JPH0997257A

JPH0997257A - Machine translation system

Info

Publication number: JPH0997257A
Application number: JP7251264A
Authority: JP
Inventors: Mihoko Kitamura; 美穂子北村; Hideki Yamamoto; 秀樹山本; Mitsuo Shimohata; 光夫下畑
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1995-09-28
Filing date: 1995-09-28
Publication date: 1997-04-08

Abstract

PROBLEM TO BE SOLVED: To reduce the necessity of the tuning of the contents of a grammar dictionary data base by obtaining the past translation environment which can be estimated to be the optimum one for an input document from the feature of the input document and utilizing the contents of the grammar dictionary data base for translation. SOLUTION: A document feature extraction part 12 extracts document feature information from an input document. A document feature degree of similarity calculation part 114 calculates the degree of similarity between the document feature information extracted by the document feature extraction part 12 and the document feature information registered in a document feature data base 13 and selects most similar document feature information from the document feature data base 13. In this case, each document feature information stored in the document feature data base 13 is made to correspond to any translation environment stored in a translation environment data base 31 by 1 to 1. In this way, the translation environment at the past translation time is stored, the past translation environment which can be estimated as the optimum one for the input document of this time is obtained from the feature of the input document of this time and the contents of the grammar dictionary data base in the translation environment is utilized for translation.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は原言語文書を目的言
語文書に翻訳する機械翻訳システムに関し、特に、ネッ
トワーク環境下で複数の使用者が用いる機械翻訳システ
ムに適用して好適なものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a machine translation system for translating a source language document into a target language document, and is particularly suitable for being applied to a machine translation system used by a plurality of users in a network environment.

【０００２】[0002]

【従来の技術】機械翻訳システムの翻訳対象となる自然
言語の文は、その文書の内容や目的に応じて、出現する
単語や表現が限定されることが多い。このため、文書の
特徴毎に個別の文法や辞書を用意し、それを用いて翻訳
することが望ましい。2. Description of the Related Art In a natural language sentence to be translated by a machine translation system, the words and expressions that appear often are limited according to the content and purpose of the document. Therefore, it is desirable to prepare an individual grammar or dictionary for each feature of the document and use it for translation.

【０００３】しかし、翻訳対象となる文書の内容や目的
は多様であり、機械翻訳システムが最初から種々の文書
の特徴に対応した文法や辞書を提供することができな
い。このため、従来、機械翻訳システムが内部に保持す
る文法や辞書を使用者自らチューニングできる機構を提
供し、翻訳対象となる文書の翻訳結果を用いて使用者が
文法と辞書をチューニングすることによって、どんな特
徴を持った文書も翻訳できる機械翻訳システムが提案さ
れていた（特開平６−３３２９４６号参照）。However, the contents and purposes of the document to be translated are diverse, and the machine translation system cannot provide a grammar or a dictionary corresponding to the characteristics of various documents from the beginning. For this reason, conventionally, a machine translation system provides a mechanism by which the user can tune the grammar and dictionary held internally, and the user tunes the grammar and dictionary by using the translation result of the document to be translated. A machine translation system capable of translating a document having any characteristic has been proposed (see JP-A-6-332946).

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、従来の
チューニング機構では、使用者は今までの文書とは違っ
た新たな特徴を持った文書を翻訳する度に、最初から文
法や辞書をチューニングしなければならない。However, in the conventional tuning mechanism, the user has to tune the grammar and the dictionary from the beginning every time a user translates a document having a new feature different from the conventional document. I have to.

【０００５】例えば、使用者がビジネスレター文書と技
術文書の２つの文書を翻訳したいとする。ビジネスレタ
ー文書と技術文書とでは訳語や文章表現は大きく異なっ
ており、例えば、ビジネスレター文書では“book”は一
般に「予約する」と翻訳されることが好ましいのに対
し、技術文書では一般に「本」と翻訳されることが好ま
しい。このように、ビジネスレター文書と技術文書とで
は訳語や文章表現は大きく異なっているため、使用者
は、それぞれ個別の文法辞書を作成したり、チューニン
グしたりする必要がある。For example, a user wants to translate two documents, a business letter document and a technical document. Business letters and technical documents have very different translations and textual expressions. For example, in a business letter document, “book” is generally translated as “reserve”, while in a technical document, it is generally “book”. It is preferably translated as ". As described above, since the translated words and the sentence expressions are greatly different between the business letter document and the technical document, the user needs to create and tune individual grammar dictionaries.

【０００６】しかし、文法や辞書のチューニングは労力
を要する作業であり、機械翻訳システムの使い勝手から
みると、かかる作業をすることなく文書に応じた翻訳結
果を得られるようにすることが望ましい。However, tuning the grammar and dictionary is a labor-intensive task, and from the viewpoint of the usability of the machine translation system, it is desirable to be able to obtain the translation result according to the document without performing such task.

【０００７】[0007]

【課題を解決するための手段】かかる課題を解決するた
め、第１の本発明（請求項１に記載の発明）において
は、入力された文書の翻訳を実行する１以上の翻訳装置
部と、過去に翻訳したときの少なくとも入力文書と使用
した文法辞書名とを含む１以上の翻訳環境を格納する翻
訳環境データベースを有する翻訳環境データベース格納
部と、異なる文法辞書をそれぞれデータベースとして格
納している複数の文法辞書データベース格納部とを備え
る機械翻訳システムであって、各翻訳装置部がそれぞ
れ、以下の各手段を備えていることを特徴とする。In order to solve such a problem, in the first aspect of the present invention (the invention according to claim 1), one or more translation device sections for executing translation of an input document, A translation environment database storage unit having a translation environment database that stores at least one translation environment including at least an input document and a grammar dictionary name used when translated in the past, and a plurality of different grammar dictionaries respectively stored as databases. A machine translation system including a grammar dictionary database storage unit, characterized in that each translation device unit includes each of the following means.

【０００８】すなわち、各翻訳装置部はそれぞれ、(1)
自然言語で記述された今回の入力文書から、その入力文
書内の単語やイディオムの出現頻度情報を少なくとも含
む文書特徴情報を抽出する文書特徴抽出手段と、(2) 翻
訳環境データベース内に格納されている各翻訳環境に係
る過去の入力文書の文書特徴情報を格納している文書特
徴格納手段と、(3) 文書特徴抽出手段で抽出された文書
特徴情報と、文書特徴格納手段に格納されている各文書
特徴情報との類似度を得、得られた各類似度に基づい
て、翻訳環境データベースから、今回の入力文書に適し
たと推測される翻訳環境を取出して使用者に提示して、
いずれかの翻訳環境を設定させる類似度判定・翻訳環境
設定手段と、(4) 設定された翻訳環境に記載されている
文法辞書名を有する文法辞書データベースを文法辞書デ
ータベース格納部から読み込み、その文法辞書データベ
ース内容を用いて今回の入力文書を翻訳する翻訳処理手
段と、(5) 今回の翻訳処理に係る翻訳環境を抽出する翻
訳環境抽出手段と、(6) 抽出された翻訳環境を翻訳環境
データベース格納部に転送させて翻訳環境データベース
に追加更新させる翻訳環境更新手段と、(7) 翻訳環境デ
ータベースが更新されたとき、それに合わせて、文書特
徴格納手段内の格納内容を更新させる格納文書特徴更新
手段とを備えている。[0008] That is, each translation device unit has (1)
Document feature extraction means for extracting document feature information including at least appearance frequency information of words and idioms in the input document written in natural language, and (2) stored in the translation environment database. Document feature storage means for storing the document feature information of past input documents relating to each translation environment, (3) Document feature information extracted by the document feature extraction means, and stored in the document feature storage means Obtain the similarity with each document feature information, and based on each obtained similarity, extract the translation environment that is presumed to be suitable for this input document from the translation environment database and present it to the user.
A similarity determination / translation environment setting means for setting one of the translation environments, and (4) a grammar dictionary database having a grammar dictionary name described in the set translation environment is read from the grammar dictionary database storage unit, and the grammar is stored. Translation processing means for translating the input document this time using the contents of the dictionary database, (5) Translation environment extraction means for extracting the translation environment related to this translation processing, and (6) Translation environment database for the extracted translation environment. (7) A translation environment updating means for transferring to the storage unit and additionally updating the translation environment database, and (7) A stored document feature update for updating the contents stored in the document feature storage means when the translation environment database is updated. And means.

【０００９】この第１の本発明においては、過去の翻訳
時の翻訳環境を格納しておき、今回の入力文書の特徴か
ら、今回の入力文書に最適と推測できる過去の翻訳環境
を得て、その際の翻訳環境での文法辞書データベース内
容を翻訳に利用することにより、文法辞書データベース
内容のチューニングの必要性を小さくできると共に、良
好な品質の翻訳結果が得られる。また、今回の翻訳環境
を抽出して格納するようにしたので、次回以降の翻訳時
における翻訳環境の抽出精度を高められるようになる。According to the first aspect of the present invention, a translation environment at the time of past translation is stored, and a past translation environment that can be presumed to be optimal for the current input document is obtained from the characteristics of the current input document, By utilizing the contents of the grammar dictionary database in the translation environment at that time for translation, the need for tuning the contents of the grammar dictionary database can be reduced and a good quality translation result can be obtained. Further, since the translation environment of this time is extracted and stored, the accuracy of extraction of the translation environment at the time of the next and subsequent translations can be improved.

【００１０】第２の本発明（請求項２に記載の発明）の
機械翻訳システムは、第１の本発明における各翻訳装置
部がそれぞれ、(8) いずれかの文法辞書データベース格
納部から転送されてきた文法辞書データベース内容を、
使用者からの指示に基づいてチューニングして、翻訳処
理手段の使用に供するようにさせるチューニング手段
と、(9) チューニング後の文法辞書データベース内容を
新しい文法辞書データベースを形成させて格納させる文
法辞書データベース形成手段とをさらに備えて、このチ
ューニング後の文法辞書データベース内容を抽出される
翻訳環境に反映させることを特徴とする。In the machine translation system of the second aspect of the present invention (the invention according to claim 2), each translation device section of the first aspect of the present invention is transferred from any one of (8) grammar dictionary database storage sections. Contents of the grammar dictionary database
Tuning means for tuning according to the instruction from the user so as to be used for the translation processing means, and (9) Grammar dictionary database for forming and storing a new grammar dictionary database content. It is characterized by further comprising a forming means for reflecting the content of the grammar dictionary database after the tuning in the extracted translation environment.

【００１１】第２の本発明では、過去の翻訳環境から得
た文法辞書データベースが不十分であっても、チューニ
ングによって良好な翻訳品質を達成でき、また、チュー
ニング後の文法辞書データベース内容を独立のデータベ
ースと格納したことにより、かかるチューニングによる
品質向上効果を将来に渡って得ることができる。In the second aspect of the present invention, even if the grammar dictionary database obtained from the past translation environment is insufficient, good translation quality can be achieved by tuning, and the contents of the grammar dictionary database after tuning can be independent. By storing it in the database, the quality improvement effect by such tuning can be obtained in the future.

【００１２】[0012]

【発明の実施の形態】以下、本発明による機械翻訳シス
テムの一実施形態を図面を参照しながら詳述する。BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of a machine translation system according to the present invention will be described in detail below with reference to the drawings.

【００１３】図２は、この実施形態の機械翻訳システム
の全体構成を示すブロック図である。図２において、こ
の実施形態の機械翻訳システムは、複数の翻訳装置部２
００Ａ、２００Ｂ、２００Ｃ、…と、複数の文法辞書デ
ータベース格納部２０１ａ、２０１ｂ、…と、１個の翻
訳環境データベース格納部２０２とがネットワーク２０
３を介して接続されて構成されている。FIG. 2 is a block diagram showing the overall configuration of the machine translation system of this embodiment. In FIG. 2, the machine translation system according to this embodiment includes a plurality of translation device units 2
00A, 200B, 200C, ..., a plurality of grammar dictionary database storage units 201a, 201b ,.
It is configured to be connected via 3.

【００１４】各翻訳装置部２００Ａ、２００Ｂ、２００
Ｃ、…はそれぞれ、翻訳を実行するものであり、ネット
ワーク２０３を介して複数の文法辞書データベース格納
部２０１ａ、２０１ｂ、…内の後述する文法辞書データ
ベースを自由に使用することができる。個々の文法辞書
データベース格納部２０１ａ、２０１ｂ、…内の文法辞
書データベースがどのような翻訳環境のもので作成され
たかという情報が、翻訳環境データベース格納部２０２
内の後述する翻訳環境データベースに格納されており、
各翻訳装置部２００Ａ、２００Ｂ、２００Ｃ、…は、そ
の翻訳環境データベースの情報を利用することによっ
て、翻訳対象の文書に最も適した文法辞書データベース
を選択し、その文法辞書データベースを用いて翻訳処理
を実行することができる。Each translation device section 200A, 200B, 200
Each of C, ... Executes translation, and a grammar dictionary database described later in the plurality of grammar dictionary database storage units 201a, 201b, ... Can be freely used via the network 203. The translation environment database storage unit 202 stores information about the translation environment in which the grammar dictionary databases in the individual grammar dictionary database storage units 201a, 201b, ... Are created.
It is stored in the translation environment database described later in
Each translation device unit 200A, 200B, 200C, ... Selects the grammar dictionary database most suitable for the document to be translated by using the information of the translation environment database, and performs the translation process using the grammar dictionary database. Can be executed.

【００１５】各翻訳装置部２００（２００Ａ、２００
Ｂ、２００Ｃ、…）はそれぞれ、実際上、キーボードや
マウス等の入力装置や、ＣＲＴディスプレイや液晶ディ
スプレイやプリンタ等の出力装置や、ハードディスク装
置等の補助記憶装置や、ネットワークコントロールユニ
ット等の通信制御装置を備えたワークステーションやミ
ニコンやパソコン等の情報処理装置で構成されており、
各文法辞書データベース格納部２０１（２０１ａ、２０
１ｂ、…）や翻訳環境データベース格納部２０２は、実
際上、ハードディスク装置等の補助記憶装置や、ネット
ワークコントロールユニット等の通信制御装置を備えた
個別又は両データベースに共通のワークステーションや
ミニコンやパソコン等の情報処理装置で構成されてい
る。Each translation device unit 200 (200A, 200
B, 200C, ...) are actually input devices such as a keyboard and a mouse, output devices such as a CRT display, a liquid crystal display and a printer, auxiliary storage devices such as a hard disk device, and communication control such as a network control unit. It is composed of workstations equipped with devices, information processing devices such as minicomputers and personal computers,
Each grammar dictionary database storage unit 201 (201a, 20a)
1b, ...) and the translation environment database storage unit 202 are actually equipped with an auxiliary storage device such as a hard disk device, a communication control device such as a network control unit, or a workstation common to both databases, a minicomputer, a personal computer, or the like. Information processing device.

【００１６】この実施形態の特徴から、翻訳装置部２０
０、文法辞書データベース格納部２０１及び翻訳環境デ
ータベース格納部２０２のそれぞれの要部構成を機能部
に分けると、図１に示す機能ブロック図に示す構成を有
する。From the features of this embodiment, the translation device unit 20
0, the grammar dictionary database storage unit 201 and the translation environment database storage unit 202 are divided into functional units, each having the configuration shown in the functional block diagram of FIG.

【００１７】図１において、翻訳装置部２００は、使用
者が翻訳したい文書を入力したり翻訳環境を設定したり
翻訳結果を表示したりするためのユーザインターフェイ
ス部１と、入力文書の特徴を抽出し、その文書に類似す
る文書を選択する文書特徴抽出判定部２と、翻訳処理を
実行する翻訳実行部３と、ネットワーク２０３上で提供
される各データベース格納部２０１、２０２と通信する
機能を提供する通信部４とから構成されている。In FIG. 1, a translation device section 200 extracts a feature of an input document and a user interface section 1 for a user to input a document to be translated, set a translation environment, and display a translation result. And provides a function of communicating with the document feature extraction determination unit 2 that selects a document similar to the document, the translation execution unit 3 that executes the translation process, and the database storage units 201 and 202 provided on the network 203. And a communication unit 4 that operates.

【００１８】また、文法辞書データベース格納部２０１
は、翻訳装置部２００との通信を実行する通信部６と、
ネットワーク内のそれぞれの使用者が作成した文法辞書
データベース３２とから構成されている。Also, the grammar dictionary database storage unit 201
Is a communication unit 6 that executes communication with the translation device unit 200,
It is composed of a grammar dictionary database 32 created by each user in the network.

【００１９】さらに、翻訳環境データベース格納部２０
２は、翻訳装置部２００との通信を実行する通信部５
と、翻訳環境データベース３１とから構成される。Further, the translation environment database storage unit 20
2 is a communication unit 5 that executes communication with the translation device unit 200.
And a translation environment database 31.

【００２０】なお、文法辞書データベース格納部２０１
及び翻訳環境データベース格納部２０２が、同じ情報処
理装置に実現されている場合は、通信部５及び通信部６
はハード的には同じものである。The grammar dictionary database storage unit 201
And the translation environment database storage unit 202 are realized by the same information processing apparatus, the communication unit 5 and the communication unit 6
Are the same in terms of hardware.

【００２１】ユーザインターフェイス部１は、使用者が
翻訳したい文書を入力するための文書入力部９と、使用
者が翻訳環境を設定するための翻訳環境設定部１０と、
翻訳結果を表示するための翻訳結果表示部１１とから構
成されている。The user interface unit 1 includes a document input unit 9 for a user to input a document to be translated, a translation environment setting unit 10 for a user to set a translation environment,
It is composed of a translation result display unit 11 for displaying a translation result.

【００２２】文書特徴抽出判定部２は、文書特徴抽出部
１２、文書特徴データベース１３、及び文書特徴類似度
計算部１４からなる。文書特徴抽出部１２は、入力文書
から文書特徴情報を抽出するものである。文書特徴デー
タベース１３は、過去に翻訳された文書の特徴情報を格
納しているものである。文書特徴類似度計算部１４は、
文書特徴抽出部１２で抽出された文書特徴情報と文書特
徴データベース１３に登録されている文書特徴情報間の
類似度を計算し、最も類似する文書特徴情報を文書特徴
データベース１３から選択するものである。The document feature extraction determination unit 2 is composed of a document feature extraction unit 12, a document feature database 13, and a document feature similarity calculation unit 14. The document feature extraction unit 12 extracts document feature information from an input document. The document feature database 13 stores feature information of documents translated in the past. The document feature similarity calculation unit 14
The similarity between the document feature information extracted by the document feature extraction unit 12 and the document feature information registered in the document feature database 13 is calculated, and the most similar document feature information is selected from the document feature database 13. .

【００２３】なお、文書特徴データベース１３に格納さ
れている各文書特徴情報はそれぞれ、翻訳環境データベ
ース３１に格納されているいずれかの翻訳環境（識別番
号が等しい）に１対１で対応しているものである。Each document feature information stored in the document feature database 13 corresponds to one of the translation environments (identification numbers are equal) stored in the translation environment database 31 on a one-to-one basis. It is a thing.

【００２４】図３は、文書特徴情報の一例の説明図であ
る。この例の文書特徴情報３０２は、基本的には、図３
（Ｂ）に示すように、図３（Ａ）に示すような入力され
た英語文書（原言語文書）３０１において、所定回数以
上（例えば２回以上）出現した所定品詞（例えば名詞）
の単語の頻度分布３０２４である。なお、この例では、
入力文書に付随して入力されたファイル名(filename)３
０２１、編集者情報(editor)３０２２、使用者情報(use
r)３０２３等の書誌情報も文書特徴情報を構成してい
る。FIG. 3 is an explanatory diagram of an example of the document characteristic information. The document feature information 302 in this example is basically the same as that shown in FIG.
As shown in (B), in the input English document (source language document) 301 as shown in FIG. 3A, a predetermined part of speech (eg, noun) that has appeared a predetermined number of times or more (eg, twice or more).
Is a frequency distribution 3024 of the word. In this example,
The file name (filename) 3 that was input along with the input document
021, editor information (editor) 3022, user information (use
r) Bibliographic information such as 3023 also constitutes the document feature information.

【００２５】文書特徴抽出部１２は、このような文書特
徴情報を抽出するために、形態素解析等を実行してい
る。この解析には文法、辞書内容が必要となるが、例え
ば、汎用的な文法辞書データベース（３２）から転送さ
せて利用するようにしても良く、また、当該文書特徴抽
出部１２の内部に文法辞書内容を固定的に格納していて
も良い。The document feature extraction unit 12 executes morphological analysis or the like in order to extract such document feature information. This analysis requires grammar and dictionary contents. For example, the grammar dictionary may be transferred from a general-purpose grammar dictionary database (32) and used. The contents may be fixedly stored.

【００２６】翻訳実行部３は、翻訳処理を実行する翻訳
処理部１５と、翻訳処理実行時に使用する転送されてき
た文法辞書内容を格納する文法辞書部１６と、翻訳処理
実行時において入力文書、翻訳結果、使用辞書、使用者
等でなる翻訳環境（後述する図８参照）を抽出する翻訳
環境抽出部１７とから構成されている。The translation execution unit 3 executes a translation process, a translation processing unit 15, a grammar dictionary unit 16 for storing the transferred grammar dictionary contents used during the translation process, an input document during the translation process, The translation environment extraction unit 17 is configured to extract a translation result, a used dictionary, and a translation environment (see FIG. 8 to be described later) including a user and the like.

【００２７】翻訳装置部２００の通信部４は、文書特徴
抽出判定部２に関連した通信要素１８〜２０と、翻訳実
行部３に関連した通信要素２１〜２４とからなる。The communication section 4 of the translation device section 200 is composed of communication elements 18 to 20 related to the document feature extraction determination section 2 and communication elements 21 to 24 related to the translation execution section 3.

【００２８】文書特徴抽出判定部２に関連した通信要素
は、翻訳環境データベース３１が更新されたか否かを監
視する翻訳環境更新監視部１８と、文書特徴類似度計算
部１４で選択された文書特徴情報に対応する翻訳環境情
報の送信を依頼する翻訳環境送信依頼部１９と、依頼し
た翻訳環境情報を受信する翻訳環境受信部２０とであ
る。The communication elements related to the document feature extraction / judgment unit 2 are the translation environment update monitoring unit 18 for monitoring whether or not the translation environment database 31 has been updated, and the document feature selected by the document feature similarity calculation unit 14. A translation environment transmission requesting unit 19 for requesting transmission of translation environment information corresponding to the information and a translation environment receiving unit 20 for receiving the requested translation environment information.

【００２９】また、翻訳実行部３に関連した通信要素
は、使用者によって設定された文法辞書データベースの
内容の送信を依頼する文法辞書送信依頼部２１と、依頼
した文法辞書データベースの内容を受信する文法辞書受
信部２２と、修正された（チューニングされた）文法辞
書データベース内容を送信する文法辞書送信部２３と、
翻訳処理実行時の翻訳環境を翻訳環境データベース格納
部２０２に送信する翻訳環境送信部２４とから構成され
る。The communication elements associated with the translation execution unit 3 receive the contents of the requested grammar dictionary database and the grammar dictionary transmission request unit 21 that requests the transmission of the contents of the grammar dictionary database set by the user. A grammar dictionary receiving unit 22, a grammar dictionary transmitting unit 23 for transmitting the corrected (tuned) grammar dictionary database contents,
A translation environment transmission unit 24 that transmits the translation environment at the time of executing the translation process to the translation environment database storage unit 202.

【００３０】これに対応して、翻訳環境データベース格
納部２０２の通信部５は、翻訳環境データベース３１が
更新されたことを翻訳環境更新監視部１８に通知する翻
訳環境更新通知部２５と、翻訳環境の送信依頼を受信す
る翻訳環境送信依頼受信部２６と、翻訳環境を送信する
翻訳環境送信部２７と、使用者によって新しく設定され
た翻訳環境を受信する翻訳環境受信部２８とから構成さ
れている。In response to this, the communication unit 5 of the translation environment database storage unit 202 notifies the translation environment update monitor unit 18 that the translation environment database 31 has been updated, and the translation environment update notification unit 25. Is comprised of a translation environment transmission request receiving unit 26 for receiving a transmission environment request, a translation environment transmitting unit 27 for transmitting the translation environment, and a translation environment receiving unit 28 for receiving the translation environment newly set by the user. .

【００３１】また、文法辞書データベース格納部２０１
の通信部６は、文法辞書データベースの内容の送信依頼
を受信する文法辞書送信受信部２９と、文法辞書データ
ベース内容を送信する文法辞書送信部３０とから構成さ
れている。Further, the grammar dictionary database storage unit 201
The communication unit 6 includes a grammar dictionary transmission / reception unit 29 that receives a transmission request for the contents of the grammar dictionary database and a grammar dictionary transmission unit 30 that transmits the contents of the grammar dictionary database.

【００３２】翻訳環境データベース格納部２０２には当
然に翻訳環境データベース３１が格納されている。ま
た、文法辞書データベース格納部２０１には当然に文法
辞書データベース３２が格納されている。The translation environment database storage unit 202 naturally stores the translation environment database 31. Further, the grammar dictionary database storage unit 201 naturally stores the grammar dictionary database 32.

【００３３】ここで、翻訳環境は、いずれかの文法辞書
データベース（３２）に対応したものであり（１対１と
は限らない）、その文法辞書データベース（３２）を利
用して翻訳した場合に、どのような翻訳がなされるかを
使用者が推測できるようにした情報でもある。言い換え
ると、今回の入力文書に対して最適な訳質を得られるで
あろう文法辞書データベース（３２）等を使用者が認識
できる情報からなっている。翻訳環境は、例えば、後述
する図７及び図８に示すように、使用文法辞書データベ
ースやキーワードや過去の翻訳に係る対訳等を中心とし
た情報でなる。Here, the translation environment corresponds to any grammar dictionary database (32) (not necessarily one-to-one), and when the grammar dictionary database (32) is used for translation. It is also information that allows the user to guess what kind of translation will be done. In other words, it is composed of information that allows the user to recognize the grammar dictionary database (32) or the like that will obtain the optimum translation quality for the input document this time. The translation environment is, for example, as shown in FIGS. 7 and 8 to be described later, information centered on a grammar dictionary for use, keywords, and parallel translations related to past translations.

【００３４】次に、以上のような各部からなるこの実施
形態の機械翻訳システムの動作を、図４及び図５のフロ
ーチャートを参照しながら説明する。ここで、図４及び
図５は、使用者がある翻訳装置部２００に入力した文書
に対して、その翻訳装置部２００が入力文書に適した翻
訳環境を表示し、翻訳処理が実行されるまでの処理を示
している。Next, the operation of the machine translation system of this embodiment, which is made up of the above-described units, will be described with reference to the flowcharts of FIGS. 4 and 5. Here, FIG. 4 and FIG. 5 are for the document input by the user to the translation device unit 200 until the translation device unit 200 displays the translation environment suitable for the input document and the translation process is executed. Shows the processing of.

【００３５】なお、以下の動作説明においては、一般的
な動作説明だけでなく、図３（Ａ）に示したビジネスレ
ターに関する入力文書３０１を翻訳する場合を例とし
て、具体的な動作例も併せて説明する。In the following description of the operation, not only a general operation description but also a specific operation example will be given by taking the case of translating the input document 301 relating to the business letter shown in FIG. 3A as an example. Explain.

【００３６】使用者が翻訳対象となる文書（３０１）を
ある翻訳装置部２００の文書入力部９によって入力する
と（ステップ４０２）、文書特徴抽出部１２によって、
入力文書（３０１）から図３（Ｂ）に示すような文書特
徴情報（３０２）が抽出される（ステップ４０３）。When the user inputs the document (301) to be translated by the document input section 9 of a certain translation device section 200 (step 402), the document feature extraction section 12 causes
Document feature information (302) as shown in FIG. 3B is extracted from the input document (301) (step 403).

【００３７】その後、文書特徴類似度計算部１４によっ
て、抽出された文書特徴情報（３０２）と、文書特徴デ
ータベース１３内の各文書特徴情報との間の類似度が計
算され、文書特徴データベース１３から、最大の類似度
を持つ文書特徴情報の識別番号の取出し動作が実行され
（ステップ４０４）、類似する過去に翻訳した文書の存
在が判定される（ステップ４０５）。Thereafter, the document feature similarity calculator 14 calculates the similarity between the extracted document feature information (302) and each document feature information in the document feature database 13, and the document feature database 13 The operation of extracting the identification number of the document feature information having the maximum similarity is executed (step 404), and the existence of a similar previously translated document is determined (step 405).

【００３８】ここで、類似する文書特徴情報が文書特徴
データベース１３に存在しない場合には、類似する翻訳
環境がないことを示すメッセージ（例えば、「類似する
翻訳環境はありません」）を表示させる（ステップ４０
７）。Here, if similar document feature information does not exist in the document feature database 13, a message indicating that there is no similar translation environment (for example, "no similar translation environment") is displayed (step 40
7).

【００３９】文書特徴情報は、上述したように、その文
書中に出現する単語の頻度分布３０２４を中心としたも
のであり、過去の翻訳時に抽出された文書特徴情報を格
納した文書特徴データベース１３における文書特徴情報
の形式も、図６に例を示すように、図３（Ｂ）に示す抽
出された文書特徴情報と同様な形式を有する。しかし、
図６に示すように、文書特徴データベース１３に格納さ
れた各文書特徴情報６０１、６０２、６０３は、それぞ
れを識別するための識別番号（Ｎｏ．）が格納時に付与
されており、類似度判定ではこの識別番号が取り出され
る。As described above, the document characteristic information is centered on the frequency distribution 3024 of words appearing in the document, and is stored in the document characteristic database 13 that stores the document characteristic information extracted during the past translation. The format of the document characteristic information also has a format similar to the extracted document characteristic information shown in FIG. 3B, as shown in FIG. But,
As shown in FIG. 6, the document characteristic information 601, 602, 603 stored in the document characteristic database 13 is provided with an identification number (No.) for identifying each of the document characteristic information 601, at the time of storage. This identification number is retrieved.

【００４０】また、文書特徴情報間の類似度計算方法
は、類似度を表すものならばいかなる方法でも良いが、
ここでは、以下の方法が採用されているとする。Further, the similarity calculation method between the document characteristic information may be any method as long as it represents the similarity.
Here, it is assumed that the following method is adopted.

【００４１】文書特徴情報中の文書名（符号３０２１参
照）が抽出された文書特徴情報と、文書特徴データベー
ス１３内の文書特徴情報とで同じであれば、類似度は無
限大とする。この場合以外の場合は、所定回数（例えば
２回）以上出現する単語で両方の文書特徴情報中に存在
する単語数を類似度とする。また、複数回以上出現する
重複する単語が両文書特徴情報に存在しない場合には、
抽出された文書特徴情報に類似するは類似する文書特徴
データベース１３内の文書特徴情報は存在しないとす
る。類似度最大のものが複数存在する場合は、一つに決
めるのではなく全てを選択する。また、類似度最大のも
のが文書特徴データベース１３内に複数存在する場合は
全てを選択する。If the document feature information (see reference numeral 3021) in the document feature information and the document feature information in the document feature database 13 are the same, the degree of similarity is infinite. In cases other than this case, the number of words that appear in both document feature information among words that appear a predetermined number of times (for example, twice) or more is taken as the similarity. If there are no duplicate words that appear multiple times in both document feature information,
It is assumed that there is no document feature information in the document feature database 13 similar to or similar to the extracted document feature information. If there are multiple items with the highest similarity, select all instead of deciding one. If a plurality of documents with the highest degree of similarity are present in the document feature database 13, all are selected.

【００４２】例えば、図６に示す３個の文書特徴情報６
０１〜６０３が文書特徴データベース１３内に格納され
ている状況で、図３（Ｂ）に示す文書特徴情報３０２が
抽出された場合には、以下のように類似度の判定が実行
される。文書特徴情報６０１は文書特徴情報３０２に対
して類似度が１（“information ”が重複）と判定さ
れ、文書特徴情報６０２は文書特徴情報３０２に対して
類似度が３（“hotel ”、“room”、“accommodation
”が重複）と判定され、文書特徴情報６０３は文書特
徴情報３０２に対して類似度が０（重複単語なし）と判
定され、その結果、文書特徴情報６０２が類似度最大の
文書特徴情報として選択される。For example, the three pieces of document characteristic information 6 shown in FIG.
When the document feature information 302 shown in FIG. 3B is extracted with 01 to 603 stored in the document feature database 13, the determination of the degree of similarity is executed as follows. The document feature information 601 is determined to have a similarity of 1 (“information” is duplicated) to the document feature information 302, and the document feature information 602 has a similarity of 3 to the document feature information 302 (“hotel”, “room”). "," Accommodation
“Duplicate), the document feature information 603 is determined to have a similarity of 0 (no duplicate word) to the document feature information 302, and as a result, the document feature information 602 is selected as the document feature information having the highest similarity. To be done.

【００４３】以上のようにして選択された文書特徴情報
の番号は、翻訳環境送信依頼部１９からネットワーク２
０３を介して、翻訳環境データベース格納部２０２の翻
訳環境送信依頼受信部２６宛に送信され、翻訳環境送信
依頼受信部２６はこれを受信すると、翻訳環境データベ
ース３１からその識別番号を持つ翻訳環境情報を取り出
し、それを翻訳環境送信部２７からネットワーク２０３
を介して、送信の依頼をした翻訳装置部２００の翻訳環
境受信部２０宛に送信する（ステップ４０６）。翻訳環
境受信部２０が受信した翻訳環境情報は、翻訳環境設定
部１０を介して使用者に提示される（ステップ４０
８）。The number of the document characteristic information selected as described above is transmitted from the translation environment transmission requesting unit 19 to the network 2
03, the translation environment transmission request reception unit 26 of the translation environment database storage unit 202 is transmitted to the translation environment transmission request reception unit 26, and the translation environment transmission request reception unit 26 receives the translation environment information having the identification number from the translation environment database 31. From the translation environment transmission unit 27 to the network 203.
Is transmitted to the translation environment receiving unit 20 of the translation device unit 200 that has requested the transmission (step 406). The translation environment information received by the translation environment receiving unit 20 is presented to the user via the translation environment setting unit 10 (step 40).
8).

【００４４】例えば、文書名が“letter.doc”の文書特
徴情報６０２の識別番号（２）が翻訳環境送信依頼とし
て送信されると、翻訳環境データベース３１からその識
別番号２を持つ翻訳環境情報が取り出されて返送され、
図７に示すようなその識別番号２に係る翻訳環境７０１
が使用者に提示される。For example, when the identification number (2) of the document characteristic information 602 having the document name “letter.doc” is transmitted as a translation environment transmission request, the translation environment information having the identification number 2 is transmitted from the translation environment database 31. Taken out and returned,
A translation environment 701 associated with the identification number 2 as shown in FIG.
Is presented to the user.

【００４５】この実施形態の場合、使用者は、表示され
た翻訳環境をそのまま用いて入力文書を翻訳できるよう
になされており、また、その翻訳環境を修正したした後
に入力文書を翻訳できるようになされている。In the case of this embodiment, the user can translate the input document by using the displayed translation environment as it is, and the user can translate the input document after correcting the translation environment. Has been done.

【００４６】類似する文書特徴情報がない旨や類似する
文書特徴情報に係る翻訳環境を使用者に提示した後は、
使用者が提示された翻訳環境の設定（又は修正）を指示
したりしたか否かを判定する（ステップ４０９）。翻訳
環境の設定（又は修正）を使用者が指示した場合には、
さらに、使用者が他の翻訳環境の参照を望んだか否かを
判定する（ステップ４１０）。使用者が他の翻訳環境の
参照を望んだ場合には、ステップ４０６以降のその翻訳
環境の取出し、提示処理に戻る。After the fact that there is no similar document feature information or the translation environment related to the similar document feature information is presented to the user,
It is determined whether the user has instructed the setting (or correction) of the presented translation environment (step 409). If the user instructs to set (or modify) the translation environment,
Further, it is determined whether the user desires to refer to another translation environment (step 410). If the user desires to refer to another translation environment, the process returns to the extraction and presentation processing of that translation environment after step 406.

【００４７】ここで、他の翻訳環境は、使用者が識別番
号を入力して指示する選択形態で指示したものであって
も良く、また、表示されている翻訳環境の次の類似度
（同一類似度を含む）に基づいて自動的に選択したもの
であっても良く、その選択方法は任意であって良い。い
ずれかの翻訳環境が提示されている状況で使用者が翻訳
環境の確定操作を行なうと、翻訳環境設定部１０は、そ
の提示されている翻訳環境の設定確立動作を行なう（ス
テップ４１１）。Here, the other translation environment may be instructed in a selection form in which the user inputs an identification number to instruct, and the next similarity of the displayed translation environment (identical It may be automatically selected on the basis of (including similarity), and the selection method may be arbitrary. When the user performs a translation environment confirming operation in a situation where any one of the translation environments is presented, the translation environment setting unit 10 performs a setting establishment operation of the presented translation environment (step 411).

【００４８】以上のような翻訳環境の設定処理が終了す
ると、又は、翻訳環境の設定を使用者が希望しないと、
図５に示したステップ４１２以降の翻訳処理に移行す
る。When the translation environment setting process as described above is completed, or the user does not wish to set the translation environment,
The process proceeds to the translation process after step 412 shown in FIG.

【００４９】翻訳環境の設定処理が終了すると、又は、
翻訳環境の設定を使用者が希望しないと、その翻訳環境
内の文法辞書名（なお、翻訳環境が設定されていない場
合には予め定められている文法辞書名）が文法辞書送信
依頼部２１に渡され、文法辞書送信依頼部２１はネット
ワーク内の同じ名前を持つ文法辞書送信依頼受信部２８
宛に文法辞書データベース３２の内容の送信を依頼し、
これを受けて、その名前を持つ文法辞書送信部２９は文
法辞書データベース３２を依頼した文法辞書受信部２２
宛に返信し、文法辞書受信部２２は翻訳実行部３内の文
法辞書部１６に受信した文法辞書データベースの内容を
格納する（ステップ４１２）。When the setting process of the translation environment is completed, or
If the user does not desire to set the translation environment, the name of the grammar dictionary in the translation environment (or, if the translation environment is not set, a predetermined grammar dictionary name) is sent to the grammar dictionary transmission request unit 21. The grammar dictionary transmission requesting unit 21 receives the grammar dictionary transmission request receiving unit 28 having the same name in the network.
Request to send the contents of the grammar dictionary database 32 to
In response to this, the grammar dictionary transmitting unit 29 having that name requests the grammar dictionary database 32 to receive the grammar dictionary receiving unit 22.
The grammar dictionary receiving unit 22 stores the received contents of the grammar dictionary database in the grammar dictionary unit 16 in the translation executing unit 3 (step 412).

【００５０】そして、格納された文法辞書データベース
内容を利用して、文書入力部９から入力された文書の翻
訳処理が翻訳処理部１５で実行され（ステップ４１
３）、その翻訳結果が翻訳結果表示部１１に表示される
（ステップ４１４）。Then, using the contents of the stored grammar dictionary database, the translation processing unit 15 executes the translation process of the document input from the document input unit 9 (step 41).
3), the translation result is displayed on the translation result display unit 11 (step 414).

【００５１】その後、表示された翻訳結果に対応して、
使用者が文法辞書部１６内の文法辞書データベース内容
の修正（例えば単語登録や既存単語の訳語修正等）や翻
訳環境の変更を望んだか否かを判定する（ステップ４１
５、４１７）。Then, according to the displayed translation result,
It is determined whether or not the user desires to correct the contents of the grammar dictionary database in the grammar dictionary unit 16 (for example, register a word or correct a translated word of an existing word) or change the translation environment (step 41).
5, 417).

【００５２】使用者が文法辞書部１６内の文法辞書デー
タベース内容の修正を望んだ場合には、その指示した修
正内容を取り込んで修正させた後（ステップ４１６）、
ステップ４１３に戻って翻訳をし直す。一方、使用者が
翻訳環境の変更を望んだ場合には、上述したステップ４
０９に戻って翻訳環境を再設定（修正）させる。When the user desires to correct the contents of the grammar dictionary database in the grammar dictionary section 16, after the specified correction contents are fetched and corrected (step 416).
Return to step 413 to re-translate. On the other hand, when the user desires to change the translation environment, the above step 4 is performed.
Returning to 09, the translation environment is reset (corrected).

【００５３】使用者がその時点で表示されている翻訳結
果を満足し、その後に、文法辞書部１６内の文法辞書デ
ータベース内容の修正や翻訳環境の変更を指示しない場
合には、その翻訳処理が実行された状態の翻訳環境を翻
訳環境抽出部１７が抽出する（ステップ４１８）。If the user is satisfied with the translation result displayed at that time and does not subsequently give an instruction to correct the contents of the grammar dictionary database in the grammar dictionary unit 16 or change the translation environment, the translation process is performed. The translation environment extraction unit 17 extracts the translation environment in the executed state (step 418).

【００５４】図８は、抽出された翻訳環境８０１の提示
例を示すものである。翻訳環境８０１は、例えば、その
識別番号（Ｎｏ．）、文書名（filename）、編集者（ed
itor）、使用者（user）、翻訳実行日（date）、使用文
法辞書名（dic.）、キーワード（keyword ）、入力文書
（原文書：text）及び翻訳結果（transtext ）からな
る。ここで、キーワードについては、文書特徴として抽
出された単語を自動的にしても良いが、これら単語を使
用者に提示して修正させるようにしても良い。図８は、
使用者が修正したキーワードを示している。FIG. 8 shows an example of presentation of the extracted translation environment 801. The translation environment 801 includes, for example, its identification number (No.), document name (filename), editor (ed
Itor), user (user), translation execution date (date), grammar dictionary name used (dic.), keyword (keyword), input document (original document: text) and translation result (transtext). Here, regarding the keyword, the words extracted as the document characteristics may be automatically set, but these words may be presented to the user and corrected. FIG.
It shows the keywords modified by the user.

【００５５】その後、使用者が良好な翻訳結果を得るた
めに、文法辞書部１６のデータベース内容が使用者によ
って変更（チューニング）されたものか否かを翻訳処理
部１５は判別する（ステップ４１９）。文法辞書部１６
の内容が変更されている場合には、文法辞書送信部２３
によって、ネットワーク内に新しい文法辞書データベー
ス３２（文法辞書データベース格納部２０１）を作成す
る（ステップ４２０）。なお、この際に、抽出された翻
訳環境における使用文法辞書名（既存の文法辞書名）
を、抽出された翻訳環境における文書名に等しくするよ
うな変更を自動的に行なうようにし、既存の文法辞書名
と異なるようにしても良い。また、新たな文法辞書名を
使用者から取込んで、抽出された翻訳環境における文法
辞書名を変更するようにしても良い。After that, the translation processing section 15 determines whether or not the database contents of the grammar dictionary section 16 have been changed (tuned) by the user so that the user can obtain a good translation result (step 419). . Grammar dictionary section 16
If the contents of the grammar dictionary have been changed, the grammar dictionary transmission unit 23
Thus, a new grammar dictionary database 32 (grammar dictionary database storage unit 201) is created in the network (step 420). At this time, the name of the grammar dictionary used in the extracted translation environment (existing grammar dictionary name)
May be automatically changed so as to be the same as the document name in the extracted translation environment, and may be different from the existing grammar dictionary name. Further, a new grammar dictionary name may be imported from the user and the grammar dictionary name in the extracted translation environment may be changed.

【００５６】文法辞書データベース（３２）の新規作成
の有無を問わず、続いて、翻訳環境送信部２３が、翻訳
環境抽出部１７によって抽出された翻訳環境を、翻訳環
境データベース格納部２０２の翻訳環境受信部２７宛に
送信し、翻訳環境受信部２７は、翻訳環境データベース
３１を更新する（ステップ４２１）。Whether or not the grammar dictionary database (32) is newly created, the translation environment transmission unit 23 subsequently sets the translation environment extracted by the translation environment extraction unit 17 to the translation environment in the translation environment database storage unit 202. The translation environment receiving unit 27 updates the translation environment database 31 by transmitting the data to the receiving unit 27 (step 421).

【００５７】かかる更新を終了すると、翻訳環境更新通
知部２５が全ての翻訳装置部２００（２００Ａ、２００
Ｂ、…）の翻訳環境更新監視部１８へ翻訳環境データベ
ース３１が更新されたことを通知し、各翻訳装置部２０
０（２００Ａ、２００Ｂ、…）の翻訳環境更新監視部１
８は、文書特徴データベース１３の内容を更新させて一
連の処理を終了させる（ステップ４２２）。When this update is completed, the translation environment update notifying section 25 causes all the translation device sections 200 (200A, 200A).
(B, ...) Notifying the translation environment update monitoring unit 18 that the translation environment database 31 has been updated, and each translation device unit 20
0 (200A, 200B, ...) Translation environment update monitoring unit 1
8 updates the contents of the document feature database 13 and ends the series of processes (step 422).

【００５８】翻訳処理を実行していた翻訳装置部２００
の翻訳環境更新監視部１８は、このステップ４２２の処
理では、抽出された文書特徴情報に、更新（新規追加）
が通知された翻訳環境に付与された識別番号を付与し
て、自装置内の文書特徴データベース１３を更新する。
また、翻訳処理を実行していない翻訳装置部２００の翻
訳環境更新監視部１８は、このステップ４２２の処理で
は、翻訳処理を実行していた翻訳装置部２００の翻訳環
境更新監視部１８から抽出された文書特徴情報を転送さ
せた後、自装置内の文書特徴データベース１３を更新す
る。The translation device unit 200 that was executing the translation process
The translation environment update monitoring unit 18 updates the extracted document feature information (new addition) in the process of step 422.
The identification number assigned to the translation environment notified of is added, and the document feature database 13 in the own device is updated.
Further, the translation environment update monitoring unit 18 of the translation device unit 200 that has not executed the translation process is extracted from the translation environment update monitoring unit 18 of the translation device unit 200 that has executed the translation process in the process of step 422. After transferring the document feature information, the document feature database 13 in the device is updated.

【００５９】なお、文書特徴データベース１３の内容更
新を、入力文書が入力された翻訳装置部だけ（すなわ
ち、文書特徴情報を抽出した翻訳装置部だけ）で実行さ
せるようにしても良い。The contents of the document feature database 13 may be updated only by the translation device unit to which the input document is input (that is, only the translation device unit from which the document feature information is extracted).

【００６０】例えば、図３（Ａ）に示す文書３０１が入
力されたときに、各翻訳装置部２００内の文書特徴デー
タベース１３に図６に示す３個の文書特徴情報６０１〜
６０３が格納されているととする。そして、類似度判定
によって図７に示す翻訳環境７０１が使用者に提示さ
れ、この翻訳環境７０１及びこの翻訳環境７０１が規定
する文法辞書データベース３２（letter.doc）をそのま
ま使って、使用者が満足のいく翻訳結果が得られたとす
る。この場合、翻訳環境抽出部１７によって図８に示す
翻訳環境８０１が抽出され、翻訳環境送信部２４及び翻
訳環境受信部２８間の通信によって、翻訳環境データベ
ース３１中にこの翻訳環境が識別番号（文書特徴番号の
識別番号ともなる）４の翻訳環境として追加更新され
る。その後、翻訳環境更新通知部２５が各翻訳環境更新
監視部１８に翻訳環境データベース３１中に識別番号４
の翻訳環境が追加更新されたことを通知し、これを受け
て、各文書特徴データベース１３に、文書特徴抽出部１
２によって抽出された文書特徴情報３０２が、識別番号
４が付与された文書特徴情報として格納されて処理が終
了する。For example, when the document 301 shown in FIG. 3A is input, the three document feature information items 601 shown in FIG.
It is assumed that 603 is stored. Then, the translation environment 701 shown in FIG. 7 is presented to the user by the similarity determination, and the user is satisfied using the translation environment 701 and the grammar dictionary database 32 (letter.doc) defined by the translation environment 701 as they are. Suppose that a good translation result is obtained. In this case, the translation environment extracting unit 17 extracts the translation environment 801 shown in FIG. 8, and the translation environment transmitting unit 24 and the translation environment receiving unit 28 communicate with each other to identify this translation environment in the translation environment database 31 (document number). (Also serves as the identification number of the feature number) 4 is additionally updated as the translation environment. After that, the translation environment update notification unit 25 causes the translation environment update monitoring unit 18 to identify the identification number 4 in the translation environment database 31.
Is notified that the translation environment has been additionally updated, and in response to this, the document feature extraction unit 1 is added to each document feature database 13.
The document characteristic information 302 extracted by 2 is stored as the document characteristic information assigned with the identification number 4, and the process ends.

【００６１】上記実施形態によれば、過去の翻訳時にお
ける翻訳環境を格納しておき、今回の入力文書の特徴か
ら、今回の入力文書に最適と推測できる過去の翻訳環境
を得て、その際の翻訳環境での文法辞書データベース内
容を翻訳に利用することにより、文法辞書データベース
内容のチューニングの必要性を小さくできると共に、良
好な品質の翻訳結果が得られる。また、今回の翻訳環境
を抽出して格納するようにしたので、次回以降の翻訳時
における翻訳環境の抽出精度を高められるようになる。According to the above embodiment, the translation environment at the time of past translation is stored, and the past translation environment which can be estimated to be the optimum for the current input document is obtained from the characteristics of the current input document. By utilizing the contents of the grammar dictionary database in the translation environment for translation, it is possible to reduce the need for tuning the contents of the grammar dictionary database and obtain good quality translation results. Further, since the translation environment of this time is extracted and stored, the accuracy of extraction of the translation environment at the time of the next and subsequent translations can be improved.

【００６２】また、上記実施形態によれば、過去にシス
テム内で変更（チューニング）された文法辞書を独立の
文法辞書として格納すると共に、選択された翻訳環境に
基づいて、最適な文法辞書を翻訳処理に効率良く利用す
るようにしたので、使用者の翻訳作業を軽減でき、かつ
高品質の翻訳結果を得ることができる。Further, according to the above-described embodiment, the grammar dictionary modified (tuned) in the system in the past is stored as an independent grammar dictionary, and the optimum grammar dictionary is translated based on the selected translation environment. Since it is efficiently used for processing, the translation work of the user can be reduced and a high-quality translation result can be obtained.

【００６３】すなわち、使用者が良質の翻訳結果を得る
ためにチューニングした文法辞書をシステム内の文法辞
書データベースとして格納し、それと同時に、それぞれ
の文法辞書データベースがどのような翻訳環境のもとで
（どのような入力文書を用いて）作成されたかという情
報を翻訳環境としてデータベースに格納しておくように
したので、翻訳したい文書又はそれと類似した文書が過
去に誰かによって翻訳されていれば、その文法辞書を直
接又は修正して利用することができ、使用者は、最初か
ら文法辞書をチューニングする必要はなく、簡単に入力
文書に合った文法辞書を作成でき、良好な翻訳品質を達
成することができる。That is, the grammar dictionary tuned by the user in order to obtain a good translation result is stored as a grammar dictionary database in the system, and at the same time, each grammar dictionary database is stored under a certain translation environment ( Since the information about what input document was created) was stored in the database as a translation environment, if the document to be translated or a document similar to it has been translated by someone in the past, its grammar The dictionary can be used directly or by modifying it, and the user does not need to tune the grammar dictionary from the beginning, and can easily create a grammar dictionary that fits the input document and achieve good translation quality. it can.

【００６４】例えば、翻訳対象となる入力文書（３０
１）がビジネスレターに関する文書であって、使用者が
ビジネスレター文書を翻訳した経験がなくても、この実
施形態の機械翻訳システムによれば、過去に翻訳したビ
ジネスレター文書と類似していることを判断し、ビジネ
スレター文書に適した文法辞書データベースを使用して
翻訳し、その結果、良好な翻訳結果が得られる。ビジネ
スレターに関する入力文書に“book”が含まれていれ
ば、ビジネスレター文書に適した「予約する」という訳
語を得ることができ、技術文書に“book”が含まれてい
れば、技術文書に適した「本」という訳語を得ることが
できる。For example, the input document (30
1) is a document related to a business letter, and even if the user has no experience translating the business letter document, the machine translation system of this embodiment is similar to the previously translated business letter document. And translate using a grammar dictionary database suitable for business letter documents, resulting in good translation results. If the input document for the business letter contains "book", you can get the translated word "book" suitable for the business letter document, and if the technical document contains "book", the technical document You can get a suitable translation of "book".

【００６５】さらに、上記実施形態によれば、同一の文
法辞書データベースを異なる翻訳環境に対応付けること
もできるので、今回の入力文書に対して最適な文法辞書
データベースがいずれであるかをより適切に判断するこ
とができる。例えば、一般的なビジネスレター文書の翻
訳環境と、ビジネスレター文書の一種であり、それより
用途が限定されている予約用のビジネスレター文書の翻
訳環境とに同一の文法辞書データベースを対応付けら
れ、今回の入力文書が予約用のビジネスレター文書であ
れば、後者の翻訳環境が提示されるので、適切に環境選
択を行なうことができる。Further, according to the above-described embodiment, the same grammar dictionary database can be associated with different translation environments, so that which grammar dictionary database is the most suitable for this input document can be determined more appropriately. can do. For example, the same grammar dictionary database is associated with the translation environment of a general business letter document and the translation environment of a business letter document for reservation, which is a type of business letter document and has a limited use. If the input document this time is a business letter document for reservation, the latter translation environment is presented, so that the environment can be appropriately selected.

【００６６】上記実施形態の説明の途中においても、他
の実施形態も説明したが、さらに、上記実施形態を以下
のように変形した他の実施形態も本発明を構成するもの
である。While other embodiments have been described in the middle of the description of the above embodiment, other embodiments obtained by modifying the above embodiment as follows also constitute the present invention.

【００６７】(1) 上記実施形態においては、文書特徴情
報が、主として、文書中に所定回数以上出現した単語の
組情報であるものを示したが、これ以外の情報であって
も良い。例えば、入力文書の長さを反映させるため、文
書の単語総数で出現回数を割った出現率が所定の出現率
以上の単語の組情報を、文書特徴情報及び辞書特徴情報
とするようにしても良い。また、単語だけでなく、イデ
ィオムをも特徴を構成する要素とするようにしても良
い。(1) In the above embodiment, the document characteristic information is mainly set information of words that appear a predetermined number of times or more in the document, but other information may be used. For example, in order to reflect the length of the input document, the group information of words whose appearance rate obtained by dividing the number of appearances by the total number of words in the document is a predetermined appearance rate or more may be used as the document characteristic information and the dictionary characteristic information. good. Further, not only words but also idioms may be used as the constituent elements of the feature.

【００６８】(2) 同様に、文書特徴情報間の類似度も、
双方に属する単語数に限定されるものではない。例え
ば、文書作成者の一致不一致を値に換算して類似度の値
に含めるようにしても良い。また、出現回数や出現率が
大きい単語（重要語）については、類似度への加算値を
大きくするようにしても良い。(2) Similarly, the degree of similarity between document feature information is
The number of words belonging to both sides is not limited. For example, the match / mismatch of the document creator may be converted into a value and included in the similarity value. For words (important words) having a large number of appearances or a high appearance rate, the value added to the degree of similarity may be increased.

【００６９】(3) また、上記実施形態においては、英日
機械翻訳システムに本発明を適用したものを示したが、
原言語又は目的言語がこれ以外の機械翻訳システムの本
発明を適用できることは勿論である。また、上記実施形
態は、ネットワークを介して複数の翻訳装置部が接続さ
れたシステムを示したが、１個の機械翻訳装置が中心と
なったシステムにも本発明を適用できることは勿論であ
る。(3) In the above embodiment, the present invention is applied to the English-Japanese machine translation system.
Needless to say, the present invention of a machine translation system other than this can be applied to the source language or the target language. Further, although the above-described embodiment shows a system in which a plurality of translation device units are connected via a network, it is needless to say that the present invention can also be applied to a system in which one machine translation device is the center.

【００７０】(4) さらに、上記実施形態においては、翻
訳方向が１方向の機械翻訳システムに本発明を適用した
ものを示したが、翻訳方向が２方向以上の機械翻訳シス
テムに本発明を適用することができる。(4) Further, in the above embodiment, the present invention is applied to a machine translation system having a translation direction of one direction, but the present invention is applied to a machine translation system having translation directions of two or more directions. can do.

【００７１】(5) さらにまた、上記実施形態において
は、文書特徴情報を文書特徴データベース１３に格納し
ておくものを示したが、それぞれ対応する文法辞書デー
タベースに特徴情報の格納エリアを設けて格納しておく
ようにしても良い。また、翻訳環境データベースに対応
する文書特徴情報を格納するようにしても良い。(5) Furthermore, in the above embodiment, the document feature information is stored in the document feature database 13. However, the feature information storage area is provided and stored in the corresponding grammar dictionary database. You may leave it. Further, document characteristic information corresponding to the translation environment database may be stored.

【００７２】(6) また、上記実施形態においては、翻訳
処理毎に翻訳環境を抽出して格納するものを示したが、
今回の入力文書に係る翻訳環境を格納するか否かを使用
者に判断させるものであっても良い。(6) In the above embodiment, the translation environment is extracted and stored for each translation process.
The user may determine whether or not to store the translation environment for the input document this time.

【００７３】(7) さらに、翻訳環境の情報は、上記実施
形態のように入力文書及び翻訳結果の対訳と、使用文法
辞書名とを中心としたものに限定されない。例えば、翻
訳結果を含めず、入力文書と使用文法辞書名とを中心と
した情報であっても良い。(7) Furthermore, the information of the translation environment is not limited to the information centered on the parallel translation of the input document and the translation result and the name of the used grammar dictionary as in the above embodiment. For example, information centered on the input document and the name of the grammar dictionary used may be used without including the translation result.

【００７４】(8) 翻訳環境データベース格納部及び又は
文法辞書データベース格納部は、いずれかの翻訳装置部
と同じ情報処理装置上に構築されたものであっても良い
ことは勿論である。(8) Of course, the translation environment database storage unit and / or the grammar dictionary database storage unit may be constructed on the same information processing device as any of the translation device units.

【００７５】[0075]

【発明の効果】以上のように、第１の本発明の機械翻訳
システムによれば、過去の翻訳時の翻訳環境を格納して
おき、今回の入力文書の特徴から、今回の入力文書に最
適と推測できる過去の翻訳環境を得て、その際の翻訳環
境での文法辞書データベース内容を翻訳に利用すること
により、文法辞書データベース内容のチューニングの必
要性を小さくできると共に、良好な品質の翻訳結果が得
られると共に、また、今回の翻訳環境を抽出して格納す
るようにしたので、次回以降の翻訳時における翻訳環境
の抽出精度を高めることができる。As described above, according to the machine translation system of the first aspect of the present invention, the translation environment at the time of past translation is stored, and the characteristics of the input document of this time make it suitable for the input document of this time. It is possible to reduce the need for tuning the contents of the grammar dictionary database and obtain good quality translation results by obtaining a past translation environment that can be guessed and using the contents of the grammar dictionary database in that translation environment for translation. In addition, since the translation environment of this time is extracted and stored, it is possible to improve the extraction accuracy of the translation environment at the time of subsequent translations.

【００７６】また、第２の本発明の機械翻訳システムに
よれば、チューニング手段と、チューニング後の文法辞
書データベース内容を新しい上記文法辞書データベース
を形成させて格納させる文法辞書データベース形成手段
とを設けたので、第１の本発明による効果に加えて、過
去の翻訳環境から得た文法辞書データベースが不十分で
あってもチューニングによって良好な翻訳品質を達成で
きるという効果、及び、チューニング後の文法辞書デー
タベース内容を独立のデータベースと格納したことに基
づくチューニングによる品質向上効果を将来に渡って得
ることができるという効果を奏する。Further, according to the machine translation system of the second aspect of the present invention, tuning means and grammar dictionary database forming means for forming and storing the new grammar dictionary database contents after tuning are provided. Therefore, in addition to the effect of the first aspect of the present invention, the effect that good translation quality can be achieved by tuning even if the grammar dictionary database obtained from the past translation environment is insufficient, and the grammar dictionary database after tuning There is an effect that a quality improvement effect by tuning based on storing the contents in an independent database can be obtained in the future.

【図面の簡単な説明】[Brief description of drawings]

【図１】実施形態の各部詳細構成を示す機能ブロック図
である。FIG. 1 is a functional block diagram showing a detailed configuration of each part of an embodiment.

【図２】実施形態のシステム構成を示すブロック図であ
る。FIG. 2 is a block diagram showing a system configuration of an embodiment.

【図３】入力文書及び文書特徴情報の説明図である。FIG. 3 is an explanatory diagram of an input document and document characteristic information.

【図４】実施形態の動作フローチャート（その１）であ
る。FIG. 4 is an operation flowchart (1) of the embodiment.

【図５】実施形態の動作フローチャート（その２）であ
る。FIG. 5 is an operation flowchart (2) of the embodiment.

【図６】文書特徴データベース内の文書特徴情報の説明
図である。FIG. 6 is an explanatory diagram of document feature information in a document feature database.

【図７】使用者への翻訳環境の提示例を示す説明図であ
る。FIG. 7 is an explanatory diagram showing an example of presentation of a translation environment to a user.

【図８】抽出された翻訳環境例を示す説明図である。FIG. 8 is an explanatory diagram showing an example of an extracted translation environment.

【符号の説明】[Explanation of symbols]

２…文書特徴抽出判定部、３…翻訳実行部、４〜６…通
信部、１０…翻訳環境設定部、１２…文書特徴抽出部、
１３…文書特徴データベース、１４…文書特徴類似度計
算部、１５…翻訳処理部、１６…文法辞書部、１７…翻
訳環境抽出部、３１…翻訳環境データベース、３２…文
法辞書データベース、２００、２００Ａ、２００Ｂ、２
００Ｃ…翻訳装置部、２０１、２０１ａ、２０１ｂ…文
法辞書データベース格納部、２０２…翻訳環境データベ
ース格納部、２０３…ネットワーク。2 ... Document feature extraction determination unit, 3 ... Translation execution unit, 4-6 ... Communication unit, 10 ... Translation environment setting unit, 12 ... Document feature extraction unit,
13 ... Document feature database, 14 ... Document feature similarity calculation unit, 15 ... Translation processing unit, 16 ... Grammar dictionary unit, 17 ... Translation environment extraction unit, 31 ... Translation environment database, 32 ... Grammar dictionary database, 200, 200A, 200B, 2
00C ... Translation device section, 201, 201a, 201b ... Grammar dictionary database storage section, 202 ... Translation environment database storage section, 203 ... Network.

Claims

【特許請求の範囲】[Claims]

【請求項１】入力された文書を翻訳する１以上の翻訳
装置部と、過去に翻訳したときの少なくとも入力文書と
使用した文法辞書名とを含む１以上の翻訳環境を格納す
る翻訳環境データベースを有する翻訳環境データベース
格納部と、異なる文法辞書をそれぞれデータベースとし
て格納している複数の文法辞書データベース格納部とを
備える機械翻訳システムであって、上記各翻訳装置部がそれぞれ、自然言語で記述された今回の入力文書から、その入力文
書内の単語やイディオムの出現頻度情報を少なくとも含
む文書特徴情報を抽出する文書特徴抽出手段と、上記翻訳環境データベース内に格納されている各翻訳環
境に係る過去の入力文書の文書特徴情報を格納している
文書特徴格納手段と、上記文書特徴抽出手段で抽出された文書特徴情報と、上
記文書特徴格納手段に格納されている各文書特徴情報と
の類似度を得、得られた各類似度に基づいて、上記翻訳
環境データベースから、今回の入力文書に適したと推測
される翻訳環境を取出して使用者に提示して、いずれか
の翻訳環境を設定させる類似度判定・翻訳環境設定手段
と、設定された翻訳環境に記載されている文法辞書名を有す
る文法辞書データベースの内容を上記文法辞書データベ
ース格納部から読み込み、その文法辞書データベース内
容を用いて今回の入力文書を翻訳する翻訳処理手段と、今回の翻訳処理に係る翻訳環境を抽出する翻訳環境抽出
手段と、抽出された翻訳環境を上記翻訳環境データベース格納部
に転送させて上記翻訳環境データベースを更新させる翻
訳環境更新手段と、上記翻訳環境データベースが更新されたとき、それに合
わせて、上記文書特徴格納手段内の格納内容を更新させ
る格納文書特徴更新手段とを備えることを特徴とする機
械翻訳システム。1. A translation environment database for storing at least one translation environment for translating an input document, and at least one translation environment including at least an input document when translated in the past and a grammar dictionary name used. A machine translation system including a translation environment database storage unit and a plurality of grammar dictionary database storage units that respectively store different grammar dictionaries as databases, wherein each of the translation device units is described in natural language. Document feature extraction means for extracting document feature information including at least frequency information of words and idioms in the input document from the current input document, and past feature related to each translation environment stored in the translation environment database. Document feature storage means for storing document feature information of an input document, and document feature extracted by the document feature extraction means And the similarity of each document feature information stored in the document feature storage means is obtained, and based on each obtained similarity, it is presumed from the translation environment database that it is suitable for the input document this time. Contents of a grammar dictionary database having a grammar dictionary name described in the set translation environment and extracting the translation environment and presenting it to the user to set one of the translation environments. Is read from the grammar dictionary database storage unit, the translation processing means for translating the input document this time using the contents of the grammar dictionary database, and the translation environment extraction means for extracting the translation environment related to the translation processing this time are extracted. Translation environment updating means for transferring the translation environment to the translation environment database storage unit to update the translation environment database, and the translation environment database When updated, accordingly, a machine translation system, characterized in that it comprises a storage document characteristic updating means for updating the stored contents in the document feature storage unit.

【請求項２】上記各翻訳装置部がそれぞれ、いずれかの上記文法辞書データベース格納部から転送さ
れてきた文法辞書データベース内容を、使用者からの指
示に基づいてチューニングして、上記翻訳処理手段の使
用に供するようにさせるチューニング手段と、チューニング後の文法辞書データベース内容を新しい上
記文法辞書データベースを形成させて格納させる文法辞
書データベース形成手段とをさらに備え、このチューニング後の文法辞書データベース内容を抽出
される翻訳環境に反映させることを特徴とする請求項１
に記載の機械翻訳システム。2. Each of the translation device units tunes the contents of the grammar dictionary database transferred from any of the grammar dictionary database storage units based on an instruction from a user, and the translation processing means performs the tuning. The apparatus further comprises tuning means for making it available for use, and grammar dictionary database forming means for forming and storing the new grammar dictionary database content after tuning, in which the grammar dictionary database content after tuning is extracted. The information is reflected in the translation environment according to claim 1.
A machine translation system according to item 1.