JP2019036093A

JP2019036093A - Model learning device, conversion device, method, and program

Info

Publication number: JP2019036093A
Application number: JP2017156514A
Authority: JP
Inventors: 永田　昌明; Masaaki Nagata; 昌明永田; 峻輔竹野; Shunsuke Takeno; 和英山本; Kazuhide Yamamoto
Original assignee: Nippon Telegraph and Telephone Corp; Nagaoka University of Technology NUC
Current assignee: Nippon Telegraph and Telephone Corp; Nagaoka University of Technology NUC
Priority date: 2017-08-14
Filing date: 2017-08-14
Publication date: 2019-03-07
Anticipated expiration: 2037-08-14
Also published as: JP6946842B2

Abstract

To learn a model for predicting features on a combination of an input sentence and an output sentence and simultaneously generating the output sentence from an input sentence.SOLUTION: A conversion model learning unit learns a conversion model for converting a source language sentence into an objective language sentence prefixed with a prefix based on the source language sentence and the objective language sentence to which a string of symbols is added as a prefix. The string of symbols is information representing a feature related to a combination of the source language sentence and the objective language sentence.SELECTED DRAWING: Figure 2

Description

本発明は、モデル学習装置、変換装置、方法、及びプログラムに係り、特に、入力された入力文を出力文に変換するためのモデル学習装置、変換装置、方法、及びプログラムに関する。 The present invention relates to a model learning device, a conversion device, a method, and a program, and more particularly, to a model learning device, a conversion device, a method, and a program for converting an input sentence to an output sentence.

ニューラルネットを用いたニューラル機械翻訳システムが出力する文をユーザが制御することは難しい。入力単語やニューラルネットの内部状態は、500次元から1000次元程度の実数ベクトルで表現されるので、ユーザが内容を理解し、操作できるような記号がシステムの内部には全く存在しない。機械翻訳を実現するニューラルネット（確率モデル）は、原言語文（入力文,入力単語列）と目的言語文（出力文,出力単語列）の対だけから学習される。これをend-to-end 学習という。End-to-end学習を行うニューラル機械翻訳システムは完全なブラックボックスであり、開発者が誤りの原因を分析したり、ユーザが出力を思い通りに変更することは非常に困難である。 It is difficult for a user to control a sentence output by a neural machine translation system using a neural network. Since the input word and the internal state of the neural network are expressed by a real vector of about 500 to 1000 dimensions, there are no symbols in the system that allow the user to understand and operate the contents. A neural network (probability model) that realizes machine translation is learned only from a pair of a source language sentence (input sentence, input word string) and a target language sentence (output sentence, output word string). This is called end-to-end learning. A neural machine translation system that performs end-to-end learning is a complete black box, and it is very difficult for developers to analyze the cause of errors and for users to change the output as desired.

［アテンション付きエンコーダデコーダモデル］
まず現在のニューラル機械翻訳の主流である、アテンション付きエンコーダデコーダモデル（attention-based encoder-decoder model,注意付き符号器復号器モデル）について説明する（非特許文献１、非特許文献５参照）。 [Encoder decoder model with attention]
First, an attention-based encoder-decoder model (attention-based encoder-decoder model), which is the mainstream of current neural machine translation, will be described (see Non-Patent Document 1 and Non-Patent Document 5).

入力系列を Input series

、モデルパラメタをθとするとき、エンコーダデコーダモデルは、出力系列
When the model parameter is θ, the encoder decoder model is the output sequence

の尤度を（１）式のように定式化する。
Is formulated as shown in equation (1).

・・・（１）
... (1)

ここで、＜ｊは、ｊ以前の全ての単語を考慮することを意味する。 Here, <j means that all words before j are considered.

エンコーダ（encoder,符号器）は、非線形変換により入力系列xを内部状態系列（hidden states,隠れ層の状態） The encoder (encoder) converts the input sequence x into the internal state sequence (hidden states) by nonlinear transformation.

に写像するリカレントニューラルネットワーク（recurrent neural network）である。デコーダ（decoder,復号器）は、出力系列ｙを文頭から一つずつ予測するリカレントニューラルネットワークである。エンコーダデコーダモデルは、確率的勾配降下法（stochastic gradient descent, SGD）を使って以下（２）式のように対訳データＤの条件付き尤度を最大化するように学習される。
It is a recurrent neural network that maps to The decoder is a recurrent neural network that predicts the output sequence y one by one from the beginning of the sentence. The encoder / decoder model is learned so as to maximize the conditional likelihood of the parallel translation data D using the stochastic gradient descent (SGD) as shown in the following equation (2).

・・・（２）
... (2)

アテンション付きエンコーダデコーダモデルは、アテンション層（attention layer）と呼ばれるフィードフォワードニューラルネットワーク（feed-forward neural network）を持つエンコーダデコーダモデルである。アテンション層は、直前の目的言語の単語ｙ_ｊ−１から次の単語ｙ_ｊを予測する際に使用する、エンコーダの各内部状態ｈ_ｉ（すなわち原言語の各単語ｘ_ｉ）に対する重みを、直前のデコーダの内部状態とエンコーダの各内部状態に基づいて計算する。 The attention-added encoder / decoder model is an encoder / decoder model having a feed-forward neural network called an attention layer. The attention layer assigns a weight to each internal state h _i of the encoder (ie, each word x _{i in the} source language) used in predicting the next word y _j from the word y _j−1 in the previous target language. It calculates based on the internal state of the decoder and the internal state of the encoder.

以上がアテンション付きエンコーダデコーダモデルについての説明である。 This completes the description of the encoder decoder model with attention.

次に、機械翻訳で考慮される各種の手法についてそれぞれ説明する。 Next, various methods considered in machine translation will be described.

［付加制約］
Sennrichらは、英語からドイツ語への翻訳において目的言語文の丁寧さを制御する方法として付加制約（side constraints）を提案した（非特許文献６）。この方法では、目的言語文におけるラテン語のTuのような親称（familiar）とラテン語のVosのような敬称（polite）の使用を区別するT-Vタグ（T-V distinction tag）を、原言語文の末尾に付加する。 [Additional restrictions]
Sennrich et al. Proposed side constraints as a method for controlling the politeness of a target language sentence in translation from English to German (Non-Patent Document 6). This method uses a TV distinction tag at the end of the source language sentence to distinguish between the use of a familiar name such as Latin Tu in the target language sentence and a polite name such as Latin Vos. Append.

付加制約は、目的言語文が満たすべき特徴を表現する特別な記号（special token）を原言語文の文末に付与することにより、ユーザが生成される目的言語文を制御する一般的な方法と考えることができる。 The additional constraint is considered as a general method for controlling the target language sentence generated by the user by adding a special token that expresses the characteristics to be satisfied by the target language sentence to the end of the source language sentence. be able to.

翻訳モデルを訓練（training,学習）する際には、付加制約は、原言語文と目的言語文の対から何らかの方法で自動的に抽出される。翻訳を実行（test）する際には、ユーザが付加制約を指定する必要がある。入力文（目的言語文）から付加制約を自動的に求める一般的な方法は存在せず、付加制約ごとに個別に問題を解決しなければならない。 When training a translation model, additional constraints are automatically extracted in some way from a pair of a source language sentence and a target language sentence. When performing translation (test), the user needs to specify additional constraints. There is no general method for automatically obtaining an additional constraint from an input sentence (target language sentence), and the problem must be solved individually for each additional constraint.

Johnsonらは、一対多の多言語翻訳モデルを学習する際に生成すべき目的言語を指定する方法として付加制約を用いている（非特許文献２参照）。これは以下の例のように、目的言語であるスペイン語を表す特別な記号を、原言語の先頭に付加する。 Johnson et al. Uses an additional constraint as a method of specifying a target language to be generated when learning a one-to-many multilingual translation model (see Non-Patent Document 2). This adds a special symbol representing the target language, Spanish, to the beginning of the source language, as in the following example.

ただし非特許文献２では、以下の例のように翻訳モデルを学習する際には原言語文を反転させているので、実際には特別な記号は原言語側の末尾に付加されている。 However, in Non-Patent Document 2, since the source language sentence is inverted when learning the translation model as in the following example, a special symbol is actually added to the end of the source language side.

［接頭辞制約付きデコーディング］ [Decoding with prefix constraint]

Webkerらは、エンコーダデコーダモデルに基づくニューラル機械翻訳において、ユーザが指定した接頭辞（prefix）と出力文の接頭辞が一致するという制約の下で出力文を生成する接頭辞制約付きデコーディング（prefix-constrained decoding）を提案している（非特許文献８参照）。彼らはこれを対話的機械翻訳（interactive machine translation）に用いている。 Webker et al., In neural machine translation based on the encoder decoder model, prefix-constrained decoding (prefix) that generates an output sentence under the constraint that the prefix specified by the user matches the prefix of the output sentence. -constrained decoding) (see Non-Patent Document 8). They use it for interactive machine translation.

接頭辞制約付きデコーディングの実装は非常に簡単であり、デコーダが次の単語を予測する際に、デコーダが予測した直前の単語を無視して、代わりに接頭辞中の対応する位置の単語を入力に用いるだけである。接頭辞が終了したら、通常のビーム探索（beam search）によるデコーディングに戻る。すなわち、直前の単語として最も確率が高かった単語候補を次の単語の予測に使用する。 Implementing prefix constrained decoding is very simple: when the decoder predicts the next word, it ignores the previous word predicted by the decoder and instead replaces the word at the corresponding position in the prefix. It is only used for input. When the prefix is complete, return to decoding by normal beam search. That is, the word candidate having the highest probability as the immediately preceding word is used for the prediction of the next word.

［双方向デコーディング］
一般に、ニューラル機械翻訳では、目的言語側を生成する際に、文頭から文末方向（左から右,left-to-right）に生成した結果と文末から文頭方向（右から左,right-to-left）に生成した結果が異なる。この性質を利用して、翻訳精度の向上を図ることを双方向デコーディング（bidirectional decoding）と呼ぶ。 [Bidirectional decoding]
In general, in neural machine translation, when generating the target language side, the result generated from the beginning of the sentence to the end of the sentence (left to right, left-to-right) and the direction from the end of the sentence to the beginning of the sentence (right to left, right-to-left) ) The results generated are different. Making use of this property to improve the translation accuracy is called bidirectional decoding.

Liuらは、ニューラル翻訳において目的言語側を生成する際に、左から右に生成した結果と右から左に生成した結果が一致するような文候補を近似的に探索することにより翻訳精度が向上すると報告している（非特許文献４参照）。彼らはこれを目的言語双方向ニューラル機械翻訳（Target-bidirectional Neural Machine Translation）と呼んでいる。 Liu et al. Improved the accuracy of translation by approximating sentence candidates that match the result generated from left to right and the result generated from right to left when generating the target language side in neural translation. Then, it reports (refer nonpatent literature 4). They call this Target-bidirectional Neural Machine Translation.

非特許文献４の方法は、具体的には、目的言語文を左から右に生成する翻訳モデルと右から左へ生成する翻訳モデルを学習し、それぞれの翻訳モデルを用いてビーム探索によりk-best文候補を作成し、両者の共通集合となる文候補の中から、二つの翻訳モデルが与える確率の積が最大となる文候補を選ぶ。 Specifically, the method of Non-Patent Document 4 learns a translation model for generating a target language sentence from left to right and a translation model for generating from right to left, and uses each translation model to perform k- The best sentence candidate is created, and the sentence candidate that maximizes the product of the probabilities given by the two translation models is selected from the sentence candidates that are the common set of both.

非特許文献４の方法では、双方向デコーディングを実現するために、二つの翻訳モデルを用意し、通常のデコーディングに加えて、二つの翻訳結果が一致する候補を探索する手段を用意する必要がある。 In the method of Non-Patent Document 4, in order to realize bidirectional decoding, it is necessary to prepare two translation models, and in addition to normal decoding, it is necessary to provide means for searching for candidates that match two translation results There is.

［領域適応］
一般に翻訳対象となる領域（domain）の対訳データを大量に用意できない場合、翻訳対象とは異なる領域の対訳データを利用して翻訳精度の向上を図る。これを領域適応（domain adaptation）と呼ぶ。 [Region adaptation]
In general, when a large amount of parallel translation data of a domain to be translated cannot be prepared, translation accuracy is improved by using parallel translation data of a different area from the translation target. This is called domain adaptation.

Kobusらは、情報通信、文学、医療、ニュース、国会議事録、観光などの異なる領域（domain）から構成される対訳データにおいて、原言語文が所属する領域を領域タグ（domain tag）で表現して付加制約として原言語の文末に付加し、すべての領域の対訳データから一つの翻訳モデルを学習し、テスト時には原言語文が所属する領域を自動推定して領域タグを付加することにより、翻訳精度が向上すると報告している（非特許文献３参照）。 Kobus et al. Expresses the domain to which the source language sentence belongs in the domain tag in the bilingual data composed of different domains such as information communication, literature, medical care, news, Diet proceedings, tourism, etc. As an additional constraint, it is added to the end of the source language sentence, learning one translation model from the bilingual data of all areas, and during the test, the area to which the source language sentence belongs is automatically estimated and the area tag is added. It is reported that the accuracy is improved (see Non-Patent Document 3).

入力文（目的言語文）から付加制約を自動的に求める一般的な方法は存在しないので、非特許文献３では、TF-IDFに基づく特徴量を利用した分類器を作成し、原言語文が所属する領域を自動的に決定している。 Since there is no general method for automatically obtaining additional constraints from an input sentence (target language sentence), Non-Patent Document 3 creates a classifier that uses features based on TF-IDF, and the source language sentence is The area to which it belongs is automatically determined.

［ゼロ代名詞の英語への翻訳］
日本語や中国語のように文脈から了解可能な主語を省略するpro-drop 言語から、英語のような主語が必須である（主語の省略を許さない）non-pro-drop言語への翻訳では、原言語文において省略された主語や目的語を検出し、これに対応する主語や目的語を目的言語文で生成する必要がある。 [Translation of zero pronouns into English]
In pro-drop languages that omit subject that can be understood from context, such as Japanese and Chinese, to non-pro-drop languages that require a subject such as English (which does not allow omission of the subject) It is necessary to detect a subject or object omitted in the source language sentence and generate a corresponding subject or object in the target language sentence.

省略された主語や目的語のことをゼロ代名詞（zero pronoun）と呼ぶ。 The omitted subject and object are called zero pronouns.

Wangらは、中国語から英語への翻訳において、対訳データから自動作成した単語対応と原言語（中国語）の言語モデルを用いて、英語において明示されている代名詞から、中国語において省略されている代名詞の種類と位置を推定する手法を提案し、この手法を用いて訓練データとテストデータにおける原言語（中国語）のゼロ代名詞を推定（補完）した後にニューラル翻訳を行うことにより翻訳精度が向上すると報告している（非特許文献７参照）。 Wang et al. Used the word correspondence automatically created from the bilingual data and the language model of the source language (Chinese) in Chinese to English translation. We propose a method for estimating the type and position of pronouns, and use this method to estimate (complement) zero pronouns in the source language (Chinese) in training data and test data, and then perform neural translation to improve translation accuracy. It is reported to improve (see Non-Patent Document 7).

非特許文献７の手法では、（主語人称代名詞, 我, him）（目的語人称代名詞,他,him）のように、ゼロ代名詞の種類、そのゼロ代名詞と等価な原言語の代名詞、そのゼロ代名詞に対応する目的言語の代名詞の組から構成されるリストを予め与えなければならない。 In the method of Non-Patent Document 7, the type of zero pronoun, the pronoun of the source language equivalent to the zero pronoun, and the zero pronoun, such as (subject personal pronoun, i, him) A list composed of a set of pronouns corresponding to the target language must be given in advance.

また彼らの手法は、原言語におけるゼロ代名詞の推定と原言語から目的言語への翻訳を独立した２つの課題として扱っている。 In addition, their method treats the estimation of zero pronouns in the source language and the translation from the source language to the target language as two independent issues.

［欠落語（目的言語不対応語）の同定］
竹野らは、日本語や中国語におけるゼロ代名詞を英語の代名詞へ翻訳する問題を包含し一般化した課題として、欠落語（missing word）の予測を定義し、対訳データから求めた単語翻訳確率を用いて欠落語を同定する方法を提案している（非特許文献９）。非特許文献９では、互いに翻訳になっている文の対が与えられた際に、相手の言語には対応する単語が存在しない単語を不対応語（unaligned word）と呼び、特に目的言語文に存在する不対応語を欠落語（missing word）と呼んでいる。 [Identification of missing words (words that do not correspond to the target language)]
Takeno et al. Defined prediction of missing words as a generalized problem that includes the problem of translating zero pronouns in Japanese and Chinese into English pronouns, and calculated word translation probabilities obtained from parallel translation data. A method for identifying missing words using this method has been proposed (Non-Patent Document 9). In Non-Patent Document 9, when a pair of sentences that are translated from each other is given, a word that does not have a corresponding word in the partner language is called an unaligned word. An incompatible word that exists is called a missing word.

図１は、日本語と英語の翻訳における不対応語の例である。一般に、「が」、「を」、「に」などの日本語の格助詞や「a」、「an」、「the」などの英語の冠詞は相手の言語に対応する単語が存在しない。このような二つの言語間の文法機能の違いだけでなく、一つの言語の特定の言語現象や構文が原因となって生じる不対応語もある。例えば日本語のゼロ代名詞（省略された主語や目的語）、英語の虚辞（expletive）、すなわち、there 構文のthere、疑問文のdo、形式主語のitなどである。 FIG. 1 is an example of non-corresponding words in Japanese and English translation. In general, Japanese case particles such as “ga”, “wo”, “ni”, and English articles such as “a”, “an”, “the” do not have words corresponding to the language of the partner. In addition to the differences in grammatical functions between these two languages, there are also unsupported words that are caused by specific language phenomena and syntax of one language. For example, there are Japanese zero pronouns (abbreviated subject and object), English expletive, ie, there there is a question, do is a question sentence, and it is a formal subject.

目的言語文に存在する不対応語は、それを生成するための単語が原言語文に明示的に存在しないので、機械翻訳において正しく翻訳（生成）することが非常に難しい。非特許文献９ではこれらを欠落語と呼んでいるが、言語学的な観点では、これらは必ずしも原言語文において欠落している要素ではないので、ここではこれらを目的言語不対応語（unaligned target word）と呼ぶことにする。 A non-corresponding word existing in the target language sentence is very difficult to be correctly translated (generated) in machine translation because a word for generating it is not explicitly present in the source language sentence. In Non-Patent Document 9, these are called missing words. However, from a linguistic point of view, these are not necessarily missing elements in the source language sentence. word).

非特許文献９では、以下のようにして目的言語不対応語を同定している。原言語をf、目的言語をe、空単語（empty word）をNULLとするとき、まずGiza++等の自動単語対応付けソフトウェアを用いて対訳データから単語翻訳確率p（e|f）とp（f|e）を求める。次に原言語fの単語NULLが目的言語eの単語wに対応する度合いを表すスコアS_u（w）を以下（３）式のように定義し、このスコアが大きな順に上位n個の単語のリストを目的言語不対応語のリストとする。 In Non-Patent Document 9, target language incompatible words are identified as follows. When the source language is f, the target language is e, and the empty word is NULL, the word translation probabilities p (e | f) and p (f) are first calculated from the parallel translation data using automatic word mapping software such as Giza ++. | e). Next, a score S _u (w) representing the degree to which the word NULL in the source language f corresponds to the word w in the target language e is defined as shown in the following equation (3). The list is a list of non-target language compatible words.

・・・（３）
... (3)

次に、各対訳文対において、単語対応付けソフトウェアを用いて目的言語文において原言語文に対応する単語が存在しない単語を求め、これらの単語のうち、上記で定めた目的言語不対応語の候補リストに含まれているものを、この対訳文対における目的言語不対応語とする。 Next, in each bilingual sentence pair, the word association software is used to obtain a word for which there is no word corresponding to the source language sentence in the target language sentence, and among these words, the target language incompatible word defined above is determined. What is included in the candidate list is set as a target language incompatible word in this parallel translation pair.

非特許文献９では、目的言語不対応語を原言語文中の適切な位置へ投射した「オラクル入力文」を作成し、オラクル入力文と目的言語文の対から学習した翻訳モデルを用いてオラクル入力文を翻訳すると、翻訳精度が大きく向上することを示した。ただし、これは目的言語文（翻訳の正解）を見ながら原言語文（入力文）に目的言語不対応語を追加しているので、目的言語不対応語の扱いを工夫すれば、翻訳精度を向上できる余地があることを示しているものである。実際の翻訳では、原言語文の情報だけから目的言語不対応語を予測する必要がある。 In Non-Patent Document 9, an “Oracle input sentence” is created by projecting a target language incompatible word to an appropriate position in the source language sentence, and an Oracle input is made using a translation model learned from the pair of the Oracle input sentence and the target language sentence. It was shown that translation accuracy improved greatly when the sentence was translated. However, since the target language sentence (input sentence) is added to the target language sentence (input sentence) while looking at the target language sentence (correct translation), if the handling of the target language incompatible word is devised, the translation accuracy will be improved. It shows that there is room for improvement. In actual translation, it is necessary to predict a target language non-corresponding word only from the information of the source language sentence.

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of ICLR-2015, 2015.Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio.Neural Machine Translation by Jointly Learning to Align and Translate.In Proceedings of ICLR-2015, 2015. Melvin Johnson, Mike Schuster, Quoc V Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viegas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean.Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation.arXiv preprint arXiv:1611.04558, 2016.Melvin Johnson, Mike Schuster, Quoc V Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viegas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean.Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation .arXiv preprint arXiv: 1611.04558, 2016. Catherine Kobus, Josep Maria Crego, and Jean Senellart. Domain control for neural machine translation. arXiv preprint arXiv:1612.06140, 2016.Catherine Kobus, Josep Maria Crego, and Jean Senellart.Domain control for neural machine translation.arXiv preprint arXiv: 1612.06140, 2016. Lemao Liu, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. Agreement on targetbidirectional neural machine translation. In Proceedings of the NAACL-HLT, pp. 411-416, 2016.Lemao Liu, Masao Utiyama, Andrew Finch, and Eiichiro Sumita.Agreement on targetbidirectional neural machine translation.In Proceedings of the NAACL-HLT, pp. 411-416, 2016. Minh-Thang Luong, Hieu Pham, and Christopher D Manning. Effective approaches to attentionbased neural machine translation. In EMNLP-2015, 2015.Minh-Thang Luong, Hieu Pham, and Christopher D Manning.Efficient approaches to attentionbased neural machine translation.In EMNLP-2015, 2015. Rico Sennrich, Barry Haddow, and Alexandra Birch. Controlling Politeness in Neural Machine Translation via Side Constraints. In Proceedings of NAACL-HLT-2016, pp. 35-40, 2016.Rico Sennrich, Barry Haddow, and Alexandra Birch.Controlling Politeness in Neural Machine Translation via Side Constraints.In Proceedings of NAACL-HLT-2016, pp. 35-40, 2016. Longyue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way, and Qun Liu. A Novel Approach to Dropped Pronoun Translation. In Proceedings of the NAACL-2016, pp. 983-993,2016.Longyue Wang, Zhaopeng Tu, Xiaojun Zhang, Hang Li, Andy Way, and Qun Liu.A Novel Approach to Dropped Pronoun Translation.In Proceedings of the NAACL-2016, pp. 983-993,2016. Joern Wuebker, Spence Green, John DeNero, Sasa Hasan, and Minh-Thang Luong. Models and Inference for Prefix-Constrained Machine Translation. In Proceedings of the ACL-2016, pp. 66-5, 2016.Joern Wuebker, Spence Green, John DeNero, Sasa Hasan, and Minh-Thang Luong.Models and Inference for Prefix-Constrained Machine Translation.In Proceedings of the ACL-2016, pp. 66-5, 2016. 竹野峻輔, 永田昌明, 山本和英. 単語対応を利用した欠落語の投射による機械翻訳向きのオラクル入力文の生成. 信学技法vol. 116, no. 379, NLC2016-38, pp., pp. 135-140, 2016.Shuno Takeno, Masaaki Nagata, Kazuhide Yamamoto. Generation of Oracle Input Sentences for Machine Translation by Projecting Missing Words Using Word Correspondence. Science Technique vol. 116, no. 379, NLC2016-38, pp., Pp. 135 -140, 2016.

上記において挙げた各種の手法については、双方向デコーディング、領域適応、及びゼロ代名詞の翻訳を一般化した問題を例にした目的言語不対応語の生成において、次のような課題が考えられる。 With respect to the various methods mentioned above, the following problems can be considered in the generation of non-corresponding target languages, taking as an example the problem of generalizing bidirectional decoding, domain adaptation, and zero pronoun translation.

双方向デコーディングに関しては、「左から右」および「右から左」の二つの翻訳モデルを用意し、通常のデコーディングに加えて、二つデコーディング方向の翻訳結果が一致する候補を探索する手段を用意することが煩雑であるという課題がある。 For bidirectional decoding, prepare two translation models of “left to right” and “right to left”, and search for candidates that match the translation results in the two decoding directions in addition to normal decoding. There is a problem that it is complicated to prepare means.

領域適応に関しては、原言語文が所属する領域を自動的に同定する手段を別途用意することが煩雑であるという課題がある。 Regarding area adaptation, there is a problem that it is troublesome to separately prepare means for automatically identifying the area to which the source language sentence belongs.

目的言語不対応語の生成に関しては、原言語文の情報だけから目的言語不対応語の予測翻訳精度を改善できる余地があることは分かっているが、原言語文の情報だけから目的言語不対応語を予測する方法が知られていないという課題がある。 Regarding the generation of non-target language words, we know that there is room to improve the predicted translation accuracy of non-target language words only from the source language sentence information, but the target language is not compatible only from the source language sentence information. There is a problem that the method of predicting words is not known.

本発明は、上記問題点を解決するために成されたものであり、入力文から、入力文と出力文との組に関する特徴の予測と、出力文の生成とを同時に行うためのモデルを学習できるモデル学習装置、方法、及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems, and learns a model for simultaneously predicting features related to a combination of an input sentence and an output sentence and generating an output sentence from the input sentence. An object of the present invention is to provide a model learning apparatus, method, and program that can be used.

また、入力文から、入力文と出力文との組に関する特徴の予測と、出力文の生成とを同時に行うことができる変換装置、方法、及びプログラムを提供することを目的とする。 It is another object of the present invention to provide a conversion device, method, and program capable of simultaneously predicting a feature related to a combination of an input sentence and an output sentence and generating an output sentence from the input sentence.

上記目的を達成するために、第１の発明に係るモデル学習装置は、入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列を接頭辞として先頭に付加された前記出力文とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文に変換するための変換モデルを学習するモデル学習部、を含んで構成されている。 In order to achieve the above object, a model learning device according to a first invention uses, as a prefix, a sequence of one or more symbols that are information representing characteristics related to an input sentence and a set of the input sentence and the output sentence. A model learning unit that learns a conversion model for converting the input sentence to the output sentence with the prefix added to the input sentence based on the output sentence added to the beginning. Yes.

また、第２の発明に係るモデル学習装置は、入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列から構成される接頭辞と、前記出力文とに基づいて、前記出力文に対して、前記接頭辞に応じて定められた処理を実行した処理結果の先頭に、前記接頭辞を付加する文作成部と、前記入力文と、前記文作成部により前記接頭辞が先頭に付加された前記出力文の前記処理結果とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文の前記処理結果に変換するための変換モデルを学習するモデル学習部と、を含んで構成されている。 The model learning device according to the second invention is based on a prefix composed of a sequence of one or more symbols, which is information representing characteristics relating to a set of an input sentence and an output sentence, and the output sentence. , A sentence creation unit for adding the prefix to the head of a processing result obtained by executing a process determined according to the prefix for the output sentence, the input sentence, and the prefix by the sentence creation unit. Learning a conversion model for converting the input sentence to the processing result of the output sentence prefixed with the prefix, based on the processing result of the output sentence prefixed with a prefix And a model learning unit.

また、第３の発明に係る変換装置は、予め学習された、入力文を、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列が接頭辞として先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換する変換部を含み、前記変換部は、前記入力文の単語系列を内部状態系列に変換するエンコーダと、前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対する内部状態の重み付き和を出力するアテンション層と、前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップ（時刻）の各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを備える。 Further, the conversion device according to the third aspect of the present invention provides an input sentence that has been learned in advance as a prefix with a string of one or more symbols that are information representing characteristics relating to the combination of the input sentence and the output sentence. A conversion unit that converts the input sentence into an output sentence prefixed with the prefix using a conversion model that converts the output sentence to the added output sentence, and the conversion unit converts a word sequence of the input sentence An encoder for converting to an internal state series, an attention layer for calculating a weight for each word of the input sentence, and outputting a weighted sum of internal states for each word of the encoder, and the prefix added to the head A decoder that predicts an output sentence one word at a time from the beginning, and in each of the steps (time) in which the decoder predicts a word, the output from the attention layer and the decoder in the previous step Comprising a state, a decoder for receiving the word output as predicted by the previous step.

また、第４の発明に係る変換装置は、入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号からなる接頭辞とを入力とし、予め学習された、入力文を、前記接頭辞が先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換する変換部を含み、前記変換部は、前記入力文の単語系列を内部状態系列に変換するエンコーダと、前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応する内部状態の重み付き和を出力するアテンション層と、前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップ（時刻）の各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップの予測として出力された単語とを入力とするデコーダとを備え、前記一つ前のステップで予測として出力された単語が、入力された接頭辞の対応する記号と異なる場合、前記入力された接頭辞の対応する記号を、前記一つ前のステップで予測として出力された単語の代わりとする。 In addition, the conversion device according to the fourth invention receives an input sentence and a prefix composed of one or more symbols, which are information representing characteristics relating to the combination of the input sentence and the output sentence, and has been learned in advance. A conversion unit that converts the input sentence into an output sentence prefixed with the prefix using a conversion model that converts the input sentence into the output sentence prefixed with the prefix, The conversion unit includes an encoder that converts the word sequence of the input sentence into an internal state sequence, and calculates a weight for each word of the input sentence, and outputs a weighted sum of the internal state corresponding to each word of the encoder An output from the attention layer in each of the steps (time) in which the decoder predicts a word word by word from the head, and the output sentence with the prefix added to the head A decoder having as input the internal state of the previous step decoder and the word output as the prediction of the previous step, and the word output as the prediction in the previous step, If it is different from the corresponding symbol of the input prefix, the corresponding symbol of the input prefix is substituted for the word output as a prediction in the previous step.

本発明のモデル学習装置、方法、及びプログラムによれば、入力文と、入力文と出力文との組に関する特徴を表す情報である長さ１以上の接頭辞が先頭に付加された出力文とに基づいて、入力文を、接頭辞が先頭に付加された出力文に変換するための変換モデルを学習することにより、入力文から、入力文と出力文との組に関する特徴の予測と、出力文の生成とを同時に行うためのモデルを学習できる、という効果が得られる。 According to the model learning apparatus, method, and program of the present invention, an input sentence, and an output sentence prefixed with a prefix of length 1 or more, which is information indicating characteristics relating to a combination of the input sentence and the output sentence, Based on the above, by learning a conversion model for converting an input sentence to an output sentence with a prefix added to the input sentence, prediction of features related to the combination of the input sentence and the output sentence can be performed and output. The effect of being able to learn a model for simultaneous sentence generation is obtained.

本発明の変換装置、方法、及びプログラムによれば、予め学習された、入力文を、入力文と出力文との組に関する特徴を表す情報である長さ１以上の接頭辞が先頭に付加された出力文に変換する変換モデルを用いて、入力文を接頭辞が先頭に付加された出力文に変換することにより、入力文から、入力文と出力文との組に関する特徴の予測と、出力文の生成とを同時に行うことができる、という効果が得られる。 According to the conversion apparatus, method, and program of the present invention, a prefix having a length of 1 or more, which is information indicating characteristics of a set of an input sentence and an output sentence, is added to the head of an input sentence learned in advance. By using a conversion model that converts to an output sentence, the input sentence is converted into an output sentence with a prefix added to it. The effect is that the sentence can be generated at the same time.

日本語と英語の翻訳における不対応語の例を示す図である。It is a figure which shows the example of the non-corresponding word in Japanese and English translation. 本発明の第１の実施の形態に係るモデル学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the model learning apparatus which concerns on the 1st Embodiment of this invention. 学習される変換モデルの模式図の一例を示す図である。It is a figure which shows an example of the schematic diagram of the conversion model learned. 本発明の第１の実施の形態に係るモデル学習装置におけるモデル学習処理ルーチンを示すフローチャートである。It is a flowchart which shows the model learning process routine in the model learning apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the converter which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る変換装置における変換処理ルーチンを示すフローチャートである。It is a flowchart which shows the conversion process routine in the converter which concerns on the 1st Embodiment of this invention. 本発明の第２の実施の形態に係るモデル学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the model learning apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第２の実施の形態に係る変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the converter which concerns on the 2nd Embodiment of this invention. 変換処理において、予測された単語が、入力された接頭辞の対応する記号と異なる場合の一例を示す図である。In a conversion process, it is a figure which shows an example in case the estimated word differs from the corresponding symbol of the input prefix. 本発明の第２の実施の形態に係る変換装置における変換処理ルーチンを示すフローチャートである。It is a flowchart which shows the conversion process routine in the converter which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施の形態に係る変換モデル学習装置の構成を示すブロック図である。It is a block diagram which shows the structure of the conversion model learning apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施の形態に係る接頭辞作成部の構成を示すブロック図である。It is a block diagram which shows the structure of the prefix preparation part which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施の形態に係る変換装置における接頭辞作成処理ルーチンを示すフローチャートである。It is a flowchart which shows the prefix creation process routine in the conversion apparatus which concerns on the 3rd Embodiment of this invention. 本発明の第３の実施の形態に係る変換装置の構成を示すブロック図である。It is a block diagram which shows the structure of the converter which concerns on the 3rd Embodiment of this invention.

以下、図面を参照して本発明の実施の形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜本発明の実施の形態に係る原理＞ <Principle according to the embodiment of the present invention>

まず、本発明の実施の形態における原理を説明する。 First, the principle in the embodiment of the present invention will be described.

本発明の実施の形態では、接頭辞制約の予測（prefix constraints prediction）、及び接頭辞制約付きデコーディング（prefix-constraint decoding）という汎用的な枠組みを提案する。さらに、この枠組みを用いて、領域適応、双方向デコーディング、及び目的言語不対応語の生成を実現する方法について、各実施形態において説明する。 The embodiment of the present invention proposes a general framework of prefix constraints prediction and prefix-constraint decoding. Furthermore, a method for realizing region adaptation, bidirectional decoding, and generation of a target language incompatible word using this framework will be described in each embodiment.

［接頭辞制約の予測］
付加制約（side constraints）が原言語の文末に特別な記号を付加するのに対して、本発明の実施の形態では、特別な記号列を目的言語の先頭に付加することを提案する。これを接頭辞制約（prefix constraints）と呼ぶ。接頭辞制約の予測は、言い換えれば、原言語文から、特別な記号列を目的言語文の接頭辞（prefix）とする拡張された目的言語文への翻訳である。 Predict prefix constraints
Whereas the side constraints add a special symbol to the end of the source language sentence, the embodiment of the present invention proposes to add a special symbol string to the beginning of the target language. This is called prefix constraints. Prefix constraint prediction is, in other words, translation from a source language sentence to an extended target language sentence with a special symbol string as a prefix of the target language sentence.

原言語文ｘと目的言語文ｙの対から求めた特徴を記号で表現した系列を A sequence that expresses features obtained from pairs of source language sentences x and target language sentences y by symbols

とし、拡張された目的言語文を
And the expanded target language sentence

とする。エンコーダデコーダモデルの（１）式を以下（４）式のように拡張する。
And The expression (1) of the encoder / decoder model is expanded as the following expression (4).

・・・（４）
... (4)

（４）式は、デコーダが接頭辞ｃを生成した後に、目的言語文ｙを生成することを表す。 Expression (4) represents that the target language sentence y is generated after the decoder generates the prefix c.

また、目的関数は以下（５）式のように拡張する。 The objective function is expanded as shown in the following equation (5).

・・・（５）
... (5)

このようにして、元のアテンション付きエンコーダデコーダモデルのネットワークは何も変更せずに、特徴を表現する記号列を目的言語の先頭に接頭辞として付加するだけで、原言語文から、記号列の予測と目的言語文の生成を同時に行うことができるようになる。 In this way, the network of the original encoder / decoder model with attention is not changed, and the symbol string representing the feature is simply added as a prefix to the beginning of the target language. Prediction and target language sentence generation can be performed simultaneously.

［接頭辞制約の指定］ [Specify prefix constraints]

本発明の実施の形態では、接頭辞制約をユーザが外部から指定することも可能である。具体的には、特徴を表現する記号列を接頭辞とする接頭辞制約付きデコーディング（非特許文献８参照）を行う。これにより、言語文と接頭辞を入力とし、接頭辞を指定しながら、目的言語文へ翻訳する。 In the embodiment of the present invention, the prefix constraint can be designated by the user from the outside. Specifically, decoding with prefix restriction using a symbol string representing a feature as a prefix is performed (see Non-Patent Document 8). As a result, the language sentence and the prefix are input and translated into the target language sentence while specifying the prefix.

上記の手法により、原言語と目的言語の対に関する特徴を記号または記号列で表現し、この記号列の予測と目的言語文の生成を同時に行う枠組みを提供することができる。また、ユーザがこの記号列を指定し、その制約の下で目的言語文を生成する枠組みを提供することが可能である。この記号は、原言語と目的言語の対に関する特徴を表現するものであれば何でもよい。この記号の体系を適切に設計することにより、特定の問題に関して翻訳精度を向上させたり、ユーザが記号を指定することにより異なる目的言語文を生成することができる。 With the above-described method, it is possible to provide a framework for expressing features related to a pair of a source language and a target language with a symbol or a symbol string, and simultaneously predicting the symbol string and generating a target language sentence. Further, it is possible to provide a framework in which a user designates this symbol string and generates a target language sentence under the restriction. This symbol may be anything as long as it expresses the characteristics of the source language / target language pair. By appropriately designing this symbol system, it is possible to improve the translation accuracy for a specific problem, or to generate different target language sentences by designating a symbol by the user.

以下、双方向デコーディング、領域適応、及び目的言語不対応語の生成に対して、本実施の形態の手法を適用した場合について、第１〜第３の実施の形態においてそれぞれ説明する。 Hereinafter, cases where the method of this embodiment is applied to bidirectional decoding, region adaptation, and generation of non-target language compatible words will be described in the first to third embodiments, respectively.

＜本発明の第１の実施の形態に係るモデル学習装置の構成＞ <Configuration of Model Learning Device According to First Embodiment of the Present Invention>

本発明の第１の実施の形態に係るモデル学習装置の構成について説明する。第１の実施の形態では、双方向デコーディングに、接頭辞制約の予測の手法を適用した場合を例に説明する。 The configuration of the model learning device according to the first embodiment of the present invention will be described. In the first embodiment, a case where a prefix constraint prediction technique is applied to bidirectional decoding will be described as an example.

図２に示すように、本発明の第１の実施の形態に係るモデル学習装置１００は、ＣＰＵと、ＲＡＭと、後述するモデル学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。このモデル学習装置１００は、機能的には図２に示すように入力部１０と、演算部２０とを備えている。 As shown in FIG. 2, the model learning device 100 according to the first embodiment of the present invention includes a CPU, a RAM, a ROM for storing a program and various data for executing a model learning processing routine to be described later, , Can be configured with a computer including. Functionally, the model learning apparatus 100 includes an input unit 10 and a calculation unit 20 as shown in FIG.

入力部１０は、原言語文と目的言語文とが互いに翻訳になっている対訳データを受け付ける。 The input unit 10 receives parallel translation data in which a source language sentence and a target language sentence are translated into each other.

演算部２０は、原言語文抽出部３０と、目的言語文抽出部３２と、接頭辞作成部３４と、文作成部３６と、変換モデル学習部３８と、変換モデル４０とを含んで構成されている。 The arithmetic unit 20 includes a source language sentence extraction unit 30, a target language sentence extraction unit 32, a prefix creation unit 34, a sentence creation unit 36, a conversion model learning unit 38, and a conversion model 40. ing.

原言語文抽出部３０は、入力部１０で受け付けた対訳データから原言語文を文ごとに抽出する。 The source language sentence extraction unit 30 extracts a source language sentence for each sentence from the parallel translation data received by the input unit 10.

目的言語文抽出部３２は、入力部１０で受け付けた対訳データから目的言語文を文ごとに抽出する。 The target language sentence extraction unit 32 extracts a target language sentence for each sentence from the parallel translation data received by the input unit 10.

接頭辞作成部３４は、入力部１０で受け付けた対訳データの原言語文と目的言語文との組について、原言語文と目的言語文との組に関する特徴を表す情報である長さ１以上の記号の列を接頭辞として作成する。本実施の形態では、双方向デコーディングに関する特徴を接頭辞として作成する。例えば、対訳データの原言語文と目的言語文との組の各々について、目的言語文を左から右（left-to-right）に生成することを表す、#L2Rというシャープ付きのタグを接頭辞として作成する。また、対訳データの原言語文と目的言語文との組の各々について、右から左（right-to-left）に生成することを表す、#R2Lというシャープ付きのタグを接頭辞として作成する。なお、接頭辞は、目的言語文に含まれる語彙と重ならず、特徴に対応するように一意に定まる記号であれば何でもよい。また、接頭辞の長さとは、接頭辞に含まれる記号の数であり、本実施の形態では長さは１（固定長）である。 The prefix creation unit 34 has a length of 1 or more, which is information representing the characteristics of the combination of the source language sentence and the target language sentence with respect to the combination of the source language sentence and the target language sentence of the parallel translation data received by the input unit 10. Create a symbol string as a prefix. In the present embodiment, a feature related to bidirectional decoding is created as a prefix. For example, for each pair of source language text and target language text in bilingual data, prefix the tag with sharp # L2R, which indicates that the target language text is generated from left to right (left-to-right) Create as. In addition, a tag with sharp # R2L, which represents generation from right to left (right-to-left), is created as a prefix for each pair of the source language sentence and target language sentence of the bilingual data. The prefix may be any symbol that does not overlap with the vocabulary included in the target language sentence and is uniquely determined so as to correspond to the feature. The prefix length is the number of symbols included in the prefix, and in this embodiment, the length is 1 (fixed length).

文作成部３６は、接頭辞作成部３４で作成した接頭辞と、目的言語文抽出部３２で抽出した目的言語文とに基づいて、目的言語文に、接頭辞を付加する。具体的には、対訳データの原言語文と目的言語文との組の各々について、当該目的言語文の先頭に、#L2Rという接頭辞を付加したものと、当該目的言語文の先頭に、#R2Lという接頭辞を付加したものとを作成する。このとき、文作成部３６は、#R2Lという接頭辞を付加する際には、目的言語文に対して、#R2Lという接頭辞に応じて定められた処理を実行した処理結果として得られた目的言語文の先頭に、接頭辞を付加する。本実施の形態では、#R2Lという接頭辞に応じて定められた処理は、右から左（right-to-left）に生成する場合を考慮して目的言語文を反転する処理である。 The sentence creation unit 36 adds a prefix to the target language sentence based on the prefix created by the prefix creation part 34 and the target language sentence extracted by the target language sentence extraction unit 32. Specifically, for each pair of source language sentence and target language sentence in the bilingual data, the prefix of # L2R is added to the beginning of the target language sentence, and # Create with the prefix R2L. At this time, when adding the prefix # R2L, the sentence creating unit 36 executes the process determined according to the # R2L prefix for the target language sentence, and the objective result obtained as a result of the process is obtained. Add a prefix to the beginning of the language sentence. In the present embodiment, the processing determined according to the prefix # R2L is processing that inverts the target language sentence in consideration of the case of generating from right to left (right-to-left).

原言語文「京都が好きです」に対応する目的言語文、及び処理結果として得られた目的言語文に接頭辞を付加すると以下のようになる。 When prefixes are added to the target language sentence corresponding to the source language sentence “I like Kyoto” and the target language sentence obtained as a result of the processing, the result is as follows.

このように、単語の並び方向が異なる二つの目的言語文を作成し、異なる生成方向を接頭辞として付加する。
In this way, two target language sentences having different word alignment directions are created, and different generation directions are added as prefixes.

変換モデル学習部３８は、原言語文抽出部３０で抽出された原言語文と、文作成部３６により接頭辞が先頭に付加された目的言語文、及び接頭辞が先頭に付加された処理結果により得られた目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための変換モデル４０を学習する。本実施の形態では、目的言語文の単語を予測する生成方向の順序を示すタグが接頭辞として付加された目的言語文を用いて、変換モデル４０を学習する。 The conversion model learning unit 38 includes the source language sentence extracted by the source language sentence extraction unit 30, the target language sentence prefixed by the sentence creation unit 36, and the processing result with the prefix prefixed. Based on the target language sentence obtained by the above, the conversion model 40 for translating the source language sentence into the target language sentence prefixed with the prefix is learned. In the present embodiment, the conversion model 40 is learned using a target language sentence to which a tag indicating a generation direction order for predicting a word of the target language sentence is added as a prefix.

ここで、変換モデル学習部３８によってパラメタが学習され、そのパラメタが変換モデル４０に保持される、「アテンション付きエンコーダデコーダモデル」と呼ばれるニューラルネットワークの模式図を図３に示す。図３に示すようにアテンション付きエンコーダデコーダモデルは、ＲＮＮ（Recurrent neural network）によるエンコーダと、ＦＦＮＮ（Feedforward Neural Network）を用いたアテンション層と、ＲＮＮによるデコーダとから構成される。エンコーダは、文頭から文末方向へ単語を入力するＲＮＮと文末から文頭方向へ単語を入力するＲＮＮの両方の内部状態を連結したものを入力文の各単語の内部状態とする双方向ＲＮＮを使用する。図３は、原言語文、及び目的言語文ともにＲＮＮを順序方向に展開した状態を表している。また、エンコーダは単方向のＲＮＮでもよい。エンコーダとデコーダは多層化(stacking)したＲＮＮでもよい。 Here, FIG. 3 shows a schematic diagram of a neural network called “an encoder decoder model with attention” in which parameters are learned by the transformation model learning unit 38 and the parameters are held in the transformation model 40. As shown in FIG. 3, the encoder / decoder model with attention is composed of an encoder using RNN (Recurrent neural network), an attention layer using Feedforward Neural Network (FFNN), and a decoder using RNN. The encoder uses a bi-directional RNN in which the internal state of both the RNN that inputs a word from the beginning of the sentence to the end of the sentence and the RNN that inputs the word from the end of the sentence to the beginning of the sentence are connected to each other in the input sentence . FIG. 3 shows a state in which the RNN is expanded in the order direction for both the source language sentence and the target language sentence. The encoder may be a unidirectional RNN. The encoder and decoder may be a stacked RNN.

ＲＮＮでは、ある状態ｔにおける内部状態ｈ_ｔは、状態ｔにおける入力ｘ_ｔと直前の状態ｔ−１における内部状態ｈ_ｔ−１に基づいて決定される。なお、本発明の実施の形態で用いるＲＮＮは、ＬＳＴＭ(Long Short Term Memory)やＧＲＵ(Gated Recurrent Unit)など同等の機能を持つ他のニューラルネットで代用してもよい。 In RNN, internal state _{h t} in a certain state t is determined based on the internal state _{h t-1} in the state t-1 immediately preceding the input _{x t} in the state t. Note that the RNN used in the embodiment of the present invention may be replaced by another neural network having an equivalent function such as LSTM (Long Short Term Memory) or GRU (Gated Recurrent Unit).

エンコーダは、原言語文の単語系列を内部状態系列に変換する。アテンション層は、原言語文の各単語に対応するエンコーダの内部状態とデコーダの一つ前のステップの内部状態に基づいてエンコーダの内部状態に対する重みを計算し、エンコーダの各単語に対応する内部状態の重み付き和を出力するＦＦＮＮ（図示省略）である。デコーダは、接頭辞が先頭に付加された目的言語文を先頭から一単語ずつ予測するデコーダであって、デコーダのステップの各々において、アテンション層からの出力と、一つ前のステップの内部状態と、一つ前のステップで予測として出力された単語とを入力とする。 The encoder converts the word sequence of the source language sentence into an internal state sequence. The attention layer calculates the weight for the internal state of the encoder based on the internal state of the encoder corresponding to each word of the source language sentence and the internal state of the previous step of the decoder, and the internal state corresponding to each word of the encoder Is an FFNN (not shown) that outputs a weighted sum of. The decoder predicts a target language sentence prefixed with one word at a time from the beginning. In each step of the decoder, the output from the attention layer, the internal state of the previous step, and The word output as a prediction in the previous step is input.

＜本発明の第１の実施の形態に係るモデル学習装置の作用＞ <Operation of Model Learning Device According to First Embodiment of the Present Invention>

次に、本発明の第１の実施の形態に係るモデル学習装置１００の作用について説明する。入力部１０において対訳データを受け付けると、モデル学習装置１００は、図４に示すモデル学習処理ルーチンを実行する。 Next, the operation of the model learning device 100 according to the first embodiment of the present invention will be described. When the parallel translation data is received by the input unit 10, the model learning device 100 executes a model learning processing routine shown in FIG.

まず、ステップＳ１００では、入力部１０で受け付けた対訳データから原言語文を文ごとに抽出する。 First, in step S100, a source language sentence is extracted for each sentence from parallel translation data received by the input unit 10.

次に、ステップＳ１０２では、入力部１０で受け付けた対訳データから目的言語文を文ごとに抽出する。 Next, in step S102, the target language sentence is extracted for each sentence from the parallel translation data received by the input unit 10.

次に、ステップＳ１０４では、入力部１０で受け付けた対訳データの原言語文と目的言語文との組の各々について、原言語文と目的言語文との組に関する特徴を表す情報である長さ１以上の記号の列を接頭辞として作成する。 Next, in step S104, for each pair of the source language sentence and the target language sentence of the bilingual data received by the input unit 10, length 1 which is information representing the characteristics relating to the combination of the source language sentence and the target language sentence Create the above symbol string as a prefix.

次に、ステップＳ１０６では、ステップＳ１０４で作成した接頭辞と、ステップＳ１０２で抽出した目的言語文とに基づいて、目的言語文に、接頭辞を付加する。また、目的言語文に対して、接頭辞に応じて定められた処理を実行した処理結果として得られた目的言語文の先頭に、接頭辞を付加する。 Next, in step S106, a prefix is added to the target language sentence based on the prefix created in step S104 and the target language sentence extracted in step S102. Also, a prefix is added to the head of the target language sentence obtained as a result of executing the processing determined according to the prefix for the target language sentence.

次に、ステップＳ１０８では、ステップＳ１００で抽出された原言語文と、ステップＳ１０６により接頭辞が先頭に付加された目的言語文、及び接頭辞が先頭に付加された処理結果により得られた目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための変換モデル４０を学習して処理を終了する。 Next, in step S108, the source language sentence extracted in step S100, the target language sentence prefixed with the prefix in step S106, and the target language obtained from the processing result with the prefix prefixed. Based on the sentence, the conversion model 40 for translating the source language sentence into the target language sentence prefixed with the prefix is learned, and the process ends.

以上説明したように、第１の実施の形態に係るモデル学習装置によれば、原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞が先頭に付加された目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に変換するための変換モデルを学習することにより、原言語文から、原言語文と目的言語文との組に関する特徴を表す接頭辞の予測と、目的言語文の生成とを同時行うためのモデルを学習できる。 As described above, according to the model learning device according to the first embodiment, a prefix, which is information representing characteristics related to a combination of a source language sentence and a source language sentence and a target language sentence, is added to the head. The source language sentence and the target language sentence from the source language sentence by learning a conversion model for converting the source language sentence to the target language sentence prefixed with the target language sentence. It is possible to learn a model for simultaneously performing the prediction of a prefix representing the characteristics of the pair and the generation of a target language sentence.

＜本発明の第１の実施の形態に係る変換装置の構成＞ <Configuration of Conversion Device According to First Embodiment of the Present Invention>

次に、本発明の第１の実施の形態に係る変換装置の構成について説明する。図５に示すように、本発明の第１の実施の形態に係る変換装置２００は、ＣＰＵと、ＲＡＭと、後述する変換処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この変換装置２００は、機能的には図５に示すように入力部２１０と、演算部２２０と、出力部２５０とを備えている。 Next, the configuration of the conversion apparatus according to the first embodiment of the present invention will be described. As shown in FIG. 5, the conversion device 200 according to the first embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a conversion processing routine to be described later. It can be configured with a computer including. Functionally, the conversion device 200 includes an input unit 210, a calculation unit 220, and an output unit 250 as shown in FIG.

入力部２１０は、翻訳対象の原言語文を受け付ける。 The input unit 210 receives a source language sentence to be translated.

演算部２２０は、変換部２３０と、整形部２３２と、変換モデル２４０とを含んで構成されている。 The calculation unit 220 includes a conversion unit 230, a shaping unit 232, and a conversion model 240.

変換モデル２４０は、上記変換モデル学習装置１００で学習された、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための学習済みニューラルネットのパラメタを保持している。 The conversion model 240 holds parameters of a learned neural network for translating a source language sentence learned by the conversion model learning apparatus 100 into a target language sentence prefixed with a prefix.

変換部２３０は、変換モデル２４０を用いて、入力部２１０で受け付けた原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する。変換部２３０は、ＲＮＮによるエンコーダと、ＦＦＮＮを用いたアテンション層と、ＲＮＮによるデコーダから構成される。エンコーダは、原言語文の単語系列を内部状態系列に変換する。アテンション層は、原言語文の各単語に対応するエンコーダの内部状態とデコーダの一つ前のステップの内部状態に基づいてエンコーダの内部状態に対する重みを計算し、エンコーダの各単語に対応する内部状態の重み付き和を出力するＦＦＮＮである。デコーダは、接頭辞が先頭に付加された目的言語文を先頭から一単語ずつ予測するデコーダであって、デコーダのステップの各々において、アテンション層からの出力と、一つ前のステップの内部状態と、一つ前のステップで予測として出力された単語とを入力とする。 Using the conversion model 240, the conversion unit 230 translates the source language sentence received by the input unit 210 into a target language sentence prefixed with a prefix. The conversion unit 230 includes an encoder based on RNN, an attention layer using FFNN, and a decoder based on RNN. The encoder converts the word sequence of the source language sentence into an internal state sequence. The attention layer calculates the weight for the internal state of the encoder based on the internal state of the encoder corresponding to each word of the source language sentence and the internal state of the previous step of the decoder, and the internal state corresponding to each word of the encoder Is an FFNN that outputs a weighted sum of. The decoder predicts a target language sentence prefixed with one word at a time from the beginning. In each step of the decoder, the output from the attention layer, the internal state of the previous step, and The word output as a prediction in the previous step is input.

翻訳を実行する際には、変換モデル２４０に基づいて、入力された原言語文に対して、まず#L2R（左から右）または#R2L（右から左）という目的言語文の単語を予測する生成方向の順序を示すタグが接頭辞として予測される。次に、接頭辞が#L2Rの場合は、変換モデル２４０に基づいて、当該接頭辞の後に、目的言語文が左から右に生成され、接頭辞が#R2Lの場合は、変換モデル２４０に基づいて、当該接頭辞の後に、目的言語文が右から左に生成される。最終的には、ビーム探索により最も確率が高い拡張された目的言語文候補が選択されるので、入力文に応じた適切なデコーディング方向の選択が実現される。 When performing translation, based on the conversion model 240, first, a word of a target language sentence of # L2R (left to right) or # R2L (right to left) is predicted for the input source language sentence. A tag indicating the order of the generation direction is predicted as a prefix. Next, when the prefix is # L2R, the target language sentence is generated from the left to the right after the prefix based on the conversion model 240, and when the prefix is # R2L, based on the conversion model 240. Thus, the target language sentence is generated from right to left after the prefix. Eventually, an extended target language sentence candidate having the highest probability is selected by beam search, so that an appropriate decoding direction can be selected according to the input sentence.

整形部２３２は、変換部２３０により出力された、接頭辞が先頭に付加された目的言語文に基づいて、当該接頭辞に応じて定められた処理を、当該目的言語文に対して行い、最終的に得られた目的言語文を出力部２５０に出力する。本実施形態では、接頭辞に応じて定められた処理としては、例えば、出力された接頭辞が#R2Lであれば、目的言語文を反転する処理を行う。接頭辞が#L2Rであれば処理は行わずにそのまま目的言語文を出力する。 The shaping unit 232 performs processing determined according to the prefix on the target language sentence based on the target language sentence with the prefix added to the output from the conversion unit 230, and finally The target language sentence thus obtained is output to the output unit 250. In the present embodiment, as the processing determined according to the prefix, for example, if the output prefix is # R2L, processing for inverting the target language sentence is performed. If the prefix is # L2R, the target language sentence is output without processing.

＜本発明の第１の実施の形態に係る変換装置の作用＞ <Operation of Conversion Device According to First Embodiment of the Present Invention>

次に、本発明の第１の実施の形態に係る変換装置２００の作用について説明する。入力部２１０において翻訳対象の原言語文を受け付けると、変換装置２００は、図６に示す変換処理ルーチンを実行する。 Next, the operation of the conversion device 200 according to the first embodiment of the present invention will be described. When receiving the source language sentence to be translated in the input unit 210, the conversion device 200 executes a conversion processing routine shown in FIG.

まず、ステップＳ２００では、変換モデル２４０を用いて、入力部２１０で受け付けた原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する。 First, in step S200, using the conversion model 240, the source language sentence received by the input unit 210 is translated into a target language sentence prefixed with a prefix.

次に、ステップＳ２０２では、変換部２３０により出力された、接頭辞が先頭に付加された目的言語文に基づいて、当該接頭辞に応じて定められた処理を、当該目的言語文に対して行い、最終的に得られた目的言語文を出力部２５０に出力して処理を終了する。 Next, in step S202, based on the target language sentence output by the conversion unit 230 and prefixed with the prefix, processing determined according to the prefix is performed on the target language sentence. Then, the finally obtained target language sentence is output to the output unit 250, and the process ends.

以上説明したように、第１の実施の形態に係る変換装置によれば、予め学習された、原言語文を、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞が先頭に付加された目的言語文に翻訳する変換モデルを用いて、原言語文を接頭辞が先頭に付加された目的言語文に変換することにより、原言語文から、原言語文と目的言語文との組に関する特徴の予測と、目的言語文の生成とを同時に行うことができる。 As described above, according to the conversion apparatus according to the first embodiment, a prefix, which is learned in advance, is a source language sentence, which is information indicating characteristics relating to the combination of the source language sentence and the target language sentence. Using a conversion model that translates to the target language sentence added to the beginning, the source language sentence and the target language sentence are converted from the source language sentence by converting the source language sentence to the target language sentence prefixed with the prefix. Prediction of the characteristics regarding the pair and generation of the target language sentence can be performed simultaneously.

＜本発明の第２の実施の形態に係るモデル学習装置の構成＞ <Configuration of Model Learning Device According to Second Embodiment of the Present Invention>

本発明の第２の実施の形態に係るモデル学習装置の構成について説明する。第２の実施の形態では、領域適応に、接頭辞制約の予測、及び接頭辞制約の指定の手法を適用した場合を例に説明する。なお、第１の実施の形態と同様となる箇所については同一符号を付して説明を省略する。 The configuration of the model learning device according to the second embodiment of the present invention will be described. In the second embodiment, a case where a prefix constraint prediction and prefix constraint designation method is applied to region adaptation will be described as an example. In addition, the same code | symbol is attached | subjected about the location similar to 1st Embodiment, and description is abbreviate | omitted.

図７に示すように、本発明の第２の実施の形態に係るモデル学習装置３００は、ＣＰＵと、ＲＡＭと、後述するモデル学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。このモデル学習装置３００は、機能的には図７に示すように入力部３１０と、演算部３２０とを備えている。 As shown in FIG. 7, a model learning apparatus 300 according to the second embodiment of the present invention includes a CPU, a RAM, a ROM for storing a program and various data for executing a model learning processing routine to be described later, , Can be configured with a computer including. Functionally, the model learning apparatus 300 includes an input unit 310 and a calculation unit 320 as shown in FIG.

入力部３１０は、原言語文と目的言語文とが対になっている対訳データを受け付ける。 The input unit 310 receives parallel translation data in which a source language sentence and a target language sentence are paired.

演算部３２０は、原言語文抽出部３０と、目的言語文抽出部３２と、接頭辞作成部３３４と、文作成部３３６と、変換モデル学習部３３８と、変換モデル３４０とを含んで構成されている。 The calculation unit 320 includes a source language sentence extraction unit 30, a target language sentence extraction unit 32, a prefix creation unit 334, a sentence creation unit 336, a conversion model learning unit 338, and a conversion model 340. ing.

接頭辞作成部３３４は、入力部３１０で受け付けた対訳データの原言語文と目的言語文との組について、原言語文と目的言語文との組に関する特徴を表す情報である長さ１以上の記号の列を接頭辞として作成する。本実施の形態では、領域適応に関する特徴を接頭辞として作成する。領域は、ニュース、旅行会話、ウィキペディアなど対訳データが所属する領域であり、対訳データに付与されている領域情報や対訳データのデータベース名から、当該領域を表すタグを接頭辞として作成する。例えば、旅行記事であれば#IWSLT、京都に関するウィキペディア記事であれば#KFTT、ロイター社のニュース記事であれば#REUTERS、というシャープ付きのタグを接頭辞として作成する。 The prefix creation unit 334 has a length of 1 or more, which is information indicating the characteristics of the combination of the source language sentence and the target language sentence with respect to the combination of the source language sentence and the target language sentence of the parallel translation data received by the input unit 310. Create a symbol string as a prefix. In this embodiment, a feature relating to area adaptation is created as a prefix. The area is an area to which the bilingual data belongs, such as news, travel conversation, and Wikipedia, and a tag representing the area is created from the area information given to the bilingual data and the database name of the bilingual data as a prefix. For example, #IWSLT for travel articles, #KFTT for Wikipedia articles about Kyoto, and #REUTERS for Reuters news articles are created as prefixes.

文作成部３３６は、接頭辞作成部３３４で作成した接頭辞と、目的言語文抽出部３２で抽出した目的言語文とに基づいて、目的言語文の先頭に、接頭辞を付加する。なお、上記第１の実施の形態で説明した、目的言語文の単語を予測する生成方向の順序を示すタグも付加している場合には、接頭辞に応じて定められた処理として、目的言語文を反転する処理を更に行うようにする。 The sentence creation unit 336 adds a prefix to the head of the target language sentence based on the prefix created by the prefix creation part 334 and the target language sentence extracted by the target language sentence extraction unit 32. In addition, when the tag which shows the order of the production | generation direction which estimates the word of the target language sentence demonstrated in the said 1st Embodiment is also added, as a process defined according to the prefix, the target language Further processing to invert the sentence is performed.

原言語文に対応する目的言語文の先頭に接頭辞を付加すると以下のようになる。 If a prefix is added to the head of the target language sentence corresponding to the source language sentence, it will be as follows.

変換モデル学習部３３８は、原言語文抽出部３０で抽出された原言語文と、文作成部３３６により接頭辞が先頭に付加された目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための変換モデル３４０を学習する。本実施の形態では、領域を表すタグが接頭辞として付加された目的言語文を用いて、変換モデル３４０を学習する。 Based on the source language sentence extracted by the source language sentence extraction unit 30 and the target language sentence prefixed by the sentence creation unit 336, the conversion model learning unit 338 converts the source language sentence into a prefix. A conversion model 340 for translating into a target language sentence with a prefix added to is learned. In the present embodiment, the conversion model 340 is learned using a target language sentence to which a tag representing a region is added as a prefix.

第２の実施の形態の他の構成については、第１の実施の形態と同様であるため、詳細な説明を省略する。 Since other configurations of the second embodiment are the same as those of the first embodiment, detailed description thereof is omitted.

なお、第２の実施の形態に係る作用については、接頭辞に応じて定められた処理を実行した処理結果として得られた目的言語文を用いない点以外は、第１の実施の形態と同様であるため、説明を省略する。 Note that the operation according to the second embodiment is the same as that of the first embodiment, except that the target language sentence obtained as a result of executing the processing determined according to the prefix is not used. Therefore, the description is omitted.

以上説明したように、第２の実施の形態に係るモデル学習装置によれば、原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞が先頭に付加された目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に変換するための変換モデルを学習することにより、原言語文から、原言語文と目的言語文との組に関する特徴を表す接頭辞の予測と、目的言語文の生成とを同時行うためのモデルを学習できる。 As described above, according to the model learning device according to the second embodiment, a prefix that is information indicating characteristics of a source language sentence and a combination of the source language sentence and the target language sentence is added to the head. The source language sentence and the target language sentence from the source language sentence by learning a conversion model for converting the source language sentence to the target language sentence prefixed with the target language sentence. It is possible to learn a model for simultaneously performing the prediction of a prefix representing the characteristics of the pair and the generation of a target language sentence.

＜本発明の第２の実施の形態に係る変換装置の構成＞ <Configuration of Conversion Device According to Second Embodiment of the Present Invention>

次に、本発明の第２の実施の形態に係る変換装置の構成について説明する。図８に示すように、本発明の第２の実施の形態に係る変換装置４００は、ＣＰＵと、ＲＡＭと、後述する変換処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この変換装置４００は、機能的には図８に示すように入力部４１０と、演算部４２０と、出力部４５０とを備えている。 Next, the configuration of the conversion apparatus according to the second embodiment of the present invention will be described. As shown in FIG. 8, the conversion device 400 according to the second embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a conversion processing routine to be described later. It can be configured with a computer including. Functionally, the conversion device 400 includes an input unit 410, a calculation unit 420, and an output unit 450 as shown in FIG.

入力部４１０は、接頭辞を予測する場合には翻訳対象の原言語文を、接頭辞を指定する場合には翻訳対象の原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞とを受け付ける。 The input unit 410 has features relating to a source language sentence to be translated when a prefix is predicted, a source language sentence to be translated when a prefix is designated, and a combination of the source language sentence and a target language sentence. It accepts a prefix that is information to represent.

演算部４２０は、変換部４３０と、整形部４３２と、変換モデル４４０とを含んで構成されている。 The calculation unit 420 includes a conversion unit 430, a shaping unit 432, and a conversion model 440.

変換モデル４４０は、上記変換モデル学習装置３００で学習された、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための学習済みニューラルネットのパラメタを保持している。 The conversion model 440 holds learned neural network parameters for translating the source language sentence learned by the conversion model learning apparatus 300 into a target language sentence prefixed with a prefix.

変換部４３０は、入力部４１０で受け付けた、原言語文と、接頭辞とを入力とし、変換モデル４４０を用いて、入力部４１０で受け付けた原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する。変換部４３０の構成については第１の実施の形態の変換部２３０と同様である。 The conversion unit 430 receives the source language sentence received by the input unit 410 and a prefix, and uses the conversion model 440 to prefix the source language sentence received by the input unit 410 with the prefix. Translate to target language sentence. The configuration of the conversion unit 430 is the same as that of the conversion unit 230 of the first embodiment.

翻訳を実行する際には、変換モデル４４０に基づいて、入力された原言語文に対して、まず領域タグが接頭辞として予測され、接頭辞の後に、目的言語文が生成される。翻訳対象となる入力文の領域が予め分かっている場合には、入力文と領域タグを入力とし、接頭辞制約付きデコーディングにより領域を指定することも可能である。このとき、予測された単語と、入力された接頭辞とが異なる場合に、入力された接頭辞が採用され、入力された接頭辞の後に、目的言語文が生成される。これは、予測誤りが生じることは避けられないため、必ずしも期待する接頭辞が出力されない場合を想定した処理である。また領域タグを指定することにより、指定された領域（例えば#KFTTのような書き言葉）の特徴を反映した、入力文から予測される領域（例えば#IWSLTのような話し言葉）の語彙や文体とは異なる語彙や文体をデコーダに生成させる効果もある。 When executing the translation, based on the conversion model 440, an area tag is first predicted as a prefix for the input source language sentence, and a target language sentence is generated after the prefix. If the area of the input sentence to be translated is known in advance, the input sentence and the area tag can be input and the area can be designated by decoding with prefix restriction. At this time, when the predicted word is different from the input prefix, the input prefix is adopted, and the target language sentence is generated after the input prefix. This is a process that assumes a case where an expected prefix is not always output because it is unavoidable that a prediction error occurs. Also, by specifying the area tag, the vocabulary and style of the area predicted from the input sentence (for example, spoken language such as #IWSLT) reflecting the characteristics of the specified area (for example, written language such as #KFTT) There is also an effect of causing the decoder to generate different vocabulary and style.

例えば、図９に示すように、デコーダのあるステップの入力において、一つ前のステップで予測として出力された単語（図９では＃ＩＷＳＬＴ）が、接頭辞であり、かつ、入力された接頭辞（図９では＃ＫＦＴＴ）と異なる場合がある。この場合、入力された接頭辞を、一つ前のステップで予測として出力された単語の代わりとする。これにより入力された接頭辞に応じた目的言語文を生成することができる。 For example, as shown in FIG. 9, at the input of a certain step of the decoder, the word (#IWSLT in FIG. 9) output as a prediction in the previous step is a prefix, and the input prefix (#KFTT in FIG. 9) may be different. In this case, the input prefix is used instead of the word output as a prediction in the previous step. As a result, a target language sentence corresponding to the input prefix can be generated.

そして、ビーム探索により最も確率が高い目的言語文候補が出力される。 Then, the target language sentence candidate with the highest probability is output by the beam search.

整形部４３２は、変換部４３０により出力された、接頭辞が先頭に付加された目的言語文に基づいて、当該接頭辞に応じて定められた処理を、当該目的言語文に対して行い、最終的に得られた目的言語文を出力部４５０に出力する。本実施形態では、領域タグを示す接頭辞を除く処理を行ってもよい。 The shaping unit 432 performs processing determined according to the prefix on the target language sentence based on the target language sentence output by the conversion unit 430 and having the prefix added to the head, and finally The target language sentence thus obtained is output to the output unit 450. In the present embodiment, processing for removing a prefix indicating an area tag may be performed.

＜本発明の第２の実施の形態に係る変換装置の作用＞ <Operation of Conversion Device According to Second Embodiment of the Present Invention>

次に、本発明の第２の実施の形態に係る変換装置４００の作用について説明する。入力部４１０において翻訳対象の原言語文、及び原言語文と目的言語文との組に関する特徴を表す情報である接頭辞を受け付けると、変換装置４００は、図１０に示す変換処理ルーチンを実行する。領域タグを示す接頭辞を入力するかどうかはユーザが選ぶことができる。 Next, the operation of the conversion device 400 according to the second embodiment of the present invention will be described. When the input unit 410 receives a source language sentence to be translated and a prefix that is information indicating characteristics of the combination of the source language sentence and the target language sentence, the conversion apparatus 400 executes the conversion processing routine shown in FIG. . The user can choose whether to enter a prefix indicating the region tag.

まず、ステップＳ４００では、入力部４１０で受け付けた、原言語文と、接頭辞とを入力とし、変換モデル２４０を用いて、入力部４１０で受け付けた原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する。 First, in step S400, the source language sentence received by the input unit 410 and the prefix are input, and the prefix is added to the source language sentence received by the input unit 410 using the conversion model 240. Translate to the target language sentence.

次に、ステップＳ４０２では、変換部４３０により出力された、接頭辞が先頭に付加された目的言語文に基づいて、当該接頭辞に応じて定められた処理を、当該目的言語文に対して行い、最終的に得られた目的言語文を出力部４５０に出力して処理を終了する。 Next, in step S402, processing determined according to the prefix is performed on the target language sentence based on the target language sentence prefixed with the prefix output from the conversion unit 430. Then, the finally obtained target language sentence is output to the output unit 450 and the process is terminated.

以上説明したように、第２の実施の形態に係る変換装置によれば、原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞とを入力とし、予め学習された、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する変換モデルを用いて、原言語文を接頭辞が先頭に付加された目的言語文に変換することにより、接頭辞が指定されていない場合には、原言語文から、原言語文と目的言語文との組に関する特徴の予測と、目的言語文の生成とを同時に行うことができる。また接頭辞が指定されている場合には、指定された接頭辞に応じた目的言語文を生成することができる。 As described above, according to the conversion apparatus according to the second embodiment, the source language sentence and the prefix that is the information representing the feature relating to the combination of the source language sentence and the target language sentence are input in advance. By using a conversion model that translates the learned source language sentence into a target language sentence prefixed with a prefix, the source language sentence is converted into a target language sentence prefixed with a prefix, When the prefix is not specified, it is possible to simultaneously predict the characteristics related to the combination of the source language sentence and the target language sentence and generate the target language sentence from the source language sentence. When a prefix is specified, a target language sentence corresponding to the specified prefix can be generated.

＜本発明の第３の実施の形態に係るモデル学習装置の構成＞ <Configuration of Model Learning Device According to Third Embodiment of the Present Invention>

本発明の第３の実施の形態に係るモデル学習装置の構成について説明する。第３の実施の形態では、目的言語不対応語の生成に、接頭辞制約の予測、及び接頭辞制約の指定の手法を適用した場合を例に説明する。なお、第１及び第２の実施の形態と同様となる箇所については同一符号を付して説明を省略する。 The configuration of the model learning device according to the third embodiment of the present invention will be described. In the third embodiment, a case will be described as an example where a prefix constraint prediction and prefix constraint designation method is applied to the generation of a target language incompatible word. In addition, about the location which becomes the same as that of 1st and 2nd embodiment, the same code | symbol is attached | subjected and description is abbreviate | omitted.

図１１に示すように、本発明の第３の実施の形態に係るモデル学習装置５００は、ＣＰＵと、ＲＡＭと、後述するモデル学習処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。このモデル学習装置５００は、機能的には図１１に示すように入力部５１０と、演算部５２０とを備えている。 As shown in FIG. 11, a model learning apparatus 500 according to the third embodiment of the present invention includes a CPU, a RAM, a ROM that stores a program and various data for executing a model learning processing routine described later, and , Can be configured with a computer including. Functionally, the model learning apparatus 500 includes an input unit 510 and a calculation unit 520 as shown in FIG.

入力部５１０は、原言語文と目的言語文とが対になっている対訳データを受け付ける。 The input unit 510 receives parallel translation data in which a source language sentence and a target language sentence are paired.

演算部５２０は、原言語文抽出部３０と、目的言語文抽出部３２と、接頭辞作成部５３４と、文作成部５３６と、変換モデル学習部５３８と、変換モデル５４０とを含んで構成されている。 The calculation unit 520 includes a source language sentence extraction unit 30, a target language sentence extraction unit 32, a prefix creation unit 534, a sentence creation unit 536, a conversion model learning unit 538, and a conversion model 540. ing.

接頭辞作成部５３４は、入力部５１０で受け付けた対訳データの原言語文と目的言語文との組について、原言語文と目的言語文との組に関する特徴を表す情報を表現する一つ以上の記号からなる接頭辞を作成する。本実施の形態では、目的言語不対応語に関する特徴を接頭辞として作成する。目的言語不対応語に関する接頭辞の例としては、次のＬＥＸとＣＯＵＮＴが挙げられる。そこで、これらの例に対応する、原言語文と目的言語文との組の接頭辞を作成する。ＬＥＸでは、目的言語不対応語の系列を接頭辞として付加する。ＣＯＵＮＴでは、目的言語不対応語の数を接頭辞として付加する。 The prefix creating unit 534 expresses one or more pieces of information representing characteristics relating to the combination of the source language sentence and the target language sentence with respect to the pair of the source language sentence and the target language sentence of the parallel translation data received by the input unit 510. Create a prefix consisting of symbols. In the present embodiment, a feature relating to a target language incompatible word is created as a prefix. Examples of prefixes relating to target language incompatible words include the following LEX and COUNT. Therefore, a prefix of a pair of a source language sentence and a target language sentence corresponding to these examples is created. In LEX, a sequence of non-target language words is added as a prefix. In COUNT, the number of target language incompatible words is added as a prefix.

接頭辞作成部５３４の具体的な構成を図１２に示す。 A specific configuration of the prefix creation unit 534 is shown in FIG.

接頭辞作成部５３４は、単語対応部５５０と、単語翻訳確率計算部５５２と、目的言語不対応語候補リスト作成部５５４と、目的言語不対応語抽出部５５６と、目的言語不対応語接頭辞作成部５５８とを含んで構成されている。 The prefix creation unit 534 includes a word correspondence unit 550, a word translation probability calculation unit 552, a target language incompatible word candidate list creation unit 554, a target language incompatible word extraction unit 556, and a target language incompatible word prefix. And a creation unit 558.

単語対応部５５０は、対訳データの原言語文と目的言語文との各組について単語対応を求める。 The word correspondence unit 550 obtains word correspondence for each set of the source language sentence and the target language sentence of the parallel translation data.

単語翻訳確率計算部５５２は、単語対応部５５０で求めた各組の単語対応から、原言語文の単語と目的言語文の単語との間の単語翻訳確率を計算する。 The word translation probability calculation unit 552 calculates the word translation probability between the word of the source language sentence and the word of the target language sentence from each pair of word correspondences obtained by the word correspondence unit 550.

目的言語不対応語候補リスト作成部５５４は、単語翻訳確率計算部５５２で求めた単語翻訳確率から、原言語文に対応する単語がない、目的言語文の単語である目的言語不対応語の候補リストを作成する。例えば、（３）式のスコアが大きな順に上位n個の単語のリストを目的言語不対応語の候補リストとする。 The target language incompatible word candidate list creation unit 554, based on the word translation probabilities obtained by the word translation probability calculation unit 552, has no word corresponding to the source language sentence and is a target language incompatible word candidate that is a word of the target language sentence. Create a list. For example, the list of the top n words in descending order of the score of equation (3) is set as a candidate language incompatible word candidate list.

目的言語不対応語抽出部５５６は、各組について、単語対応部５５０で求めた単語対応と、目的言語不対応語候補リスト作成部５５４で作成した候補リストとに基づいて、目的言語不対応語を求める。 The target language incompatible word extraction unit 556, for each group, based on the word correspondence obtained by the word correspondence unit 550 and the candidate list created by the target language incompatible word candidate list creation unit 554, Ask for.

目的言語不対応語接頭辞作成部５５８は、各組について、目的言語不対応語抽出部５５６で抽出した目的言語不対応語から長さ１以上の記号の列からなる接頭辞を作成する。ここで、接頭辞の長さとは、接頭辞に含まれる記号（例えば、「#we」、「#you」などのそれぞれが記号に対応）の数である。本実施の形態では長さは可変長である。 The target language incompatible word prefix creation unit 558 creates, for each set, a prefix composed of a string of symbols having a length of 1 or more from the target language incompatible word extracted by the target language incompatible word extraction unit 556. Here, the length of the prefix is the number of symbols (for example, “#we”, “#you”, etc. correspond to the symbols) included in the prefix. In this embodiment, the length is variable.

接頭辞作成部５３４は、以上の各部の処理により、対訳データの原言語文と目的言語文との各組について接頭辞を作成する。 The prefix creation unit 534 creates a prefix for each pair of the source language sentence and the target language sentence of the bilingual data by the processing of each part described above.

文作成部５３６は、対訳データの原言語文と目的言語文との各組について、接頭辞作成部５３４で作成した接頭辞と、目的言語文抽出部３２で抽出した目的言語文とに基づいて、目的言語文の先頭に、接頭辞を付加する。 The sentence creation unit 536 uses the prefix created by the prefix creation unit 534 and the target language sentence extracted by the target language sentence extraction unit 32 for each pair of the source language sentence and the target language sentence of the parallel translation data. Add a prefix to the beginning of the target language sentence.

以下の例では、ＬＥＸを用いた接頭辞を付加した目的言語文において、目的言語不対応語を下線で示している。接頭辞中の各目的言語不対応語には、目的言語の語彙と区別するために先頭にシャープを付けている。これは目的言語不対応語の候補リストの要素を一意に特定できる記号であれば何でもよい。接頭辞が可変長になるので、接頭辞と目的言語文を区別する記号として、さらに「#GO」を付加する。可変長の接頭辞と目的言語文を区分する記号は、目的言語の語彙および可変長の接頭辞の語彙（本実施形態では目的言語不対応語の候補リストの要素）と重ならない記号であれば何でもよい。 In the following example, in a target language sentence to which a prefix using LEX is added, a target language incompatible word is indicated by an underline. Each target language incompatible word in the prefix is prefixed with a sharp to distinguish it from the target language vocabulary. Any symbol can be used as long as it can uniquely identify an element of the candidate list of target language incompatible words. Since the prefix has a variable length, “#GO” is further added as a symbol for distinguishing the prefix from the target language sentence. The symbol that distinguishes the variable-length prefix from the target language sentence is a symbol that does not overlap the target language vocabulary and the variable-length prefix vocabulary (elements of candidate list of target language incompatible words in this embodiment). Anything

ＣＯＵＮＴを用いた接頭辞では、目的言語不対応語の数を接頭辞として目的言語文の先頭に付加する。以下の例では目的言語不対応語の数を“[”と“]”で囲って、接頭辞と目的言語文を区別している。これは特に数字を含む記号である必要はなく、目的言語不対応語の数を一意に特定できる記号であれば何でもよい。 In the prefix using COUNT, the number of target language incompatible words is added to the head of the target language sentence as a prefix. In the following example, the number of non-target language words is enclosed in “[” and “]” to distinguish the prefix from the target language sentence. This does not have to be a symbol including numbers in particular, and any symbol can be used as long as it can uniquely identify the number of words that do not correspond to the target language.

変換モデル学習部５３８は、原言語文抽出部３０で抽出された原言語文と、文作成部５３６により接頭辞が先頭に付加された目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための変換モデル５４０を学習する。本実施の形態では、目的言語不対応語に関するタグが接頭辞として付加された目的言語文を用いて、変換モデル５４０を学習する。 Based on the source language sentence extracted by the source language sentence extraction unit 30 and the target language sentence prefixed by the sentence creation unit 536, the conversion model learning unit 538 converts the source language sentence into a prefix. A conversion model 540 for translating into a target language sentence with a prefix added to is learned. In the present embodiment, the conversion model 540 is learned using a target language sentence to which a tag related to a target language incompatible word is added as a prefix.

第３の実施の形態の他の構成については、第２の実施の形態と同様であるため、詳細な説明を省略する。 Since the other configuration of the third embodiment is the same as that of the second embodiment, detailed description thereof is omitted.

＜本発明の第３の実施の形態に係る変換モデル学習装置の作用＞ <Operation of Conversion Model Learning Device According to Third Embodiment of the Present Invention>

第３の実施の形態に係る作用については、変換モデル学習処理ルーチンは、接頭辞に応じて定められた処理を実行した処理結果として得られた目的言語文を用いない点以外は、第１の実施の形態と同様であるため説明を省略する。第３の実施形態では、接頭辞作成処理ルーチンの作用の詳細について説明する。 Regarding the operation according to the third embodiment, the conversion model learning process routine is the first except that it does not use the target language sentence obtained as a result of executing the process determined according to the prefix. Since it is the same as that of the embodiment, the description is omitted. In the third embodiment, details of the operation of the prefix creation processing routine will be described.

図１３に示すように、ステップＳ５００では、対訳データの原言語文と目的言語文との各組について単語対応を求める。 As shown in FIG. 13, in step S500, word correspondence is obtained for each set of the source language sentence and the target language sentence of the parallel translation data.

ステップＳ５０２では、ステップＳ５００で求めた各組の単語対応から単語翻訳確率を計算する。 In step S502, a word translation probability is calculated from each set of word correspondences obtained in step S500.

ステップＳ５０４では、ステップＳ５０２で求めた単語翻訳確率から、目的言語不対応語の候補リストを作成する。 In step S504, a candidate list of target language incompatible words is created from the word translation probabilities obtained in step S502.

ステップＳ５０６では、各組について、ステップＳ５００で求めた単語対応と、ステップＳ５０４で作成した候補リストとに基づいて、目的言語不対応語を求める。 In step S506, a target language incompatible word is obtained for each group based on the word correspondence obtained in step S500 and the candidate list created in step S504.

ステップＳ５０８では、各組について、ステップＳ５０６で抽出した目的言語不対応語から接頭辞を作成する。 In step S508, a prefix is created for each group from the target language incompatible words extracted in step S506.

以上説明したように、第３の実施の形態に係るモデル学習装置によれば、対訳データから原言語文と目的言語文との組に関する特徴を表す情報である接頭辞を作成し、原言語文と、接頭辞が先頭に付加された目的言語文とに基づいて、原言語文を、接頭辞が先頭に付加された目的言語文に変換するための変換モデルを学習することにより、接頭辞と、目的言語文とを予測するためのモデルを同時に学習できる。 As described above, according to the model learning device according to the third embodiment, a prefix that is information representing characteristics relating to a combination of a source language sentence and a target language sentence is created from parallel translation data, and the source language sentence is generated. And a target language sentence prefixed with the target language sentence, by learning a conversion model for converting the source language sentence into a target language sentence prefixed with the prefix and A model for predicting a target language sentence can be learned simultaneously.

＜本発明の第３の実施の形態に係る変換装置の構成＞ <Configuration of Conversion Device According to Third Embodiment of the Present Invention>

次に、本発明の第３の実施の形態に係る変換装置の構成について説明する。図１４に示すように、本発明の第３の実施の形態に係る変換装置６００は、ＣＰＵと、ＲＡＭと、後述する変換処理ルーチンを実行するためのプログラムや各種データを記憶したＲＯＭと、を含むコンピュータで構成することが出来る。この変換装置６００は、機能的には図１４に示すように入力部６１０と、演算部６２０と、出力部６５０とを備えている。 Next, the configuration of the conversion apparatus according to the third embodiment of the present invention will be described. As shown in FIG. 14, a conversion apparatus 600 according to the third embodiment of the present invention includes a CPU, a RAM, and a ROM that stores a program and various data for executing a conversion processing routine to be described later. It can be configured with a computer including. Functionally, the conversion device 600 includes an input unit 610, a calculation unit 620, and an output unit 650 as shown in FIG.

入力部６１０は、接頭辞を予測する場合には翻訳対象の原言語文を、接頭辞を指定する場合には原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞とを受け付ける。接頭辞は、例えば、「LEX」や「COUNT」などの目的言語不対応語に関する特徴である。 The input unit 610 is information indicating characteristics of a source language sentence to be translated when a prefix is predicted, a source language sentence when specifying a prefix, and a feature about a combination of the source language sentence and a target language sentence. Accept a certain prefix. The prefix is a feature related to a target language incompatible word such as “LEX” or “COUNT”, for example.

演算部６２０は、変換部６３０と、整形部６３２と、変換モデル６４０とを含んで構成されている。 The calculation unit 620 includes a conversion unit 630, a shaping unit 632, and a conversion model 640.

変換モデル６４０は、上記変換モデル学習装置５００で学習された、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳するための学習済みニューラルネットのパラメタを保持している。 The conversion model 640 holds a learned neural network parameter for translating the source language sentence learned by the conversion model learning apparatus 500 into a target language sentence prefixed with a prefix.

変換部６３０は、入力部６１０で受け付けた、原言語文と、接頭辞を指定する場合には接頭辞とを入力とし、変換モデル６４０を用いて、入力部６１０で受け付けた原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する。変換部６３０の構成については第１の実施の形態の変換部２３０と同様である。 The conversion unit 630 receives the source language sentence received by the input unit 610 and the prefix when specifying a prefix, and uses the conversion model 640 to convert the source language sentence received by the input unit 610 into Translate to target language sentence prefixed with prefix. The configuration of the conversion unit 630 is the same as that of the conversion unit 230 of the first embodiment.

翻訳を実行する際には、変換モデル６４０に基づいて、入力された原言語文に対して、まず目的言語不対応に関する特徴を表現する接頭辞が予測され、接頭辞の後に、目的言語文が生成される。接頭辞を指定する場合には、予測された単語と、それに対応する入力された接頭辞中の記号とが異なる場合に、入力された接頭辞中の記号が次のステップの入力として採用され、入力された接頭辞の後に、目的言語文が生成される。そして、ビーム探索により最も確率が高い目的言語文候補が出力される。 When performing the translation, based on the conversion model 640, a prefix that expresses a feature relating to non-correspondence to the target language is first predicted for the input source language sentence, and after the prefix, the target language sentence is Generated. When specifying a prefix, if the predicted word differs from the corresponding symbol in the entered prefix, the symbol in the entered prefix is taken as input for the next step, After the entered prefix, the target language sentence is generated. Then, the target language sentence candidate with the highest probability is output by the beam search.

整形部６３２は、変換部６３０により出力された、接頭辞が先頭に付加された目的言語文に基づいて、当該接頭辞に応じて定められた処理を、当該目的言語文に対して行い、最終的に得られた目的言語文を出力部６５０に出力する。具体的には本実施形態では、目的言語不対応に関するタグに応じて、目的言語文に、入力文に対応する単語が存在しない単語が出力文に生成されたことを示す予め定められた文字列を付与するようにしてもよい。また、目的言語不対応に関するタグを示す接頭辞を除く処理を行ってもよい。 The shaping unit 632 performs processing determined according to the prefix on the target language sentence based on the target language sentence output by the conversion unit 630 with the prefix added to the head, and finally The target language sentence thus obtained is output to the output unit 650. Specifically, in the present embodiment, a predetermined character string indicating that a word that does not have a word corresponding to the input sentence is generated in the output sentence in the target language sentence in accordance with a tag related to non-target language correspondence May be given. Further, a process of removing a prefix indicating a tag related to non-target language correspondence may be performed.

なお、第３の実施の形態に係る変換装置の作用については、第２の実施の形態と同様であるため、説明を省略する。 In addition, about the effect | action of the converter which concerns on 3rd Embodiment, since it is the same as that of 2nd Embodiment, description is abbreviate | omitted.

以上説明したように、第３の実施の形態に係る変換装置によれば、原言語文と、原言語文と目的言語文との組に関する特徴を表す情報である接頭辞とを入力とし、予め学習された、原言語文を、接頭辞が先頭に付加された目的言語文に翻訳する変換モデルを用いて、原言語文を接頭辞が先頭に付加された目的言語文に変換することにより、原言語文から、原言語文と目的言語文との組に関する特徴の予測と、目的言語文の生成とを同時に行うことができる。 As described above, according to the conversion apparatus according to the third embodiment, the source language sentence and the prefix that is information representing the characteristics of the combination of the source language sentence and the target language sentence are input in advance. By using a conversion model that translates the learned source language sentence into a target language sentence prefixed with a prefix, the source language sentence is converted into a target language sentence prefixed with a prefix, From the source language sentence, it is possible to simultaneously predict the characteristics related to the combination of the source language sentence and the target language sentence and to generate the target language sentence.

［実験結果］ [Experimental result]

上記第１の実施の形態の双方向デコーディング、及び第２の実施の形態の領域適応の実験において用いた対訳データを以下の表１に示す。 Table 1 below shows the bilingual data used in the bidirectional decoding of the first embodiment and the region adaptation experiment of the second embodiment.

双方向デコーディングと領域適応に関する実験では、一般に入手可能な５つの対訳コーパスを用いた。それらは、IWSLT-2005（旅行会話）、KFTT（京都に関するWikipedia記事）、Global Voices（社会問題に関するブログ記事）、Reuters（ロイター社のニュース記事)、Tatoeba（集合知による例文収集サイト）である。表１には各対訳コーパスの文数、及び平均単語長を示している。 In the experiments on bidirectional decoding and region adaptation, five commonly available bilingual corpora were used. They are IWSLT-2005 (travel conversation), KFTT (Wikipedia article on Kyoto), Global Voices (blog article on social issues), Reuters (News article on Reuters), Tatoeba (example sentence collection site by collective intelligence). Table 1 shows the number of sentences of each parallel corpus and the average word length.

目的言語不対応語の生成に関する実験では、日本語のゼロ代名詞を多く含む話し言葉データであるIWSLT-2005を用いた。IWSLT-2005は約２万文しかないので、より信頼性が高い実験を行うために、さらに二つの話し言葉コーパスを追加した。一つはストレートワード社から販売されている日常会話フレーズ集である「大音泉日英対訳データベース」で、50,709文（英語431,258単語、日本語471,677単語）ある。もう一つはハルピン工業大学が北京オリンピック向けに開発した音声翻訳用の日英対訳データで、62,727文（英語635,809単語、日本語796,200単語）ある。ここではIWSLT-2005と大音泉と北京オリンピックデータを合わせたものをIWSLT-2005+EXTRAと呼ぶことにする。 IWSLT-2005, which is spoken language data that contains many Japanese zero pronouns, was used in the experiment on the generation of non-target language compatible words. IWSLT-2005 has only about 20,000 sentences, so we added two more spoken corpora to conduct more reliable experiments. One is a collection of daily conversation phrases sold by Straightword Inc., which is a database of Japanese-English bilingual translations of Ootosen. There are 50,709 sentences (431,258 words in English, 471,677 words in Japanese). The other is Japanese-English translation data for speech translation developed by Harbin Institute of Technology for the Beijing Olympics, with 62,727 sentences (635,809 words in English, 796,200 words in Japanese). Here, IWSLT-2005, Daionsen and the Beijing Olympics data are called IWSLT-2005 + EXTRA.

翻訳の前処理としては、日本語は形態素解析器MeCabとUniDic辞書を使って形態素解析した。 As a pre-translational process, Japanese language was analyzed using a morphological analyzer MeCab and UniDic dictionary.

英語は統計翻訳ソフトウェアmosesに付属する字句解析ソフトウェア（tokenize.perl）と小文字化ソフトウェア（lowercase.perl）を用いた。 For English, lexical analysis software (tokenize.perl) and lowercase software (lowercase.perl) attached to statistical translation software “moses” were used.

ニューラル機械翻訳には、アテンション付きエンコーダデコーダ（非特許文献５参照）を実装したオープンソースの翻訳ツールであるseq2seq-attnを使用した。翻訳精度は、最も標準的な自動翻訳尺度であるBLEUで評価した。 For neural machine translation, seq2seq-attn, an open source translation tool that implements an encoder decoder with attention (see Non-Patent Document 5), was used. Translation accuracy was evaluated with BLEU, the most standard automatic translation scale.

［第１の実施の形態の実験結果］ [Experimental result of the first embodiment]

双方向デコーディングに関し、IWSLT、KFTT、REUTERSの３つの対訳コーパスに関して、順方向(左から右) に翻訳した場合、逆方向(右から左) に翻訳した場合、従来手法である目的言語双方向法（非特許文献４参照）で翻訳した場合、提案手法である接頭辞制約を用いたデコーディング方向の予測で翻訳した場合の翻訳精度BLEUの値を表２に示す。
Regarding bi-directional decoding, IWSLT, KFTT, and REUTERS bilingual corpus, when translated in the forward direction (left to right), when translated in the reverse direction (right to left), the target language bidirectional that is the conventional method Table 2 shows translation accuracy BLEU values when translation is performed by prediction of the decoding direction using the prefix constraint, which is a proposed method, when translation is performed by the method (see Non-Patent Document 4).

一方向だけのデコーディングである順方向および逆方向に比べて、提案法はデコーディング方向を予測することにより、従来手法である目的言語双方向法と同程度または同程度以上に翻訳精度が改善されている。第１の実施の形態で説明した提案手法は従来手法に比べて、変換モデルが一つであり、アテンション付きエンコーダデコーダモデルをそのまま利用できるという利点がある。 Compared to the forward and reverse directions, which are decoding in only one direction, the proposed method predicts the decoding direction, thereby improving the translation accuracy to the same level or better than the conventional target language bidirectional method. Has been. Compared with the conventional method, the proposed method described in the first embodiment has one conversion model and has an advantage that an encoder / decoder model with attention can be used as it is.

［第２の実施の形態の実験結果］ [Experimental result of the second embodiment]

領域適応に関し、領域が異なる５つの対訳コーパスについて、以下の４つの場合について翻訳精度を評価した。 Regarding region adaptation, the translation accuracy was evaluated for the following four cases for five parallel corpora with different regions.

（１）単独：各対訳コーパスだけを使って変換モデルを作成し、同じ対訳コーパスのテスト文で翻訳精度を評価した。
（２）全体：５つの対訳コーパスを単純に一つにまとめて変換モデルを作成し、各対訳コーパスのテスト文で翻訳精度を評価した。
（３）領域予測：各対訳コーパスにおいて、対訳コーパス名を領域タグとし、付与したものを一つにまとめて変換モデルを作成し、各対訳コーパスのテスト文で翻訳精度を評価（接頭辞制約予測）した。
（４）領域指定：変換モデルとテスト文は分野予測の場合と同じである。デコーディングの際に正解の領域（対訳コーパス名）を与えて翻訳（接頭辞制約付きデコーディング）を行った。 (1) Single: A conversion model was created using only each parallel corpus, and the translation accuracy was evaluated using test sentences of the same parallel corpus.
(2) Overall: A translation model was created by simply combining five parallel corpora into one, and the translation accuracy was evaluated with test sentences of each parallel corpus.
(3) Area prediction: In each bilingual corpus, the bilingual corpus name is used as an area tag, and a translation model is created by combining the assigned tags into one, and the translation accuracy is evaluated with the test sentence of each bilingual corpus (prefix constraint prediction) )did.
(4) Area specification: The conversion model and the test sentence are the same as those in the field prediction. At the time of decoding, the correct answer area (parallel corpus name) was given and translated (decoding with prefix restriction).

実験結果を表３に示す。 The experimental results are shown in Table 3.

「単独」と「全体」を比べると、最も対訳データの数が多いKFTT は翻訳精度が低下し、それ以外の４つの対訳コーパスは翻訳精度が向上している。これに比べて「領域予測」すなわち接頭辞制約予測を適用した場合、すべての対訳コーパスで翻訳精度が向上している。 Comparing “single” and “whole”, the translation accuracy of KFTT, which has the largest number of bilingual data, is reduced, and the translation accuracy of the other four bilingual corpora is improved. In contrast, when “region prediction”, that is, prefix constraint prediction is applied, translation accuracy is improved in all parallel corpora.

原言語文から領域タグを予測する「領域予測」と、外部から領域タグの正解を与える「領域指定」を比較すると、領域指定の方が少し翻訳精度が高いがほとんど差はない。従って、原言語文から領域タグを予測するのはニューラルネットにとって易しい問題であり、領域タグを正しく予測することにより翻訳精度が向上していることが分かる。 Comparing “region prediction” that predicts region tags from source language sentences with “region specification” that gives correct answers to region tags from the outside, region specification is slightly higher in translation accuracy but has little difference. Therefore, it is easy for a neural network to predict a region tag from a source language sentence, and it can be seen that translation accuracy is improved by correctly predicting the region tag.

従来手法である付加制約を用いた領域適応では、原言語文から付加制約(すなわち領域タグ)を予測する手段を別途用意しなければならないが、提案手法は、アテンション付きエンコーダデコーダモデルの中で、接頭辞（領域タグ）の予測と目的言語文の生成が同時に行われるという利点がある。 In region adaptation using additional constraints, which is a conventional method, a means for predicting additional constraints (i.e., region tags) from the source language sentence must be prepared separately, but the proposed method is an attention-added encoder / decoder model. There is an advantage that the prediction of the prefix (area tag) and the generation of the target language sentence are performed simultaneously.

表４に領域タグにより翻訳結果が変わる例を示す。 Table 4 shows an example in which the translation result varies depending on the region tag.

従来のアテンション付きエンコーダデコーダにおいてビーム探索により得られる上位候補は、ほとんど違いがない。それに比べて接頭辞制約を外部から指定して接頭辞制約付きデコーディングを行った場合には、大きく異なる翻訳結果が得られる。 The high-order candidates obtained by beam search in the conventional encoder decoder with attention are almost the same. On the other hand, when prefix constraint is specified from the outside and decoding with prefix constraint is performed, a greatly different translation result is obtained.

［第３の実施の形態の実験結果］ [Experimental result of the third embodiment]

目的言語不対応語の翻訳に関し、IWSLT-2005の日英翻訳において、（３）式のスコアに基づいて上位５０個の目的言語不対応語の候補リストを求めた結果を表５に示す。 Regarding translation of target language incompatible words, Table 5 shows the results of obtaining a candidate list of the top 50 target language incompatible words based on the score of formula (3) in Japanese-English translation of IWSLT-2005.

ｉ，ｙｏｕ，ｉｔなどのゼロ代名詞に対応する英語の代名詞、ａ，ｔｈｅなどの冠詞、ｔａｋｅ，ｇｅｔ，ｍａｋｅなどの軽動詞（日本語の「する」のようなあまり意味を持っていない動詞）、ｄｏ，ｄｏｅｓなどの虚辞が自動的に抽出できていることが分かる。 English pronouns corresponding to zero pronouns such as i, you, and it, articles such as a and the, light verbs such as take, get, and make (verbs that do not have much meaning such as Japanese "suru") It can be seen that imaginary words such as, do, and does can be automatically extracted.

接頭辞制約を予測する場合、上位１０語の候補リストを用いてＣＯＵＮＴを接頭辞とした場合に、ベースライン（接頭辞なし）に比べて翻訳精度が約１ポイント向上している。これは接頭辞制約の予測と目的言語文の生成を同時に行うことにより翻訳精度を向上できることを示している。接頭辞制約を外部から与える場合、上位１０語の候補リストを用いてＣＯＵＮＴを接頭辞とした場合に、ベースラインに比べて翻訳精度が約３ポイント向上し、上位５０語の候補リストを用いてＬＥＸを接頭辞とした場合に、ベースラインに比べて翻訳精度が１０ポイント以上向上する。これはユーザが外部から接頭辞制約を与えることにより、大幅に翻訳精度を向上できることを示している。 When predicting a prefix constraint, the translation accuracy is improved by about 1 point compared to the baseline (no prefix) when COUNT is used as a prefix using a candidate list of the top 10 words. This indicates that translation accuracy can be improved by simultaneously predicting prefix constraints and generating target language sentences. When prefix constraints are given from the outside, the translation accuracy is improved by about 3 points compared to the baseline when the top 10 word candidate list is used as a prefix, and the top 50 word candidate list is used. When LEX is used as a prefix, the translation accuracy is improved by 10 points or more compared to the baseline. This indicates that the translation accuracy can be greatly improved by applying a prefix constraint from the outside by the user.

以上、本発明の実施の形態に係る手法では、原言語文と目的言語文の対に関する特徴を記号列で表現し、この記号列を目的言語文に接頭辞として付加する。原言語文と接頭辞付き目的言語文の対から変換モデルを学習し、入力された原言語文に対して、接頭辞の予測と目的言語文の生成を同時に行うことにより、翻訳精度が向上する。また本発明の実施の形態に係る手法では、ユーザが接頭辞を外部から指定することが可能であり、指定された接頭辞(特徴)に応じた目的言語文が生成される。接頭辞制約の予測、及び指定は、ニューラル機械翻訳において原言語文と目的言語文の対に関する任意の特徴を明示的に予測することにより翻訳精度を向上させ、出力される目的言語文の特徴をユーザが制御するための一般的な枠組みとして使うことができる。 As described above, in the method according to the embodiment of the present invention, the characteristics relating to the pair of the source language sentence and the target language sentence are expressed by a symbol string, and this symbol string is added to the target language sentence as a prefix. Learning the conversion model from a pair of source language sentence and prefixed target language sentence, and improving the translation accuracy by simultaneously predicting the prefix and generating the target language sentence for the input source language sentence . In the method according to the embodiment of the present invention, a user can designate a prefix from the outside, and a target language sentence corresponding to the designated prefix (feature) is generated. Prefix constraint prediction and specification improve the accuracy of translation by explicitly predicting any feature related to the pair of source language and target language in neural machine translation. It can be used as a general framework for user control.

また、実験により、本発明の各実施の形態における、双方向デコーディング、領域適応、及び目的言語不対応語の生成について、その実現例を示した。 In addition, experiments have shown implementation examples of bidirectional decoding, area adaptation, and target language incompatible word generation in each embodiment of the present invention.

双方向デコーディングに関しては、「左から右」および「右から左」というデコーディング方向を表すタグを接頭辞として付加し、目的言語文の生成に関する制約とすることにより、ベースラインとなるニューラル機械翻訳方式に変更を加えることなく双方向デコーディングを実現して翻訳精度を向上することができる。 For bidirectional decoding, add a tag indicating the decoding direction of "left to right" and "right to left" as a prefix, and use it as a constraint on the generation of the target language sentence, thereby becoming the baseline neural machine It is possible to improve the translation accuracy by realizing bidirectional decoding without changing the translation method.

領域適応に関しては、領域タグを接頭辞として付加し、目的言語文の生成に関する制約とすることにより、原言語文が所属する領域の予測と目的言語文の生成を同時に行うことができ、原言語文が所属する領域を同定する手段を別途用意する必要なく、翻訳精度を向上することができる。 For area adaptation, by adding an area tag as a prefix and limiting the generation of the target language sentence, it is possible to simultaneously predict the area to which the source language sentence belongs and generate the target language sentence. Translation accuracy can be improved without requiring a separate means for identifying the region to which the sentence belongs.

目的言語不対応語の生成に関しては、目的言語不対応語の表記のリストまたは目的言語不対応語の数を接頭辞として付加し、目的言語文の生成に関する制約とすることにより、原言語文の情報だけから目的言語不対応語またはそれに関連する情報を予測する手段を実現し、かつ、翻訳精度を向上することができる。 For the generation of target language incompatible words, a list of target language incompatible words or the number of target language incompatible words is added as a prefix to restrict the generation of the target language sentence. It is possible to realize a means for predicting a target language incompatible word or related information only from information, and improve translation accuracy.

なお、本発明は、上述した実施の形態に限定されるものではなく、この発明の要旨を逸脱しない範囲内で様々な変形や応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made without departing from the gist of the present invention.

例えば、上述した各実施の形態では、双方向デコーディング、領域適応、及び目的言語不対応語の生成をそれぞれ分けて説明したが、これらを組み合わせて実施してもよい。その場合には、複数の接頭辞を目的言語文に付加して、変換モデルを学習すればよい。 For example, in each of the above-described embodiments, bidirectional decoding, area adaptation, and generation of a target language incompatible word have been described separately. However, these may be implemented in combination. In that case, a plurality of prefixes may be added to the target language sentence to learn the conversion model.

また、原言語文と目的言語文との組の対訳データについて、接頭辞を付加した目的言語文に翻訳するための変換モデルを学習する場合について説明したが、これに限定されるものではない。例えば、ユーザとシステムとの対話システムに適用し、ユーザ発話を入力文とし、システム発話を、出力文として、ユーザ発話から、接頭辞を付加したシステム発話に変換するための変換モデルを学習するようにしてもよい。 Moreover, although the case where the conversion model for translating into the target language sentence which added the prefix about the parallel translation data of the pair of a source language sentence and the target language sentence was demonstrated, it is not limited to this. For example, it is applied to a dialogue system between a user and a system, and learns a conversion model for converting a user utterance into a system utterance with a prefix using a user utterance as an input sentence and a system utterance as an output sentence. It may be.

１０入力部
２０、２２０、３２０、４２０、５２０、６２０演算部
３０原言語文抽出部
３２目的言語文抽出部
３４、３３４、５３４接頭辞作成部
３６、３３６、５３６文作成部
３８、３３８、５３８変換モデル学習部
４０、２４０、３４０、４４０、５４０、６４０変換モデル
２３０、４３０、６３０変換部
２３２、４３２、６３２整形部
２５０、４５０、６５０出力部
５５０単語対応部
５５２単語翻訳確率計算部
５５４目的言語不対応語候補リスト作成部
５５６目的言語不対応語抽出部
５５８目的言語不対応語接頭辞作成部 10 input unit 20, 220, 320, 420, 520, 620 arithmetic unit 30 source language sentence extraction unit 32 target language sentence extraction unit 34, 334, 534 prefix creation unit 36, 336, 536 sentence creation unit 38, 338, 538 Conversion model learning unit 40, 240, 340, 440, 540, 640 Conversion model 230, 430, 630 Conversion unit 232, 432, 632 Shaping unit 250, 450, 650 Output unit 550 Word correspondence unit 552 Word translation probability calculation unit 554 Language incompatible word candidate list creation unit 556 Target language incompatible word extraction unit 558 Target language incompatible word prefix creation unit

Claims

入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列を接頭辞として先頭に付加された前記出力文とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文に変換するための変換モデルを学習するモデル学習部
を含むモデル学習装置。 Based on the input sentence, and the output sentence prefixed with a string of one or more symbols that are information representing characteristics relating to the set of the input sentence and the output sentence, the input sentence is A model learning device including a model learning unit that learns a conversion model for conversion to the output sentence prefixed with the prefix.

前記入力文と前記出力文との組について、前記接頭辞を作成する接頭辞作成部を更に含み、
前記モデル学習部は、前記入力文と、前記接頭辞作成部によって作成された前記接頭辞が先頭に付加された前記出力文とに基づいて、前記変換モデルを学習する請求項１に記載のモデル学習装置。 A prefix creation unit that creates the prefix for the set of the input sentence and the output sentence;
2. The model according to claim 1, wherein the model learning unit learns the conversion model based on the input sentence and the output sentence having the prefix created by the prefix creation unit added to the head. 3. Learning device.

入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列から構成される接頭辞と、前記出力文とに基づいて、前記出力文に対して、前記接頭辞に応じて定められた処理を実行した処理結果の先頭に、前記接頭辞を付加する文作成部と、
前記入力文と、前記文作成部により前記接頭辞が先頭に付加された前記出力文の前記処理結果とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文の前記処理結果に変換するための変換モデルを学習するモデル学習部と、
を含むモデル学習装置。 According to the prefix for the output sentence, based on the prefix composed of one or more symbol strings that are information indicating the characteristics of the combination of the input sentence and the output sentence, and the output sentence A sentence creation unit for adding the prefix to the head of the processing result of executing the process defined in the above;
Based on the input sentence and the processing result of the output sentence prefixed with the prefix by the sentence creation unit, the input sentence is converted into the output sentence prefixed with the output sentence. A model learning unit for learning a conversion model for converting into a processing result;
Model learning device including

前記変換モデルの学習において、前記変換は、前記入力文の単語系列を内部状態系列に変換するエンコーダと、
前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応するエンコーダの内部状態に対する重み付き和を出力するアテンション層と、
前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップの各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを用いて行うことを特徴とする請求項１〜３のいずれか１項に記載のモデル学習装置。 In the learning of the conversion model, the conversion includes an encoder that converts a word sequence of the input sentence into an internal state sequence;
An attention layer for calculating a weight for each word of the input sentence and outputting a weighted sum for an internal state of the encoder corresponding to each word of the encoder;
A decoder for predicting the output sentence with the prefix added one word at a time from the beginning, wherein the decoder predicts a word in each of the output from the attention layer and the previous step. The model learning according to any one of claims 1 to 3, wherein the model learning is performed using a decoder that receives an internal state of the decoder and a word output as a prediction in the previous step. apparatus.

前記接頭辞は、前記入力文と前記出力文との組に関する特徴を表す情報を一つ以上含み、異なる接頭辞は、異なる前記特徴を表す情報を含むことを特徴とする請求項１〜４のいずれか１項に記載のモデル学習装置。 5. The prefix according to claim 1, wherein the prefix includes one or more pieces of information representing characteristics relating to the combination of the input sentence and the output sentence, and different prefixes include information representing the different features. The model learning device according to any one of the above items.

前記接頭辞が付加された前記出力文において、前記接頭辞と前記出力文とは、識別子によって区分される請求項１〜５のいずれか１項に記載のモデル学習装置。 The model learning device according to claim 1, wherein, in the output sentence to which the prefix is added, the prefix and the output sentence are distinguished by an identifier.

前記入力文を原言語文とし、前記出力文を目的言語文として、
前記変換モデルは、前記原言語文を、前記接頭辞が先頭に付加された前記目的言語文に変換するためのものである請求項１〜６の何れか１項に記載のモデル学習装置。 The input sentence is a source language sentence, the output sentence is a target language sentence,
The model learning apparatus according to claim 1, wherein the conversion model is for converting the source language sentence into the target language sentence prefixed with the prefix.

予め学習された、入力文を、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列が接頭辞として先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換する変換部を含み、
前記変換部は、
前記入力文の単語系列を内部状態系列に変換するエンコーダと、
前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応する内部状態の重み付き和を出力するアテンション層と、
前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップの各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを備える変換装置。 A conversion model for converting an input sentence learned in advance into the output sentence prefixed with a string of one or more symbols, which is information indicating characteristics relating to the combination of the input sentence and the output sentence Using a conversion unit that converts the input sentence into an output sentence prefixed with the prefix,
The converter is
An encoder for converting a word sequence of the input sentence into an internal state sequence;
An attention layer that calculates a weight for each word of the input sentence and outputs a weighted sum of internal states corresponding to each word of the encoder;
A decoder for predicting the output sentence with the prefix added one word at a time from the beginning, wherein the decoder predicts a word in each of the output from the attention layer and the previous step. A decoder comprising: a decoder that receives an internal state of the decoder and a word output as a prediction in the previous step.

入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号からなる接頭辞とを入力とし、予め学習された、入力文を、前記接頭辞が先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換する変換部を含み、
前記変換部は、
前記入力文の単語系列を内部状態系列に変換するエンコーダと、
前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応する内部状態の重み付き和を出力するアテンション層と、
前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップの各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを備え、
前記一つ前のステップで予測として出力された単語が、入力された接頭辞の対応する記号と異なる場合、前記入力された接頭辞の対応する記号を、前記一つ前のステップで予測として出力された単語の代わりとする
変換装置。 An input sentence and a prefix consisting of one or more symbols, which are information representing characteristics of the combination of the input sentence and the output sentence, are input, and the input sentence learned in advance is added to the prefix. A conversion unit that converts the input sentence to an output sentence prefixed with the prefix, using a conversion model that converts the output sentence to
The converter is
An encoder for converting a word sequence of the input sentence into an internal state sequence;
An attention layer that calculates a weight for each word of the input sentence and outputs a weighted sum of internal states corresponding to each word of the encoder;
A decoder for predicting the output sentence with the prefix added one word at a time from the beginning, wherein the decoder predicts a word in each of the output from the attention layer and the previous step. A decoder that receives as input the internal state of the decoder and the word output as a prediction in the previous step,
If the word output as a prediction in the previous step is different from the corresponding symbol of the input prefix, the corresponding symbol of the input prefix is output as the prediction in the previous step A conversion device that replaces the translated word.

前記変換部により出力された前記接頭辞に応じて定められた処理を、前記変換部により出力された出力文に対して行う整形部を更に備える請求項８又は請求項９に記載の変換装置。 The conversion apparatus according to claim 8, further comprising a shaping unit that performs processing determined according to the prefix output by the conversion unit on an output sentence output by the conversion unit.

前記変換部は、前記接頭辞と前記出力文とは、識別子によって区分されるように、前記入力文を一つ以上の記号からなる接頭辞が先頭に付加された出力文に変換する請求項８〜１０の何れか１項に記載の変換装置。 9. The conversion unit converts the input sentence into an output sentence prefixed with one or more symbols so that the prefix and the output sentence are distinguished by an identifier. The conversion device according to any one of 10 to 10.

前記接頭辞は、
前記デコーダによる前記出力文の単語を予測する順序の方向、前記出力文が所属する領域、前記入力文に対応する単語がない、前記出力文の単語である不対応語に関する特徴に関する情報、前記不対応語の表記の列、及び前記不対応語の数の少なくとも１つである請求項８〜１１の何れか１項に記載の変換装置。 The prefix is
The direction of the order of predicting the words of the output sentence by the decoder, the region to which the output sentence belongs, the information about the feature related to the uncorresponding word that is the word of the output sentence, the word corresponding to the input sentence does not exist, The conversion device according to any one of claims 8 to 11, which is at least one of a column of notation of corresponding words and the number of uncorresponding words.

前記入力文を原言語文とし、前記出力文を目的言語文として、
前記変換モデルは、前記原言語文を、前記接頭辞が先頭に付加された前記目的言語文に変換するためのものであり、
前記変換部は、前記原言語文を、前記接頭辞が先頭に付加された目的言語文に変換する請求項８〜１２の何れか1項に記載の変換装置。 The input sentence is a source language sentence, the output sentence is a target language sentence,
The conversion model is for converting the source language sentence into the target language sentence prefixed with the prefix,
The conversion device according to claim 8, wherein the conversion unit converts the source language sentence into a target language sentence prefixed with the prefix.

モデル学習部が、入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列が接頭辞として先頭に付加された前記出力文とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文に変換するための変換モデルを学習するステップ
を含むモデル学習方法。 The model learning unit, based on the input sentence and the output sentence prefixed with a string of one or more symbols, which is information representing characteristics relating to the combination of the input sentence and the output sentence, A model learning method comprising: learning a conversion model for converting an input sentence into the output sentence prefixed with the prefix.

文作成部が、入力文と出力文との組に関する特徴を表す情報である一つ以上の接頭辞と、前記出力文とに基づいて、前記出力文に対して、前記接頭辞に応じて定められた処理を実行した処理結果の先頭に、前記接頭辞を付加するステップと、
モデル学習部が、前記入力文と、前記文作成部により前記接頭辞が先頭に付加された前記出力文の前記処理結果とに基づいて、前記入力文を、前記接頭辞が先頭に付加された前記出力文の前記処理結果に変換するための変換モデルを学習するステップと、
を含むモデル学習方法。 The sentence creation unit determines the output sentence according to the prefix based on one or more prefixes that are information indicating characteristics of the combination of the input sentence and the output sentence, and the output sentence. Adding the prefix to the top of the processing result of executing the processed processing;
The model learning unit adds the prefix to the input sentence based on the input sentence and the processing result of the output sentence prefixed with the prefix by the sentence creation unit. Learning a conversion model for converting into the processing result of the output sentence;
Model learning method including

変換部が、予め学習された、入力文を、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号の列が接頭辞として先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換するステップを含み、
前記変換部は、
前記入力文の単語系列を内部状態系列に変換するエンコーダと、
前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応する内部状態の重み付き和を出力するアテンション層と、
前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップの各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを備える変換方法。 The conversion unit converts the input sentence learned in advance into the output sentence prefixed with a string of one or more symbols, which is information representing characteristics relating to the combination of the input sentence and the output sentence. Using the conversion model to convert the input sentence to an output sentence prefixed with the prefix,
The converter is
An encoder for converting a word sequence of the input sentence into an internal state sequence;
An attention layer that calculates a weight for each word of the input sentence and outputs a weighted sum of internal states corresponding to each word of the encoder;
A decoder for predicting the output sentence with the prefix added one word at a time from the beginning, wherein the decoder predicts a word in each of the output from the attention layer and the previous step. A conversion method comprising: a decoder having as inputs an internal state of the decoder and a word output as a prediction in the previous step.

変換部が、入力文と、前記入力文と出力文との組に関する特徴を表す情報である一つ以上の記号からなる接頭辞とを入力とし、予め学習された、入力文を、前記接頭辞が先頭に付加された前記出力文に変換する変換モデルを用いて、前記入力文を前記接頭辞が先頭に付加された出力文に変換するステップを含み、
前記変換部は、
前記入力文の単語系列を内部状態系列に変換するエンコーダと、
前記入力文の各単語に対する重みを計算し、前記エンコーダの各単語に対応する内部状態の重み付き和を出力するアテンション層と、
前記接頭辞が先頭に付加された前記出力文を先頭から一単語ずつ予測するデコーダであって、前記デコーダが単語を予測するステップの各々において、前記アテンション層からの出力と、一つ前のステップのデコーダの内部状態と、一つ前のステップで予測として出力された単語とを入力とするデコーダとを備え、
前記一つ前のステップで予測として出力された単語が、入力された接頭辞の対応する記号と異なる場合、前記入力された接頭辞の対応する記号を、前記一つ前のステップで予測として出力された単語の代わりとする
変換方法。 The conversion unit receives as input the input sentence and a prefix composed of one or more symbols that are information representing characteristics relating to the combination of the input sentence and the output sentence, and the input sentence is learned in advance as the prefix. Converting the input sentence to an output sentence prefixed with the prefix using a conversion model that converts the output sentence to prefixed with the output sentence,
The converter is
An encoder for converting a word sequence of the input sentence into an internal state sequence;
An attention layer that calculates a weight for each word of the input sentence and outputs a weighted sum of internal states corresponding to each word of the encoder;
A decoder for predicting the output sentence with the prefix added one word at a time from the beginning, wherein the decoder predicts a word in each of the output from the attention layer and the previous step. A decoder that receives as input the internal state of the decoder and the word output as a prediction in the previous step,
If the word output as a prediction in the previous step is different from the corresponding symbol of the input prefix, the corresponding symbol of the input prefix is output as the prediction in the previous step Conversion method to replace the given word.

コンピュータを、請求項１〜請求項７のいずれか１項に記載の変換モデル学習装置、又は請求項８〜１３のいずれか１項に記載の変換装置の各部として機能させるためのプログラム。 The program for functioning a computer as each part of the conversion model learning apparatus of any one of Claims 1-7, or the conversion apparatus of any one of Claims 8-13.