CN107273363B - A kind of language text interpretation method and system - Google Patents

A kind of language text interpretation method and system Download PDF

Info

Publication number
CN107273363B
CN107273363B CN201710335652.4A CN201710335652A CN107273363B CN 107273363 B CN107273363 B CN 107273363B CN 201710335652 A CN201710335652 A CN 201710335652A CN 107273363 B CN107273363 B CN 107273363B
Authority
CN
China
Prior art keywords
translation
probability distribution
text
probability
language text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710335652.4A
Other languages
Chinese (zh)
Other versions
CN107273363A (en
Inventor
刘洋
张嘉成
孙茂松
栾焕博
许静芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Original Assignee
Tsinghua University
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Beijing Sogou Technology Development Co Ltd filed Critical Tsinghua University
Priority to CN201710335652.4A priority Critical patent/CN107273363B/en
Publication of CN107273363A publication Critical patent/CN107273363A/en
Application granted granted Critical
Publication of CN107273363B publication Critical patent/CN107273363B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The present invention provides a kind of language text interpretation method and system.This method comprises: determining rule according to preset translation candidate collection, determine that the corresponding translation candidate collection of source language text, the translation candidate collection include multiple cypher texts of source language text;The source language text is language text to be translated;Based on the translation candidate collection, preset translation model and preset priori knowledge model, the first probability distribution and the second probability distribution are determined;First probability distribution is used to indicate the probability that the cypher text meets priori knowledge model, and second probability distribution is used to indicate the probability that the cypher text meets translation model;Based on first probability distribution and second probability distribution, the cypher text of the source language text is determined from the translation candidate collection.The present invention can incorporate any priori knowledge in translation model, to improve the accuracy and reliability of machine translation.

Description

A kind of language text interpretation method and system
Technical field
The present invention relates to machine translation mothod field, in particular to a kind of language text interpretation method and system.
Background technique
With international progress, the exchange between different language crowd is growing day by day, translate into order in exchanging to closing Important tool.Machine translation because it is convenient simple and free the advantages that, greatly meet the translation demand of people, improve The efficiency of international exchange, so that more stringent requirements are proposed for correctness of the people to machine translation.
Machine translation can substantially be divided into: rule-based machine translation method and the machine translation based on corpus.Base In the machine translation of corpus, its critical issue, which is that, establishes a complete corpus, alternatively referred to as high quality Training sample.The training sample of high quality directly affects the accuracy of translation.However, establishing the training sample of high quality not It is an easy thing, reason is that sample data is limited, and cannot portray the distribution of initial data well;In addition, Even if sample data is enough, it can not avoid wherein the presence of error sample, i.e. noise data.The mind obtained based on the training sample It is difficult to prepare to embody master mould through network, or even will appear the case where violating priori knowledge.In this case, priori knowledge Introducing just becomes particularly significant.For translation rule, for example, " should not repeat translation, should not also leak and turn over ", such rule is just It can be described as priori knowledge.Many studies have shown that incorporating priori knowledge in neural network model to constrain it, mind can be improved Performance through network.
Machine translation method (the Attention-based Neural Machine of neural network based on attention mechanism Translation;Abbreviation Attention-based NMT) be the machine translation based on corpus a branch, and at present A kind of machine translation method used in mainstream translation system.Its basic thought is using a non-linear neural net end to end Source language text is directly mapped to target language text by network, that is, constructs the new frame of one " coding-decoding ": giving a source Language sentence is mapped as a continuous, dense vector using an encoder first, then reuses a decoder A target language sentence is converted by the vector.But this method is difficult for priori knowledge to be dissolved among neural network.
There are also the technologies being dissolved into priori knowledge in neural network at present.For example, some technologies are by priori knowledge It is indicated with additional neural network module;Some technologies are by adding limit entry in training objective to incorporate priori knowledge.Though These right technologies can promote translation effect significantly, but the former require the correlation between different priori knowledges be also required to by Modeling, the latter are merely able to add a small amount of simple limit entry.These problems cause these technologies that cannot be applied to will be any, multiple Miscellaneous priori knowledge incorporates neural network machine translation model.
Therefore, how a kind of interpretation method that any priori knowledge can be incorporated to neural network machine translation model is provided It is a urgent problem needed to be solved.
Summary of the invention
For solve the problems, such as it is of the existing technology can not by any priori knowledge incorporate neural network translation model, this hair It is bright that a kind of language text interpretation method and system are provided.
On the one hand, the present invention provides a kind of language text interpretation method, this method comprises:
Rule is determined according to preset translation candidate collection, determines the corresponding translation candidate collection of source language text, it is described Translation candidate collection includes multiple cypher texts of source language text;The source language text is language text to be translated;
Based on the translation candidate collection, preset translation model and preset priori knowledge model, the first probability is determined Distribution and the second probability distribution;First probability distribution is used to indicate the cypher text and meets the general of priori knowledge model Rate, second probability distribution are used to indicate the probability that the cypher text meets translation model;
Based on first probability distribution and second probability distribution, the source is determined from the translation candidate collection The cypher text of language text.
On the other hand, the present invention provides a kind of language text translation system, which includes:
Candidate collection module is translated, for determining rule according to preset translation candidate collection, determines source language text pair The translation candidate collection answered, the translation candidate collection include multiple cypher texts of source language text;The source language text For language text to be translated;
Training module, for being based on the translation candidate collection, preset translation model and preset priori knowledge model, Determine the first probability distribution and the second probability distribution;First probability distribution, which is used to indicate the cypher text and meets priori, to be known Know the probability of model, second probability distribution is used to indicate the probability that the cypher text meets translation model;
Translation module, for being based on first probability distribution and second probability distribution, from the translation Candidate Set The cypher text of the source language text is determined in conjunction.
Language text interpretation method provided by the invention and system, by calculating separately priori knowledge model and translation model Probability distribution in translation candidate collection, and using the difference of two probability distribution as a part of speech training target, from And Machine Translation Model is made to may learn arbitrary priori knowledge, improve the accuracy of machine translation result and reliable Property.
Detailed description of the invention
Fig. 1 is the flow diagram of language text interpretation method provided in an embodiment of the present invention;
Fig. 2 is the structural schematic diagram of language text translation system provided in an embodiment of the present invention;
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical solution in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention A part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not having Every other embodiment obtained under the premise of creative work is made, shall fall within the protection scope of the present invention.
Fig. 1 is the flow diagram of language text interpretation method provided in an embodiment of the present invention.As shown in Figure 1, this method The following steps are included:
Step 101 determines rule according to preset translation candidate collection, determines the corresponding translation Candidate Set of source language text It closes, the translation candidate collection includes multiple cypher texts of source language text;The source language text is language to be translated Text;
Step 102 is based on the translation candidate collection, preset translation model and preset priori knowledge model, determines First probability distribution and the second probability distribution;First probability distribution is used to indicate the cypher text and meets priori knowledge mould The probability of type, second probability distribution are used to indicate the probability that the cypher text meets translation model;
Step 103 is based on first probability distribution and second probability distribution, from the translation candidate collection really The cypher text of the fixed source language text.
Specifically, firstly, preset translation candidate collection determines that rule refers to that translation is the task that a sequence generates, source There are multiple words or word in language text x, when generating translation candidate collection, the word or word of previous generation can be used as latter The input of a word or word.According to the source language text x of different length, it is exponential for really translating the size of candidate collection , it can not effectively calculate.In practical applications, by stochastical sampling or beam search, to obtain the source language text Multiple cypher texts, i.e., translation candidate collection S (x), can be realized using the prior art, details are not described herein again;
Then, according to the translation candidate collection S (x) and preset priori knowledge model Q (y | x;γ), the first probability is determined DistributionAccording to the translation candidate collection S (x) and preset translation model P (y | x;θ), the second probability distribution is determinedFinally, being based on the first probability distribution and the second probability distribution, source language text is determined from translation candidate collection Cypher text y.
For sake of clarity, if source language text x, as input, cypher text y thus constitutes sentence pair as output (x, y).In practical applications, under different contexts the same word or word there are different semantemes, and source language text x be by Multiple words or word are according to the different compositions that puts in order, and the uncertainty of the ambiguity and sequence of word or word leads to one A source language text may correspond to multiple cypher texts (y1, y2, y3 etc.), and probability is highest in this multiple cypher text, be Best cypher text, in order to be distinguished with other cypher texts, referred to as target language text.
For example, preset priori knowledge model Q (y | x;γ), it can be obtained not according to different characteristic function φ (x, y) Same model, the first probability distribution can determine according to the following formula:
Wherein, x indicates source language text, and y is target language text, and y ' is cypher text, and γ is priori knowledge model Parameter preset.
Characteristic function φ (x, y) indicates the corresponding relationship of source language text and cypher text in priori knowledge data base, It based on specific characteristic function, is given a mark, that is, calculated each to each cypher text y1, y2 and y3 using priori knowledge model Cypher text meets the probability of priori knowledge model.Wherein, more meet the cypher text of priori knowledge model, probability is higher.
Translation model P (y | x;It θ) is then the commonly used scoring model of machine translation, which can be parallel by training Corpus obtains, and the corresponding relationship of source language text x and cypher text y in Parallel Corpus is indicated, for calculating each translation Text meets the probability of translation model, belongs to the prior art, and details are not described herein again.
According to translation candidate collection S (x) and translation model P (y | x;θ), the second probability distribution can be determined by following formula:
Wherein, x indicates source language text, and y is target language text, and y ' is cypher text, and θ is the parameter of translation model;α It is the default hyper parameter for controlling the second probability distribution steep.
Language text interpretation method provided in an embodiment of the present invention passes through comprehensive utilization priori knowledge model and translation mould Type gives a mark to multiple cypher texts in terms of two, so that the cypher text for more meeting priori knowledge model be encouraged to turn over The probability translated under model is also higher, to finally determine target language text from translation candidate collection, improves translation model Performance and translation result accuracy.
On the basis of the above embodiments, first probability distribution and described second in the language text interpretation method Probability distribution determines the cypher text of the source language text from the translation candidate collection, comprising:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference Different parameter is used to indicate the difference of first probability distribution and second probability distribution;
Based on the probability difference parameter value, the translation text of the source language text is determined from the translation candidate collection This.
Specifically, firstly, determining rule according to preset translation candidate collection, the corresponding translation of source language text x is determined Candidate collection S (x);Then, it is based on the translation candidate collection, translation model and priori knowledge model, determines the first probability distributionWith the second probability distributionLater, it determines general between the first probability distribution and the second probability distribution Rate difference parameter value;Finally, being based on the probability difference parameter value, determine source language text x's from translation candidate collection S (x) Cypher text y.
For example, user log in translation system after, in-English translation window input in Chinese column in input source language text x For " many airports are all forced to close ", determining translation candidate collection S (x) according to x, there are two cypher texts: y1 is " Many Airports were closed to close " and y2 is " Many airports were forced to close down";
According to priori knowledge model, the first probability distribution is determined
Wherein, the probability that Q (y1 | x)=0.2, i.e. sentence pair (x, y1) meet priori knowledge model is 0.2;Q (y2 | x)= 0.8, i.e., it is 0.8 that sentence pair (x, y2), which meets the probability of priori knowledge model,;
According to translation model, the second probability distribution is determined:
Wherein, the probability that P (y1 | x)=0.6, i.e. sentence pair (x, y1) meet translation model is 0.6;P (y2 | x)=0.4, i.e., The probability that sentence pair (x, y2) meets translation model is 0.4;
By the first probability distribution and the second probability distribution, difference parameter value between the two can be determined;Based on the difference Different parameter value is adjusted translation model and gives a mark again to above-mentioned two cypher text, obtain P (y1 | x)=0.3, P (y2 | X)=0.7;
Accordingly, it is determined that source language text x: the cypher text y: " Many airports of " many airports are all forced to close " were forced to close down”。
By above-described embodiment, it can be seen that, language text interpretation method provided in an embodiment of the present invention is based on the first probability The difference parameter value of distribution and the second probability distribution, and given a mark again according to translation model to multiple cypher texts, to improve Meet probability of the cypher text of priori knowledge in translation model probability distribution, and then obtains more accurate source language text Cypher text.
On the basis of the above embodiments, the difference parameter value of first probability distribution and second probability distribution is KL (Kullback-Leibler) distance can be determined by following formula:
On the basis of the various embodiments described above, in the language text interpretation method based on the probability difference parameter value, The cypher text of the source language text is determined from the translation candidate collection, comprising:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to institute State priori knowledge Model approximation;
Based on the training objective and the preset model that reorders, the original language is determined from the translation candidate collection The cypher text of text.
Specifically, firstly, determining rule according to preset translation candidate collection, the corresponding translation of source language text x is determined Candidate collection S (x);Then, it is based on the translation candidate collection, translation model and priori knowledge model, determines the first probability distributionWith the second probability distributionLater, it determines general between the first probability distribution and the second probability distribution Rate difference parameter value;Finally, being based on the probability difference parameter value, training objective J (θ, γ) is determined, so that translation model is to priori Model approximation;Finally, being based on training objective J (θ, γ) and the preset model that reorders, determined from translation candidate collection S (x) The cypher text y of source language text x.
In general, when giving a mark to cypher text, generally use translation model P (y | x;Log-likelihood θ) is estimated Be counted as standard exercise criterion, i.e., traditional training objective be log-likelihood function L (θ)=logP (y | x;θ).
By determining the difference parameter value of the first probability distribution and the second probability distribution, which is added tradition In training objective, determine that new training objective is J (θ, γ), which thinks that optimal parameter θ and γ can encourage most to accord with Probability highest of the cypher text of priori knowledge in the second probability distribution of translation model is closed, so that translation model more inclines The cypher text that priori knowledge is determined for compliance in Xiang Yucong translation candidate collection S (x) is the target language text of source language text x y。
Optionally, if the difference parameter value is KL distance, training objective can determine according to the following formula:
Wherein, λ1And λ2It is the default hyper parameter of balance training target, N is the sentence pair number of training data.
Optimal parameter θ and γ is obtained by new training objective, using the following model that reorders, from translation candidate Determine the cypher text of source language text.
Y=argmaxy∈S(x){logP(y|x;θ)+γ·φ(x,y)}
For example, it is assumed that source language text x is " Bush and salon have held talks ", translation candidate collection S is determined according to x (x) there are three cypher texts: y1 is " Bush held a talk with Sharon ", and y2 is " Bush held a talk With Bush ", y3 are " Bush had lunch with Sharon ".
Assuming that characteristic function φ (x, y) indicates the word pair occurred in source language text x and target language text y in sentence pair Quantity, word is combined into { (Bush, Bush), (holding, held), (talks, talk), (salon, Sharon) } to collection, then the In one cypher text y1,4 words are to occurring, therefore φ (x, y1)=4;Similarly, φ (x, y2)=3, φ (x, y3)= 2。
The first probability distribution can be determined according to priori knowledge model
Wherein, the probability of cypher text y1 are as follows:
It can similarly obtain: Q (y2 | x)=e3/(e2+e3+e4);Q (y3 | x)=e2/(e2+e3+e4).Final Q (y1 | x)= 0.67, Q (y2 | x)=0.24, Q (y3 | x)=0.09.
It by above-mentioned probability it is found that cypher text y1 is best suitable for priori knowledge model, and is in fact also correctly to turn over Translation sheet;Cypher text y2 has then obviously violated the priori knowledge of " should not repeat translation, should not leak and turn over ", therefore probability is lower; Cypher text y3 then deviates from the semanteme of source language text, therefore probability is lower.
Assuming that obtaining the second probability distribution by adjusting preceding translation model
Wherein, P (y1 | x)=0.4, P (y2 | x)=0.5, P (y3 | x)=0.1, translation model can translate " Bush held a talk with Bush”。
At this point, if default hyper parameter λ1、λ2Numerical value be 1, KL (P between above-mentioned two probability distribution is calculated by formula | | Q), new training objective J (θ, γ) is determined based on KL distance;
It based on the training objective and reorders model, translation model is adjusted, the P (y1 | x)=0.6 after training, P (y2 | x)=0.31, P (y3 | x)=0.09, it is seen then that new training objective improves the probability of cypher text y1, and reduces The probability of cypher text y2 and y3, so that more meeting the cypher text of priori knowledge probability in the probability distribution in translation model It is higher, even if translation model is to priori knowledge Model approximation.
Therefore, the target language text y of final output is " Bush held a talk with Sharon ".
By above-described embodiment it can be seen that, language text interpretation method provided in an embodiment of the present invention, by the way that elder generation will be met Test the probability distribution of knowledge model and meet translation model probability distribution between KL distance traditional training objective, drum is added It encourages more meeting the translation that is also higher, and then more being optimized of probability of the cypher text of priori knowledge model under translation model Model parameter improves the performance and translation of translation model to finally determine target language text from translation candidate collection As a result accuracy.
Fig. 2 is the structural schematic diagram of language text translation system provided in an embodiment of the present invention.As shown in Fig. 2, the system It include: translation candidate collection module 21, training module 22 and translation module 23.Wherein, translation candidate collection module 21 is used for root Rule is determined according to preset translation candidate collection, determines the corresponding translation candidate collection of source language text, the translation Candidate Set Close multiple cypher texts including source language text;The source language text is language text to be translated;Training module 22 is used In be based on the translation candidate collection, preset translation model and preset priori knowledge model, determine the first probability distribution and Second probability distribution;First probability distribution is used to indicate the probability that the cypher text meets priori knowledge model, described Second probability distribution is used to indicate the probability that the cypher text meets translation model;Translation module 23 is used to be based on described first Probability distribution and second probability distribution determine the cypher text of the source language text from the translation candidate collection.
It should be noted that the language text translation system is to realize that above method embodiment, function are specific It can refer to above method embodiment, details are not described herein again.
On the basis of the above embodiments, the translation module 23 in the system is specifically used for being based on first probability distribution And second probability distribution, determine probability difference parameter value;The probability difference parameter is used to indicate first probability point The difference of cloth and second probability distribution;Based on the probability difference parameter value, institute is determined from the translation candidate collection State the cypher text of source language text.Optionally, the probability difference parameter is KL distance.
On the basis of the various embodiments described above, the translation module 23 in the system is specifically used for being based on the difference parameter Value, determines training objective;The training objective is used to indicate the translation model to the priori knowledge Model approximation;Based on institute Training objective and the preset model that reorders are stated, the translation text of the source language text is determined from the translation candidate collection This.
Priori knowledge is dissolved into translation in the training stage by the language text interpretation method and system provided through the invention In model, the performance of translation model is improved, and then priori knowledge is applied in translation process, without increasing additionally Network module, which achieves that, applies to any priori knowledge in machine translation, the final accuracy for improving translation result and reliable Property.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although Present invention has been described in detail with reference to the aforementioned embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (4)

1. a kind of language text interpretation method characterized by comprising
Rule is determined according to preset translation candidate collection, determines the corresponding translation candidate collection of source language text, the translation Candidate collection includes multiple cypher texts of source language text;The source language text is language text to be translated;
Based on the translation candidate collection, preset translation model and preset priori knowledge model, the first probability distribution is determined And second probability distribution;First probability distribution is used to indicate the probability that the cypher text meets priori knowledge model, institute It states the second probability distribution and is used to indicate the probability that the cypher text meets translation model;
Based on first probability distribution and second probability distribution, the original language is determined from the translation candidate collection The cypher text of text;
It is described to be based on first probability distribution and second probability distribution, the source is determined from the translation candidate collection The cypher text of language text, comprising:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference ginseng Number is used to indicate the difference of first probability distribution and second probability distribution;
Based on the probability difference parameter value, the cypher text of the source language text is determined from the translation candidate collection;
It is described to be based on the probability difference parameter value, the translation text of the source language text is determined from the translation candidate collection This, comprising:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to the elder generation Knowledge model is tested to approach;
Based on the training objective and the preset model that reorders, the source language text is determined from the translation candidate collection Cypher text.
2. the method according to claim 1, wherein the probability difference parameter is KL distance.
3. a kind of language text translation system characterized by comprising
Candidate collection module is translated, for determining rule according to preset translation candidate collection, determines that source language text is corresponding Candidate collection is translated, the translation candidate collection includes multiple cypher texts of source language text;The source language text be to The language text of translation;
Training module is determined for being based on the translation candidate collection, preset translation model and preset priori knowledge model First probability distribution and the second probability distribution;First probability distribution is used to indicate the cypher text and meets priori knowledge mould The probability of type, second probability distribution are used to indicate the probability that the cypher text meets translation model;
Translation module, for being based on first probability distribution and second probability distribution, from the translation candidate collection Determine the cypher text of the source language text;
The translation module is specifically used for:
Based on first probability distribution and second probability distribution, probability difference parameter value is determined;The probability difference ginseng Number is used to indicate the difference of first probability distribution and second probability distribution;
Based on the probability difference parameter value, the cypher text of the source language text is determined from the translation candidate collection;
The translation module is specifically used for:
Based on the difference parameter value, training objective is determined;The training objective is used to indicate the translation model to the elder generation Knowledge model is tested to approach;
Based on the training objective and the preset model that reorders, the source language text is determined from the translation candidate collection Cypher text.
4. system according to claim 3, which is characterized in that the probability difference parameter is KL distance.
CN201710335652.4A 2017-05-12 2017-05-12 A kind of language text interpretation method and system Active CN107273363B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710335652.4A CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710335652.4A CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Publications (2)

Publication Number Publication Date
CN107273363A CN107273363A (en) 2017-10-20
CN107273363B true CN107273363B (en) 2019-11-22

Family

ID=60074224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710335652.4A Active CN107273363B (en) 2017-05-12 2017-05-12 A kind of language text interpretation method and system

Country Status (1)

Country Link
CN (1) CN107273363B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783824B (en) * 2018-12-17 2023-04-18 北京百度网讯科技有限公司 Translation method, device and storage medium based on translation model
CN110298045B (en) * 2019-05-31 2023-03-24 北京百度网讯科技有限公司 Machine translation method, device, equipment and storage medium
CN110334359B (en) * 2019-06-05 2021-06-15 华为技术有限公司 Text translation method and device
CN111178085B (en) * 2019-12-12 2020-11-24 科大讯飞(苏州)科技有限公司 Text translator training method, and professional field text semantic parsing method and device
CN111368091B (en) * 2020-02-13 2023-09-22 中国工商银行股份有限公司 Document translation method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646019A (en) * 2013-12-31 2014-03-19 哈尔滨理工大学 Method and device for fusing multiple machine translation systems
CN103678285A (en) * 2012-08-31 2014-03-26 富士通株式会社 Machine translation method and machine translation system
CN105573994A (en) * 2016-01-26 2016-05-11 沈阳雅译网络技术有限公司 Statistic machine translation system based on syntax framework

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9530161B2 (en) * 2014-02-28 2016-12-27 Ebay Inc. Automatic extraction of multilingual dictionary items from non-parallel, multilingual, semi-structured data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678285A (en) * 2012-08-31 2014-03-26 富士通株式会社 Machine translation method and machine translation system
CN103646019A (en) * 2013-12-31 2014-03-19 哈尔滨理工大学 Method and device for fusing multiple machine translation systems
CN105573994A (en) * 2016-01-26 2016-05-11 沈阳雅译网络技术有限公司 Statistic machine translation system based on syntax framework

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Knowledge-Based Semantic Embedding for Machine Translation;Chen Shi等;《Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics》;20160812;第2245-2254页 *
N-Best句法知识增强的统计机器翻译预调序模型;郭俊博 等;《计算机工程与应用》;20160901;第52卷(第17期);第160-165页 *

Also Published As

Publication number Publication date
CN107273363A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
CN107273363B (en) A kind of language text interpretation method and system
JP7122341B2 (en) Method and apparatus for evaluating translation quality
Wu et al. " Mask and Infill": Applying Masked Language Model to Sentiment Transfer
Gu et al. Dialogwae: Multimodal response generation with conditional wasserstein auto-encoder
CN109086357A (en) Sensibility classification method, device, equipment and medium based on variation autocoder
WO2021134520A1 (en) Voice conversion method, voice conversion training method, intelligent device and storage medium
CN106649272A (en) Named entity recognizing method based on mixed model
Tarnavskyi et al. Ensembling and knowledge distilling of large sequence taggers for grammatical error correction
CN116127020A (en) Method for training generated large language model and searching method based on model
CN103646019A (en) Method and device for fusing multiple machine translation systems
CN107861954A (en) Information output method and device based on artificial intelligence
CN103119584A (en) Machine translation evaluation device and method
CN108021549A (en) Sequence conversion method and device
WO2023109294A1 (en) Method and apparatus for jointly training natural language processing model on basis of privacy protection
CN113901208B (en) Method for analyzing emotion tendentiousness of mid-cross language comments blended with theme characteristics
CN105868187A (en) A multi-translation version parallel corpus establishing method
Gang et al. Chinese intelligent chat robot based on the AIML language
Liu et al. Augmenting multi-turn text-to-SQL datasets with self-play
Han et al. [Retracted] The Modular Design of an English Pronunciation Level Evaluation System Based on Machine Learning
CN116610795B (en) Text retrieval method and device
CN109446535A (en) A kind of illiteracy Chinese nerve machine translation method based on triangle framework
Kumar et al. Towards building text-to-speech systems for the next billion users
Iranzo-Sánchez et al. From simultaneous to streaming machine translation by leveraging streaming history
Hao et al. Abstractive summarization model with a feature-enhanced seq2seq structure
CN111724767B (en) Spoken language understanding method based on Dirichlet variation self-encoder and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant