CN111597824A - Training method and device of language translation model - Google Patents

Training method and device of language translation model Download PDF

Info

Publication number
CN111597824A
CN111597824A CN202010307663.3A CN202010307663A CN111597824A CN 111597824 A CN111597824 A CN 111597824A CN 202010307663 A CN202010307663 A CN 202010307663A CN 111597824 A CN111597824 A CN 111597824A
Authority
CN
China
Prior art keywords
corpus
source
target
language
translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010307663.3A
Other languages
Chinese (zh)
Other versions
CN111597824B (en
Inventor
陈巍华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd, Xiamen Yunzhixin Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN202010307663.3A priority Critical patent/CN111597824B/en
Publication of CN111597824A publication Critical patent/CN111597824A/en
Application granted granted Critical
Publication of CN111597824B publication Critical patent/CN111597824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a method and a device for training a language translation model. The method comprises the following steps: step A1, obtaining a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and target sentences are translated with each other; step A2, training an initial source language training model by using a source corpus P1 to obtain a target source language training model M2; step A3, obtaining target corpus T2 corresponding to a source corpus S3 and a source corpus S3 according to a binary translation network M1, a target source language training model M2 and the source corpus S1; and step A4, obtaining a language translation model according to the source corpus S1, the target corpus T1, the source corpus S3 and the target corpus T2. According to the technical scheme, the source language material can be expanded, so that the source language material and the target language material which are rich in resources and mutually double-translated are obtained, and a language translation model with high translation precision, accuracy and quality is obtained.

Description

Training method and device of language translation model
Technical Field
The invention relates to the technical field of translation, in particular to a method and a device for training a language translation model.
Background
At present, in a Translation task, most of existing mainstream data enhancement algorithms extend a corpus by introducing noise (word insertion, deletion, reordering, and the like) or extend the corpus by generating parallel pseudo-bilingual by using a large amount of target monolingues through a Back-Translation method, and then train a language Translation model after obtaining bilingual data.
Disclosure of Invention
The embodiment of the invention provides a method and a device for training a language translation model. The technical scheme is as follows:
according to a first aspect of the embodiments of the present invention, there is provided a method for training a language translation model, including:
step A1, obtaining a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and target sentences are translated with each other;
step A2, training an initial source language training model by using a source corpus P1 to obtain a target source language training model M2;
step A3, obtaining a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the two-classification translation network M1, the target source language training model M2 and the source corpus S1;
and A4, obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3 and the target corpus T2.
In one embodiment, the step a3 includes:
expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of candidate source corpora;
and screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
In one embodiment, the screening the candidate source corpus S2 and the target corpus T1 by using the two-category translation network M1 to obtain a source corpus S3 and a target corpus T2 includes:
obtaining preset corpus probability threshold values of the inter-translation corpus;
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
In one embodiment, the method further comprises:
respectively taking the target corpus T2 corresponding to the source corpus S3 and the source corpus S3 as the target corpus T1 and the source corpus S1 again, and executing the step A2 and the step A3 to obtain a target corpus T3 and a source corpus S4 corresponding to the target corpus T3;
the step A4 includes:
and obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3 and the source corpus S4.
In one embodiment, the method further comprises:
acquiring a preset number of target monolingues;
translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and retraining the language translation model according to the source corpus P2 and the target monolingua.
According to a second aspect of the embodiments of the present invention, there is provided a training apparatus for a language translation model, including:
the system comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is used for acquiring a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and any target sentences are translated with each other;
the training module is used for training the initial source language training model by utilizing the source corpus P1 to obtain a target source language training model M2;
a first obtaining module, configured to obtain a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the binary translation network M1, the target source language training model M2 and the source corpus S1;
and the second obtaining module is configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, and the target corpus T2.
In one embodiment, the first obtaining module comprises:
the expansion submodule is used for expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of a candidate source corpus;
and the screening submodule is used for screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
In one embodiment, the screening submodule is specifically configured to:
obtaining preset corpus probability threshold values of the inter-translation corpus;
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
In one embodiment, the apparatus further comprises:
a second processing module, configured to respectively re-use the source corpus S3 and the target corpus T2 corresponding to the source corpus S3 as the target corpus T1 and the source corpus S1, and execute the steps of the training module and the first obtaining module, so as to obtain source corpora S4 corresponding to the target corpus T3 and the target corpus T3;
the second acquisition module includes:
an obtaining submodule, configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3, and the source corpus S4.
In one embodiment, the apparatus further comprises:
the third acquisition module is used for acquiring target monolingues with preset number;
the translation module is used for translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and the training module is used for retraining the language translation model according to the source corpus P2 and the target monolingual.
The technical scheme provided by the embodiment of the invention can have the following beneficial effects:
an initial two-classification translation network M1 can be constructed through a source language material S1 and a target language material T1 with low resources (i.e., a small amount of resources and/or simplicity), then a model can be trained on an initial source language by using a source language material P1, so that a target source language training model M2 with high accuracy is obtained, then the two-classification translation network M1 and the target source language training model M2 can be used for expanding the source language material S1, so that a source language material S3 with rich resources and double translations with each other and a target language material T2 corresponding to the source language material S3 are obtained, and then the language translation model with high translation accuracy and precision and high quality can be obtained by using rich languages (i.e., the source language material S1, the target language material T1, the source language material S3 and the target language material T2).
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is a flow diagram illustrating a method of training a language translation model in accordance with an exemplary embodiment.
FIG. 2 is a block diagram illustrating a training apparatus for a language translation model in accordance with an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
In order to solve the above technical problem, an embodiment of the present invention provides a method for training a language translation model, where the method is used in a training program, system or device for a language translation model, and an execution subject corresponding to the method may be a terminal or a server, as shown in fig. 1, and the method includes steps a1 to a 4:
step A1, obtaining a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and target sentences are translated with each other; the bilingual corpus source corpus S1 and target corpus T1 each other mean that the target corpus T1 and the source corpus S1 refer to the same content, but different expression languages. The source corpus S1 belongs to a corpus with low resources, i.e. a small amount of resources, i.e. the number of corpuses of the source corpus S1 is less than the preset number. Therefore, the essence of the invention is a training method of the low-resource translation model. The binary translation network M1 may be any network having the capability of determining whether any source sentence and any target sentence are translated with each other, such as a binary convolutional neural network.
Step A2, training an initial source language training model by using a source corpus P1 to obtain a target source language training model M2; the initial source language training model may be an open-source language training model, the target source language training model M2 is a model obtained by training a source corpus P1 on a pre-training model (e.g., BERT (Bidirectional Encoder expressions from transducers)), which is a network, and the model M2 trained by the network has a capability of completely filling in the empty space and can be used for expanding the corpus; in addition, both the initial source language training model and the target source language training model have the function of enriching the linguistic data, namely enriching a certain source linguistic data into a plurality of linguistic data.
The source corpus P1 may be a monolingual corpus, i.e., a corpus that can currently be expressed in only one language, or a bilingual corpus, i.e., a corpus that can currently be expressed in multiple languages.
Step A3, obtaining a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the two-classification translation network M1, the target source language training model M2 and the source corpus S1;
and A4, obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3 and the target corpus T2. The language translation model is a model with a language translation function (double translation function), such as being capable of translating chinese into english, russian, and other languages.
An initial two-classification translation network M1 can be constructed through a source corpus S1 and a target corpus T1 with low resources (i.e., a small amount of resources and/or simplicity), then the initial source language training model can be trained by using a source corpus P1, so that a target source language training model M2 with high accuracy is obtained, then the two-classification translation network M1 and the target source language training model M2 can be used for expanding the source corpus S1, so that a source corpus S3 with rich resources and double translations for each other and a target corpus T2 corresponding to the source corpus S3 are obtained, and then the language translation model with high translation accuracy and precision and high quality can be obtained by using rich corpora (i.e., the source corpus S1, the target corpus T1, the source corpus S3 and the target corpus T2).
In one embodiment, the step a3 includes:
expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of candidate source corpora;
and screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
Because the source corpus S1 is large in quantity, the language translation model trained only by S1 is low in precision, so that the target source language training model M2 can be used to expand the source corpus S1 to obtain S2 of a larger number of candidate source corpora, and the two-classification translation network M1 is used to further screen the candidate source corpus S2 and the target corpus T1, so as to obtain the source corpus S3 and the target corpus T2 with higher probability of bilingual corpus each other, so that the source corpus S3 and the target corpus T2 are combined to improve the translation precision and quality of the language translation model.
In one embodiment, the screening the candidate source corpus S2 and the target corpus T1 by using the two-category translation network M1 to obtain a source corpus S3 and a target corpus T2 includes:
obtaining a preset corpus probability threshold value of the mutual translation corpuses (namely the mutual bilingual corpuses);
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
In the screening, a preset corpus probability threshold may be combined to filter the source corpus S2 and the target corpus T1, which have lower probability of being bilingual corpus, so as to ensure that the source corpus S3 and the target corpus T2 are screened to have higher probability of being bilingual corpus.
In one embodiment, the method further comprises:
respectively taking the target corpus T2 corresponding to the source corpus S3 and the source corpus S3 as the target corpus T1 and the source corpus S1 again, and executing the step A2 and the step A3 to obtain a target corpus T3 and a source corpus S4 corresponding to the target corpus T3;
the step A4 includes:
and obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3 and the source corpus S4.
By re-using the source corpus S3 and the target corpus T2 corresponding to the source corpus S3 as the target corpus T1 and the source corpus S1, respectively, and re-executing step a2 and step A3, more source corpora and target corpora each other as bilingual corpora can be obtained, i.e., further obtaining the target corpus T3 and the source corpus S4 each as a bilingual corpus with respect to the target corpus T3, so that a language translation model with higher translation accuracy and quality can be obtained by using more source corpora and target corpora each as bilingual corpora (i.e., (S1, T1), (S3, T2), (S4, T3)).
In one embodiment, the method further comprises:
acquiring a preset number of target monolingues; a target monolingual is a corpus that can currently be expressed in only one language.
Translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and retraining the language translation model according to the source corpus P2 and the target monolingua.
By acquiring a preset number of target monolingues and translating a source language material P2 which is translated with the target monolingues by using the language Translation model, a large number of target monolingues are subjected to the language Translation model by using a Back-Translation (reverse Translation) mode to obtain bilinguals with higher quality, and then the language Translation model can be retrained by using the source language material P2 and the target monolingues, so that the Translation precision and quality of the language Translation model are further improved by combining the Back-Translation mode.
Therefore, the invention constructs the Translation model by further expanding high-quality bilingual for the bilingual corpus of the low resource, improves the capability of carrying out the Translation model, further has higher quality of the forged bilingual obtained by combining the Back-Translation, and finally can further improve the Translation capability of the bilingual model of the low resource.
The technical solution of the present invention will be further explained in detail below:
step 1: modeling is carried out on a source language material S1 and a target language material T1 in a low-resource bilingual language material, and a two-classification network M1 is trained, wherein the network has the capability of judging whether any source sentences and target sentences are translated mutually.
Step 2: for source corpora (monolingual, bilingual all rows), a pre-trained model (e.g., BERT M2 from the bidirectional coder token of the transformer, etc.) is obtained by using an open-source pre-trained language model or by using a large number of source monolingual training.
And step 3: and performing corpus expansion on the source corpus S1 by using M2, randomly masking the source corpus or adding some masks at certain positions of sentences, predicting through M2, and predicting/adding words at the mask positions to obtain S2 of candidate source corpuses.
And 4, step 4: and (3) passing the generated source language material S2 and the corresponding target language material T1 through a binary network M1, and screening data which are larger than a certain threshold value (the probability that an original expected language material and a target expected language material are bilingual to each other) in S2 to obtain a forged source language material S3 and a target language material T1 corresponding to the forged source language material S3.
And 5: and (3) converting the source language material into the target language material, and repeating the steps 2, 3 and 4 to obtain the forged target language material T3 and the source language material S1 corresponding to the forged target language material T3.
Step 6: and (3) performing model construction by taking the original bilingual corpus (S1, T1) and the forged bilingual corpus (S3, T1) and the forged bilingual corpus (S1, T3) as training corpora training translation model to obtain the bilingual translation model M3.
And 7: a large amount of target monolingues are subjected to M3 in a Back-Translation mode to obtain high-quality forged monolingues (S4, T4), and the obtained data are subjected to Translation model training, so that the effect of low-resource Translation is improved.
Finally, it is clear that: the above embodiments can be freely combined by those skilled in the art according to actual needs.
Corresponding to the above method for training a language translation model provided in the embodiment of the present invention, an embodiment of the present invention further provides a device for training a language translation model, as shown in fig. 2, the device includes:
the first processing module 201 is configured to obtain a source corpus S1 and a target corpus T1 of bilingual corpuses each other, and construct a two-classification translation network M1 according to the source corpus S1 and the target corpus T1, where the two-classification translation network M1 has an ability to determine whether any source sentences and target sentences are translated with each other;
the training module 202 is configured to train an initial source language training model by using the source corpus P1, and obtain a target source language training model M2;
the first obtaining module 203 is configured to obtain a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the binary translation network M1, the target source language training model M2 and the source corpus S1;
the second obtaining module 204 is configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, and the target corpus T2.
In one embodiment, the first obtaining module comprises:
the expansion submodule is used for expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of a candidate source corpus;
and the screening submodule is used for screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
In one embodiment, the screening submodule is specifically configured to:
obtaining preset corpus probability threshold values of the inter-translation corpus;
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
In one embodiment, the apparatus further comprises:
a second processing module, configured to respectively re-use the source corpus S3 and the target corpus T2 corresponding to the source corpus S3 as the target corpus T1 and the source corpus S1, and execute the steps of the training module and the first obtaining module, so as to obtain source corpora S4 corresponding to the target corpus T3 and the target corpus T3;
the second acquisition module includes:
an obtaining submodule, configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3, and the source corpus S4.
In one embodiment, the apparatus further comprises:
the third acquisition module is used for acquiring target monolingues with preset number;
the translation module is used for translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and the training module is used for retraining the language translation model according to the source corpus P2 and the target monolingual.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A method for training a language translation model, comprising:
step A1, obtaining a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and target sentences are translated with each other;
step A2, training an initial source language training model by using a source corpus P1 to obtain a target source language training model M2;
step A3, obtaining a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the two-classification translation network M1, the target source language training model M2 and the source corpus S1;
and A4, obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3 and the target corpus T2.
2. The method of claim 1,
the step A3 includes:
expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of candidate source corpora;
and screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
3. The method of claim 2,
the screening the candidate source corpus S2 and the target corpus T1 by using the two-class translation network M1 to obtain a source corpus S3 and a target corpus T2 includes:
obtaining preset corpus probability threshold values of the inter-translation corpus;
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
4. The method of claim 2, further comprising:
respectively taking the target corpus T2 corresponding to the source corpus S3 and the source corpus S3 as the target corpus T1 and the source corpus S1 again, and executing the step A2 and the step A3 to obtain a target corpus T3 and a source corpus S4 corresponding to the target corpus T3;
the step A4 includes:
and obtaining the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3 and the source corpus S4.
5. The method according to any one of claims 1 to 4, further comprising:
acquiring a preset number of target monolingues;
translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and retraining the language translation model according to the source corpus P2 and the target monolingua.
6. An apparatus for training a language translation model, comprising:
the system comprises a first processing module, a second processing module and a third processing module, wherein the first processing module is used for acquiring a source language material S1 and a target language material T1 which are bilingual language materials with each other, and constructing a two-classification translation network M1 according to the source language material S1 and the target language material T1, wherein the two-classification translation network M1 has the capability of judging whether any source sentences and any target sentences are translated with each other;
the training module is used for training the initial source language training model by utilizing the source corpus P1 to obtain a target source language training model M2;
a first obtaining module, configured to obtain a source corpus S3 and a target corpus T2 corresponding to the source corpus S3 according to the binary translation network M1, the target source language training model M2 and the source corpus S1;
and the second obtaining module is configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, and the target corpus T2.
7. The apparatus of claim 6,
the first obtaining module comprises:
the expansion submodule is used for expanding the source corpus S1 according to the target source language training model M2 to obtain S2 of a candidate source corpus;
and the screening submodule is used for screening the candidate source corpus S2 and the target corpus T1 by using the two-classification translation network M1 to obtain a source corpus S3 and the target corpus T2.
8. The apparatus of claim 7,
the screening submodule is specifically configured to:
obtaining preset corpus probability threshold values of the inter-translation corpus;
inputting the candidate source corpus S2 and the target corpus T1 into the two-classification translation network M1, so as to screen out the source corpus S3 and the target corpus T2 by using the two-classification translation network M1 and the corpus probability threshold.
9. The apparatus of claim 7, further comprising:
a second processing module, configured to respectively re-use the source corpus S3 and the target corpus T2 corresponding to the source corpus S3 as the target corpus T1 and the source corpus S1, and execute the steps of the training module and the first obtaining module, so as to obtain source corpora S4 corresponding to the target corpus T3 and the target corpus T3;
the second acquisition module includes:
an obtaining submodule, configured to obtain the language translation model according to the source corpus S1, the target corpus T1, the source corpus S3, the target corpus T2, the target corpus T3, and the source corpus S4.
10. The apparatus of any one of claims 6 to 9, further comprising:
the third acquisition module is used for acquiring target monolingues with preset number;
the translation module is used for translating the source language material P2 corresponding to the target monolingual by using the language translation model;
and the training module is used for retraining the language translation model according to the source corpus P2 and the target monolingual.
CN202010307663.3A 2020-04-17 2020-04-17 Training method and device for language translation model Active CN111597824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010307663.3A CN111597824B (en) 2020-04-17 2020-04-17 Training method and device for language translation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010307663.3A CN111597824B (en) 2020-04-17 2020-04-17 Training method and device for language translation model

Publications (2)

Publication Number Publication Date
CN111597824A true CN111597824A (en) 2020-08-28
CN111597824B CN111597824B (en) 2023-05-26

Family

ID=72190412

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010307663.3A Active CN111597824B (en) 2020-04-17 2020-04-17 Training method and device for language translation model

Country Status (1)

Country Link
CN (1) CN111597824B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887252A (en) * 2021-10-18 2022-01-04 浙江香侬慧语科技有限责任公司 Unsupervised rephrase text generation method, unsupervised rephrase text generation device and unsupervised rephrase text generation medium based on machine translation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090326912A1 (en) * 2006-08-18 2009-12-31 Nicola Ueffing Means and a method for training a statistical machine translation system
CN105389303A (en) * 2015-10-27 2016-03-09 北京信息科技大学 Automatic heterogenous corpus fusion method
CN110334361A (en) * 2019-07-12 2019-10-15 电子科技大学 A kind of neural machine translation method towards rare foreign languages language

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090326912A1 (en) * 2006-08-18 2009-12-31 Nicola Ueffing Means and a method for training a statistical machine translation system
CN105389303A (en) * 2015-10-27 2016-03-09 北京信息科技大学 Automatic heterogenous corpus fusion method
CN110334361A (en) * 2019-07-12 2019-10-15 电子科技大学 A kind of neural machine translation method towards rare foreign languages language

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吕学强;仵永栩;周强;刘殷;: "异源语料融合研究" *
姚亮;洪宇;刘昊;刘乐;姚建民;: "基于语义分布相似度的翻译模型领域自适应研究" *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887252A (en) * 2021-10-18 2022-01-04 浙江香侬慧语科技有限责任公司 Unsupervised rephrase text generation method, unsupervised rephrase text generation device and unsupervised rephrase text generation medium based on machine translation

Also Published As

Publication number Publication date
CN111597824B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
Andreas Good-enough compositional data augmentation
CN107273358B (en) End-to-end English chapter structure automatic analysis method based on pipeline mode
CN109670180B (en) Method and device for translating individual characteristics of vectorized translator
US11669695B2 (en) Translation method, learning method, and non-transitory computer-readable storage medium for storing translation program to translate a named entity based on an attention score using neural network
CN103678285A (en) Machine translation method and machine translation system
CN110555213B (en) Training method of text translation model, and text translation method and device
KR100853173B1 (en) Automatic speech interpretation system based on statistical automatic translation mode, translation processing method and training method thereof
CN111144137B (en) Method and device for generating corpus of machine post-translation editing model
Werlen et al. Self-attentive residual decoder for neural machine translation
CN103631770A (en) Language entity relationship analysis method and machine translation device and method
Çolakoğlu et al. Normalizing non-canonical Turkish texts using machine translation approaches
KR20210035721A (en) Machine translation method using multi-language corpus and system implementing using the same
CN110728154B (en) Construction method of semi-supervised general neural machine translation model
CN110991193A (en) Translation matrix model selection system based on OpenKiwi
CN110633456A (en) Language identification method, language identification device, server and storage medium
CN111597824A (en) Training method and device of language translation model
Langedijk et al. Decoderlens: Layerwise interpretation of encoder-decoder transformers
CN109657244B (en) English long sentence automatic segmentation method and system
CN113591493B (en) Translation model training method and translation model device
Keh et al. Pancetta: Phoneme aware neural completion to elicit tongue twisters automatically
Bondarenko et al. Comparative study of models trained on synthetic data for Ukrainian grammatical error correction
Zhang et al. Morphology in English reading comprehension in monolingual and bilingual/L2 readers: A synthesis and meta-analytic structural equation modeling study
US11664010B2 (en) Natural language domain corpus data set creation based on enhanced root utterances
Naseem et al. Reusing stanford pos tagger for tagging urdu sentences
US20220382793A1 (en) Apparatus and method for training model for document summarization

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant