CN112380876B

CN112380876B - Translation method, device, equipment and medium based on multilingual machine translation model

Info

Publication number: CN112380876B
Application number: CN202011409340.1A
Authority: CN
Inventors: 赵程绮; 朱耀明; 王明轩; 封江涛; 李磊
Original assignee: Beijing Youzhuju Network Technology Co Ltd
Current assignee: Beijing Youzhuju Network Technology Co Ltd
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2024-06-14
Anticipated expiration: 2040-12-04
Also published as: CN112380876A; WO2022116821A1

Abstract

The embodiment of the disclosure provides a translation method, a translation device, translation equipment and translation media based on a multilingual machine translation model. The method comprises the following steps: acquiring an original sentence to be translated and translation language information of the original sentence; determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the translation error of a preset multi-language machine translation model; and translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence. According to the embodiment of the disclosure, by adopting the technical scheme, the adapter is adopted to correct the translation turning error of the multi-language machine translation model, so that the accuracy of the translation result output by the multi-language translation model can be improved.

Description

Translation method, device, equipment and medium based on multilingual machine translation model

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a translation method, a translation device, translation equipment and translation media based on a multilingual machine translation model.

Background

Machine translation (Machine Translation, MT) is one of the core tasks in the direction of natural language processing, aimed at translating one natural language into another natural language using a computer program.

Conventional machine translation models are typically bilingual machine translation models that can handle translations in one language direction, such as translating English into Chinese. Because when the number of languages is large, a very large number of bilingual machine translation models need to be trained to realize the mutual translation between each pair of natural languages, in recent years, the bilingual machine translation model is gradually replaced by the multilingual machine translation model, and the bilingual machine translation model becomes one of the commonly used machine translation models.

However, under the same parameter configuration and model architecture, the performance of the multilingual machine translation model is often inferior to that of the bilingual machine translation model, resulting in a larger translation error of the translation result output by the multilingual machine translation model.

Disclosure of Invention

The embodiment of the disclosure provides a translation method, a translation device, translation equipment and translation media based on a multi-language machine translation model, so as to improve the accuracy of translation results output by the multi-language machine translation model.

In a first aspect, an embodiment of the present disclosure provides a translation method based on a multilingual machine translation model, including:

acquiring an original sentence to be translated and translation language information of the original sentence;

Determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the translation error of a preset multi-language machine translation model;

And translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence.

In a second aspect, an embodiment of the present disclosure further provides a translation apparatus based on a multilingual machine translation model, including:

the sentence acquisition module is used for acquiring an original sentence to be translated and translation language information of the original sentence;

The adapter determining module is used for determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the preset translation error of the multi-language machine translation model;

And the translation module is used for translating the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.

In a third aspect, an embodiment of the present disclosure further provides an electronic device, including:

one or more processors;

A memory for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods as described in embodiments of the present disclosure.

In a fourth aspect, the disclosed embodiments also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method as described in the disclosed embodiments.

The translation method, device, equipment and medium based on the multilingual machine translation model provided by the embodiment of the disclosure acquire an original sentence to be translated and translation language information of the original sentence, determine a target adapter corresponding to the translation language information of the original sentence and used for correcting a preset translation error of the multilingual machine translation model, and translate the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence. According to the embodiment of the disclosure, by adopting the technical scheme, the adapter is adopted to correct the translation turning error of the multi-language machine translation model, so that the accuracy of the translation result output by the multi-language translation model can be improved.

Drawings

The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow chart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure;

FIG. 2 is a schematic view of an adapter according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a multi-language machine translation model according to an embodiment of the disclosure;

FIG. 4 is a schematic diagram of a connection relationship between target adapters according to an embodiment of the disclosure;

FIG. 5 is a flow chart of another translation method based on a multilingual machine translation model according to an embodiment of the present disclosure;

FIG. 6 is a block diagram of a translation device based on a multilingual machine translation model according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.

It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.

Fig. 1 is a flow chart of a translation method based on a multilingual machine translation model according to an embodiment of the present disclosure. The method may be performed by a translation device based on a multilingual machine translation model, wherein the device may be implemented in software and/or hardware, may be configured in an electronic device, typically in a cell phone, tablet or computer device. As shown in fig. 1, the translation method based on the multilingual machine translation model provided in this embodiment may include:

s101, acquiring an original sentence to be translated and translation language information of the original sentence.

The original sentence is a sentence to be translated, which can be input by a user through an input device such as a keyboard or the like or recognized through a text recognition or voice recognition mode, that is, when the user needs to translate a sentence, the sentence can be input into the electronic device through a text input or voice input mode, or a picture containing the sentence can be shot or a text containing the sentence can be acquired, and the picture or the text can be imported into the electronic device. Accordingly, the translation method based on the multilingual machine translation model provided in the embodiment may translate text or voice input by a user, or translate sentences included in a picture or text input by the user, and when translating the voice input by the user or sentences in the picture, the sentences may be translated after being converted into original sentences in a text form, and the following description will be given by taking the input of the original sentences by the user through the text input method as an example. The translation language information of the original sentence can be understood as translation direction information when the translation is performed at this time, and the translation direction information can include the original language information of the translation (i.e. language information of the original sentence to be translated) and target language information (i.e. language information of the target sentence to which the original sentence needs to be translated), where the language information can be, for example, english, chinese, german, or the like.

When a user needs to translate a certain original sentence, inputting the original sentence, language information of an original language to which the original sentence belongs and language information of a target language to which the original sentence needs to be translated into electronic equipment, so as to generate a translation instruction for the original sentence; correspondingly, when receiving a translation instruction for an original sentence, the electronic device acquires the original sentence, and determines translation language information of the original sentence, for example, language information of an original language selected by a user in a translation page is determined as the original language information, and language information of a target language selected by the user in the translation page is determined as the target translation language information.

S102, determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the translation error of a preset multilingual machine translation model.

In this embodiment, after the multi-language machine translation model is obtained through training, an adapter of the multi-language machine translation model under different translation scenes (i.e. when the multi-language machine translation model is used for translating the original sentence under a certain translation scene) may be further set for the multi-language machine translation model, and when the original sentence under a certain translation scene is translated, the adapter corresponding to the translation scene is used for correcting the translation error of the multi-language machine translation model, so that the accuracy of the translation result output by the multi-language machine translation model is improved. And, because the parameter quantity of the adapter is very small (the parameter quantity is less than one twentieth of the multi-language machine translation model, and the larger the multi-language machine translation model is, the smaller the ratio is), the translation error of the multi-language machine translation model is corrected by the mode of configuring the adapter, the added parameter quantity is very small, the deployment is convenient,

Specifically, after obtaining the translation language information of the original sentence, the electronic device may obtain, from among the preset adapters, the adapter corresponding to the translation language information as the target adapter according to the translation language information.

The number of the adapters corresponding to a certain translation language information may be one or more, that is, the embodiment may only set one adapter corresponding to each translation language information, and accordingly, when translating a certain original sentence corresponding to the translation language information, only the adapter may be used to correct a translation error of the multi-language machine translation model; a plurality of adapters corresponding to each translation language information may be set, and accordingly, when translating an original sentence corresponding to the translation language information, the plurality of adapters may be used to correct a translation error of the multi-language machine translation model, thereby further improving accuracy of a translation result output by the multi-language machine translation model, which will be described below as an example. Here, when the translation language information is different, the adapter employed may be different; the structure of the adapter may be flexibly selected, for example, each adapter may include a planning layer, a first feedforward layer and a second feedforward layer that are sequentially connected, and an activation function may be configured between the first feedforward layer and the second feedforward layer, where the activation function may be a gaussian error linear unit (Gaussian Error Linear Unit, GULU), as shown in fig. 2.

In this embodiment, the type of the multilingual machine translation model may be set as required, for example, the multilingual machine translation model may be a transducer model, as shown in fig. 3 (only one encoder and one decoder are shown in fig. 3 by way of example), and the multilingual machine translation model may include at least one encoder and at least one decoder, preferably include multiple encoders and multiple decoders, for example, include 6 encoders and 6 decoders, where each encoder is provided with at least two encoder sublayers of a self-focusing layer and a feed-forward layer, each decoder is provided with at least three decoder sublayers of a self-focusing layer, a coding-decoding-focusing layer and a feed-forward layer, each encoder is connected in series, and the feed-forward layer of the last encoder is connected with the coding-decoding-focusing layer of each decoder.

For example, when setting an adapter corresponding to a certain translation language information for the multilingual machine translation model, a corresponding adapter may be set for each encoder and decoder in the language machine translation model; corresponding adapters may be provided for each of the encoder sub-layer and the decoder sub-layer of the language machine translation model, and the present implementation is not limited thereto. In order to further reduce the number of adapters required to be set by the multilingual machine translation model on the premise of ensuring the accuracy of the translation result output by the multilingual machine translation model, the present embodiment may divide each encoder sublayer in each encoder into one or more encoder sublayer components, divide each decoder sublayer in each decoder into one or more decoder sublayer components, and set corresponding adapters for each encoder sublayer component and each decoder sublayer component, respectively. In this case, preferably, the multilingual machine translation model includes an encoder and a decoder, the encoder includes at least one encoder sublayer component, and the encoder sublayer component is formed by at least one encoder sublayer; the decoder comprises at least one decoder sublayer assembly, the decoder sublayer assembly is composed of at least one decoder sublayer, and each encoder sublayer assembly and each decoder sublayer assembly are provided with different adapters corresponding to different translation language information. Wherein the values of parameters in the adapters corresponding to different sublayers may be different; the self-attention layer and the feedforward layer in each encoder may individually constitute an encoder sub-layer assembly, the self-attention layer and the codec attention layer in each decoder may collectively constitute a decoder sub-layer assembly, and the feedforward layer in each decoder may individually constitute a decoder sub-layer assembly, as shown in fig. 4 (fig. 4 only shows one encoder and one decoder by way of example).

In one embodiment, before the obtaining the original sentence to be translated and the target language information to be translated, the method further includes: for each translation language information, a plurality of training samples conforming to the translation language information are acquired, and each training sample is input into a multi-language machine translation model to train to obtain an adapter corresponding to the translation language information.

In the above embodiment, the adapters corresponding to each sublayer component (including the encoder sublayer component and the decoder sublayer component) under different translation language information may be obtained through training. Specifically, for the adapter corresponding to each sub-layer component under each translation language information, after the multi-language machine translation model is trained, each parameter in the multi-language machine translation model is fixed, an original adapter is set for each sub-layer component of the multi-language machine translation model, a training sample conforming to the translation language information is adopted to train each original adapter, a test sample is adopted to test the translation error of the multi-language machine translation model, and the adapter corresponding to each sub-layer component when the translation error of the multi-language machine translation model is smaller than a preset error threshold value is determined to be the adapter corresponding to the multi-language machine translation model under the translation language information.

S103, translating the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence.

The target sentence is a sentence obtained by translating the original sentence.

For example, under a certain translation language information, when the multi-language machine translation model is configured with only one adapter corresponding to the multi-language machine translation model, data output by the multi-language machine translation model can be obtained, and the target adapter is adopted to correct the data to obtain a target sentence; when each encoder and each decoder in the multi-language machine translation model are configured with a corresponding adapter, after a certain encoder/decoder outputs data, the data can be corrected by adopting the corresponding adapter of the encoder/decoder, the corrected data is input into the next layer, and a sentence output by the multi-language machine translation model is determined as a target sentence; when each of the encoder and each of the sub-layer components in the multi-language machine translation model is configured with a corresponding adapter, after a certain sub-layer component outputs original data, the adapter corresponding to the sub-layer component may be used to correct the original data, the corrected target data is input into a next layer, and a sentence output by the multi-language machine translation model is determined as a target sentence, where preferably, the translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence includes: and translating the original sentence by adopting the multi-language machine translation model, and correcting the original output data of each sublayer component by adopting a first target adapter of each sublayer component to obtain a target sentence, wherein the sublayer component comprises an encoder sublayer component and/or a decoder sublayer component.

According to the translation method based on the multi-language machine translation model, an original sentence to be translated and translation language information of the original sentence are obtained, a target adapter which corresponds to the translation language information of the original sentence and is used for correcting a preset translation error of the multi-language machine translation model is determined, and the original sentence is translated based on the multi-language machine translation model and the target adapter to obtain a target sentence. According to the embodiment, by adopting the technical scheme, the adapter is adopted to correct the translation turning error of the multi-language machine translation model, so that the accuracy of the translation result output by the multi-language translation model can be improved.

Fig. 5 is a flow chart of another translation method based on a multilingual machine translation model according to an embodiment of the present disclosure, where the solution in the present embodiment may be combined with one or more alternatives of the above embodiments. Optionally, the translating the original sentence by using the multilingual machine translation model, and correcting the original output data of each sub-layer component by using a first target adapter of the sub-layer component to obtain a target sentence, including: according to the connection relation of each sublayer component in a multi-language machine translation model, determining the first sublayer component in the multi-language machine translation model as a current sublayer component, and acquiring target input data of the current sublayer component; determining a first target adapter of the current sublayer component as a current target adapter; respectively inputting the target input data into the current sublayer assembly and the current target adapter to obtain the original output data of the current sublayer assembly and the current correction parameters output by the current target adapter; correcting the original output data by adopting the current correction parameters to obtain target output data of the current sublayer assembly; determining the target output data as target input data of a next-layer component, determining the next-layer component as a current-layer component, and returning to execute the operation of determining the target adapter of the current-layer component as the current target adapter until the next-layer component does not exist; and when the next sub-layer component does not exist, inputting the target output data of the current sub-layer component into the next layer of the current sub-layer component to obtain a target sentence.

Accordingly, as shown in fig. 5, the translation method based on the multilingual machine translation model provided in this embodiment may include:

S201, acquiring an original sentence to be translated and translation language information of the original sentence.

S202, determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the translation error of a preset multilingual machine translation model.

S203, determining a first sublayer component in the multi-language machine translation model as a current sublayer component according to the connection relation of each sublayer component in the multi-language machine translation model, and acquiring target input data of the current sublayer component.

Wherein the target input data may be understood as data output into the current sub-layer component.

For example, when the original sentence is translated, the original sentence may be output to the multi-language machine translation model, and according to the connection relationship between the layers of the multi-language machine translation model, the layers are sequentially controlled to process the information output by the previous layer, and when the target output data of the previous layer of the first sublayer component in the multi-language machine translation model is obtained, the first sublayer component is determined to be the current sublayer component, and the target output data of the previous layer is determined to be the target input data of the current sublayer component.

S204, determining the first target adapter of the current sublayer component as the current target adapter.

The first target adapter may be understood as an adapter configured by a sublayer component for correcting original output data of the sublayer component, i.e. an adapter configured by an encoder sublayer component of an encoder or a decoder sublayer component of a decoder. In this embodiment, each target adapter corresponding to the translation language information of the original sentence may include a first target adapter configured by each sub-layer component in the multilingual machine translation model; a second target adapter configured by a word embedding layer (comprising an input word embedding layer and an output word embedding layer) in the multilingual machine translation model can be further included.

Specifically, from the target adapters determined in S202, a first target adapter corresponding to the component identification information of the current sub-layer component may be selected as the current target adapter according to the component identification information of the current sub-layer component.

S205, respectively inputting the target input data into the current sublayer assembly and the current target adapter to obtain the original output data of the current sublayer assembly and the current correction parameters output by the current target adapter.

The original output data of the current sub-layer component can be understood as data before correction obtained by calculating the target input data of the current sub-layer component. The current correction parameter may be understood as a parameter for correcting the original output data of the current sub-layer component, which may be calculated by the current target adapter configured by the current sub-layer component from the target input data of the current sub-layer component.

For example, after determining the current target adapter of the current sublayer assembly, the target input data of the current sublayer assembly may be respectively input into the current sublayer assembly and the current target adapter, and the data output by the current sublayer assembly may be obtained as the original output data of the current sublayer assembly, and the data output by the current target adapter may be obtained as the current correction parameter.

S206, correcting the original output data by adopting the current correction parameters to obtain target output data of the current sublayer assembly.

For example, the current correction parameters may be used to correct the original output data, such as correcting the original output data to be the sum of the original output data and the current correction parameters, and determining the data corrected from the original output data as the target output data of the current sub-layer component.

S207, judging whether a next sub-layer component exists, if so, executing S208; if not, S209 is performed.

For example, according to the connection relationship of each sublayer component in the multilingual machine translation model, whether the next layer connected to the output end of the current sublayer component is located in a sublayer component can be judged, if yes, the next sublayer component is determined to exist, and the sublayer component to which the next layer belongs is determined to be the next sublayer component.

It can be understood that, if the output end of the current sublayer component is connected to the input ends of a plurality of sublayers, the sublayer component, where each sublayer connected to the output end of the current sublayer component is located, can be determined as the next sublayer component according to the flow direction of data in the multilingual machine translation model.

S208, determining the target output data as target input data of a next-layer component, determining the next-layer component as a current-layer component, and returning to S204.

S209, inputting the target output data of the current sublayer component into the next layer of the current sublayer component to obtain a target sentence.

For example, as shown in fig. 4, when the output end of the last sublayer component in the multi-language machine translation model is connected to the output layer of the multi-language machine translation model, if the current sublayer component does not have the next sublayer component, it is indicated that the current sublayer component is the last sublayer component in the multi-language machine translation model, at this time, after obtaining the target output data of the current sublayer component, the electronic device may output the target output data to the output layer of the multi-language machine translation model, so that the multi-language machine translation model may output the target sentence obtained by translating the original sentence through the output layer.

In one embodiment, the multi-language machine translation model further comprises an input word embedding layer and an output word embedding layer, wherein an output end of the input word embedding layer is connected with an input end of a first encoder sublayer component in the multi-language machine translation model, and an output end of the output word embedding layer is connected with an input end of a first decoder sublayer component in the multi-language machine translation model.

In the above embodiment, as shown in fig. 3, an input word embedding layer and an output word embedding layer may be further provided in the multi-language machine translation model, where an output end of the input word embedding layer may be connected to an input end of the self-attention layer of the first encoder in the multi-language machine translation model, and an output end of the output word embedding layer may be connected to an input end of the self-attention layer of the first decoder in the multi-language machine translation model.

In order to further improve accuracy of the translation result output by the multi-language machine translation model, in the above embodiment, as shown in fig. 4, corresponding adapters may be further provided for the input word embedding layer and the output word embedding layer, so as to help the input word embedding layer and the output word embedding layer of the multi-language machine translation model better model word semantics. At this time, preferably, the translation method based on the multilingual machine translation model further includes: when original word embedding output data of a word embedding layer are received, inputting the original word embedding output data into a second target adapter of the word embedding layer, and acquiring word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; correcting the original word embedding output data by adopting the word embedding correction parameters to obtain target word embedding output data of the word embedding layer, wherein the target word embedding output data is used as target input data of a sublayer component connected with the word embedding layer.

The second target adapter may be understood as an adapter configured by the input word embedding layer or the output word embedding layer and used for correcting the original word embedding output data of the input word embedding layer or the output word embedding layer. The original word embedding output data may be understood as data output by the word embedding layer.

After obtaining the first original word embedding output data output by the input word embedding layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the input word embedding layer, input the first original word embedding output data into the second target adapter, and obtain the data output by the second target adapter as a word embedding correction parameter; and correcting the first original word embedded output data by adopting the word embedded correction parameters, taking the corrected first original word embedded output data as target word embedded output data of an input word embedded layer, and inputting the target word embedded output data into a self-attention layer connected with an output end of the input word embedded layer. After obtaining the second original word embedded output data output by the output word embedded layer, the electronic device may first obtain a second target adapter corresponding to the identification information of the output word embedded layer, input the second original word embedded output data into the second target adapter, and obtain the data output by the second target adapter as a word embedded correction parameter; and correcting the second original word embedded output data by adopting the word embedded correction parameters, taking the corrected second original word embedded output data as target word embedded output data of an output word embedded layer, and inputting the target word embedded output data into a self-attention layer connected with an output end of the output word embedded layer.

According to the translation method based on the multi-language machine translation model, the adapters corresponding to different translation language information are respectively arranged for the encoder assemblies and the decoder assemblies in the multi-language machine translation model, and when the original sentence is translated, the adapters arranged by the encoder assemblies and the decoder assemblies are respectively adopted to correct data output by the corresponding encoder or decoder, so that the translation accuracy of the multi-language machine translation model can be further improved on the premise of adding fewer parameters.

Fig. 6 is a block diagram of a translation device based on a multilingual machine translation model according to an embodiment of the present disclosure. The device can be realized by software and/or hardware, can be configured in electronic equipment, typically in a mobile phone, a tablet computer or a computer equipment, and can perform sentence translation by executing a translation method based on a multilingual machine translation model. As shown in fig. 6, the translation device based on the multilingual machine translation model provided in the present embodiment may include: a statement acquisition module 601, an adapter determination module 602, and a translation module 603, wherein,

A sentence acquisition module 601, configured to acquire an original sentence to be translated and translation language information of the original sentence;

An adapter determining module 602, configured to determine a target adapter corresponding to translation language information of the original sentence, where the target adapter is configured to correct a preset translation error of a multi-language machine translation model;

and the translation module 603 is configured to translate the original sentence based on the multilingual machine translation model and the target adapter, to obtain a target sentence.

According to the translation device based on the multi-language machine translation model, an original sentence to be translated and translation language information of the original sentence are obtained through the sentence obtaining module, a target adapter which corresponds to the translation language information of the original sentence and is used for correcting a preset translation error of the multi-language machine translation model is determined through the adapter determining module, and the original sentence is translated through the translation module based on the multi-language machine translation model and the target adapter, so that a target sentence is obtained. According to the embodiment, by adopting the technical scheme, the adapter is adopted to correct the translation turning error of the multi-language machine translation model, so that the accuracy of the translation result output by the multi-language translation model can be improved.

Optionally, the multilingual machine translation model includes an encoder and a decoder, wherein the encoder includes at least one encoder sublayer component, and the encoder sublayer component is formed by at least one encoder sublayer; the decoder comprises at least one decoder sublayer assembly, the decoder sublayer assembly is composed of at least one decoder sublayer, and each encoder sublayer assembly and each decoder sublayer assembly are provided with different adapters corresponding to different translation language information.

Optionally, the translation module 603 is specifically configured to: and translating the original sentence by adopting the multi-language machine translation model, and correcting the original output value of each sublayer component by adopting a first target adapter of each sublayer component to obtain a target sentence, wherein the sublayer component comprises an encoder sublayer component and/or a decoder sublayer component.

Optionally, the translation module 603 includes: the component determining unit is used for determining a first sublayer component in the multi-language machine translation model as a current sublayer component according to the connection relation of each sublayer component in the multi-language machine translation model and acquiring target input data of the current sublayer component; an adapter acquisition unit, configured to determine a first target adapter of the current sublayer component as a current target adapter; the parameter determining unit is used for respectively inputting the target input data into the current sublayer assembly and the current target adapter to obtain the original output data of the current sublayer assembly and the current correction parameters output by the current target adapter; the correction unit is used for correcting the original output data by adopting the current correction parameters to obtain target output data of the current sublayer assembly; the calling unit is used for determining the target output data as target input data of a next-layer assembly, determining the next-layer assembly as a current-layer assembly, and calling the adapter acquisition unit back until the next-layer assembly does not exist; and the input unit is used for inputting the target output data of the current sublayer component into the next layer of the current sublayer component when the next sublayer component does not exist, so as to obtain a target sentence.

In the above scheme, the multi-language machine translation model further includes an input word embedding layer and an output word embedding layer, wherein an output end of the input word embedding layer is connected to an input end of a first encoder sublayer component in the multi-language machine translation model, and an output end of the output word embedding layer is connected to an input end of a first decoder sublayer component in the multi-language machine translation model.

Optionally, the translation device based on the multilingual machine translation model provided in this embodiment further includes: the adapter input module is used for inputting the original word embedding output data into a second target adapter of the word embedding layer when receiving the original word embedding output data of the word embedding layer, and acquiring word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer; and the embedding layer correction module is used for correcting the original word embedding output data by adopting the word embedding correction parameters to obtain target word embedding output data of the word embedding layer, so that the target word embedding output data is used as target input data of a sublayer component connected with the word embedding layer.

Optionally, the translation device based on the multilingual machine translation model provided in this embodiment further includes: the adapter training module is used for acquiring a plurality of training samples conforming to the translation language information for each translation language information before acquiring the original sentence to be translated and the target language information to be translated, and inputting each training sample into a multi-language machine translation model so as to train and obtain an adapter corresponding to the translation language information.

The translation device based on the multilingual machine translation model provided by the embodiment of the disclosure can execute the translation method based on the multilingual machine translation model provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the translation method based on the multilingual machine translation model. Technical details not described in detail in this embodiment may be found in the translation method based on the multilingual machine translation model provided in any embodiment of the present disclosure.

Referring now to fig. 7, a schematic diagram of an electronic device (e.g., terminal device) 700 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.

As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 706 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 706 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 706, or installed from ROM 702. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 701.

It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.

The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring an original sentence to be translated and translation language information of the original sentence; determining a target adapter corresponding to the translation language information of the original sentence, wherein the target adapter is used for correcting the translation error of a preset multi-language machine translation model; and translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence.

Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. Wherein the name of the module does not constitute a limitation of the unit itself in some cases.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

In accordance with one or more embodiments of the present disclosure, example 1 provides a translation method based on a multilingual machine translation model, comprising:

According to one or more embodiments of the present disclosure, example 2 the method of example 1, the multi-language machine translation model comprising an encoder and a decoder, the encoder including at least one encoder sublayer component therein, the encoder sublayer component being comprised of at least one encoder sublayer; the decoder comprises at least one decoder sublayer assembly, the decoder sublayer assembly is composed of at least one decoder sublayer, and each encoder sublayer assembly and each decoder sublayer assembly are provided with different adapters corresponding to different translation language information.

According to one or more embodiments of the present disclosure, example 3 is the method of example 2, the translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence, including:

And translating the original sentence by adopting the multi-language machine translation model, and correcting the original output value of each sublayer component by adopting a first target adapter of each sublayer component to obtain a target sentence, wherein the sublayer component comprises an encoder sublayer component and/or a decoder sublayer component.

According to one or more embodiments of the present disclosure, example 4 is the method of example 3, the translating the original sentence using the multi-language machine translation model, and correcting the original output value of each sub-layer component using the first target adapter of the sub-layer component to obtain a target sentence, including:

according to the connection relation of each sublayer component in a multi-language machine translation model, determining the first sublayer component in the multi-language machine translation model as a current sublayer component, and acquiring target input data of the current sublayer component;

determining a first target adapter of the current sublayer component as a current target adapter;

Respectively inputting the target input data into the current sublayer assembly and the current target adapter to obtain the original output data of the current sublayer assembly and the current correction parameters output by the current target adapter;

Correcting the original output data by adopting the current correction parameters to obtain target output data of the current sublayer assembly;

determining the target output data as target input data of a next-layer component, determining the next-layer component as a current-layer component, and returning to execute the operation of determining the target adapter of the current-layer component as the current target adapter until the next-layer component does not exist;

And when the next sub-layer component does not exist, inputting the target output data of the current sub-layer component into the next layer of the current sub-layer component to obtain a target sentence.

According to one or more embodiments of the present disclosure, example 5 is the method of example 3 or 4, the multi-language machine translation model further comprising an input word embedding layer and an output word embedding layer, the output of the input word embedding layer being connected to the input of a first encoder sublayer component in the multi-language machine translation model, the output of the output word embedding layer being connected to the input of a first decoder sublayer component in the multi-language machine translation model.

According to one or more embodiments of the present disclosure, example 6 is the method of example 5, further comprising:

When original word embedding output data of a word embedding layer are received, inputting the original word embedding output data into a second target adapter of the word embedding layer, and acquiring word embedding correction parameters output by the second target adapter, wherein the word embedding layer is an input word embedding layer or an output word embedding layer;

Correcting the original word embedding output data by adopting the word embedding correction parameters to obtain target word embedding output data of the word embedding layer, wherein the target word embedding output data is used as target input data of a sublayer component connected with the word embedding layer.

According to one or more embodiments of the present disclosure, example 7, the method according to any one of examples 1 to 4, further includes, before the obtaining the original sentence to be translated and the target language information to be translated:

For each translation language information, a plurality of training samples conforming to the translation language information are acquired, and each training sample is input into a multi-language machine translation model to train to obtain an adapter corresponding to the translation language information.

Example 8 provides a translation apparatus based on a multilingual machine translation model, according to one or more embodiments of the present disclosure, comprising:

Example 9 provides an electronic device according to one or more embodiments of the present disclosure, comprising:

one or more processors;

A memory for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement a translation method based on a multilingual machine translation model as set forth in any one of examples 1-7.

In accordance with one or more embodiments of the present disclosure, example 10 provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a multi-language machine translation model-based translation method as described in any of examples 1-7.

The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A method of translating based on a multilingual machine translation model, comprising:

Acquiring an original sentence to be translated and translation language information of the original sentence; the translation language information of the original sentence is translation direction information in the translation process, and the translation direction information comprises original language information and target language information of the translation process;

Translating the original sentence based on the multi-language machine translation model and the target adapter to obtain a target sentence;

the translating the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence comprises the following steps:

Translating the original sentence by adopting the multi-language machine translation model, and correcting an original output value of each sublayer assembly by adopting a first target adapter of each sublayer assembly to obtain a target sentence, wherein the sublayer assembly comprises an encoder sublayer assembly and/or a decoder sublayer assembly;

The translating the original sentence by adopting the multilingual machine translation model, and correcting the original output value of each sub-layer component by adopting a first target adapter of each sub-layer component to obtain a target sentence, including:

the current correction parameters are parameters corrected according to the original output data of the current sublayer assembly, and the current target adapter configured by the current sublayer assembly is obtained by calculation according to the target input data of the current sublayer assembly;

2. The method of claim 1, wherein the multi-language machine translation model comprises an encoder and a decoder, the encoder comprising at least one encoder sublayer component, the encoder sublayer component being comprised of at least one encoder sublayer; the decoder comprises at least one decoder sublayer assembly, the decoder sublayer assembly is composed of at least one decoder sublayer, and each encoder sublayer assembly and each decoder sublayer assembly are provided with different adapters corresponding to different translation language information.

3. The method of claim 1, wherein the multi-language machine translation model further comprises an input word embedding layer and an output word embedding layer, the output of the input word embedding layer being coupled to the input of a first encoder sublayer component in the multi-language machine translation model, the output of the output word embedding layer being coupled to the input of a first decoder sublayer component in the multi-language machine translation model.

4. A method according to claim 3, further comprising:

5. The method according to any one of claims 1-2, further comprising, prior to said obtaining the original sentence to be translated and the target language information to be translated:

6. A multi-language machine translation model-based translation device, comprising:

The sentence acquisition module is used for acquiring an original sentence to be translated and translation language information of the original sentence; the translation language information of the original sentence is translation direction information in the translation process, and the translation direction information comprises original language information and target language information of the translation process;

The translation module is used for translating the original sentence based on the multilingual machine translation model and the target adapter to obtain a target sentence;

The translation module is specifically used for: translating the original sentence by adopting the multi-language machine translation model, and correcting an original output value of each sublayer assembly by adopting a first target adapter of each sublayer assembly to obtain a target sentence, wherein the sublayer assembly comprises an encoder sublayer assembly and/or a decoder sublayer assembly;

The translation module comprises:

The component determining unit is used for determining a first sublayer component in the multi-language machine translation model as a current sublayer component according to the connection relation of each sublayer component in the multi-language machine translation model and acquiring target input data of the current sublayer component;

an adapter acquisition unit, configured to determine a first target adapter of the current sublayer component as a current target adapter;

The parameter determining unit is used for respectively inputting the target input data into the current sublayer assembly and the current target adapter to obtain the original output data of the current sublayer assembly and the current correction parameter output by the current target adapter, wherein the current correction parameter is a parameter corrected according to the original output data of the current sublayer assembly, and the current target adapter configured by the current sublayer assembly is obtained by calculation according to the target input data of the current sublayer assembly;

The correction unit is used for correcting the original output data by adopting the current correction parameters to obtain target output data of the current sublayer assembly;

The calling unit is used for determining the target output data as target input data of a next-layer assembly, determining the next-layer assembly as a current-layer assembly, and calling the adapter acquisition unit back until the next-layer assembly does not exist;

and the input unit is used for inputting the target output data of the current sublayer component into the next layer of the current sublayer component when the next sublayer component does not exist, so as to obtain a target sentence.

7. An electronic device, comprising:

one or more processors;

A memory for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the multi-language machine translation model based translation method of any one of claims 1-5.

8. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements a translation method based on a multilingual machine translation model as claimed in any one of claims 1 to 5.