CN115879469A

CN115879469A - Text data processing method, model training method, device and medium

Info

Publication number: CN115879469A
Application number: CN202211737328.2A
Authority: CN
Inventors: 高杨帆; 孙辉丰; 孙叔琦; 常月
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-12-30
Filing date: 2022-12-30
Publication date: 2023-03-31
Anticipated expiration: 2042-12-30
Also published as: CN115879469B

Abstract

The present disclosure provides a text data processing method, a model training method, an apparatus and a medium, which relate to the technical field of artificial intelligence, and in particular to the fields of text data processing, deep learning, natural language processing and dialog systems. The implementation scheme is as follows: generating an original text for replying to the input text based on the input text of the user; acquiring target style information; and generating a target text corresponding to the target style based on the original text and the target style information.

Description

Text data processing method, model training method, device and medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of text data processing, deep learning, natural language processing, and dialog systems, and in particular, to a text data processing method, a model training method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product.

Background

Artificial intelligence is the subject of research that causes computers to simulate certain human mental processes and intelligent behaviors (such as learning, reasoning, thinking, planning, etc.), both at the hardware level and at the software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.

In the dialogue chatting system, a dialogue robot needs to generate a reasonable reply with rich contents according to the expression of a user, and the user has strong perception requirements on the reply style of the robot, for example, ordinary replies that "i are not stubborn eggs, i are smart eggs" and vivid style replies that "i can not be stubborn eggs, i are smart eggs" can give the user a completely different use experience, and the robot has an expression mode close to a real person is one of the targets pursued by the dialogue system. At present, many conversation robots actually have TTS (Text To Speech, from Text To Speech) voice packets with different styles, and the source of the TTS is still a Text, so that a user can feel the human design style and expression characteristics of the robot in front of a screen in the conversation process no matter the consistency of the Text and the Speech on the style or the consistency of all the styles of all replies in the history of the robot is ensured.

The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, the problems mentioned in this section should not be considered as having been acknowledged in any prior art.

Disclosure of Invention

The present disclosure provides a text data processing method, a model training method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product.

According to an aspect of the present disclosure, there is provided a text data processing method including: generating an original text for replying to the input text based on the input text of the user; acquiring target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style to be converted of an original text, and the target style dictionary comprises at least one corpus text corresponding to the target style; and generating a target text corresponding to the target style based on the original text and the target style information.

According to another aspect of the present disclosure, there is provided a model training method for converting an original text into a target-style text, including: the method comprises the steps of obtaining a sample data set, wherein the sample data set comprises at least one sample text pair, and each sample text pair in the at least one sample text pair comprises an original sample text and a target sample text corresponding to a target style; for each sample text pair in at least one sample text pair, acquiring a labeling sequence corresponding to the sample text pair, wherein the labeling sequence comprises at least one operation label corresponding to at least one character of an original sample text in the sample text pair, the at least one operation label comprises a reserved label and a modified label, the reserved label is used for indicating characters needing to be reserved compared with a target sample text in the sample text pair, the modified label comprises an insertion label, and the insertion label is used for indicating characters needing to be inserted in the original sample text compared with the target sample text; determining characters corresponding to the inserted labels in the labeling sequences corresponding to each sample text pair in at least one sample text pair as corpus texts to construct a target style dictionary corresponding to a target style; and for each sample text pair in the sample data set, performing the following operations: inputting the corpus text in the target style dictionary, the original sample text in the sample text pair and the target sample text into a model to obtain a prediction result of a labeling sequence output by the model; and training a model based on the prediction result of the labeling sequence and the labeling sequence corresponding to the sample text pair.

According to another aspect of the present disclosure, there is provided a model training method, including: the method comprises the steps of obtaining a sample data set, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the plurality of target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and for each sample text pair in the sample data set, performing the following operations: inputting an original sample text, a target sample text and a target style label corresponding to the sample text pair in the sample text pair into a model to obtain a target text prediction result output by the model; and training a model based on the target text prediction result and the target sample text in the sample text pair.

According to another aspect of the present disclosure, there is provided a text data processing apparatus including: a first generating unit configured to generate an original text for replying to an input text based on the input text of a user; a first obtaining unit, configured to obtain target style information, where the target style information includes at least one of a target style tag and a target style dictionary, the target style tag is used to indicate a target style to be converted from an original text, and the target style dictionary includes at least one corpus text corresponding to a target style; and a second generating unit configured to generate a target text corresponding to the target style based on the original text and the target style information.

According to another aspect of the present disclosure, there is provided a model training apparatus for converting an original text into a target-style text, comprising: the second obtaining unit is configured to obtain a sample data set, wherein the sample data set comprises at least one sample text pair, and each sample text pair in the at least one sample text pair comprises an original sample text and a target sample text corresponding to a target style; a third obtaining unit, configured to obtain, for each sample text pair in at least one sample text pair, a labeling sequence corresponding to the sample text pair, where the labeling sequence includes at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, and the at least one operation tag includes a retention tag and a modification tag, the retention tag is used to indicate a character that needs to be retained compared with a target sample text in the sample text pair, and the modification tag includes an insertion tag used to indicate a character that needs to be inserted in the original sample text compared with the target sample text in the sample text pair; the determining unit is configured to determine characters corresponding to the inserted labels in the labeling sequences corresponding to each sample text pair in at least one sample text pair as corpus texts so as to construct a target style dictionary corresponding to a target style; and a first execution unit configured to, for each sample text pair in the sample data set, perform the following sub-unit operations, the first execution unit comprising: the first input subunit is configured to input the corpus text in the target style dictionary, the original sample text in the sample text pair and the target sample text into a model so as to obtain a labeling sequence prediction result output by the model; and a first training subunit configured to train a model based on the annotation sequence prediction result and an annotation sequence corresponding to the sample text pair.

According to another aspect of the present disclosure, there is provided a model training apparatus including: the fourth acquisition unit is configured to acquire a sample data set, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the plurality of target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and a second execution unit configured to, for each sample text pair in the sample data set, perform the following sub-unit operations, the second execution unit comprising: the second input subunit is configured to input the original sample text, the target sample text and the target style label corresponding to the sample text pair in the model so as to obtain a target text prediction result output by the model; and a second training subunit configured to train a model based on the target text prediction result and the target sample text in the sample text pair.

According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the text data processing method or the model training method described above.

According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the above text data processing method or model training method.

According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program, wherein the computer program, when executed by a processor, implements the above-described text data processing method or model training method.

According to one or more embodiments of the present disclosure, the complexity of model training can be reduced, and the model training efficiency can be improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the embodiments and, together with the description, serve to explain the exemplary implementations of the embodiments. The illustrated embodiments are for purposes of example only and do not limit the scope of the claims. Throughout the drawings, identical reference numbers designate similar, but not necessarily identical, elements.

FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, according to an embodiment of the present disclosure;

FIG. 2 shows a flow diagram of a text data processing method according to an embodiment of the present disclosure;

FIG. 3 shows an architectural schematic of a dialog system according to an exemplary embodiment of the present disclosure;

FIG. 4 shows a flow diagram of a method of generating target text in accordance with an embodiment of the present disclosure;

FIG. 5 shows a schematic diagram of a first model according to an exemplary embodiment of the present disclosure;

FIG. 6 shows an architectural diagram of a second model in accordance with an exemplary embodiment of the present disclosure;

FIG. 7 shows a flow diagram of a model training method according to an embodiment of the present disclosure;

FIG. 8 shows a flow diagram of a model training method according to an embodiment of the present disclosure;

fig. 9 shows a block diagram of a structure of a text data processing apparatus according to an embodiment of the present disclosure;

FIG. 10 shows a block diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 11 shows a block diagram of a model training apparatus according to an embodiment of the present disclosure;

FIG. 12 illustrates a block diagram of an exemplary electronic device that can be used to implement embodiments of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, it will be recognized by those of ordinary skill in the art that various changes and modifications may be made to the embodiments described herein without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the present disclosure, unless otherwise specified, the use of the terms "first", "second", and the like to describe various elements is not intended to limit the positional relationship, the temporal relationship, or the importance relationship of the elements, and such terms are used only to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, while in some cases they may refer to different instances based on the context of the description.

The terminology used in the description of the various described examples in this disclosure is for the purpose of describing the particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, if the number of elements is not specifically limited, the elements may be one or more. Furthermore, the term "and/or" as used in this disclosure is intended to encompass any and all possible combinations of the listed items.

In the related art, a dialog large model with a style conversion function is trained to directly generate a reply text after style conversion, a large amount of style corpora and computing resources are required, the training cost and complexity are high, and the training efficiency is low.

Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented in accordance with embodiments of the present disclosure. Referring to fig. 1, the system 100 includes one or

more client devices

101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120.

Client devices

101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.

In embodiments of the present disclosure, server 120 may run one or more services or software applications that enable the text data processing methods, model training methods described above to be performed.

In some embodiments, the server 120 may also provide other services or software applications, which may include non-virtual environments and virtual environments. In certain embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of

client devices

101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.

In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof, which may be executed by one or more processors. A user operating a

client device

101, 102, 103, 104, 105, and/or 106 may, in turn, utilize one or more client applications to interact with the server 120 to take advantage of the services provided by these components. It should be understood that a variety of different system configurations are possible, which may differ from system 100. Accordingly, fig. 1 is one example of a system for implementing the various methods described herein, and is not intended to be limiting.

The user may use

client devices

101, 102, 103, 104, 105, and/or 106 to enter textual data. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that any number of client devices may be supported by the present disclosure.

Client devices

101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, smart screen devices, self-service terminal devices, service robots, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and so forth. These computer devices may run various types and versions of software applications and operating systems, such as MICROSOFT Windows, APPLE iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., GOOGLE Chrome OS); or include various Mobile operating systems such as MICROSOFT Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head-mounted displays (such as smart glasses) and other devices. The gaming system may include a variety of handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols.

Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a variety of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. By way of example only, one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a blockchain network, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.

The server 120 may include one or more general purpose computers, special purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), blade servers, mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture involving virtualization (e.g., one or more flexible pools of logical storage that may be virtualized to maintain virtual storage for the server). In various embodiments, the server 120 may run one or more services or software applications that provide the functionality described below.

The computing units in server 120 may run one or more operating systems including any of the operating systems described above, as well as any commercially available server operating systems. The server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, and the like.

In some implementations, the server 120 can include one or more applications to analyze and consolidate data feeds and/or event updates received from users of the

client devices

101, 102, 103, 104, 105, and/or 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of

client devices

101, 102, 103, 104, 105, and/or 106.

In some embodiments, the server 120 may be a server of a distributed system, or a server incorporating a blockchain. The server 120 may also be a cloud server, or a smart cloud computing server or a smart cloud host with artificial intelligence technology. The cloud Server is a host product in a cloud computing service system, and is used for solving the defects of high management difficulty and weak service expansibility in the traditional physical host and Virtual Private Server (VPS) service.

The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of the databases 130 may be used to store information such as audio files and video files. The database 130 may reside in various locations. For example, the database used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The database 130 may be of different types. In certain embodiments, the database used by the server 120 may be, for example, a relational database. One or more of these databases may store, update, and retrieve data to and from the database in response to the command.

In some embodiments, one or more of the databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key-value stores, object stores, or regular stores supported by a file system.

The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.

According to an embodiment of the present disclosure, as shown in fig. 2, there is provided a text data processing method including: step S201, generating an original text for replying the input text based on the input text of the user; step S202, obtaining target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style to be converted of an original text, and the target style dictionary comprises at least one corpus text corresponding to the target style; and step S203, generating a target text corresponding to the target style based on the original text and the target style information.

Therefore, the answer text is generated by the large dialogue model based on the input text of the user, and then the answer text is input into the text style conversion model to obtain the answer text after style conversion, so that retraining of the large dialogue model can be avoided, meanwhile, the style conversion model training can be completed only by aiming at a small number of style conversion samples, the complexity of model training is reduced, and the model training efficiency is improved.

In some embodiments, the input text may be obtained by receiving a text input or a voice input by the user. And after the target text corresponding to the target style is obtained, the target text can be displayed on the display device in a text form, or TTS voice generation can be performed based on the target text and is fed back to the user in a voice broadcast form. The style of the TTS voice packet can correspond to the target style.

Fig. 3 shows an architectural schematic of a dialog system according to an exemplary embodiment of the present disclosure.

In some exemplary embodiments, as shown in fig. 3, the input text of the user may be obtained by the conversation robot 310 and sent to the conversation model 320, and the conversation model 320 may generate an original reply text for replying to the input text based on the input of the user and the historical context information, and input the original reply text into the style conversion model 330 to convert the original reply text into a target reply text in a target style and return it to the conversation robot 310 to be replied to the user.

In some embodiments, the dialog model may apply a transform-based dialog model.

In some exemplary embodiments, the dialogue model can apply a PLATO dialogue large model, which applies a Unified-Transformer structure on a network architecture and can simultaneously carry out joint modeling of dialogue understanding and reply generation; by introducing the input representation of multi-role perception, consistency on multiple rounds of conversation is improved.

In some embodiments, the style conversion model may be an edit-based sequence annotation model.

In some embodiments, as shown in fig. 4, generating the target text corresponding to the target style based on the original text and the target style information may include: step S401, determining a first model corresponding to a target style based on a target style label, wherein the first model is obtained based on at least one first sample text pair and target style dictionary training, and each first sample text pair in the at least one first sample text pair comprises a first original text and a first target text corresponding to the target style; step S402, acquiring a character sequence of an original text, wherein the character sequence comprises at least one character of the original text; step S403, based on the target style dictionary, performing sequence labeling on the character sequence by using the first model to obtain a labeled sequence, wherein the labeled sequence includes at least one operation tag corresponding to at least one character, the at least one operation tag includes a reserved tag and an insertion tag, the reserved tag is used for indicating the character corresponding to the reserved tag, and the insertion tag corresponds to one of the at least one corpus text and is used for indicating that the corresponding corpus text is inserted into the character sequence; and step S404, generating a target text based on the labeling sequence.

Therefore, the input text is subjected to sequence labeling based on the corpus dictionary corresponding to the target style by applying the model corresponding to the target style so as to obtain an operation label of each character in the input text, and the output text is determined based on the operation labels. Therefore, some texts in the texts can be reserved or newly added through the model, so that simpler style conversion is realized, and text style conversion can be realized with less computing resources and higher computing efficiency.

In some embodiments, the target style label may be used to determine a first model corresponding to the target style to be converted and a target style dictionary, and then the style conversion of the corresponding target style may be performed based on the corresponding first model and dictionary.

In some embodiments, the target style dictionary may include one or more corpus texts corresponding to the target style, and the corpus texts may be selected by the first model to be inserted into corresponding positions in the original text, so as to implement a style conversion of the original text. For example, if the original text is "good", the corpus text in the dictionary is "ya", and then the target text "haya" with an active style can be obtained by inserting the corpus text "ya" into the original text.

Fig. 5 shows a schematic diagram of a first model according to an exemplary embodiment of the present disclosure.

In some exemplary embodiments, as shown in fig. 5, for the original text "good i go soon", its corresponding character sequence may be obtained first, where the character sequence includes all the characters in the original text, and may include a sequence start [ CLS ] and a sequence end [ SEP ], and then the character sequence may be: [ CLS ], "good", "of", "i", "horse", "up", "down", "SEP ].

After the character sequence is embedded, it can be input into the first model 500. In some embodiments, at least one corpus text in the target style dictionary may be simultaneously combined into a corpus sequence, and the corpus sequence is simultaneously input into the first model 500, and the first model 500 may predict the corresponding annotation sequence "keep, keep, keeper-right". Wherein, the "keep" is a reserved label and is used for indicating that the corresponding characters of the label need to be reserved at the corresponding positions in the target text. "keep-X" is an insertion tag, which is used to indicate that a character "X" is inserted in front of a word corresponding to the tag while the word is retained, where "X" may be a certain corpus text selected by the first model 500 in the target style dictionary. It is understood that the form of the sequence tag can be set by the skilled person on his own, based on the needs, without limitation.

Based on the labeling sequence, a corresponding target text 'good, i will go right away' can be generated, and therefore the text style conversion is achieved by inserting the selected characters into the original text.

In some embodiments, the number of the at least one character may be plural, and the at least one operation tag may include a reservation tag, and at least one of an insertion tag and a deletion tag, the deletion tag indicating that a corresponding character of the deletion tag is deleted from the character sequence.

Therefore, the operation modes aiming at the text are further enriched, and the flexibility of text style conversion is improved.

In some exemplary embodiments, for the original text "i do not go to work, i play at home", first their corresponding character sequences may be obtained: [ CLS ], "I", "No", "on", "off", "," i "," at "," home "," in "," play "," SEP ".

After the character sequence is embedded, it can be input into the first model 500. In some embodiments, at least one corpus text in the target style dictionary may be simultaneously combined into a corpus sequence, and the corpus sequence is simultaneously input into the first model 500, so that the first model 500 can predict the corresponding annotation sequence "keep, keep, keep-wool". The "delete" is a deletion tag, and is used to indicate that the word corresponding to the tag needs to be deleted at the corresponding position in the target text. The character corresponding to the deletion tag may be a certain corpus text selected by the first model 500 in the target style dictionary. It is understood that the related art can set the form of the sequence tag by itself based on the need, and is not limited thereto.

Based on the labeling sequence, a corresponding target text 'i do not go to work and i play at home' can be generated, so that the conversion of the text style is realized by inserting and/or deleting the selected characters in the original text.

In some embodiments, the target style dictionary may further include reference information, and the reference information may include at least one of a first operation tag corresponding to each corpus text of the at least one corpus text and a usage probability, where the first operation tag is used to indicate that an operation corresponding to the corresponding corpus text is an insertion or a deletion, and the usage probability is determined based on a frequency of occurrence of the corresponding corpus text when the target style dictionary is constructed.

In some embodiments, sequence tagging the sequence of characters with the first model based on the target style dictionary to obtain a tagged sequence may include: and carrying out sequence labeling on the character sequence by utilizing the first model based on at least one corpus text and reference information in the target style dictionary to obtain a labeled sequence.

Therefore, multidimensional reference information in the corpus dictionary can be introduced into model prediction, and the effects of model prediction and text style conversion are further improved.

When a target style dictionary of a corresponding style is constructed, dynamically planning a sample text pair prepared in advance to obtain a longest common subsequence of an original text sequence and a target text sequence, subtracting the longest common subsequence from the target text sequence to obtain a text needing to be inserted and deleted, labeling the text with a corresponding modification label (inserting label or deleting label), and adding the text and the corresponding modification label into the target style dictionary to take the text as one of corpus texts.

In some embodiments, word frequency statistics may be further performed on the corpus texts in the target style dictionary obtained by the above method, so as to obtain the occurrence probability of each corpus text, and the information is also recorded in the dictionary as one of the reference information.

In some embodiments, when the tagging sequence is generated, at least one of the reference information of the modification tag and the occurrence probability may be input into the first model at the same time, so that the multidimensional reference information in the corpus dictionary is introduced into model prediction, and the effects of model prediction and text style conversion are further improved.

In some embodiments, the first model may be a Transformer-based language model or a sequence annotation model. It is understood that the model applied can be determined by the person skilled in the art, without limitation.

In some embodiments, the style conversion model may be an end-to-end generative model.

In some embodiments, generating the target text corresponding to the target style based on the original text and the target style information may include: and inputting the target style label and the original text into a second model to obtain a target text output by the second model, wherein the second model is obtained by training based on the plurality of style labels and at least one second sample text pair corresponding to each style label in the plurality of style labels, and each second sample text pair in the at least one second sample text pair comprises the second original text and a second target text corresponding to the corresponding style label.

Fig. 6 shows an architectural diagram of a second model according to an exemplary embodiment of the present disclosure.

In some embodiments, the second model 600 may be a generative model capable of multiple stylistic transformations. In order to determine the target style to be converted, the target style tag may be spliced with the original text sequence, and the spliced sequence is embedded and then input into the second model 600, so as to obtain the target text sequence of the corresponding target style. For example, if the original text is "thank you for a quart prize," and the target style label is "ancient wind," the target text generated by the second model 600 may be "thank you for a good taste.

Therefore, by introducing the style labels as the guide information in the model prediction process, the conversion of multiple styles can be realized through one model, and meanwhile, the fluency of the generated text can be further optimized, and more complex style conversion is realized.

In some embodiments, the second model may be a transform-based language model. In some embodiments, the second model may be obtained based on Unilm model training. It is understood that the model applied can be determined by the person skilled in the art, without limitation.

In some embodiments, as shown in fig. 7, there is provided a model training method, a model for converting an original text into a target-style text, comprising: step S701, a sample data set is obtained, wherein the sample data set comprises at least one sample text pair, and each sample text pair in the at least one sample text pair comprises an original sample text and a target sample text corresponding to a target style; step S702, for each sample text pair in at least one sample text pair, obtaining a labeling sequence corresponding to the sample text pair, where the labeling sequence includes at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, and the at least one operation tag includes a reserved tag and a modified tag, the reserved tag is used to indicate a character which needs to be reserved in comparison with a target sample text in the sample text pair, and the modified tag includes an insertion tag which is used to indicate a character which needs to be inserted in the original sample text in comparison with the target sample text; step S703, determining characters corresponding to the inserted labels in the labeling sequence corresponding to each sample text pair in at least one sample text pair as corpus texts to construct a target style dictionary corresponding to a target style; and for each sample text pair in the sample data set, performing the following operations: step S704, inputting the corpus text in the target style dictionary, the original sample text in the sample text pair and the target sample text into a model to obtain a prediction result of a labeling sequence output by the model; and step S705, training a model based on the prediction result of the labeling sequence and the labeling sequence corresponding to the sample text pair.

Therefore, the model obtained through training is used for carrying out sequence labeling on the input text based on the corpus dictionary corresponding to the target style so as to obtain an operation label of each character in the input text, and therefore the output text is determined based on the operation labels. Therefore, some texts in the texts can be reserved or newly added through the model, so that simpler style conversion is realized, and text style conversion can be realized with less computing resources and higher computing efficiency.

In some embodiments, for the training of the first model, a target style to be style-converted by the first model may be first determined, and then, a corresponding sample text pair may be obtained based on the target style.

In some embodiments, each sample text pair may be dynamically planned to obtain a longest common subsequence of the original text sequence and the target text sequence, and the text that needs to be retained, inserted, and deleted may be obtained by subtracting the longest common subsequence from the target text sequence, so as to obtain a labeling sequence of the sample text pair correspondingly. Meanwhile, the text labeled with the modification tag (insert tag or delete tag) and the corresponding modification tag can be added into the target style dictionary to serve the text as one of the corpus texts, so that the target style dictionary corresponding to the target style is constructed based on sample data.

In some embodiments, the number of the at least one character may be plural, and modifying the tag may include at least one of inserting a tag and deleting a tag, the deleting tag indicating a character that needs to be deleted in the original sample text compared to the target sample text. Therefore, the operation modes aiming at the text are further enriched, and the flexibility of text style conversion is improved.

In some embodiments, the target style dictionary may further include a corresponding modification tag for each corpus text. Therefore, multi-dimensional reference information (such as a corresponding modification label of each corpus text) in the corpus dictionary is introduced into model training, and the effects of model prediction and text style conversion are further improved.

In some embodiments, constructing a target style dictionary corresponding to the target style may further comprise: counting the occurrence frequency of characters corresponding to the inserted labels in the corresponding labeling sequences of each sample text pair in at least one sample text pair to obtain at least one first character ordered according to the occurrence frequency; constructing a target style dictionary based on the first characters with the highest occurrence frequency and the preset number so as to delete the rest first characters; deleting the sample text pair corresponding to the deleted first character so as to update the sample data set; and training the model based on the updated sample data set.

Therefore, when a corpus dictionary is constructed, the corpuses are sequenced according to word frequency, so that corpuses with low occurrence frequency are deleted, and computing resources are further saved; correspondingly, the sample pairs corresponding to the deleted corpora are deleted, so that the samples which cannot be covered by the dictionary are deleted, and the effect of model training is prevented from being influenced.

Then, each sample text in the updated sample data set may be embedded into the corresponding original text sequence and target text sequence and input into the model (i.e., the first model). In some embodiments, at least one corpus text in the target style dictionary may be simultaneously combined into a corpus sequence, and simultaneously input into the model, and the model predicts to obtain a prediction result of the labeled sequence; then, a loss function is calculated based on the corresponding label sequence of the sample text pair and the prediction result, so as to train the model based on the loss function.

In some embodiments, the above-described loss function may apply a cross-entropy loss function. It is understood that the loss function can be determined by the skilled person based on actual needs, and is not limited herein.

In some embodiments, as shown in fig. 8, there is provided a model training method comprising: step S801, acquiring a sample data set, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the plurality of target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and for each sample text pair in the sample data set, performing the following operations: s802, inputting an original sample text, a target sample text and a target style label corresponding to the sample text in the sample text pair into a model to obtain a target text prediction result output by the model; and S803, training the model based on the target text prediction result and the target sample text in the sample text pair.

In some embodiments, sample text pairs corresponding to a plurality of target styles may first be collected separately, where each sample text pair has a respective target style label. In the training process of the model (i.e. the second model), the sample text can be spliced into a sequence with the corresponding original text sequence, the target text sequence and the target style label, and the sequence is embedded and then input into the model to obtain the text sequence prediction result of the corresponding target style output by the model; a loss function may then be calculated based on the target text sequence and the text sequence prediction to train the model based on the loss function.

Therefore, by introducing the style labels as the guide information in the model prediction process, the fusion training of the linguistic data in multiple styles is realized, the training efficiency is improved, meanwhile, the conversion of the multiple styles of the model can be realized, the fluency of the generated text is further optimized, and the more complicated style conversion is realized.

In some embodiments, as shown in fig. 9, there is provided a text data processing apparatus 900 including: a first generating unit 910 configured to generate an original text for replying to an input text based on the input text of a user; a first obtaining unit 920, configured to obtain target style information, where the target style information includes at least one of a target style tag and a target style dictionary, the target style tag is used to indicate a target style to be converted from an original text, and the target style dictionary includes at least one corpus text corresponding to a target style; and a second generating unit 930 configured to generate a target text corresponding to the target style based on the original text and the target style information.

The operations of the units 910 to 930 in the apparatus 900 are similar to the operations in the steps S201 to S203 of the text processing method, and are not described herein again.

In some embodiments, the second generating unit may include: a first determining subunit configured to determine, based on the target style label, a first model corresponding to a target style, the first model being obtained based on at least one first sample text pair and a target style dictionary training, each of the at least one first sample text pair including a first original text and a first target text corresponding to the target style; a first obtaining subunit configured to obtain a character sequence of an original text, the character sequence including at least one character of the original text; a second obtaining subunit, configured to perform sequence labeling on the character sequence by using the first model based on the target style dictionary to obtain a labeled sequence, where the labeled sequence includes at least one operation tag corresponding to at least one character, respectively, and the at least one operation tag includes a reserved tag and an inserted tag, the reserved tag is used for indicating a character corresponding to the reserved tag, and the inserted tag corresponds to one of the at least one corpus text and is used for indicating that the corresponding corpus text is inserted into the character sequence; and a first generation subunit configured to generate the target text based on the annotation sequence.

In some embodiments, the number of the at least one character may be plural, and the at least one operation tag may include a retention tag, and at least one of an insertion tag and a deletion tag, the deletion tag indicating that a corresponding character of the deletion tag is deleted from the character sequence.

In some embodiments, the target style dictionary may further include reference information, the reference information may include at least one of a first operation tag and a usage probability corresponding to each corpus text of the at least one corpus text, the first operation tag is used to indicate that an operation corresponding to the corresponding corpus text is an insertion or a deletion, the usage probability is determined based on an occurrence frequency of the corresponding corpus text when the target style dictionary is constructed, and the second obtaining subunit may be further configured to: and performing sequence labeling on the character sequence by utilizing the first model based on at least one corpus text and reference information in the target style dictionary to obtain a labeled sequence.

In some embodiments, the second generating unit may include: and a third obtaining subunit, configured to input the target style label and the original text into a second model to obtain a target text output by the second model, wherein the second model is obtained by training based on the plurality of style labels and at least one second sample text pair corresponding to each style label in the plurality of style labels, and each second sample text pair in the at least one second sample text pair comprises a second original text and a second target text corresponding to the corresponding style label.

In some embodiments, as shown in fig. 10, there is provided a model training apparatus 1000, wherein a model is used for converting an original text into a target style text, the apparatus 1000 comprising: a second obtaining unit 1010 configured to obtain a sample data set, where the sample data set includes at least one sample text pair, and each sample text pair in the at least one sample text pair includes an original sample text and a target sample text corresponding to a target style; a third obtaining unit 1020 configured to obtain, for each sample text pair in at least one sample text pair, a labeling sequence corresponding to the sample text pair, where the labeling sequence includes at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, and the at least one operation tag includes a retention tag and a modification tag, where the retention tag is used to indicate a character that needs to be retained compared to a target sample text in the sample text pair, and the modification tag includes an insertion tag used to indicate a character that needs to be inserted in the original sample text compared to the target sample text; a determining unit 1030 configured to determine, as a corpus text, a character corresponding to an insertion tag in a labeling sequence corresponding to each of at least one sample text pair, so as to construct a target style dictionary corresponding to a target style; and a first execution unit 1040 configured to, for each sample text pair in the sample data set, perform the following sub-unit operations, the first execution unit 1040 including: a first input subunit 1041, configured to input the corpus text in the target style dictionary, the original sample text in the sample text pair, and the target sample text into a model to obtain a prediction result of the labeling sequence output by the model; and a first training subunit 1042 configured to train a model based on the annotation sequence prediction result and the annotation sequence corresponding to the sample text pair.

The operations of the units 1010-1040, the subunit 1041, and the subunit 1042 in the apparatus 1000 are similar to the operations in steps S701-S705 of the model training method, and are not repeated herein.

In some embodiments, as shown in FIG. 11, there is provided a model training apparatus 1100, comprising: a fourth obtaining unit 1110, configured to obtain a sample data set, where the sample data set includes a plurality of target style labels and at least one sample text pair corresponding to each of the plurality of target style labels, and each sample text pair includes an original sample text and a target sample text having a corresponding target style; and a second performing unit 1120 configured to perform, for each sample text pair in the sample data set, the following sub-units of operations, the second performing unit 1120 comprising: a second input subunit 1121, configured to input the original sample text, the target sample text, and the target style label corresponding to the sample text pair in the sample text pair into a model to obtain a target text prediction result output by the model; and a second training subunit 1122 configured to train a model based on the target text prediction and the target sample text in the sample text pair.

The operations of the units 1110, 1120, 1121, and 1122 in the apparatus 1100 are similar to the operations in steps S801-S803 of the model training method, and are not repeated herein.

According to an embodiment of the present disclosure, there is also provided an electronic device, a readable storage medium, and a computer program product.

Referring to fig. 12, a block diagram of a structure of an electronic device 1200, which may be a server or a client of the present disclosure, which is an example of a hardware device that may be applied to aspects of the present disclosure, will now be described. Electronic device is intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 12, the electronic apparatus 1200 includes a computing unit 1201, which can perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM) 1202 or a computer program loaded from a storage unit 1208 into a Random Access Memory (RAM) 1203. In the RAM 1203, various programs and data necessary for the operation of the electronic apparatus 1200 may also be stored. The computing unit 1201, the ROM 1202, and the RAM 1203 are connected to each other by a bus 1204. An input/output (I/O) interface 1205 is also connected to bus 1204.

Various components in the electronic device 1200 are connected to the I/O interface 1205, including: an input unit 1206, an output unit 1207, a storage unit 1208, and a communication unit 1209. The input unit 1206 may be any type of device capable of inputting information to the electronic device 1200, and the input unit 1206 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device, and may include, but is not limited to, a mouse, a keyboard, a touch screen, a track pad, a track ball, a joystick, a microphone, and/or a remote controller. Output unit 1207 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, a video/audio output terminal, a vibrator, and/or a printer. Storage unit 1208 may include, but is not limited to, magnetic or optical disks. The communication unit 1209 allows the electronic device 1200 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth devices, 802.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 1201 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 1201 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 1201 performs the respective methods and processes described above, such as the above-described text data processing method or model training method. For example, in some embodiments, the text data processing method or the model training method described above may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 1200 via the ROM 1202 and/or the communication unit 1209. When the computer program is loaded into the RAM 1203 and executed by the computing unit 1201, one or more steps of the text data processing method or the model training method described above may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured in any other suitable manner (e.g., by way of firmware) to perform the text data processing method or model training method described above.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be performed in parallel, sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely exemplary embodiments or examples and that the scope of the present invention is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, the various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. A text data processing method, comprising:

generating original text for replying to the input text based on the input text of the user;

acquiring target style information, wherein the target style information comprises at least one of a target style label and a target style dictionary, the target style label is used for indicating a target style to be converted of the original text, and the target style dictionary comprises at least one corpus text corresponding to the target style; and

and generating a target text corresponding to the target style based on the original text and the target style information.

2. The method of claim 1, wherein the generating target text corresponding to the target style based on the original text and the target style information comprises:

determining a first model corresponding to the target style based on the target style label, the first model being obtained based on at least one first sample text pair and the target style dictionary training, each of the at least one first sample text pair comprising a first original text and a first target text corresponding to the target style;

acquiring a character sequence of the original text, wherein the character sequence comprises at least one character of the original text;

based on the target style dictionary, performing sequence labeling on the character sequence by using the first model to obtain a labeled sequence, wherein the labeled sequence comprises at least one operation tag corresponding to the at least one character respectively, the at least one operation tag comprises a reserved tag and an inserted tag, the reserved tag is used for indicating that the character corresponding to the reserved tag is reserved, and the inserted tag corresponds to one of the at least one corpus text and is used for indicating that the corresponding corpus text is inserted into the character sequence; and

and generating the target text based on the labeling sequence.

3. The method of claim 2, wherein the at least one character is plural in number, the at least one action tag includes a retention tag, and at least one of an insertion tag and a deletion tag, and the deletion tag is used to indicate that the character corresponding to the deletion tag is deleted from the character sequence.

4. The method according to claim 2 or 3, wherein the target style dictionary further includes reference information, the reference information includes at least one of a first operation tag corresponding to each of the at least one corpus text and a usage probability, the first operation tag is used to indicate that an operation corresponding to the corresponding corpus text is insertion or deletion, the usage probability is determined based on a frequency of occurrence of the corresponding corpus text when the target style dictionary is constructed, and the sequence labeling the character sequence by using the first model based on the target style dictionary to obtain a labeled sequence includes:

and performing sequence labeling on the character sequence by using the first model based on the at least one corpus text and the reference information in the target style dictionary to obtain a labeled sequence.

5. The method of claim 1, wherein the generating target text corresponding to the target style based on the original text and the target style information comprises:

inputting the target style label and the original text into a second model to obtain the target text output by the second model, wherein the second model is obtained by training based on a plurality of style labels and at least one second sample text pair corresponding to each style label in the plurality of style labels, and each second sample text pair in the at least one second sample text pair comprises a second original text and a second target text corresponding to the corresponding style label.

6. A method of model training, the model for converting original text to target-style text, the method comprising:

acquiring a sample data set, wherein the sample data set comprises at least one sample text pair, and each sample text pair in the at least one sample text pair comprises an original sample text and a target sample text corresponding to the target style;

for each sample text pair in the at least one sample text pair, obtaining a labeling sequence corresponding to the sample text pair, where the labeling sequence includes at least one operation tag corresponding to at least one character of an original sample text in the sample text pair, and the at least one operation tag includes a retention tag and a modification tag, where the retention tag is used to indicate a character that needs to be retained in comparison with a target sample text in the sample text pair, and the modification tag includes an insertion tag used to indicate a character that needs to be inserted in the original sample text in comparison with the target sample text;

determining characters corresponding to the inserted labels in the labeling sequences corresponding to each sample text pair in the at least one sample text pair as corpus texts to construct a target style dictionary corresponding to the target style; and

for each sample text pair in the sample data set, performing the following:

inputting the corpus text, the original sample text in the sample text pair and the target sample text in the target style dictionary into the model to obtain a prediction result of a labeling sequence output by the model; and

and training the model based on the prediction result of the labeling sequence and the labeling sequence corresponding to the sample text pair.

7. The method of claim 6, wherein the at least one character is plural in number, and the modifying tag includes at least one of an inserting tag and a deleting tag, and the deleting tag is used to indicate a character that needs to be deleted in the original sample text compared to the target sample text.

8. The method according to claim 6 or 7, wherein the target style dictionary further comprises a corresponding modification tag for each corpus text.

9. The method of any of claims 6-8, wherein the constructing a target style dictionary corresponding to the target style further comprises:

counting the occurrence frequency of characters corresponding to the inserted labels in the corresponding labeling sequences of each sample text pair in the at least one sample text pair to obtain at least one first character ordered according to the occurrence frequency;

constructing the target style dictionary based on the first characters with the highest occurrence frequency in the preset number so as to delete the rest first characters;

deleting the sample text pair corresponding to the deleted first character so as to update the sample data set; and

training the model based on the updated set of sample data.

10. A model training method, comprising:

the method comprises the steps of obtaining a sample data set, wherein the sample data set comprises a plurality of target style labels and at least one sample text pair corresponding to each target style label in the target style labels, and each sample text pair comprises an original sample text and a target sample text with a corresponding target style; and

for each sample text pair in the sample data set, performing the following:

inputting an original sample text, a target sample text and a target style label corresponding to the sample text pair in the sample text pair into the model to obtain a target text prediction result output by the model; and

and training the model based on the target text prediction result and the target sample text in the sample text pair.

11. A text data processing apparatus comprising:

a first generating unit configured to generate an original text for replying to an input text of a user based on the input text;

a first obtaining unit, configured to obtain target style information, where the target style information includes at least one of a target style tag and a target style dictionary, the target style tag is used to indicate a target style to be converted from the original text, and the target style dictionary includes at least one corpus text corresponding to the target style; and

a second generating unit configured to generate a target text corresponding to the target style based on the original text and the target style information.

12. The apparatus of claim 11, wherein the second generating unit comprises:

a first determining subunit configured to determine, based on the target style label, a first model corresponding to the target style, the first model being obtained based on at least one first sample text pair and the target style dictionary training, each of the at least one first sample text pair including a first original text and a first target text corresponding to the target style;

a first obtaining subunit configured to obtain a character sequence of the original text, the character sequence including at least one character of the original text;

a second obtaining subunit, configured to perform sequence labeling on the character sequence by using the first model based on the target style dictionary to obtain a labeled sequence, where the labeled sequence includes at least one operation tag respectively corresponding to the at least one character, and the at least one operation tag includes a retention tag and an insertion tag, the retention tag is used for indicating that a character corresponding to the retention tag is retained, and the insertion tag corresponds to one of the at least one corpus text and is used for indicating that a corresponding corpus text is inserted into the character sequence; and

a first generating subunit configured to generate the target text based on the annotation sequence.

13. The apparatus of claim 12, wherein the at least one character is plural in number, the at least one action tag includes a retention tag, and at least one of an insertion tag and a deletion tag, and the deletion tag is used to indicate that a corresponding character of the deletion tag is deleted from the sequence of characters.

14. The apparatus according to claim 12 or 13, wherein the target style dictionary further includes reference information, the reference information including at least one of a first operation tag and a usage probability corresponding to each corpus text of the at least one corpus text, the first operation tag indicating that the operation corresponding to the respective corpus text is insertion or deletion, the usage probability being determined based on a frequency of occurrence of the respective corpus text when the target style dictionary is constructed, and the second obtaining subunit is further configured to:

and performing sequence labeling on the character sequence by using the first model based on the at least one corpus text and the reference information in the target style dictionary to obtain the labeled sequence.

15. The apparatus of claim 11, wherein the second generating unit comprises:

a third obtaining subunit, configured to input the target style label and the original text into a second model to obtain the target text output by the second model, wherein the second model is obtained by training based on a plurality of style labels and at least one second sample text pair corresponding to each style label in the plurality of style labels, and each second sample text pair in the at least one second sample text pair comprises a second original text and a second target text corresponding to the corresponding style label.

16. An apparatus for training a model for converting original text into a target style of text, the apparatus comprising:

a second obtaining unit, configured to obtain a sample data set, wherein the sample data set includes at least one sample text pair, and each sample text pair in the at least one sample text pair includes an original sample text and a target sample text corresponding to the target style;

a third obtaining unit, configured to obtain, for each sample text pair in the at least one sample text pair, an annotation sequence corresponding to the sample text pair, where the annotation sequence includes at least one operation tag respectively corresponding to at least one character of an original sample text in the sample text pair, where the at least one operation tag includes a retention tag and a modification tag, the retention tag is used to indicate a character that needs to be retained in comparison with a target sample text in the sample text pair, and the modification tag includes an insertion tag used to indicate a character that needs to be inserted in the original sample text in comparison with the target sample text;

a determining unit, configured to determine, as a corpus text, a character corresponding to an insertion tag in a labeling sequence corresponding to each of the at least one sample text pair, so as to construct a target style dictionary corresponding to the target style; and

a first execution unit configured to perform the following sub-unit operations for each sample text pair in the sample data set, the first execution unit comprising:

a first input subunit, configured to input the corpus text in the target style dictionary, the original sample text in the sample text pair, and the target sample text into the model, so as to obtain a prediction result of the labeling sequence output by the model; and

a first training subunit configured to train the model based on the annotation sequence prediction result and an annotation sequence corresponding to the sample text pair.

17. A model training apparatus comprising:

a fourth obtaining unit, configured to obtain a sample data set, where the sample data set includes a plurality of target style labels and at least one sample text pair corresponding to each of the plurality of target style labels, and each sample text pair includes an original sample text and a target sample text having a corresponding target style; and

a second execution unit configured to, for each sample text pair in the sample data set, perform the following sub-unit operations, the second execution unit comprising:

the second input subunit is configured to input the original sample text, the target sample text and the corresponding target style label of the sample text pair into the model so as to obtain a target text prediction result output by the model; and

a second training subunit configured to train the model based on the target text prediction result and the target sample text in the sample text pair.

18. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein

The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.

19. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-10.

20. A computer program product comprising a computer program, wherein the computer program realizes the method of any one of claims 1-10 when executed by a processor.