CN113268952A - Text generation method and device and electronic equipment - Google Patents

Text generation method and device and electronic equipment Download PDF

Info

Publication number
CN113268952A
CN113268952A CN202110554777.2A CN202110554777A CN113268952A CN 113268952 A CN113268952 A CN 113268952A CN 202110554777 A CN202110554777 A CN 202110554777A CN 113268952 A CN113268952 A CN 113268952A
Authority
CN
China
Prior art keywords
text
attribute
type
unit
generation model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110554777.2A
Other languages
Chinese (zh)
Other versions
CN113268952B (en
Inventor
张荣升
江琳
毛晓曦
范长杰
胡志鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Publication of CN113268952A publication Critical patent/CN113268952A/en
Application granted granted Critical
Publication of CN113268952B publication Critical patent/CN113268952B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text generation method, a text generation device and electronic equipment; wherein, the method comprises the following steps: acquiring a text structure attribute and a text content attribute of a target text to be generated; inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting a target text; wherein the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attributes are used to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute. The mode can simultaneously realize the control on the structure and the content of the text output by the model, so that the text output by the model meets the specific text structure requirement and the text content requirement, and the practicability of the generated literary works is improved.

Description

Text generation method and device and electronic equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a text generation method and device and electronic equipment.
Background
The main principle of literature creation by Artificial Intelligence (AI) is to input sample corpus into a deep learning model, such as an autoregressive language model, which learns the statistical properties of natural language by continuously optimizing model parameters, so that the model learns the text structure of corpus, predicts the next character by the existing character sequence, and generates the whole text in a character iteration manner. After the model is trained through a large amount of texts, the model can learn which characters can frequently appear together, and the probability of each character appearing at the current position is given based on the text, so that a relatively smooth text is generated. Because literary works of different literary works have different literary structure requirements, taking a lyric text as an example, a complete lyric often has structures such as a verse, a refrain, a bridge section and the like, and the contents of the texts in different structures are different. However, the text created by the method is difficult to meet the specific text structure requirement, and the practicability of the generated literary works is poor.
Disclosure of Invention
In view of the above, the present invention provides a text generation method, a text generation device, and an electronic device, so that a text output by a model meets a specific text structure requirement, and the practicability of a generated literature is improved.
In a first aspect, an embodiment of the present invention provides a text generation method, where the method includes: acquiring a text structure attribute and a text content attribute of a target text to be generated; inputting the text structure attribute and the content attribute into a text generation model which is trained in advance, and outputting a target text; wherein the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attributes are used to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
The text structure attribute includes: at least one text structure type arranged according to a preset sequence; the text structure type is used for: and controlling the text generation model to output the text matched with the text structure type.
The target text comprises a lyric text; the text structure type comprises one or more of a song, a bridge segment.
The step of inputting the text structure attribute and the text content attribute into the text generation model which is trained in advance and outputting the target text comprises the following steps: inputting the text structure attribute and the text content attribute into a text generation model, and executing the following operations through the text generation model: generating a type identifier of a text structure type contained in the text structure attribute; generating a partial text corresponding to the text structure type based on the type identification; obtaining a target text through a part of texts corresponding to the text structure types; and matching the text content of the target text with the text content attribute.
The step of generating the type identifier of the text structure type included in the text structure attribute includes: based on the text structure type contained in the text structure attribute, when the text generation model generates a first character corresponding to the text structure type, adjusting the probability of each alternative character for generating the first character; the alternative characters comprise type identifications of text structure types; the probability of the type identifier of the text structure type is the maximum in the adjusted probabilities of the alternative characters; and outputting the type identification of the text structure type as a first character.
The text structure attribute further includes: the specified text amount corresponding to the text structure type; the specified text amount is counted based on the number of the preset unit texts; the step of generating a partial text corresponding to the text structure type based on the type identifier includes: generating a unit text corresponding to the type identifier based on the type identifier; if the number of the unit texts generated currently is smaller than the specified text amount, generating unit identifiers, and generating unit texts corresponding to the unit identifiers based on the type identifiers and the generated unit texts; and if the number of the currently generated unit texts reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
The step of combining the generated unit texts to obtain the partial text corresponding to the text structure type if the number of the currently generated unit texts reaches the specified text amount includes: counting the total number of the type identifiers and the unit identifiers; and if the total number of the identifications reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
The text generation model is obtained by training in the following way: acquiring a sample text set; the sample text set comprises a sample text, a text structure label and a text content label of the sample text; generating text structure labels of at least a part of sample texts through a structure label generation model which is trained in advance; a text generation model is trained based on the sample text set.
The structure label generation model generates the text structure label by the following method: inputting the sample text into a structural label generation model, and executing the following operations through the structural label generation model: setting a text structure label of a first unit text in the sample text; for unit texts except the first unit text, if the text structure type of the current unit text is the same as the text structure type of the last unit text of the current unit text, setting a unit identification tag of the current unit text; and if the text structure type of the current unit text is different from the text structure type of the last unit text of the current unit text, setting a text structure label of the current unit text.
In a second aspect, an embodiment of the present invention provides a text generating apparatus, where the apparatus includes: the attribute determining module is used for acquiring the text structure attribute and the text content attribute of the target text to be generated; the text output module is used for inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance and outputting a target text; wherein the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attributes are used to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a processor and a memory, where the memory stores machine executable instructions capable of being executed by the processor, and the processor executes the machine executable instructions to implement the text generation method.
In a fourth aspect, embodiments of the present invention provide a machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement the text generation method described above.
The embodiment of the invention has the following beneficial effects:
in the text generation method, the text generation device and the electronic equipment, firstly, the text structure attribute and the text content attribute of a target text to be generated are obtained; inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting a target text; the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attribute is used for: and controlling the text content of the target text output by the text generation model to accord with the text content attribute. The mode can simultaneously realize the control on the structure and the content of the text output by the model, so that the text output by the model meets the specific text structure requirement and the text content requirement, and the practicability of the generated literary works is improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a text generation method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a GPT model trained with lyric corpus according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a sample text set constructed by manual labeling and model labeling together according to an embodiment of the present invention;
FIG. 4 is a diagram of a pre-trained Chinese BERT model against sample lyrics predictive text structure labels, in accordance with an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a text generating apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Literature creation by AI is a popular research and application direction for natural language generation, and for example, creating literature such as poem, couplet, lyrics, novel, and newsletter by AI is widely used. In the related art, the technical scheme of literary creation through AI mainly goes through the following processes:
firstly, when the computational power and the model parameters are small, the literary works such as lyrics and the like are generated sentence by sentence, and the main process is to generate a first sentence text by utilizing a seq2seq coding and decoding framework, then generate a second sentence text according to the first sentence text, then generate a third sentence text according to the first sentence text and the second sentence text, and sequentially iterate to obtain the complete literary work. The seq2seq framework is composed of an encoder and a decoder, wherein the encoder encodes an input sentence into an intermediate representation vector, and the decoder is used for decoding a next text character by combining the intermediate representation vector output by the encoder and a decoded partial text sequence. The encoder and decoder structure of Seq2Seq may include a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN), or a Transformer model, etc.
Since the iterative generation of the literary work by one sentence takes much time and the relevance between the front sentence and the back sentence in the generated literary work is weak, the complete literary work is directly generated by using the autoregressive language model after the calculation power and the model parameters are improved. The principle of text generation by the autoregressive language model is to encode and predict the probability distribution of the next character by using the generated character, and sequentially generate the whole literary work. The neural network structure of the language model, like the decoder of seq2seq, can be implemented based on RNN, CNN or Transformer structure.
With the further improvement of computer computing power and the accumulation of a large amount of text corpora, the pre-trained language model greatly improves the performance of natural language generation. The pre-training language model is an autoregressive language model trained by using a large amount of unsupervised text corpora, because the number of neural network parameters in the pre-training language model is large, and the model is pre-trained on a large-scale text, the most basic statistical characteristics in a natural language are learned, the correlation between long texts can be grasped better, and the fluency of the generated text is improved, so that the pre-training language model has higher language fluency and the correlation of the long texts. Model parameters are finely adjusted by using the corpus of the literary works (such as lyrics) of specific types on the basis of the pre-training language model, and the literary works of characteristic types can be created by using the finely adjusted model.
The literary works created by the pre-trained language model have better fluency and relevance, but the generation process of the literary works depends on the probability sampling of characters, has certain randomness and lacks of controllability of output contents. For a specific type of literary works, in addition to the high quality of the text to be output, the special structure of the type of literary works is often required, and taking lyrics as an example, a complete lyric includes a main song, a refrain, a bridge section and other structures. There is some variability in the characteristics of text between different structures, for example, the verse of a lyric is often a narrative of an event, while the text in a chorus is more of an expression of an emotion. However, the text created by the method is difficult to meet the specific text structure requirement, and the practicability of the generated literary works is poor.
Based on the above problems, the text generation method, the text generation device and the electronic device provided by the embodiments of the present invention can be applied to the creation process of various texts and literary works, and especially can be applied to the creation process of a literary work with a specific text structure.
First, referring to a flowchart of a text generation method shown in fig. 1, the method includes the following steps:
step S102, acquiring a text structure attribute and a text content attribute of a target text to be generated;
the user can determine and input the text structure attribute and the text content attribute of the target text according to the actual text generation requirement. The target text can be various types of literary works such as poems, couplets, lyrics, novels, news manuscripts and the like; the text structure attribute of the target text is generally determined according to the type of the target text. For example, when the target text is lyrics, the text structure attributes typically include a verse, a refrain, a bridge, etc.; when the target text is a couplet, the text structure attributes typically include upper-up, lower-down, horizontal batch, and the like. In this embodiment, the type of the target text and the text structure attribute of the target text are not specifically limited. The specific content of the text structure attribute of the target text can be determined according to the generation requirement of the target text, for example, the text structure attribute of the target text can be a song master, a song servant, a song master and a song servant by taking lyrics as an example; taking couplet as an example, the text structure attribute of the target text can be upper-level couplet and lower-level couplet.
Step S104, inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting a target text; wherein the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attributes are used to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
The text generation model can learn the text characteristics of each text structure and the text characteristics of each text content in the training process. After the training is completed, after the text structure attribute is input to the text generation model, the model can generate a text matched with the text content of the text structure according to the text structure contained in the text structure attribute, and after the text content attribute is input to the text generation model, the model can generate a text matched with the text content attribute according to the content attributes of keywords, topics, text outlines, beginning content and ending content of the text and the like contained in the text content attribute.
Also take lyrics as an example, if the text structure attribute is verse, refrain, verse; the text generation model firstly generates a text corresponding to a verse, then generates a text corresponding to a refrain, and finally generates a text corresponding to the verse, and the three texts are sequentially arranged and combined into the target text. At the moment, the text structure of the target text accords with the text structure attribute input into the model, and the structure of the text does not need to be manually adjusted. Meanwhile, in the process of generating the text, the specific text content is controlled by the text content attribute. For example, if the text content attribute is the subject word "rainbow", at least a part of the text in the target text is associated with "rainbow".
In the text generation method, firstly, the text structure attribute and the text content attribute of a target text to be generated are obtained; inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting a target text; the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attribute is used for: and controlling the text content of the target text output by the text generation model to accord with the text content attribute. The mode can simultaneously realize the control on the structure and the content of the text output by the model, so that the text output by the model meets the specific text structure requirement and the text content requirement, and the practicability of the generated literary works is improved.
In a specific implementation manner, the text structure attribute includes: at least one text structure type arranged according to a preset sequence; the text structure type is used for: and controlling the text generation model to output the text matched with the text structure type. If only one text structure type is included in the text structure attribute, the model outputs only the text of the text structure type. If the text structure attribute includes a plurality of text structure types, the arrangement order among the plurality of text structure types often determines the arrangement order among the texts of each text structure in the finally output target text. Because the model learns the characteristics of the text of each text structure type in the training stage, the text structure type in the text structure attributes can control the text to generate the text with the text structure attribute matched with the text structure type. According to the method, the model can be controlled to output the matched text based on each text structure type, so that the content in each text structure of the target text can be more finely and accurately controlled.
As an example, the target text includes a lyric text; the text structure type in the text structure attribute comprises one or more of a verse, a refrain and a bridge section. For a text structure attribute, the text structure attribute may include all text structure types, for example, the text structure attribute is verse, refrain, bridge, refrain, verse; the text structure attribute may also include only a partial text structure, for example, the text structure attribute is song, refrain, song, refrain. In a text structure attribute, the same structure type may appear one or more times, and is not limited specifically.
The following embodiment describes a specific implementation manner of inputting a text structure attribute and a text content attribute into a text generation model which is trained in advance, and outputting a target text. Inputting the text structure attribute and the text content attribute into a text generation model, and executing the following operations through the text generation model: generating a type identifier of a text structure type contained in the text structure attribute; generating a partial text corresponding to the text structure type based on the type identification; obtaining a target text through a part of texts corresponding to the text structure types; and matching the text content of the target text with the text content attribute.
The text generation model usually predicts the next character according to the existing text characters or characters predicted by the model, and the complete target text is obtained by loop iteration. Based on the method, the model generates a type identifier of the text structure type contained in the text structure attribute, and predicts the following characters based on the type identifier so as to obtain a partial text corresponding to the text structure type; when the text structure attribute comprises a plurality of text structure types, predicting to obtain a partial text corresponding to each text structure type based on the type identification corresponding to each text structure type; and combining the partial texts corresponding to each text structure type to obtain the target text.
In a specific implementation manner, when the text structure attribute includes a plurality of text structure types, the above operations are performed for each text structure type one by one according to the arrangement order of the plurality of text structure types. As an example, it is assumed that the text structure attribute includes three text structure types, i.e., type 1, type 2, and type 3, which are arranged in sequence, and the three types may be the same or different. Firstly, generating a type identifier of a type 1, predicting the following characters by the model based on the type identifier, and obtaining a partial text corresponding to the type 1; since the partial text is predicted based on the type identifier of the type 1, the partial text has a greater relevance to the type 1, and the text content of the partial text is matched with the type 1; after the prediction of the partial content of the type 1 is finished, generating a type identifier of the type 2, and predicting the following characters by the model based on the type identifier to obtain a partial text corresponding to the type 2; when the character corresponding to the type 2 is predicted, the partial text in the type 1 can be input, so that the text content of the partial text corresponding to the type 2 and the text content of the partial text corresponding to the type 1 have certain relevance; after the prediction of the partial content of the type 2 is completed, generating a type identifier of the type 3 in the same way, and predicting the following characters by the model based on the type identifier to obtain a partial text corresponding to the type 3; when the characters corresponding to the type 3 are predicted, the partial text in the type 1 and the partial text in the type 2 can be input, so that the text content of the partial text corresponding to the type 2 has certain relevance with the text content of the partial text corresponding to the type 1 and the type 2; and sequentially arranging the partial text corresponding to the type 1, the partial text corresponding to the type 2 and the partial text corresponding to the type 3 to obtain the target text.
The type identifier of the text structure type may also be understood as a character predicted by the text generation model. In order to match the output type identifier with the text structure type and simultaneously match the text content of the text output by the model with the text structure type, the type identifier of the text structure type can be used as the first character in the text corresponding to the text structure type.
Specifically, based on a text structure type contained in the text structure attribute, when the text generation model generates a first character corresponding to the text structure type, the probability of each alternative character used for generating the first character is adjusted; the alternative characters comprise type identifications of text structure types; the probability of the type identifier of the text structure type is the maximum in the adjusted probabilities of the alternative characters; and outputting the type identification of the text structure type as a first character. When the text generation model predicts characters each time, the probability of each alternative character is calculated firstly, then the alternative character with higher probability is determined as an output character, based on the result, when the text structure attribute comprises a text structure type, the text generation model indicates that the model needs to output text content corresponding to the text structure type, in order to output the text content corresponding to the text structure type, a type identifier corresponding to the text structure type needs to be output firstly, and then the subsequent characters are predicted based on the type identifier; therefore, the first character corresponding to the text structure type needs to be controlled to be the type identifier of the text structure type, and in order to achieve the purpose, when a plurality of alternative characters corresponding to the first character are generated, each alternative character usually has an initial probability; furthermore, the probability of the type identifier corresponding to the text structure type may be adjusted to a larger probability value, for example, the probability value is 1 or 0.99, etc., so as to ensure that the first character is the type identifier corresponding to the text structure type.
When the text structure attribute includes a plurality of text structure types, the probability of the type identifier of the text structure type may be adjusted to a higher probability value for each text structure type, so as to ensure that the first character corresponding to the text structure type is the type identifier of the text structure type. By the method, the type identification of the text structure type can be generated firstly, and the text content in the text structure type is predicted based on the type identification, so that the text content is matched with the current text structure type.
In addition, in order to control the text content of the target text, besides the text structure attribute, a text subject may be input, for example, a keyword "spring" may be input, and the text generated subsequently may begin with the spring or generate the text content with the spring as the subject, thereby implementing the control of the text content.
Further, the text structure attribute may include a specified text amount corresponding to the text structure type in addition to the text structure type; the specified text amount is counted based on the number of the preset unit texts; the preset unit text can be characters, sentences, lines or paragraphs and the like; the designated text amount corresponding to each text structure type can be determined according to requirements; for example, the specified text amount corresponding to the text structure type 1 is five sentences, and the specified text amount corresponding to the text structure type 2 is three sentences; for another example, the specified text amount corresponding to the text structure type 1 is two paragraphs, and the specified text amount corresponding to the text structure type 2 is four paragraphs, and the like. When the target text is the lyrics, the text structure attribute can be 5 lines of verse, 4 lines of refrain, 5 lines of verse, 4 lines of refrain and the like.
By setting the specified text amount of each text structure type, the text structure of the target text can be more accurately controlled, and the specific authoring requirements are met on the text amount.
Based on the above, when the partial text corresponding to the text structure type is generated based on the type identifier, the method may specifically be implemented in the following manner: generating a unit text corresponding to the type identifier based on the type identifier; if the number of the unit texts generated currently is smaller than the specified text amount, generating unit identifiers, and generating unit texts corresponding to the unit identifiers based on the type identifiers and the generated unit texts; and if the number of the currently generated unit texts reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type. The unit text can also be characters, sentences, lines, paragraphs, or the like; the method comprises the steps that every time a unit text is generated, whether the number of the currently generated unit texts reaches the specified text amount or not is judged, if the number of the currently generated unit texts does not reach the specified text amount, the unit texts need to be continuously generated, and in order to facilitate statistics of the number of the generated unit texts, a unit identifier can be set before each unit text; for the first unit text, the unit identifier of the unit text is the type identifier, and the unit identifier is not repeatedly set; for the unit text generated subsequently to the first unit text, a unit identifier needs to be generated before the unit text. Counting the total number of the type identifiers and the unit identifiers when counting the number of the unit texts which are generated currently; and if the total number of the identifications reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
As an example, assuming that the type identifier of a certain text structure type is < zhu >, and the unit identifier is <0>, if the specified text amount corresponding to the text structure type is five unit texts, the identifiers of the five unit texts are in turn < zhu > <0> <0> <0> <0>, and the content output by the text generation model is: < zhu > unit text 1<0> unit text 2<0> unit text 3<0> unit text 4<0> unit text 5. When the first type identifier < zhu > is generated, the probability of the type identifier can be set to 1, so that the first character output is ensured to be the type identifier; in generating the unit identifier, the probability of the unit identifier may be set to 1, thereby ensuring that the output character is the unit identifier.
As described in the foregoing embodiment, the text generation model needs to be trained in advance, and specifically, the text generation model is trained in the following manner: acquiring a sample text set; the sample text set comprises a sample text, a text structure label and a text content label of the sample text; generating text structure labels of at least a part of sample texts through a structure label generation model which is trained in advance; a text generation model is trained based on the sample text set. The text structure label is used to indicate the text structure of the sample text, and the content of the text structure label is the same as that of the text structure attribute, and the specific description may refer to the related description of the text structure attribute described in the foregoing embodiment. The text content label is used to indicate the text content of the sample text, and the specific description may refer to the related description of the text content attribute described in the foregoing embodiment, as the same content included in the text content attribute.
In the training process of the text generation model, the influence of the text structure label and the text content label on the output text can be sensed. After the training of the text generation model is completed, the user can set the sentence number or paragraph number of each text structure type, the text structure attribute is input into the model, the model controls the decoding process according to the text structure attribute, and the text which is in line with the user setting is returned to the user. After the training of the text generation model is completed, a user can set a specific text content attribute, and after the text content attribute is input into the model, the model controls a decoding process according to the text content attribute so that the text content conforms to the text content attribute.
For example only, the text generation model may be implemented by using a speech model GPT (generic Pre-Training model), and parameters of the model may be fine-tuned according to the genre of the target text, for example, if the target text is lyrics, the model may be retrained again by lyrics corpus to adjust the parameters of the model. The GPT model is an autoregressive speech model based on a Transformer model framework, for example, a word that has already appeared predicts the next word, and in the training process, MLP (Maximum Likelihood Probability) loss is calculated based on the Probability of the predicted word, thereby optimizing model parameters. The MLP loss function can be expressed as: loss (u) ═ ΣilogP(ui|u<iθ); wherein u isiFor the currently output word, u<iAnd theta is an optimized parameter of the model for a word which is output historically. Because the GPT model has more parameters and the pre-trained sample size is larger, the GPT model can grasp the correlation between long texts, thereby improving the language fluency of the output texts.
FIG. 2 shows a schematic diagram of training a GPT model with lyric corpus; the samples input into the model carry text structure labels, such as < zhu > and < O > in the figure, and the samples are input into the GPT model character by character, so that the model learns the relation between the text structure type and the text content until the loss of the model is converged, and a text generation model is obtained.
In the related technology, the labels of the sample texts are usually marked manually, the method is high in cost, time and labor are consumed, the efficiency is low, and the number of the obtained sample texts with the labels is limited; if a model is generated by training a text through a small amount of sample texts, the training effect is difficult to ensure; in order to quickly and accurately obtain a large number of sample texts labeled with labels, in this embodiment, at least a part of the sample texts are labeled by a structural label generation model which is trained in advance. Therefore, the efficiency of sample labeling is improved, the labeling cost is reduced, a large number of labeled samples can be obtained quickly and accurately, and the training effect of the text generation model is improved.
Firstly, a structure label generation model needs to be trained, a text structure label needs to be labeled in advance on a training sample of the model, and then the structure label generation model is trained on the training sample labeled with the label, so that the structure label generation model learns the relation between the text structure label and the text content. After training is finished, the sample text is input into the structure label generation model, and then the text structure type corresponding to each part of the sample text can be obtained. The training samples in the sample text set for training the text generation model can be generated by the structural labels to label the text structural labels of the model, or can be generated by the structural labels to label the text structural labels of the model, and the other part adopts a manual mode to label the text structural labels.
FIG. 3 shows a schematic diagram of the common construction of a sample text collection by manual annotation and model annotation; taking the lyric text as an example, the text structure attribute of the lyric text can also be called as a paragraph attribute; after the lyric corpus is obtained, a text structure label of a part of lyric texts is labeled in a manual labeling mode, a structure label is trained to generate a model based on a manually labeled sample, and then the model is generated through the trained structure label to label the remaining unlabeled lyric corpus to obtain the lyric corpus labeled by the model.
After the training of the structural label generation model is completed, the structural label generation model specifically generates a text structural label in the following way: inputting the sample text into a structural label generation model, and executing the following operations through the structural label generation model: inputting the sample text into the structural label generation model, and executing the following operations through the structural label generation model: setting a text structure label of a first unit text in the sample text; for unit texts except the first unit text, if the text structure type of the current unit text is the same as the text structure type of the last unit text of the current unit text, setting a unit identification tag of the current unit text; and if the text structure type of the current unit text is different from the text structure type of the last unit text of the current unit text, setting a text structure label of the current unit text.
In the structure label generating model, identifying the text structure type for each unit text, and setting a text structure label; in most cases, a plurality of continuous unit texts may belong to the same text structure type, and in this case, the text structure tag is not set, and the unit identification tag is set, through which the plurality of continuous unit text tags belonging to the same text structure type are connected. For ease of understanding, an example of a sample text provided with a text structure label is given:
Figure BDA0003075795870000151
in the above example, the lyric text is taken as an example, a text structure tag and a unit identification tag are set for each paragraph of text, where the unit text is a paragraph and one tag is set for each paragraph. Wherein, < zhu > is the type identification of the text structure type verse, < fu > is the type identification of the text structure type refrain, < O > is the unit text identification.
For example only, the structural tag generation model may use a pre-trained chinese BERT (Bidirectional Encoder Representation from transforms based on a transformation model) model, which is built on the basis of a transforms model framework and trained with a large amount of chinese corpus. FIG. 4 shows a schematic diagram of the pre-trained Chinese BERT model with respect to sample lyrics predicted text structure labels, which can be output every time a character or a segment of a character is input, such as < zhu > and < O > in the figure.
Corresponding to the above method embodiment, referring to fig. 5, a schematic structural diagram of a text generating apparatus is shown, where the apparatus includes:
the attribute determining module 50 is configured to obtain a text structure attribute and a text content attribute of a target text to be generated;
the text output module 52 is configured to input the text structure attribute and the text content attribute into a text generation model that is trained in advance, and output a target text;
wherein the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attributes are used to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
In the text generation device, firstly, the text structure attribute and the text content attribute of a target text to be generated are obtained; inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting a target text; the text structure attribute is used for: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attribute is used for: and controlling the text content of the target text output by the text generation model to accord with the text content attribute. The mode can simultaneously realize the control on the structure and the content of the text output by the model, so that the text output by the model meets the specific text structure requirement and the text content requirement, and the practicability of the generated literary works is improved.
The text structure attribute includes: at least one text structure type arranged according to a preset sequence; the text structure type is used for: and controlling the text generation model to output the text matched with the text structure type.
The target text comprises a lyric text; the text structure type comprises one or more of a song, a bridge segment.
The text output module is further configured to: inputting the text structure attribute and the text content attribute into a text generation model, and executing the following operations through the text generation model: generating a type identifier of a text structure type contained in the text structure attribute; generating a partial text corresponding to the text structure type based on the type identification; obtaining a target text through a part of texts corresponding to the text structure types; and matching the text content of the target text with the text content attribute.
The text output module is further configured to: based on the text structure type contained in the text structure attribute, when the text generation model generates a first character corresponding to the text structure type, adjusting the probability of each alternative character for generating the first character; the alternative characters comprise type identifications of text structure types; the probability of the type identifier of the text structure type is the maximum in the adjusted probabilities of the alternative characters; and outputting the type identification of the text structure type as a first character.
The text structure attribute further includes: the specified text amount corresponding to the text structure type; the specified text amount is counted based on the number of the preset unit texts; the text output module is further configured to: generating a unit text corresponding to the type identifier based on the type identifier; if the number of the unit texts generated currently is smaller than the specified text amount, generating unit identifiers, and generating unit texts corresponding to the unit identifiers based on the type identifiers and the generated unit texts; and if the number of the currently generated unit texts reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
The text output module is further configured to: counting the total number of the type identifiers and the unit identifiers; and if the total number of the identifications reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
The device also comprises a model training module, which is used for obtaining the text generation model through the following training modes: acquiring a sample text set; the sample text set comprises a sample text, a text structure label and a text content label of the sample text; generating text structure labels of at least a part of sample texts through a structure label generation model which is trained in advance; a text generation model is trained based on the sample text set.
The device also comprises a label generation module, which is used for generating the text structure label by the structure label generation model in the following way: inputting the sample text into a structural label generation model, and executing the following operations through the structural label generation model: setting a text structure label of a first unit text in the sample text; for unit texts except the first unit text, if the text structure type of the current unit text is the same as the text structure type of the last unit text of the current unit text, setting a unit identification tag of the current unit text; and if the text structure type of the current unit text is different from the text structure type of the last unit text of the current unit text, setting a text structure label of the current unit text.
The embodiment also provides an electronic device, which comprises a processor and a memory, wherein the memory stores machine executable instructions capable of being executed by the processor, and the processor executes the machine executable instructions to realize the text generation method. The electronic device may be a server or a terminal device.
Referring to fig. 6, the electronic device includes a processor 100 and a memory 101, the memory 101 stores machine executable instructions capable of being executed by the processor 100, and the processor 100 executes the machine executable instructions to implement the text generation method.
Further, the electronic device shown in fig. 6 further includes a bus 102 and a communication interface 103, and the processor 100, the communication interface 103, and the memory 101 are connected through the bus 102.
The Memory 101 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 103 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. The bus 102 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
Processor 100 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 100. The Processor 100 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 101, and the processor 100 reads the information in the memory 101 and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
The present embodiments also provide a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement the text generation method described above.
The text generation method, the text generation device, the electronic device, and the computer program product of the storage medium according to the embodiments of the present invention include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments and are not described herein again.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present invention can be understood in specific cases for those skilled in the art.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that the following embodiments are merely illustrative of the present invention, and not restrictive, and the scope of the present invention is not limited thereto: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (12)

1. A method of text generation, the method comprising:
acquiring a text structure attribute and a text content attribute of a target text to be generated;
inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance, and outputting the target text;
wherein the text structure attribute is to: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attribute is to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
2. The method of claim 1, wherein the text structure attribute comprises: at least one text structure type arranged according to a preset sequence; the text structure type is used for: and controlling the text generation model to output the text matched with the text structure type.
3. The method of claim 2, wherein the target text comprises lyric text; the text structure type comprises one or more of a song master, a song refrain and a bridge section.
4. The method of claim 1, wherein the step of inputting the text structure attribute and the text content attribute into a pre-trained text generation model and outputting the target text comprises:
inputting the text structure attribute and the text content attribute into the text generation model, and executing the following operations through the text generation model:
generating a type identifier of a text structure type contained in the text structure attribute;
generating a partial text corresponding to the text structure type based on the type identification;
obtaining a target text through a part of text corresponding to the text structure type; and matching the text content of the target text with the text content attribute.
5. The method according to claim 4, wherein the step of generating the type identifier of the text structure type contained in the text structure attribute comprises:
based on the text structure type contained in the text structure attribute, when the text generation model generates a first character corresponding to the text structure type, adjusting the probability of each alternative character for generating the first character; wherein, the alternative character comprises a type identifier of the text structure type; the probability of the type identification of the text structure type is the maximum in the adjusted probabilities of the alternative characters;
and outputting the type identifier of the text structure type as the first character.
6. The method of claim 4, wherein the text structure attribute further comprises: the specified text amount corresponding to the text structure type; the specified text amount is counted based on the number of preset unit texts;
the step of generating the partial text corresponding to the text structure type based on the type identification comprises:
generating a unit text corresponding to the type identifier based on the type identifier;
if the number of the unit texts generated currently is smaller than the specified text amount, generating a unit identifier, and generating the unit texts corresponding to the unit identifier based on the type identifiers and the generated unit texts;
and if the number of the currently generated unit texts reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
7. The method according to claim 6, wherein if the number of the currently generated unit texts reaches the specified text amount, the step of combining the generated unit texts to obtain the partial text corresponding to the text structure type comprises:
counting the total number of the type identifiers and the unit identifiers;
and if the total number of the identifications reaches the specified text amount, combining the generated unit texts to obtain a partial text corresponding to the text structure type.
8. The method of claim 1, wherein the text generation model is trained by:
acquiring a sample text set; wherein the sample text set comprises a sample text, a text structure label and a text content label of the sample text; generating text structure labels of at least one part of the sample text through a structure label generation model which is trained in advance;
training the text generation model based on the sample text set.
9. The method of claim 8, wherein the structural label generation model generates the textual structural label by:
inputting the sample text into the structural label generation model, and executing the following operations through the structural label generation model:
setting a text structure label of a first unit text in the sample text;
for unit texts except the first unit text, if the text structure type of the current unit text is the same as the text structure type of the last unit text of the current unit text, setting a unit identification tag of the current unit text;
and if the text structure type of the current unit text is different from the text structure type of the last unit text of the current unit text, setting a text structure label of the current unit text.
10. An apparatus for generating text, the apparatus comprising:
the attribute determining module is used for acquiring the text structure attribute and the text content attribute of the target text to be generated;
the text output module is used for inputting the text structure attribute and the text content attribute into a text generation model which is trained in advance and outputting the target text;
wherein the text structure attribute is to: controlling the text structure of the target text output by the text generation model to accord with the text structure attribute; the text content attribute is to: and controlling the text content of the target text output by the text generation model to accord with the text content attribute.
11. An electronic device comprising a processor and a memory, the memory storing machine executable instructions executable by the processor, the processor executing the machine executable instructions to implement the text generation method of any of claims 1-9.
12. A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement the text generation method of any of claims 1-9.
CN202110554777.2A 2021-04-26 2021-05-20 Text generation method and device and electronic equipment Active CN113268952B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110457116 2021-04-26
CN2021104571168 2021-04-26

Publications (2)

Publication Number Publication Date
CN113268952A true CN113268952A (en) 2021-08-17
CN113268952B CN113268952B (en) 2024-03-01

Family

ID=77232242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110554777.2A Active CN113268952B (en) 2021-04-26 2021-05-20 Text generation method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113268952B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011175006A (en) * 2010-02-23 2011-09-08 Sony Corp Information processing apparatus, automatic composition method, learning device, learning method and program
CN110097085A (en) * 2019-04-03 2019-08-06 阿里巴巴集团控股有限公司 Lyrics document creation method, training method, device, server and storage medium
CN111460833A (en) * 2020-04-01 2020-07-28 合肥讯飞数码科技有限公司 Text generation method, device and equipment
CN111783455A (en) * 2020-07-13 2020-10-16 网易(杭州)网络有限公司 Training method and device of text generation model and text generation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011175006A (en) * 2010-02-23 2011-09-08 Sony Corp Information processing apparatus, automatic composition method, learning device, learning method and program
CN110097085A (en) * 2019-04-03 2019-08-06 阿里巴巴集团控股有限公司 Lyrics document creation method, training method, device, server and storage medium
CN111460833A (en) * 2020-04-01 2020-07-28 合肥讯飞数码科技有限公司 Text generation method, device and equipment
CN111783455A (en) * 2020-07-13 2020-10-16 网易(杭州)网络有限公司 Training method and device of text generation model and text generation method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
秋至明月华6: ""教你使用:歌曲填词软件(AI人工智能写歌词软件)给歌曲自动填词"", Retrieved from the Internet <URL:https://haokan.***.com/v?pd=wisenatural&vid=4598011642394653704> *

Also Published As

Publication number Publication date
CN113268952B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
CN110489555B (en) Language model pre-training method combined with similar word information
CN108460013B (en) Sequence labeling model and method based on fine-grained word representation model
CN110580292B (en) Text label generation method, device and computer readable storage medium
CN112101041B (en) Entity relationship extraction method, device, equipment and medium based on semantic similarity
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN108710704B (en) Method and device for determining conversation state, electronic equipment and storage medium
CN112101031B (en) Entity identification method, terminal equipment and storage medium
CN113255320A (en) Entity relation extraction method and device based on syntax tree and graph attention machine mechanism
CN111859964A (en) Method and device for identifying named entities in sentences
CN113177412A (en) Named entity identification method and system based on bert, electronic equipment and storage medium
CN110751234A (en) OCR recognition error correction method, device and equipment
CN111291565A (en) Method and device for named entity recognition
CN115859164A (en) Method and system for identifying and classifying building entities based on prompt
CN112036186A (en) Corpus labeling method and device, computer storage medium and electronic equipment
CN111898339B (en) Ancient poetry generating method, device, equipment and medium based on constraint decoding
CN113076749A (en) Text recognition method and system
CN112069816A (en) Chinese punctuation adding method, system and equipment
CN113268952B (en) Text generation method and device and electronic equipment
CN115906854A (en) Multi-level confrontation-based cross-language named entity recognition model training method
CN113901210B (en) Method for marking verbosity of Thai and Burma characters by using local multi-head attention to mechanism fused word-syllable pair
CN115221284A (en) Text similarity calculation method and device, electronic equipment and storage medium
CN115129843A (en) Dialog text abstract extraction method and device
CN111090720B (en) Hot word adding method and device
Qiang et al. Back-translation-style data augmentation for mandarin chinese polyphone disambiguation
CN114298032A (en) Text punctuation detection method, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant