CN112287667A

CN112287667A - Text generation method and equipment

Info

Publication number: CN112287667A
Application number: CN202011156277.5A
Authority: CN
Inventors: 卫海天; 丁若谷
Original assignee: Beijing Minglue Zhaohui Technology Co Ltd
Current assignee: Beijing Minglue Zhaohui Technology Co Ltd
Priority date: 2020-10-26
Filing date: 2020-10-26
Publication date: 2021-01-29

Abstract

The application discloses a text generation method and a device, wherein the text generation method comprises the following steps: acquiring an input text; performing parameter optimization training on a pre-training model according to the input text to obtain a text generation model; and generating an output text corresponding to the input text by applying the text generation model according to the input text. By the text generation method provided by the invention, special writing can be performed on a specific theme, large-scale retraining on a pre-trained model is not needed, only individual parameters in the pre-trained model need to be dynamically adjusted, and the accuracy rate of output text is rapidly improved.

Description

Text generation method and equipment

Technical Field

One or more embodiments of the present disclosure relate to the field of natural language processing technologies, and in particular, to a text generation method and apparatus.

Background

Natural language processing, and in particular natural language generation problems, have long been considered one of the most challenging tasks. The natural language generation is a research to enable a computer to have the same expression and writing functions, namely, a high-quality natural language text can be automatically generated through a planning process according to some key information and the expression form of the key information in the computer.

Natural language generation is a branch of artificial intelligence and computational linguistics, and a corresponding language generation system is a computer model based on language information processing, and the working process of the language generation system is opposite to natural language analysis, and text is generated by selecting and executing certain semantic and grammatical rules from an abstract concept level.

Computer text generation is more and more widely applied, especially brand merchants need to generate documents matching brands and products in time for users to reach the users more timely on the network for distribution, and the current computer text generation has the following problems:

1. the application range is narrow, the method is mainly applied to the fields of customer service robots, question and answer systems and the like, and other fields are difficult to relate to application;

2. only a small amount of content related to the input text can be generated, and the text writing content cannot be controlled from the writing theme perspective;

3. when a long text needs to be output, a reasonable mechanism is lacked to ensure that the content is related to the subject, so that a large number of text words with low relevance are output;

4. no special writing can be made for a particular subject matter, such as a particular brand and product.

Disclosure of Invention

In view of the above technical background, the present invention discloses a text generation method, which is used to provide a method for writing a text specifically for a specific topic, and to output a large number of manuscripts related to a preset topic in a short time, thereby improving the work efficiency.

The invention provides a text generation method, which comprises the following steps:

s1, acquiring an input text;

s2, performing parameter optimization training on a pre-training model according to the input text to obtain a text generation model;

and S3, generating an output text corresponding to the input text by applying the text generation model according to the input text.

As a further improvement of the present invention, the step S3 specifically includes the following steps:

s31, applying the text generation model according to the input text to generate an initial text corresponding to the input text;

s32, calculating the theme generation probability of the initial text;

s33, judging whether the theme generation probability is larger than a set threshold value;

s34, if the theme generation probability is larger than the set threshold, directly outputting an initial text; if the theme generation probability is smaller than the set threshold, updating the text generation model according to the theme generation probability to generate a new initial text, and adjusting the theme generation probability until the theme generation probability meets the threshold requirement, so as to output the initial text;

s35, judging the text length of the initial text;

s36, if the length of the initial text reaches a preset text length, the initial text is the output text; if the initial text length does not reach the preset text length, circularly executing the step S31 to the step S36, and continuously obtaining a new initial text until the accumulation of the initial text length reaches the preset text length to obtain the output text.

As a further improvement of the invention, the pre-training model in the step S2 is a GPT-2 model.

As a further improvement of the present invention, in step S32, the topic generation probability is a probability that the initial text belongs to a preset topic, and the topic generation probability is calculated according to the initial text by a bag-of-words model.

As a further improvement of the present invention, the updating the text generation model according to the topic generation probability in step S34 specifically includes the following steps:

s341, calculating the updating gradient of the initial text according to the theme generation probability;

and S342, updating the internal parameters of the text generation model according to the updating gradient.

Based on the same inventive concept, the present application further discloses a text generating apparatus based on the method disclosed in any of the above inventions,

the text generation device includes:

the text acquisition module is used for acquiring an input text;

the model generation module is used for performing parameter optimization training on a pre-training model according to the input text to obtain a text generation model;

and the text output module is used for applying the text generation model to generate an output text corresponding to the input text according to the input text.

As a further improvement of the present invention, the text output module includes:

the initial text generation module is used for applying the text generation model to generate an initial text corresponding to the input text according to the input text;

the probability calculation module is used for calculating the theme generation probability of the initial text;

the probability judging module is used for judging whether the theme generation probability is greater than a set threshold value;

the model updating module is used for directly outputting an initial text if the theme generation probability is greater than the set threshold; if the theme generation probability is smaller than the set threshold, updating the text generation model according to the theme generation probability to generate a new initial text, and adjusting the theme generation probability until the theme generation probability meets the threshold requirement;

the text length judging module is used for judging the text length of the initial text;

the text length adjusting module is used for determining the initial text as the output text if the initial text length reaches a preset text length; if the initial text length does not reach the preset text length, circularly executing the step S31 to the step S36, and continuously obtaining a new initial text until the accumulation of the initial text length reaches the preset text length to obtain the output text.

As a further improvement of the invention, the pre-training model in the model generation module is a GPT-2 model.

As a further improvement of the present invention, the topic generation probability in the probability calculation module is a probability that the initial text belongs to a preset topic, and the topic generation probability is calculated through a bag-of-words model according to the initial text.

As a further improvement of the present invention, the model updating module comprises:

the updating gradient calculation module is used for calculating the updating gradient of the initial text according to the theme generation probability;

and the model parameter updating module is used for updating the internal parameters of the text generation model according to the updating gradient.

Compared with the prior art, the invention has the following beneficial effects:

1. the text generation method is provided, and special writing can be performed aiming at a specific theme, so that the writing efficiency is improved, and the manpower input is reduced;

2. the pre-trained model does not need to be retrained on a large scale, only individual parameters in the pre-trained model need to be dynamically adjusted, and the accuracy rate of output texts is rapidly improved;

3. the manuscript conforming to the preset theme is output in a short time, so that the working efficiency of the manuscript staff is improved, and the labor force is saved;

4. the method can learn and capture the characteristics which cannot be analyzed by the file personnel from the massive texts to write, and ensure that the output text is related to the preset theme.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts. In the drawings:

fig. 1 is an overall flowchart of a text generation method according to an embodiment of the present invention;

FIG. 2 is an overall flowchart of the step S3 disclosed in FIG. 1;

FIG. 3 is a flowchart of the updating of the text generation model in step S34 disclosed in FIG. 2;

FIG. 4 is a structural framework diagram of a text generating device according to an embodiment of the present invention;

fig. 5 is a block diagram of a computer device according to an embodiment of the present invention.

In the above figures:

100. a text acquisition module; 200. a model generation module; 300. a text output module; 301. an initial text generation module; 302. a probability calculation module; 303. a probability judgment module; 304. a model update module; 3041. updating the gradient calculation module; 3042. a model parameter updating module; 305. a text length judgment module; 306. a text length adjusting module; 80. a bus; 81. a processor; 82. a memory; 83. a communication interface.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.

It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.

Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.

Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.

The present invention is described in detail with reference to the embodiments shown in the drawings, but it should be understood that these embodiments are not intended to limit the present invention, and those skilled in the art should understand that functional, methodological, or structural equivalents or substitutions made by these embodiments are within the scope of the present invention.

Before describing in detail the various embodiments of the present invention, the core inventive concepts of the present invention are summarized and described in detail by the following several embodiments.

The invention discloses a text generation method, which is used for writing specially aiming at a specific theme, does not need to retrain a pre-trained model in a large scale, only needs to dynamically adjust individual parameters, quickly improves the accuracy of output texts, outputs a document manuscript which accords with a preset theme in a short time, improves the working efficiency of document personnel, saves labor force, can learn and capture the characteristics which cannot be analyzed by the document personnel from massive texts, and ensures that the output texts are related to the preset theme.

The first embodiment is as follows:

referring to fig. 1, the present example discloses a specific implementation of a text generation method (hereinafter referred to as "method").

Specifically, referring to fig. 1, the method disclosed in this embodiment mainly includes the following steps:

and step S1, acquiring the input text.

Specifically, in step S1, the input text is a corpus text of a specific industry, text collection is performed according to different industry requirements, and training of a text generation model is prepared according to the collected corpus text of the specific industry. For example: to write a text with a specific theme in the cosmetics industry, a corpus of the cosmetics industry needs to be selected for preparation.

Then, step S2 is executed to perform parameter optimization training on a pre-training model according to the input text to obtain a text generation model.

In this embodiment, the pre-training model is the GPT-2 model. GPT-2 is a large-scale unsupervised Natural Language Processing (NLP) model, called the "the strongest universal NLP model in history", which can generate coherent text paragraphs, refresh 7 large data set references, and can complete a plurality of different language modeling tasks such as reading comprehension, question answering, machine translation and the like without pre-training. GPT-2 is of very large scale and is a huge model based on transformer training on a large dataset.

Finally, referring to fig. 2, step S3 is executed to apply the text generation model to generate an output text corresponding to the input text according to the input text.

Specifically, step S3 specifically includes the following:

s32, calculating the theme generation probability of the initial text;

s35, judging the text length of the initial text;

Specifically, the calculation of the topic generation probability employs a bag-of-words model. The theme generation probability is the probability that the current initial text belongs to the preset theme, and whether the text output meets the requirements or not can be visually seen. For example: the probability that the content belongs to the topic ' facial cleanser ' is calculated according to the text generated content ' this facial cleanser is good. The bag-of-words model is an expression model simplified under natural language processing and information retrieval, and under the bag-of-words model, characters such as sentences or documents can be expressed in a manner that the words are contained in a bag, regardless of grammar and the order of the words. The bag-of-words model is widely applied to the field of document classification and computer vision, and the frequency of word occurrence can be used as the characteristic of a training classifier.

In some embodiments, referring to fig. 3, the updating the text generation model according to the topic generation probability in step S34 includes the following steps:

Specifically, the updated gradient of the initial text is calculated as:

in the formula,. DELTA.H_t' denotes the updated gradient, H_tIndicating that the text content has been generated,

representing the gradient operator, Δ H_tThe generated content gradient is represented, alpha represents parameter adjusting strength, gamma represents normalized strength, a represents a preset theme, | | | | represents norm, and p represents the theme generation probability of the initial text.

Specifically, internal historical parameters of the text generation model are updated according to the updating gradient, and then center sampling is performed on the updated parameters to generate a new initial text.

By the text generation method disclosed by the embodiment, special writing is performed on a specific theme, a pre-trained model does not need to be retrained on a large scale, only individual parameters in the model need to be dynamically adjusted, the accuracy of the output text is rapidly improved, a document manuscript which accords with the preset theme is output in a short time, the working efficiency of document personnel is improved, labor force is saved, characteristics which cannot be analyzed by the document personnel can be learned and captured from massive texts for writing, and the output text is ensured to be related to the preset theme.

Example two:

in conjunction with a text generation method disclosed in the first embodiment, this embodiment discloses a specific implementation example of a text generation device (hereinafter referred to as "device").

Referring to fig. 4, the apparatus includes:

a text acquisition module 100, configured to acquire an input text;

the model generation module 200 is configured to perform parameter optimization training on a pre-training model according to the input text to obtain a text generation model;

a text output module 300, configured to apply the text generation model to generate an output text corresponding to the input text according to the input text.

In some embodiments, the input text is a corpus of texts in a specific industry, text collection is performed according to different industry requirements, and training of a text generation model is prepared according to the collected corpus of texts in the specific industry. Such as: to write a text with a specific theme in the cosmetics industry, a corpus of the cosmetics industry needs to be selected for preparation.

In some of these embodiments, the pre-training model is the GPT-2 model. GPT-2 is a large-scale unsupervised Natural Language Processing (NLP) model, called the "the strongest universal NLP model in history", which can generate coherent text paragraphs, refresh 7 large data set references, and can complete a plurality of different language modeling tasks such as reading comprehension, question answering, machine translation and the like without pre-training. GPT-2 is of very large scale and is a huge model based on transformer training on a large dataset.

In some of these embodiments, text output module 300 includes:

an initial text generation module 301, configured to apply the text generation model to generate an initial text corresponding to the input text according to the input text;

a probability calculation module 302, configured to calculate a topic generation probability of the initial text;

a probability judging module 303, configured to judge whether the theme generation probability is greater than a set threshold;

the model updating module 304 is used for directly outputting an initial text if the theme generation probability is greater than the set threshold; if the theme generation probability is smaller than the set threshold, updating the text generation model according to the theme generation probability to generate a new initial text, and adjusting the theme generation probability until the theme generation probability meets the threshold requirement;

a text length judging module 305, configured to judge a text length of the initial text;

a text length adjusting module 306, configured to determine that the initial text is the output text if the initial text length reaches a preset text length; if the initial text length does not reach the preset text length, circularly executing the step S31 to the step S36, and continuously obtaining a new initial text until the accumulation of the initial text length reaches the preset text length to obtain the output text.

Specifically, the model update module 304 includes:

an update gradient calculation module 3041, configured to calculate an update gradient of the initial text according to the topic generation probability;

a model parameter updating module 3042, configured to update the internal parameters of the text generation model according to the update gradient.

In some of these embodiments, the calculation of the topic generation probability employs a bag of words model. The theme generation probability is the probability that the current output initial text content belongs to the preset theme, and whether the text output meets the requirement can be visually seen. The bag-of-words model is an expression model simplified under natural language processing and information retrieval, and under the bag-of-words model, characters such as sentences or documents can be expressed in a manner that the words are contained in a bag, regardless of grammar and the order of the words. The bag-of-words model is widely applied to the field of document classification and computer vision, and the frequency of word occurrence can be used as the characteristic of a training classifier.

Please refer to the description of the first embodiment, which is not repeated herein, for the technical solutions of the remaining parts in the text generation apparatus and the text generation method disclosed in the first embodiment.

Example three:

referring to FIG. 5, the embodiment discloses an embodiment of a computer device. The computer device may comprise a processor 81 and a memory 82 in which computer program instructions are stored.

Specifically, the processor 81 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 82 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these.

Memory 82 may include removable or non-removable (or fixed) media, where appropriate. The memory 82 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 82 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.

The memory 82 may be used to store or cache various data files for processing and/or communication use, as well as possible computer program instructions executed by the processor 81.

The processor 81 implements any of the text generation methods in the above embodiments by reading and executing computer program instructions stored in the memory 82.

In some of these embodiments, the computer device may also include a communication interface 83 and a bus 80. As shown in fig. 5, the processor 81, the memory 82, and the communication interface 83 are connected via the bus 80 to complete communication therebetween.

The communication interface 83 is used for implementing communication between modules, devices, units and/or equipment in the embodiment of the present application. The communication port 83 may also be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.

Bus 80 includes hardware, software, or both to couple the components of the computer device to each other. Bus 80 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 80 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 80 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

The computer device may perform text generation for a particular topic, thereby implementing the text generation method described in connection with fig. 1.

In addition, in combination with the text generation method in the foregoing embodiments, the embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the text generation methods in the above embodiments.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

In summary, the beneficial effects of the invention are that a text generation method is provided, which can perform special writing for a specific theme, improve writing efficiency, reduce manpower input, and only need to dynamically adjust individual parameters of a pre-trained model without performing large-scale retraining on the model, thereby quickly improving accuracy of output text, outputting a manuscript meeting a preset theme in a short time, improving work efficiency of a manuscript staff, saving labor force, and learning and capturing characteristics that the manuscript staff cannot analyze from a mass of texts to perform writing, and ensuring that the output text is related to the preset theme.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of text generation, the method comprising:

s1, acquiring an input text;

2. The text generation method according to claim 1, wherein the step S3 specifically includes the steps of:

s32, calculating the theme generation probability of the initial text;

s35, judging the text length of the initial text;

3. The text generation method of claim 1, wherein the pre-trained model in step S2 is a GPT-2 model.

4. The text generation method according to claim 2, wherein the topic generation probability in step S32 is a probability that the initial text belongs to a preset topic, and the topic generation probability is calculated by a bag-of-words model from the initial text.

5. The text generation method according to claim 2, wherein the updating of the text generation model according to the topic generation probability in step S34 specifically includes the steps of:

6. A text generation device that operates the text generation method according to any one of claims 1 to 5, characterized by comprising:

the text acquisition module is used for acquiring an input text;

7. The text generation apparatus of claim 6, wherein the text output module comprises:

8. The text generation device of claim 6, wherein the pre-trained model in the model generation module is a GPT-2 model.

9. The text generation device according to claim 7, wherein the topic generation probability in the probability calculation module is a probability that the initial text belongs to a preset topic, and the topic generation probability is calculated by a bag-of-words model according to the initial text.

10. The text generation device of claim 7, wherein the model update module comprises: