CN116860930A - Dialog generation method and device - Google Patents

Dialog generation method and device Download PDF

Info

Publication number
CN116860930A
CN116860930A CN202310772066.1A CN202310772066A CN116860930A CN 116860930 A CN116860930 A CN 116860930A CN 202310772066 A CN202310772066 A CN 202310772066A CN 116860930 A CN116860930 A CN 116860930A
Authority
CN
China
Prior art keywords
dialogue
knowledge
prompt
information
problem information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310772066.1A
Other languages
Chinese (zh)
Inventor
周婉月
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202310772066.1A priority Critical patent/CN116860930A/en
Publication of CN116860930A publication Critical patent/CN116860930A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a dialogue generation method and a device, wherein the method comprises the following steps: receiving question information of a first dialogue input by a user; retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents; retrieving a first sub-database according to the problem information of the first dialogue to obtain a plurality of first prompt samples; inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content; retrieving a second seed database according to the problem information of the first dialogue to obtain a plurality of second prompt samples; and inputting the problem information, the first knowledge content and a plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue. The invention can generate high-quality dialogue reply information suitable for various dialogue scenes, improve user experience, reduce sample structure and reduce cost.

Description

Dialog generation method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a dialog generation method and device.
Background
This section is intended to provide a background or context to the embodiments of the invention that are recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
With the rapid development of deep learning technology, the dialogue system is used more and more frequently in people's daily life, and is paid more and more attention to researchers, manufacturers and users. However, in many real scenes, the replies generated by the dialogue system still have many defects, such as easy generation of general, nonsensical and actually wrong reply sentences, for example, general nonsensical replies such as "good", "known", and the like, which cannot effectively identify the intention of the user and accurately answer the questions posed by the user, and influence the user experience;
in addition, in the prior art, a large amount of training data is still required for training a dialogue system, and the quality of the training result is also seriously dependent on the diversity and quality of the training data; the dialogue system obtained by training is not only complex in operation process and high in cost, but also lacks the general natural language understanding capability, and cannot be effectively generalized into dialogue scenes in other knowledge fields.
Disclosure of Invention
The embodiment of the invention provides a dialogue generating method, which is used for generating high-quality dialogue reply information applicable to various dialogue scenes, improving user experience, reducing sample structures and lowering cost, and comprises the following steps:
Receiving question information of a first dialogue input by a user;
retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents;
retrieving a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the question information of the dialogue, and knowledge documents and knowledge contents corresponding to the question information of the dialogue;
inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content, wherein the large-scale pre-training language model is constructed based on a simulated learning strategy;
retrieving a second sub-database according to the question information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the question information of the dialogue, and knowledge content and reply information corresponding to the question information of the dialogue;
and inputting the problem information, the first knowledge content and a plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue.
The embodiment of the invention also provides a dialogue generating device, which is used for generating high-quality dialogue reply information suitable for various dialogue scenes, improving user experience, reducing sample structure and lowering cost, and comprises the following steps:
The problem information receiving module is used for receiving problem information of a first dialogue input by a user;
the document knowledge base retrieval module is used for retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents;
the first sub-database retrieval module is used for retrieving the first sub-database according to the problem information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the problem information of the dialogue, and knowledge documents and knowledge contents corresponding to the problem information of the dialogue;
the first knowledge content output module is used for inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into the large-scale pre-training language model, and outputting the first knowledge content, wherein the large-scale pre-training language model is constructed based on the imitation learning strategy;
the second sub-database retrieval module is used for retrieving a second sub-database for the problem information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the problem information of the dialogue, and knowledge content and reply information corresponding to the problem information of the dialogue;
and the reply information output module is used for inputting the problem information, the first knowledge content and the plurality of second prompt samples of the first dialogue into the large-scale pre-training language model and outputting the reply information of the first dialogue.
The embodiment of the invention also provides computer equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the dialog generation method when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the dialog generation method when being executed by a processor.
The embodiment of the invention also provides a computer program product, which comprises a computer program, and the computer program realizes the dialog generation method when being executed by a processor.
In the embodiment of the invention, the problem information of the first dialogue input by the user is received; retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents; retrieving a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the question information of the dialogue, and knowledge documents and knowledge contents corresponding to the question information of the dialogue; inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content, wherein the large-scale pre-training language model is constructed based on a simulated learning strategy; retrieving a second sub-database according to the question information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the question information of the dialogue, and knowledge content and reply information corresponding to the question information of the dialogue; and inputting the problem information, the first knowledge content and a plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue.
The knowledge is introduced, a knowledge selection process in the related work of the knowledge driven dialogue system is converted into a knowledge generation task based on a simulated learning strategy, an input prompt of a pre-training language model is constructed based on a prompt sample which is similar to dialogue problems input by a user and is retrieved from a seed database, the excitation model is used for reasoning and generating specific knowledge content based on a knowledge document, and the accuracy of knowledge selection in a complex scene can be improved; in the reply generation stage of the dialogue, an imitation learning strategy is adopted, and under the condition that only a small amount of sample data is constructed, a large-scale language model can generate high-quality replies containing correct knowledge according to user input and knowledge content obtained in the knowledge generation stage.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is a process flow diagram of a dialog generation method in an embodiment of the present invention;
FIG. 2 is a flowchart of a method for retrieving a first seed database to obtain a plurality of first hint samples according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for retrieving a second seed database to obtain a plurality of second hint samples according to an embodiment of the present invention;
FIG. 4 is a schematic diagram showing a specific example of a knowledge content generation stage in an embodiment of the invention;
FIG. 5 is a schematic diagram showing a specific example of a reply generation stage according to an embodiment of the present invention;
FIG. 6 is a schematic overall flow chart of a dialogue generating method according to an embodiment of the invention;
FIG. 7 is a schematic diagram of a dialogue generating device according to an embodiment of the present invention;
fig. 8 is a schematic diagram of a computer device according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present invention and their descriptions herein are for the purpose of explaining the present invention, but are not to be construed as limiting the invention.
First, technical terms in the embodiment of the present invention will be described:
prefix hint: the industry is also called Prefix template, which is a character string input Prefix used for stimulating the output of the generated pre-training language model; prefix hints can instruct the model to accomplish specific tasks by adding human-readable natural language instructions to the input; collectively referred to herein as hints.
Imitation learning: industry also calls Demonstration Learning, adding a small number of similar samples as task cues for large-scale language models; for example, in prompting for input that "i like a movie of a director, the emotion of this sentence is: < language model output > "in emotion analysis, example text of some similar scenes can be stitched before this hint, for example: the commodity is very convenient to use, the emotion of the sentence is < positive >, "I suspects the emotion of the sentence is < negative >," and the like, and the large-scale language model can predict the result "< positive >" in a "imitation" manner according to the two newly added sample sentences.
Seed database: the invention is used for constructing a sample library simulating a small volume of a learning example, and mainly comprises a knowledge generation database and a reply generation database in the field of tasks.
The inventors have found that in order for a dialog system to generate a more satisfactory reply, one important aspect is that a reply that meets the dialog context and is knowledge-rich can be generated by introducing knowledge. In particular, for archive systems introducing human-machine interaction, dialog systems need to answer questions posed by users based on a large knowledge base stored in archive or document form, which often requires appropriate knowledge incorporated into the document. For the knowledge-driven dialogue generation task, the current research work mainly preprocesses the content in a document into a knowledge item set, marks the corresponding correct knowledge item into dialogue data to be used as training data of a dialogue for training, and the training process is used for modeling knowledge selection and knowledge fusion capability of a dialogue system.
However, the knowledge driven dialog system described above works in relation to the need to prepare a high-quality set of knowledge items in advance as an external knowledge base of the dialog generation task, and to annotate a large amount of dialog data in the relevant knowledge domain. The document information is processed into a high-quality knowledge item set, which is a tedious work, the quality of the training result is also seriously dependent on the diversity of an external knowledge base and the quality of training data, the capability of general natural language understanding is lacking, and the training result cannot be effectively generalized into dialogue scenes in other knowledge fields; in addition, in the knowledge selection stage, the existing knowledge dialogue system is mainly based on retrieving knowledge items related to user input from a knowledge item set obtained by document preprocessing, and the method is poor in performance when the user input relates to complex logic, knowledge reasoning and the like, and often cannot locate correct knowledge content. Based on this, the inventor of the present invention proposes a dialogue generating method to solve the foregoing technical problems.
Fig. 1 is a process flow diagram of a dialog generation method in an embodiment of the invention. As shown in fig. 1, the dialog generating method in the embodiment of the present invention may include:
Step 101, receiving question information of a first dialogue input by a user;
step 102, searching a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents;
step 103, searching a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the question information of the dialogue, and knowledge documents and knowledge contents corresponding to the question information of the dialogue;
104, inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content, wherein the large-scale pre-training language model is constructed based on a simulated learning strategy;
step 105, searching a second sub-database according to the question information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the question information of the dialogue, and knowledge content and reply information corresponding to the question information of the dialogue;
and 106, inputting the problem information, the first knowledge content and the plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue.
The following describes specific implementation steps of the dialog generating method in the embodiment of the present invention:
in practice, first question information of a first dialogue entered by a user may be received, i.e. the user may trigger the start of a dialogue by presenting questions.
Then in step 102, the document knowledge base may be searched according to the problem information of the first dialogue, so as to obtain a plurality of first knowledge documents, where the correlation degree between the plurality of first knowledge documents and the problem information of the first dialogue is not less than a first preset threshold.
In one embodiment, before receiving the question information of the first dialogue input by the user, the method may further include: and pre-establishing a document knowledge base, wherein the document knowledge base comprises massive knowledge documents related to various types of knowledge.
In specific implementation, the size of the first preset threshold value can be adjusted according to actual conditions. For example, if the number of knowledge documents related to the problem information of the first dialogue is small, in order to ensure that the introduced knowledge is rich, the first preset threshold may be set to 0, that is, all the retrieved knowledge documents are regarded as first knowledge documents; if the number of knowledge documents related to the problem information of the first dialogue is large, in order to avoid repeatedly introducing the same knowledge and burden the subsequent model processing procedure, the first preset threshold may be set to a higher value, so as to properly reduce the number of the first knowledge documents.
The knowledge document with the highest correlation degree with the problem information of the first dialogue can be directly searched as the first knowledge document without referring to the first preset threshold value. For example, assuming that the question information of the first dialogue input by the user is q, the external document knowledge base is first pre-searched, and the knowledge document m having the highest correlation with q can be searched from the external document base according to the general full text search method.
Next, step 103 may be executed to retrieve the first sub-database according to the problem information of the first session, so as to obtain a plurality of first prompt samples, where the first prompt samples include the problem information of the session, and the knowledge documents and the knowledge contents corresponding to the problem information of the session, and the similarity between the plurality of first prompt samples and the problem information of the first session is not less than a second preset threshold.
In one embodiment, before receiving the question information of the first dialogue input by the user, the method may further include: a first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
By way of example, how the hint samples in the first seed database are specifically constructed are as follows:
The hint samples in the first seed database may be constructed based on the historical dialog data and the annotated knowledge document information. For the knowledge generation stage, a prompt sample of a first seed database is constructed according to user input q, a corresponding first knowledge document m and knowledge content k which needs model prediction generation, wherein the first seed database is called a knowledge generation seed database D1, and each prompt sample D 1 The specific constitution of (2) may be: m+q+k. The knowledge generated hint templates are constructed as follows:
knowledge document m user input q= > k
FIG. 2 is a flowchart of a method for retrieving a first seed database to obtain a plurality of first hint samples according to an embodiment of the present invention. As shown in fig. 2, in one embodiment, a method of retrieving a first seed database to obtain a plurality of first hint samples may include:
retrieving a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise:
step 201, a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in a first seed database and a vector corresponding to problem information of a first dialogue;
step 202, calculating a vector corresponding to each prompt sample in a first seed database, and cosine similarity between vectors corresponding to the problem information of the first dialogue;
Step 203, taking the calculated cosine similarity as the similarity between each prompt sample in the first seed database and the problem information of the first dialogue;
and 204, taking a prompt sample with the similarity of the problem information of the first dialogue with the first sub database not smaller than a second preset threshold value as a first prompt sample.
After retrieving the plurality of first hint samples, step 104 may be performed, where the problem information of the first dialog, the first knowledge document, and the plurality of first hint samples may be input into a large-scale pre-training language model, where the large-scale pre-training language model is constructed based on the simulated learning strategy, and the first knowledge content is output.
In one embodiment, the knowledge content may be characterized as new knowledge derived from the question information of the dialog, and the knowledge document reasoning about the question information of the dialog. For example, assume that the problem information presented by the user is: how much points are a member's level to promote from level 5 to level 7? By retrieving the document knowledge base, the obtained related knowledge documents are: the member level needs 5 points from 5 to 6, and 6 points from 6 to 7; the knowledge content that can be inferred from the problem information and knowledge document described above is: the promotion from level 5 to level 7 requires 11 points.
After outputting the knowledge content, in order to reply to the user in the form of a complete dialogue, reply information needs to be generated based on the knowledge content. In step 105, a second sub-database may be searched according to the problem information of the first session to obtain a plurality of second prompt samples, where the second prompt samples include the problem information of the session, and the knowledge content and the reply information corresponding to the problem information of the session, and the similarity between the plurality of second prompt samples and the problem information of the first session is not less than a third preset threshold.
In one embodiment, before receiving the question information of the first dialogue input by the user, the method may further include: a second seed database is established in advance, wherein the second seed database comprises prompt samples consisting of problem information of different dialogues, knowledge content corresponding to the problem information of the different dialogues and reply information.
By way of example, how the hint samples in the second seed database are specifically constructed is as follows:
for the reply generation stage, a prompt sample in a second seed database is constructed according to user input, corresponding knowledge content and knowledge reply statement r needing model prediction generation, wherein the second seed database is called a reply generation seed database D2, and each sample D 2 The specific constitution of (2) is as follows: k+q+r. The prompt template generated by the reply is constructed as follows:
knowledge content k user input q= > r
The seed database may be smaller in size under the limitation of limited labeling resources, but should be as diverse as possible.
FIG. 3 is a flowchart of a method for retrieving a second seed database to obtain a plurality of second hint samples according to an embodiment of the present invention. As shown in fig. 3, in one embodiment, a method of retrieving a second seed database to obtain a plurality of second hint samples may include:
step 301, obtaining a vector corresponding to each prompt sample in the second seed database and a vector corresponding to the problem information of the first dialogue by adopting a pre-trained sentence encoder;
step 302, calculating a vector corresponding to each prompt sample in the second seed database, and cosine similarity between vectors corresponding to the problem information of the first dialogue;
step 303, taking the calculated cosine similarity as the similarity between each prompt sample in the second seed database and the problem information of the first dialogue;
and 304, taking a prompt sample with the similarity of the problem information of the second seed database and the first dialogue not smaller than a third preset threshold value as a second prompt sample.
After the plurality of second prompt samples are obtained, step 106 may be performed, where the question information, the first knowledge content, and the plurality of second prompt samples of the first dialog may be input into the large-scale pre-training language model, and the reply information of the first dialog may be output.
In one embodiment, the large-scale pre-trained language model may include one of the following models: GPT-3 model, GLM model or ERNIE model.
It should be noted that, because the large-scale pre-training language model in the embodiment of the present invention is constructed based on the simulated learning strategy, before the model is applied, the model is not required to be trained by a large amount of training data, and the first knowledge content can be output according to the plurality of first knowledge documents and the first prompt samples or the reply information of the first dialogue can be output according to the first knowledge content and the plurality of second prompt samples based on the simulated learning strategy. That is, the large-scale pre-training language models in step 104 and step 106 may be the same model or different models, and may be set according to practical situations, if the cost is saved, only one large-scale pre-training language model may be used to execute step 104 and step 106, and if the efficiency is improved, two large-scale pre-training language models may also be set to execute step 104 and step 106 respectively.
The dialog generation method of the present invention is described more fully below in connection with fig. 4, 5 and 6:
FIG. 4 is a schematic diagram showing a specific example of the knowledge content generation stage in the embodiment of the invention. As shown in fig. 4, during the knowledge generation phase, an input hint is constructed by searching similar samples from the seed database D1 based on the user input q, and to ensure that the selected samples are correlated with q, a pre-trained sentence encoder may be employed to obtain q and q in each data sample i Is calculated by calculating q and q i Taking cosine similarity of sentence vectors as a similarity score, taking n example sample structures with highest scores to input Prompt promtt, and prompting example samples of the ith example sample structure i Is represented as follows:
sample i = [ knowledge document]m i [ user input ]]q i =>k i
The line feed symbol "\n" interval is used between each sample, then the knowledge documents input and retrieved by the user are spliced at the tail, and the specific constitution of the prompt is as follows:
Prompt1=sample 1 \n sample 2 \n.....sample n n knowledge document]m [ user input]q=>
Inputting prompts into a pre-trained large-scale language model (LM represents abbreviation of the large-scale pre-trained language model) to generate knowledge content k':
k′=LM(Prompt1)
FIG. 5 is a schematic diagram showing a specific example of the reply generation stage in the embodiment of the invention. As shown in fig. 5, in the reply generation stage, an input prompt is constructed by searching similar samples from the seed database D2 based on the user input q, and similarly to step 4, a pre-trained sentence encoder may be employed to obtain q and q in each data sample i Is calculated by calculating q and q i Taking cosine similarity of sentence vectors of the sentence as a score of the similarity, taking n example sample structures with highest scores to input Prompt promtt, and prompting example samples of the ith example sample structure i Is represented as follows:
sample i = [ knowledge content ]]k i [ user input ]]q i =>r i
The line feed symbol "\n" interval is used between each sample, then the knowledge documents input and retrieved by the user are spliced at the tail, and the specific constitution of the prompt is as follows:
Prompt2=sample 1 \n sample 2 \n.....sample n n knowledge content]k' [ user input]q=>
Inputting prompts into a pre-trained large-scale language model to generate a reply sentence r':
r′=LM(Prompt2)
fig. 6 is a schematic overall flow chart of a dialog generating method according to an embodiment of the present invention. As shown in FIG. 6, given user input, the whole process is divided into two stages of knowledge generation and reply generation, in the knowledge generation process, related knowledge documents are firstly retrieved from a document knowledge base according to user input, then n example samples similar to the user input are retrieved from a knowledge generation seed database, then input prompts of a pre-training language model are constructed based on the example samples and the knowledge documents, and the input prompts are input into the pre-training model to generate knowledge content. In the reply generation stage, n example samples are retrieved from a reply generation seed database according to user input, and an input prompt is constructed based on the example samples and knowledge content generated in the previous stage to guide the prediction reply sentence in the pre-training language model.
The dialog generation method in the embodiment of the invention has the following beneficial effects:
(1) The method based on the large-scale pre-training language model is adopted, the method relies on the universal natural language processing capability of the large-scale pre-training language model, can quickly and efficiently migrate to dialogue scenes in other knowledge fields under the condition of using a small amount of sample data based on a simulated learning strategy, avoids the rearrangement of an external knowledge base and dialogue data in the migration field, and omits the step of fine tuning and retraining, thereby saving the time and resource consumption brought by the field migration.
(2) The knowledge selection process in the related work of the prior knowledge driven dialogue system is converted into a knowledge generation task based on a model science strategy, an input prompt of a pre-training language model is constructed according to an example sample similar to user input retrieved from a seed database, and the excitation model generates specific knowledge content by reasoning based on a knowledge document, so that the accuracy of knowledge selection in the knowledge document in a complex scene can be effectively improved, and the recovery quality is improved.
The embodiment of the invention also provides a dialogue generating device, which is described in the following embodiment. Because the principle of the device for solving the problem is similar to that of the dialogue generating method, the implementation of the device can refer to the implementation of the dialogue generating method, and the repetition is omitted.
Fig. 7 is a schematic structural diagram of a dialogue generating device according to an embodiment of the present invention. As shown in fig. 7, in an embodiment of the present invention, the session generating device may specifically include:
a question information receiving module 701, configured to receive question information of a first dialogue input by a user;
the document knowledge base retrieval module 702 is configured to retrieve a document knowledge base according to the problem information of the first dialogue, so as to obtain a plurality of first knowledge documents, where the correlation degree between the plurality of first knowledge documents and the problem information of the first dialogue is not less than a first preset threshold;
a first sub-database retrieval module 703, configured to retrieve a first sub-database according to the problem information of the first session, and obtain a plurality of first prompt samples, where the first prompt samples include the problem information of the session, and knowledge documents and knowledge contents corresponding to the problem information of the session, and a similarity between the plurality of first prompt samples and the problem information of the first session is not less than a second preset threshold;
a first knowledge content output module 704, configured to input the problem information of the first dialogue, the first knowledge document, and the plurality of first prompt samples into a large-scale pre-training language model, and output the first knowledge content, where the large-scale pre-training language model is constructed based on a simulated learning strategy;
The second sub-database retrieval module 705 is configured to retrieve a second sub-database from the question information of the first session to obtain a plurality of second prompt samples, where the second prompt samples include the question information of the session, and knowledge content and reply information corresponding to the question information of the session, and the similarity between the plurality of second prompt samples and the question information of the first session is not less than a third preset threshold;
the reply information output module 706 is configured to input the question information, the first knowledge content, and the plurality of second prompt samples of the first dialogue into the large-scale pre-training language model, and output reply information of the first dialogue.
In one embodiment, the method further includes a document knowledge base building module, configured to, before the question information receiving module 701 receives the question information of the first dialogue input by the user:
a first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
In one embodiment, the method further includes a first sub-database creation module for, before the question information receiving module 701 receives the question information of the first dialogue input by the user:
A first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
In one embodiment, the first seed database retrieval module 703 is specifically configured to:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the first seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the first seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the first seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the first dialogue with the first seed database not smaller than a second preset threshold value as a first prompt sample.
In one embodiment, the method further includes a second sub-database creation module for, before the question information receiving module 701 receives the question information of the first dialogue input by the user:
a second seed database is established in advance, wherein the second seed database comprises prompt samples consisting of problem information of different dialogues, knowledge content corresponding to the problem information of the different dialogues and reply information.
In one embodiment, the second seed database retrieval module 705 is specifically configured to:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the second seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the second seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the second seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the second seed database with the first dialogue not smaller than a third preset threshold value as a second prompt sample.
In one embodiment, the knowledge content characterizes new knowledge derived from question information of the dialog, and knowledge document reasoning associated with the question information of the dialog.
In one embodiment, the large-scale pre-trained language model includes one of the following models:
GPT-3 model, GLM model or ERNIE model.
Based on the foregoing inventive concept, as shown in fig. 8, the present invention further proposes a computer device 800, including a memory 810, a processor 820, and a computer program 830 stored in the memory 810 and capable of running on the processor 820, where the processor 820 implements the foregoing dialog generating method when executing the computer program 830.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the dialog generation method when being executed by a processor.
The embodiment of the invention also provides a computer program product, which comprises a computer program, and the computer program realizes the dialog generation method when being executed by a processor.
In summary, in the embodiment of the present invention, the problem information of the first dialogue input by the user is received; retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents; retrieving a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the question information of the dialogue, and knowledge documents and knowledge contents corresponding to the question information of the dialogue; inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content, wherein the large-scale pre-training language model is constructed based on a simulated learning strategy; retrieving a second sub-database according to the question information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the question information of the dialogue, and knowledge content and reply information corresponding to the question information of the dialogue; and inputting the problem information, the first knowledge content and a plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue.
The knowledge is introduced, a knowledge selection process in the related work of the knowledge driven dialogue system is converted into a knowledge generation task based on a simulated learning strategy, an input prompt of a pre-training language model is constructed based on a prompt sample which is similar to dialogue problems input by a user and is retrieved from a seed database, the excitation model is used for reasoning and generating specific knowledge content based on a knowledge document, and the accuracy of knowledge selection in a complex scene can be improved; in the reply generation stage of the dialogue, an imitation learning strategy is adopted, and under the condition that only a small amount of sample data is constructed, a large-scale language model can generate high-quality replies containing correct knowledge according to user input and knowledge content obtained in the knowledge generation stage.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the invention, and is not meant to limit the scope of the invention, but to limit the invention to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (21)

1. A dialog generation method, comprising:
receiving question information of a first dialogue input by a user;
retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents;
retrieving a first sub-database according to the question information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the question information of the dialogue, and knowledge documents and knowledge contents corresponding to the question information of the dialogue;
Inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into a large-scale pre-training language model, and outputting first knowledge content, wherein the large-scale pre-training language model is constructed based on a simulated learning strategy;
retrieving a second sub-database according to the question information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the question information of the dialogue, and knowledge content and reply information corresponding to the question information of the dialogue;
and inputting the problem information, the first knowledge content and a plurality of second prompt samples of the first dialogue into a large-scale pre-training language model, and outputting the reply information of the first dialogue.
2. The method of claim 1, wherein the plurality of first knowledge documents are not less than a first preset threshold in relevance to problem information of a first conversation;
the similarity between the plurality of first prompt samples and the problem information of the first dialogue is not smaller than a second preset threshold value;
and the similarity between the plurality of second prompt samples and the problem information of the first dialogue is not smaller than a third preset threshold value.
3. The method of claim 1, further comprising, prior to receiving the question information of the first dialog entered by the user:
And pre-establishing a document knowledge base, wherein the document knowledge base comprises massive knowledge documents related to various types of knowledge.
4. The method of claim 1, further comprising, prior to receiving the question information of the first dialog entered by the user:
a first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
5. The method of claim 4, wherein retrieving the first seed database based on the problem information for the first session to obtain a plurality of first hint samples comprises:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the first seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the first seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the first seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the first dialogue with the first seed database not smaller than a second preset threshold value as a first prompt sample.
6. The method of claim 1, further comprising, prior to receiving the question information of the first dialog entered by the user:
a second seed database is established in advance, wherein the second seed database comprises prompt samples consisting of problem information of different dialogues, knowledge content corresponding to the problem information of the different dialogues and reply information.
7. The method of claim 6, wherein retrieving the second seed database based on the problem information for the first session to obtain a plurality of second hint samples comprises:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the second seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the second seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the second seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the second seed database with the first dialogue not smaller than a third preset threshold value as a second prompt sample.
8. The method of claim 1, wherein the knowledge content characterizes new knowledge derived from question information of a conversation and knowledge document reasoning associated with the question information of the conversation.
9. The method of claim 1, wherein the large-scale pre-trained language model comprises one of the following models:
GPT-3 model, GLM model or ERNIE model.
10. A dialog generation device, comprising:
the problem information receiving module is used for receiving problem information of a first dialogue input by a user;
the document knowledge base retrieval module is used for retrieving a document knowledge base according to the problem information of the first dialogue to obtain a plurality of first knowledge documents;
the first sub-database retrieval module is used for retrieving the first sub-database according to the problem information of the first dialogue to obtain a plurality of first prompt samples, wherein the first prompt samples comprise the problem information of the dialogue, and knowledge documents and knowledge contents corresponding to the problem information of the dialogue;
the first knowledge content output module is used for inputting the problem information of the first dialogue, the first knowledge document and a plurality of first prompt samples into the large-scale pre-training language model, and outputting the first knowledge content, wherein the large-scale pre-training language model is constructed based on the imitation learning strategy;
The second sub-database retrieval module is used for retrieving a second sub-database for the problem information of the first dialogue to obtain a plurality of second prompt samples, wherein the second prompt samples comprise the problem information of the dialogue, and knowledge content and reply information corresponding to the problem information of the dialogue;
and the reply information output module is used for inputting the problem information, the first knowledge content and the plurality of second prompt samples of the first dialogue into the large-scale pre-training language model and outputting the reply information of the first dialogue.
11. The apparatus of claim 10, wherein the plurality of first knowledge documents are not less than a first preset threshold in relevance to problem information of a first conversation;
the similarity between the plurality of first prompt samples and the problem information of the first dialogue is not smaller than a second preset threshold value;
and the similarity between the plurality of second prompt samples and the problem information of the first dialogue is not smaller than a third preset threshold value.
12. The apparatus of claim 10, further comprising a document knowledge base creation module for, prior to the question information receiving module receiving the question information of the first dialog entered by the user:
a first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
13. The apparatus of claim 10, further comprising a first sub-database creation module for, prior to the question information reception module receiving the question information of the first dialogue entered by the user:
a first seed database is established in advance, wherein the first seed database comprises prompt samples consisting of problem information of different dialogues, knowledge documents corresponding to the problem information of the different dialogues and knowledge contents.
14. The apparatus of claim 13, wherein the first seed database retrieval module is specifically configured to:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the first seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the first seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the first seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the first dialogue with the first seed database not smaller than a second preset threshold value as a first prompt sample.
15. The apparatus of claim 10, further comprising a second sub-database creation module for, prior to the question information reception module receiving the question information of the first dialogue entered by the user:
a second seed database is established in advance, wherein the second seed database comprises prompt samples consisting of problem information of different dialogues, knowledge content corresponding to the problem information of the different dialogues and reply information.
16. The apparatus of claim 15, wherein the second seed database retrieval module is specifically configured to:
a pre-trained sentence encoder is adopted to obtain a vector corresponding to each prompt sample in the second seed database and a vector corresponding to the problem information of the first dialogue;
calculating a vector corresponding to each prompt sample in the second seed database, and cosine similarity between the vectors corresponding to the problem information of the first dialogue;
taking the calculated cosine similarity as the similarity between each prompt sample in the second seed database and the problem information of the first dialogue;
and taking a prompt sample with the similarity of the problem information of the second seed database with the first dialogue not smaller than a third preset threshold value as a second prompt sample.
17. The apparatus of claim 10, wherein the knowledge content characterizes new knowledge derived from question information of a conversation and knowledge document reasoning associated with the question information of the conversation.
18. The apparatus of claim 10, wherein the large-scale pre-trained language model comprises one of the following models:
GPT-3 model, GLM model or ERNIE model.
19. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of claims 1 to 9 when executing the computer program.
20. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method of any of claims 1 to 9.
21. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the method of any of claims 1 to 9.
CN202310772066.1A 2023-06-27 2023-06-27 Dialog generation method and device Pending CN116860930A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310772066.1A CN116860930A (en) 2023-06-27 2023-06-27 Dialog generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310772066.1A CN116860930A (en) 2023-06-27 2023-06-27 Dialog generation method and device

Publications (1)

Publication Number Publication Date
CN116860930A true CN116860930A (en) 2023-10-10

Family

ID=88231368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310772066.1A Pending CN116860930A (en) 2023-06-27 2023-06-27 Dialog generation method and device

Country Status (1)

Country Link
CN (1) CN116860930A (en)

Similar Documents

Publication Publication Date Title
CN109033305B (en) Question answering method, device and computer readable storage medium
CN109003624B (en) Emotion recognition method and device, computer equipment and storage medium
CN108287820B (en) Text representation generation method and device
CN108710704B (en) Method and device for determining conversation state, electronic equipment and storage medium
CN112487139B (en) Text-based automatic question setting method and device and computer equipment
CN112800203B (en) Question-answer matching method and system fusing text representation and knowledge representation
CN111858878B (en) Method, system and storage medium for automatically extracting answer from natural language text
CN111738018A (en) Intention understanding method, device, equipment and storage medium
CN114218379A (en) Intelligent question-answering system-oriented method for attributing questions which cannot be answered
CN116400901A (en) Python code automatic generation method and system
CN111553138A (en) Auxiliary writing method and device for standardizing content structure document
CN113220854B (en) Intelligent dialogue method and device for machine reading and understanding
CN117057414B (en) Text generation-oriented multi-step collaborative prompt learning black box knowledge distillation method and system
CN115345177A (en) Intention recognition model training method and dialogue method and device
CN113326367A (en) Task type dialogue method and system based on end-to-end text generation
CN116860930A (en) Dialog generation method and device
CN116561274A (en) Knowledge question-answering method based on digital human technology and natural language big model
CN113486160B (en) Dialogue method and system based on cross-language knowledge
CN114238595A (en) Metallurgical knowledge question-answering method and system based on knowledge graph
CN112017487A (en) Flat Flash learning system based on artificial intelligence
CN116739003A (en) Intelligent question-answering implementation method and device for power grid management, electronic equipment and storage medium
US20230267726A1 (en) Systems and methods for image processing using natural language
CN114186545A (en) Lyric generation method and device, electronic equipment and storage medium
CN115510865A (en) Method and device for identifying title entity of product
CN115658923A (en) Knowledge graph dialogue question-answering method and system based on multitask learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination