CN111382257A - Method and system for generating dialog context - Google Patents
Method and system for generating dialog context Download PDFInfo
- Publication number
- CN111382257A CN111382257A CN202010470216.XA CN202010470216A CN111382257A CN 111382257 A CN111382257 A CN 111382257A CN 202010470216 A CN202010470216 A CN 202010470216A CN 111382257 A CN111382257 A CN 111382257A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- vector
- current time
- generating
- context
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 239000013598 vector Substances 0.000 claims abstract description 207
- 230000004927 fusion Effects 0.000 claims abstract description 38
- BULVZWIRKLYCBC-UHFFFAOYSA-N phorate Chemical compound CCOP(=S)(OCC)SCSCC BULVZWIRKLYCBC-UHFFFAOYSA-N 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 28
- 238000012545 processing Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 9
- 238000012549 training Methods 0.000 claims description 7
- 230000015654 memory Effects 0.000 claims description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 19
- 238000004364 calculation method Methods 0.000 description 15
- 230000007246 mechanism Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 4
- 230000009466 transformation Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000008846 dynamic interplay Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 239000010977 jade Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the specification discloses a method and a system for generating a dialog context. The method comprises the following steps: acquiring a conversation text, acquiring at least one knowledge text related to the conversation text according to the conversation text, and generating at least one knowledge vector k corresponding to the at least one knowledge text1~km(ii) a The knowledge text is stored in a knowledge base; according to at least one knowledge vector k1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention model(ii) a Knowledge fusion vector based on the current timeContext vector of current timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timet;y1~ytThe dialog context words of (a) make up the dialog context, the y1Representing the dialog context word at t = 1.
Description
Technical Field
The present description relates to the field of Natural Language Processing (NLP), and more particularly to a method and system for generating a dialog context.
Background
In recent years, with the rise of artificial intelligence, man-machine conversation has received wide attention from academic and industrial circles as an important challenge of artificial intelligence. The task-independent chat type conversation gradually becomes the key point of research of people as more intelligent and vivid conversation experience can be provided for users and emotional appeal of the users can be solved. At present, the generation mode of the chatting dialogue can be roughly divided into a retrieval mode and a generation mode.
And the retrieval type dialogue recalls the relevant candidate reply from the knowledge base according to the input of the user by utilizing the information retrieval technology. Unlike the retrievals, the generative methods do not select historical replies from the corpus, but generate completely new replies, because the chat conversation is open and has no definite target and limited knowledge range, so that it is a challenge how to smoothly introduce knowledge in the generative conversation into the external knowledge base and reduce the "yes" and "i know" safety replies.
Therefore, a method for generating a dialog context is desired, which can dynamically select knowledge in the process of generating a reply by a system, so that the dialog system has the capability of naturally switching topics in the process of man-machine dialog.
Disclosure of Invention
One embodiment of the present specification provides a method for generating a dialog context. The method comprises the following steps:
acquiring a conversation text, acquiring at least one knowledge text related to the conversation text according to the conversation text, and generating at least one knowledge vector k corresponding to the at least one knowledge text1~km(ii) a Said isStoring the recognition text in a knowledge base; according to at least one knowledge vector k1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention model(ii) a Knowledge fusion vector based on the current timeContext vector of current timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timet;y1~ytThe dialog context words of (a) make up the dialog context, the y1Representing the dialog context word at t = 1.
One of the embodiments of the present specification provides a system for generating a dialog context, the system comprising:
a knowledge vector generation module, configured to obtain a dialog context, obtain at least one knowledge text related to the dialog context according to the dialog context, and generate at least one knowledge vector k corresponding to the at least one knowledge text1~km(ii) a The knowledge text is stored in a knowledge base; a knowledge fusion vector generation module for generating a knowledge fusion vector k based on at least one knowledge vector k1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention model(ii) a A dialogue following word generation module for fusing vector based on the knowledge of the current timeContext vector of current timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timet(ii) a A dialog context generation module for generating y1~ytThe dialog context words of (a) make up the dialog context, the y1Representing the dialog context word at t = 1.
One of the embodiments of the present specification provides an apparatus for generating a dialog context, the apparatus including:
at least one processor and at least one memory; the at least one memory is for storing computer instructions; the at least one processor is configured to execute at least some of the computer instructions to implement a method of generating a dialog context.
One of the embodiments of the present specification provides a computer-readable storage medium storing computer instructions, and when the computer instructions in the storage medium are read by a computer, the computer executes at least part of the instructions to realize a method for generating a dialog context.
Drawings
The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a structured flow diagram of a method of generating a dialog context, shown in accordance with some embodiments of the present description;
FIG. 2 is a schematic structural diagram of a first attention model according to some embodiments herein;
FIG. 3 is a diagram of an application scenario for generating a dialog context, shown in accordance with some embodiments of the present description;
FIG. 4 is an exemplary conversation between a chat bot and a user, shown in some embodiments herein.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
FIG. 1 is a structured flow diagram of a method of generating a dialog context, shown in some embodiments in accordance with the present description.
In some embodiments, the dialog context may be a dialog statement of a user chatting in the dialog system. The chat Conversation system is a kind of Human-Machine Conversation (Human-Machine Conversation) system, which makes a Machine understand and use natural language to communicate emotionally with a user. Unlike the man-machine dialogue systems that are commonly used for information query and retrieval of specific service types, dialogue through the chat dialogue system is open, i.e. the user's dialogue sentences can be any reasonable natural language sentences, such as: "I want to go to climb a mountain", "do you like to watch a movie", etc. The user dialog may be entered into the dialog system by the user via a human-machine interface, including but not limited to the following: voice input, text input, etc. In some embodiments, to generate a system conversation using more of the above information, the conversation context may include all of the historical conversations of the user and the system during a conversation.
In some embodiments, the dialog system may be implemented using a Sequence-to-Sequence (Seq 2 Seq) network architecture based on a contextual attention mechanism, the Seq2Seq consisting of a dialog encoder and a dialog decoder, the dialog encoder converting a variable length dialog context into a fixed length vector representation and the dialog decoder converting the fixed length vector into a variable length dialog context. In some embodiments, the dialog encoder may be constructed based on a Bi-directional Long Short-Term Memory (Bi-directional LSTM) model. The Bi-directional LSTM is composed of two LSTM models, the first one processes an input sentence sequence from left to right, the other one processes the input sentence sequence from right to left, and hidden states obtained by the two LSTMs are combined at each moment of coding processing and are output as the hidden state of the whole model. Because the Bi-directional LSTM takes the information of the whole context into consideration when encoding, the Bi-directional LSTM has better encoding effect compared with the unidirectional LSTM. In some embodiments, the dialog encoder may also be constructed based on other sequence-based models, such as bi-directional gru, without being limited by the description herein.
In some embodiments, all tokens (tokens) in the dialog context are converted into word-embedding vectors using a word-embedding model, and then input to a dialog encoder, which outputs a compilation of time instantsSequence h of code hidden states1~hn. Wherein h isnAll information above the dialog is contained. The word embedding model may include, but is not limited to: word2vec model, Term Frequency-Inverse Document Frequency model (TF-IDF), or SSWE-C (skip-gram based combined-sensitive Word embedding) model, etc.
In some embodiments, the encoding hidden state h generated by the dialog encoder at the last moment in step 110 is usednAs an initial intermediate semantic vector between the dialog encoder and the dialog decoder, i.e. to encode the hidden state hnInitial decoding hidden state S as dialog decoder0At the moment when t is 1, the dialog decoder conceals the state S according to the initial decoding0Decoding to generate a first decoded hidden state S1。
In some embodiments, because the decoding process of the dialog context is unidirectional, the dialog decoder may be constructed based on a unidirectional LSTM model. In some embodiments, the dialog decoder may also be constructed based on other sequence-based models, and is not limited by the description herein.
In some embodiments, at the current time (any time t after the first time), the decoding hidden state S at the previous time (time t-1) can be based ont-1Inputting the decoding of the current timeAnd knowledge fusion vector of previous timeGenerating a decoding hidden state S at the current time as an input of a dialog decodertThe calculation formula is as follows:
wherein,for knowledge fusion vectors of the previous time, knowledge fusion vectorsFor details see in step 150, see,for knowledge gating, for determining knowledge fusion vectorsThe ratio in the input is obtained by the following formula:
wherein,andfor a learnable parameter, byAndto pairAndand performing linear transformation on the formed splicing matrix, and mapping the result into a range of (0, 1) through a sigmoid function. The sigmoid function is also called Logistic function, and is commonly used for hiding layer neuron output.
wherein,andis a parameter that can be learned by the user,context vector generated for last moment, with respect to context vectorSee step 160, e (y)t-1) Dialog word y generated for the last momentt-1The words of (a) are embedded into the vector,passing parametersAndfor e (y)t-1) Andthe splicing matrix is obtained by linear transformation.
As can be seen from equations (1) to (3), the decoding hidden state S is generated at the current timetExcept that the context vector of the previous time instant is usedAnd the dialogue word y generated at the last momentt-1AsInput of dialog decoder (realization of existing decoder based on context attention mechanism) and knowledge fusion vector of last momentDynamic integration of external knowledge into the current decoding hidden state S generated by the dialog decodertIn (1). From step 150, knowledge fusion vectorsIs dynamically generated using an attention mechanism so that the dialog decoder can focus on different information in the extrinsic knowledge at each moment of decoding.
In some embodiments, at least one knowledge text related to the context of the conversation may be obtained from the context of the conversation. Specifically, at least one knowledge text related to the above of the conversation is recalled by querying a knowledge base.
In some embodiments, the knowledge text may be in a ternary format, i.e., the text format is: subject + predicate + object. For example: "Zhang three, height, 226", "Liqu, representational name, drama a" and so on. In some embodiments, the knowledge text may also be a piece of text or other format, and is not limited by the description herein. In some embodiments, the language of the knowledge text may be chinese, english or other languages, and is not limited by the description of the present specification. In some embodiments, the knowledge base may be queried by the search system according to some keywords or/words in the dialog, the relevant knowledge texts may be recalled, the top N knowledge texts may be selected by sorting, and the value of N may be selected by comprehensively considering the computation workload of the model and the acquired richness of knowledge, for example, N may be 30, or other values. In some embodiments, the content in the knowledge base may be from an open source data set, such as Wizard-of-Wikipedia, DuConv, and the like.
In some embodiments, knowledge encoding may be used for any knowledge textThe device encodes each word in the knowledge text, generating a plurality of word vectors. Specifically, the word segmentation model may be used to divide the knowledge text into token sequences, the obtained token sequences are input into a knowledge encoder, the knowledge encoder encodes each token, and a word vector sequence corresponding to the knowledge text is generated. Finally obtaining a sequence Z of m word vectors corresponding to m knowledge texts1~Zm. In some embodiments, transform's encoder may be used as the knowledge encoder. The Transformer is a classical model of natural language processing, the Transformer encoder does not adopt the sequential structure of an RNN model, but processes each input token in parallel through a Self-extension layer and a feed-forward layer, residual connection is used among sub-layers, and finally each input token is coded into a word vector representation with global information.
In some embodiments, a sequence Z of m word vectors to be obtained may be obtained using a third attention model1~ZmCombine into m knowledge vectors k1~km. See step 140 for a detailed description of the third attention model. In some embodiments, other methods may also be used to obtain at least one knowledge vector k corresponding to at least one knowledge text1~kmFor example, the encoding may be performed by a knowledge graph embedding method, and is not limited by the description of the present specification.
In some embodiments, there may be many knowledge texts (e.g. 30 knowledge texts) obtained in step 130, and in order to use all the information contained in these texts for decoding by the decoder, it is first necessary to use one knowledge vector to represent each knowledge text, and then generate knowledge fusion vectors according to the method described in step 150 from the knowledge vectors corresponding to all the knowledge texts. Thus, in some embodiments, a knowledge vector may be generated using the third attention model from the generated plurality of word vectors.
In some embodiments, the input to the third attention model may be the sequence Z of m word vectors obtained in step 1301~ZmJ-th of (1): zj,ZjComprises aIndividual word vectors:the output may be the m knowledge vectors k described in step 1301~kmJ-th knowledge vector of (1): k is a radical ofj. The third attention model is implemented as follows:
for multiple word vectorsFor each of the word vectorsPerforming weighting operation, and processing the operation result by using activation function to generate word attention vector wtThe calculation formula is as follows:
wherein, VzAnd WzAre learnable parameters. In some embodiments, word vectors that are more relevant to the context of a conversation may be trainedCorresponding word attention vectorAnd is also higher.
(II) attention vector w based on wordstGenerating a plurality of word attention weights corresponding to the word attention vectors using a scoring functionThe calculation formula is as follows:
wherein the softmax function is paired with wtIs normalized to obtain a value in the range of (0, 1). In some embodiments, word attention weight may be obtained by calculating cosine similarity. In some embodiments, other ways of calculating word attention weights may also be usedAnd are not intended to be limited by the description herein.
(III) respectively calculating a plurality of word vectorsAnd multiple word attention weightsAnd summing the results of the calculations to generate a knowledge vector kjThe calculation formula is as follows:
by using an attention mechanism, a plurality of word vectors are encodedCombining into a knowledge vector k based on importancejKnowledge vector kjAnd word vectorThe dimension is the same and comprises word vectorsAll the information of (2) is convenient for subsequent calculation.
In some embodiments, the at least one knowledge vector k may be based on1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention model。
In some embodiments, as shown in FIG. 2, the input to the first attention model may be at least one knowledge vector k1~kmAnd decoding hidden state S at current timetThe output may be a knowledge fusion vector for the current time instant. About knowledge vector k1~kmSee step 130 for details. Decoding hidden state S about current timetReferring to step 120, at each decoding time, the dialog decoder generates a decoding hidden state according to the method described in step 120, so that t-1 decoding hidden states have been generated before the current time t: s1~St-1. The first attention model is implemented as follows:
for knowledge vector k1~kmOf the knowledge vector and the decoded hidden state S at the current time instanttPerforming weighted summation operation, and processing the operation result by using activation function to generate knowledge attention vectorThe calculation formula is as follows:
whereinObtained for learnable parametersRepresenting a knowledge vector k for a vector of m real numbers1~kmDecoding hidden state S corresponding to current timetThe degree of correlation of (c). In some embodiments, the parameters are trainedSo that the decoding from the current time is hidden from the state StKnowledge vector k with high correlationiCalculated to obtainThe higher.
(II) attention vector based on knowledgeGenerating and knowledge vector k using a scoring function1~kmCorresponding knowledge attention weightThe calculation formula is as follows:
wherein, softmax function pairEach element in (1)Normalizing the elements to obtain values in the range of (0, 1). In some embodiments, knowledge attention weight may be obtained by calculating cosine similarity. In some embodiments, other ways of computing knowledge attention weights may also be usedAnd are not intended to be limited by the description herein.
(III) calculating knowledge vectors k respectively1~kmAnd knowledge attention weightAnd summing the calculation results to generate a knowledge fusion vector at the current momentThe calculation formula is as follows:
in the embodiments described in this specification, the vectors are fused by introducing knowledge in the decoding processThe knowledge vector k can be obtained by the dialog decoder at each moment of decoding1~kmAvoids losing part of knowledge information in the decoding process. And hiding the state S by using the current decodingtCalculated knowledge attention weight dtKnowledge vector k may be given at different times1~kmWith different attention. For example: at the current moment, if the knowledge vector k2Knowledge information contained with currentDecoding hidden state StThe correlation is higher, then k2Corresponding knowledge attention weight d2Will also be higher, at the next instant if the knowledge vector k3Knowledge information contained and decoding hidden state S at next momentt+1The correlation is higher, then k3Corresponding knowledge attention weight d3Will also be higher. It is possible to make the dialog decoder focus on different information in the extrinsic knowledge at different moments of decoding, for example: at the current time in the above example, the decoder may be interested in the knowledge vector k1~kmK in (1)2At the next instant, the decoder may look to the knowledge vector k1~kmK in (1)3。
In step 120, the encoded hidden state h of the semantic information of the above dialog is savednAs the initial intermediate semantic vector C between the dialog encoder and the dialog decoder. Since this vector length is fixed, when the length of the dialog context is long, the intermediate semantic vector C cannot hold all the semantic information, thereby limiting the comprehensibility of the dialog decoder. Thus, in some embodiments, a dynamic attention mechanism is used to generate a context vector for a current time instantAs an intermediate semantic vector for the current time between the dialog encoder and the dialog decoder. In some embodiments, the concealment state may be encoded according to a sequence h1~hnAnd decoding hidden state S at current timetGenerating a context vector for the current time using a second attention model。
In some embodiments, the input to the second attention model may beDecoding hidden states St and sequences h of encoding hidden states at previous instants1~hnThe output may be a context vector for the current time instant. The second attention model is implemented as follows:
for coding sequence h of hidden states1~hnFor each of the encoded hidden state and the decoded hidden state S at the current timetPerforming weighted summation operation, and processing the operation result by using activation function to generate context attention vectorThe calculation formula is as follows:
wherein,are learnable parameters. Obtained byRepresenting the sequence h of encoded hidden states for a vector of n real numbers1~hnDecoding hidden state S corresponding to current timetCorrelation of (2), coding hidden state h with high correlationiCorresponding toThe higher.
(II) attention vector based on contextGenerating and encoding a sequence h of hidden states using a scoring function1~hnCorresponding contextual attention weightsThe calculation formula is as follows:
wherein, softmax function pairIs normalized to obtain a value in the range of (0, 1). In some embodiments, the contextual attention weight may be obtained by calculating cosine similarity. In some embodiments, other ways to compute the contextual attention weight may also be usedAnd are not intended to be limited by the description herein.
(III) respectively calculating sequences h of coding hidden states1~hnAnd a plurality of contextual attention weightsAnd summing the calculation results to generate a context vector at the current timeThe calculation formula is as follows:
by usingCan make the dialog decoder decode at each timeThe sequence h of the coding hidden state can be obtained at all times1~hnAll of the information in (1). At the same time, the hidden state S is concealed by using the current decodingtCalculated contextual attention weightThe sequence h of the coded hidden states is given at different times1~hnDifferent attention is paid so that the dialog decoder can focus on different information in the dialog context at various moments of decoding. For example: the dialog encoder, based on the dialog context: "Tom chasJerry" generates a sequence h encoding a hidden state1~h3At time t =1 when the dialog decoder decodes, attention is weighted by contextAdministration of h1The highest attention, and therefore the dialog decoder may be inIs concerned with h1Information of the word "Tom" in the corresponding dialog context, and so on, at the time t =2 at which the dialog decoder decodes, the dialog decoder may be atIs concerned with h2The information of the word "chase" in the corresponding dialog context may be decoded by the dialog decoder at time t =3Is concerned with h3The corresponding dialog is for information of the word "Jerry" above. If the sequence h of the coding hidden states is not given at different times1~hnWith different attention, the information that the dialog decoder can take into account is the same at any time the dialog decoder decodes.
At step 170, the following terms are generated.
In some casesIn an embodiment, the vector may be fused based on knowledge of the current time instantContext vector of current timeAnd a decoding hidden state St at the current time, and generating a dialog context word yt at the current time. Specifically, a context vector of the current time is generatedDecoding hidden state St at current time and knowledge fusion vector at current timeThe splicing matrix is input into a word selection model, and the word selection model is used for predicting the following words y of the conversation at the current momenttThe calculation formula is as follows:
wherein, V1、V2、b1And b2For learnable parameters, in equation (13), St、Andcomposed mosaic matrix [ S ]t,,]Two linear transformations were performed with the above parameters: 1. [ S ]t,,]And V1Performing dot product operation to obtain vector and offset vector b1And performing addition operation. 2. Operation result obtained by linear transformation 1 and V2Performing dot product operation to obtain vector and offset vector b2And performing addition operation to finally obtain a vector consisting of M real numbers. M is the size of the vocabulary used by the dialog system, and M real numbers represent the similarity of the predicted dialog context word and the M words in the vocabulary, respectively. Then, the vector is normalized by a softmax function to obtain probabilities represented by M scores (within the range of 0-1), and words in the extracted vocabulary corresponding to the highest probability are selected as the words yt in the conversation context at the current moment.
In some embodiments, y1~ytThe dialog context words of (a) constitute a dialog context, y1Representing the dialog context word at t = 1. The embodiments described in this specification fuse vectors by adding knowledge based on the existing context awareness based Seq2Seq network structureThe dynamic interaction of the external knowledge and the dialog decoder is realized, so that the dialog decoder can dynamically introduce the external knowledge in the decoding process, and a dialog system can naturally and smoothly switch new topics. The following is an example of a set of human-machine dialog and knowledge text shown in fig. 4:
the history dialog shown in fig. 4, a is a dialog of chatting the robot, B is a dialog of the user, and the dialogs that have been performed are: "A: you are recommended a good-for-mouth movie bar. "," a: kay, director F of movie a may be a true stick to inspire the movie. "," B: the director has a good ear! ". At the start of the conversation, the chat bot recalls the set of knowledge texts shown in fig. 4 from the knowledge base: "F, ancestry, USA", "F, gender, male", "F,representative works, movie b "," F, profession, director "," F, date of birth, 1925/12/5/1925 "," movie a, director, F "," movie a, public praise, love movie with good public praise "," movie a, winning prize, american gold prize nomination-tv-like-best mini-drama ", for a total of 8 knowledge texts, please refer to step 130 for a related description of the knowledge texts. Knowledge vectors k corresponding to the above 8 knowledge texts can then be generated according to the methods described in steps 130 and 140 in this specification1~k8. In the generated dialog shown in fig. 4, the knowledge text has been integrated: "movie a, public praise, love movie with public praise", "movie a, director, F". The last dialog in fig. 4 is "B: the director has a good ear! ", focusing the dialog on the director-F, the dialog decoder generates a decoded hidden state S at time t =1 from the intermediate semantic vector generated above the dialog1(see step 120 for details), then based on the knowledge vector k using a dynamic attention mechanism as described in step 1501~k8And decoding the state S1Generating knowledge fusion vectorsWherein knowledge fusion vectors are being generated due to the conversational contextIn the process of (1), the knowledge text "F, date of birth, 5/12/1925" and "F" is given a higher attention to the knowledge vector corresponding to the movie b ", wherein the degree of matching between" F, date of birth, 5/12/1925 "and the dialogue text is relatively higher, so that the attention given to the corresponding knowledge vector is the highest. The knowledge is then fused into vectors as described in step 170As one of the inputs of the word selection model, the word selection model may extract the knowledge information with the highest current attention in the process of generating the following words: "F, date of birth, 5 months and 12 days 1925". At the time t =2, knowledgeFused vectorAgain, this is one of the inputs to the dialog decoder, so the decoded hidden state S2 generated by the dialog decoder at time t =2 contains the most recently interesting knowledge information: "F, date of birth, 5 months and 12 days 1925". By analogy, vectors are fused by the knowledge described aboveThe repeated interaction process with the dialog decoder can coherently integrate the knowledge text "F, date of birth, 5/12/1925" and "F, representative, movie b" into the dialog context, resulting in a dialog context "he was born at 5/12/1925" and he also has a representation, movie b ".
In some embodiments, a training set of a dialog context, at least one knowledge text, and a dialog context may be obtained, and a dialog system composed of a dialog encoder, a dialog decoder, a knowledge encoder, a first attention model, a second attention model, a third attention model, and a softmax function may be trained using a back propagation algorithm. Specifically, the dialogue context can be used as a label, and model training is performed in an end-to-end training mode, so that a trained dialogue system is obtained.
It should be noted that the above description of the process 100 is for illustration and description only, and does not limit the scope of the application of the present disclosure. Various modifications and alterations to process 100 will become apparent to those skilled in the art in light of the present description. However, such modifications and variations are intended to be within the scope of the present description. For example, step 130 and step 140 may be combined into one step, and a word vector corresponding to the knowledge text is generated in the same step, and a knowledge vector is generated based on the word vector.
FIG. 3 is a diagram of an application scenario for generating a dialog context, shown in some embodiments according to the present description.
As shown in fig. 3, during the chat process between the chat robot and the user, there is a dialog: "recommend a theatrical bar to you. "," good, you recommend a bar ". The chat robot generates a conversation context and carries out a smooth conversation with the user by using the method described in the specification: the 'called TV play a' is the one played by Zhang three actors. "i prefer three. ",.... Please refer to fig. 1 for a method for generating a dialog context in detail, which is not described herein again.
The method described in this specification can also be applied to other application scenarios, and is not limited by the description of this specification.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be regarded as illustrative only and not as limiting the present specification. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.
Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present description may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereof. Accordingly, aspects of this description may be performed entirely by hardware, entirely by software (including firmware, resident software, micro-code, etc.), or by a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present description may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of this specification may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.
Claims (24)
1. A method of generating a dialog context, the method comprising:
acquiring a conversation text, acquiring at least one knowledge text related to the conversation text according to the conversation text, and generating at least one knowledge vector k corresponding to the at least one knowledge text1~km(ii) a The knowledge text is stored in a knowledge base;
according to at least one knowledge vector k1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention model;
Knowledge fusion vector based on the current timeWhen inContext vector of previous timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timet;
y1~ytThe dialog context words of (a) make up the dialog context, the y1Representing the dialog context word at t = 1.
2. The method of claim 1, wherein the context vector for the current time instantGenerated in the following way:
generating a sequence h of encoded hidden states from the dialog context using a dialog encoder1~hn;
Obtaining decoding hidden state S of current timet;
3. The method of claim 2, wherein the generating at least one knowledge vector k corresponding to the at least one knowledge text1~kmThe method comprises the following steps:
for any of the knowledge texts, encoding each word in the knowledge text using a knowledge encoder, and generating the knowledge vector using a third attention model based on the generated plurality of word vectors.
4. The method of claim 3, wherein the generating the knowledge vector using a third attention model from the generated plurality of word vectors comprises:
for each of the plurality of word vectors, performing a weighted operation on the word vector, and processing an operation result by using an activation function to generate a word attention vector;
generating a plurality of word attention weights corresponding to a plurality of the word vectors using a scoring function based on the word attention vectors;
generating the knowledge vector based on a plurality of the word vectors and a plurality of the word attention weights.
5. The method of claim 4, wherein said k is based on at least one of said knowledge vectors1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention modelThe method comprises the following steps:
for at least one of the knowledge vectors k1~kmFor each of the knowledge vector and the decoded hidden state S at the current timetPerforming weighted summation operation, and processing the operation result by using an activation function to generate a knowledge attention vector;
generating at least one knowledge attention weight corresponding to at least one of the knowledge vectors using a scoring function based on the knowledge attention vector;
6. The method of claim 5, wherein the obtaining of the decoding hidden state S at the current timetThe method comprises the following steps:
at the current moment, based on the previous oneDecoding hidden state S of timet-1Inputting the decoding of the current timeAnd knowledge fusion vector of previous timeGenerating a decoding hidden state S at the current time as an input of a dialog decodert(ii) a Wherein, by knowledge gatingTo determine the decoded input at said current time instantAnd the knowledge fusion vector of the previous momentThe ratio of (A) to (B);
7. Method according to claim 6, wherein said sequence h according to said coded hidden state1~hnAnd a decoding hidden state S of the current timetGenerating a context vector for the current time using a second attention modelThe method comprises the following steps:
for said sequence h of encoded hidden states1~hnFor each of said encoded hidden state and said decoded hidden state S at said current timetPerforming weighted summation operation, and processing the operation result by using an activation function to generate a context attention vector;
generating and encoding the hidden state h using a scoring function based on the context attention vector1~hnA corresponding plurality of contextual attention weights;
8. The method of claim 7, wherein the fusion vector is based on knowledge of the current time instantContext vector of current timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timetThe method comprises the following steps:
9. The method of claim 8, wherein said obtaining at least one knowledge text from the dialog context comprises:
and querying the knowledge base, and recalling the at least one knowledge text related to the conversation.
10. The method of claim 9, wherein the knowledge encoder is a transform encoder.
11. The method of claim 10, wherein the method further comprises:
obtaining a training set consisting of the dialog context, the at least one knowledge text, and the dialog context, training a dialog system consisting of the dialog encoder, the dialog decoder, the knowledge encoder, the first attention model, the second attention model, the third attention model, and the softmax function using a back propagation algorithm.
12. A system for generating a dialog context, the system comprising:
a knowledge vector generation module, configured to obtain a dialog context, obtain at least one knowledge text related to the dialog context according to the dialog context, and generate at least one knowledge vector k corresponding to the at least one knowledge text1~km(ii) a The knowledge text is stored in a knowledge base;
a knowledge fusion vector generation module for generating a knowledge fusion vector k based on at least one knowledge vector k1~kmAnd a decoding hidden state St of the current time, and generating a knowledge fusion vector of the current time by using the first attention model;
A dialogue following word generation module for fusing vector based on the knowledge of the current timeContext vector of current timeAnd the decoding hidden state St of the current moment generates the dialog context word y of the current momentt;
A dialog context generation module for generating y1~ytThe dialog context words of (a) make up the dialog context, the y1Representing the dialog context word at t = 1.
13. The system of claim 12, further comprising:
an encoding module for generating a sequence h of encoded hidden states using a dialog encoder based on the dialog context1~hn;
A decoding module for obtaining the decoding hidden state S at the current timet;
14. The system of claim 13, wherein the generating at least one knowledge vector k corresponding to the at least one knowledge text1~kmThe method comprises the following steps:
for any of the knowledge texts, encoding each word in the knowledge text using a knowledge encoder, and generating the knowledge vector using a third attention model based on the generated plurality of word vectors.
15. The system of claim 14, wherein the generating the knowledge vector using a third attention model from the generated plurality of word vectors comprises:
for each of the plurality of word vectors, performing a weighted operation on the word vector, and processing an operation result by using an activation function to generate a word attention vector;
generating a plurality of word attention weights corresponding to a plurality of the word vectors using a scoring function based on the word attention vectors;
generating the knowledge vector based on a plurality of the word vectors and a plurality of the word attention weights.
16. The system of claim 15, wherein the function is based on at least one of the knowledge vectors k1~kmAnd decoding hidden state S at current timetGenerating a knowledge fusion vector for a current time using a first attention modelThe method comprises the following steps:
for at least one of the knowledge vectors k1~kmFor each of the knowledge vector and the decoded hidden state S at the current timetPerforming weighted summation operation, and processing the operation result by using an activation function to generate a knowledge attention vector;
generating at least one knowledge attention weight corresponding to at least one of the knowledge vectors using a scoring function based on the knowledge attention vector;
17. The system of claim 16, wherein the obtaining of the decoding hidden state S at the current timetThe method comprises the following steps:
at the present timeMoment based on the decoded hidden state S of the previous momentt-1Inputting the decoding of the current timeAnd knowledge fusion vector of previous timeGenerating a decoding hidden state S at the current time as an input of a dialog decodert(ii) a Wherein, by knowledge gatingTo determine the decoded input at said current time instantAnd the knowledge fusion vector of the previous momentThe ratio of (A) to (B);
18. The system according to claim 17, wherein said sequence h according to said encoded concealment state1~hnAnd a decoding hidden state S of the current timetGenerating a context vector for the current time using a second attention modelThe method comprises the following steps:
for said sequence h of encoded hidden states1~hnFor each of said encoded hidden state and said decoded hidden state S at said current timetPerforming weighted summation operation, and processing the operation result by using an activation function to generate a context attention vector;
generating and encoding the hidden state h using a scoring function based on the context attention vector1~hnA corresponding plurality of contextual attention weights;
19. The system of claim 18, wherein the knowledge fusion vector based on the current time of dayContext vector of the current timeAnd a decoding hidden state S of the current timetGenerating the dialog context word y at the current timetThe method comprises the following steps:
20. The system of claim 19, wherein said obtaining at least one knowledge text from the dialog context comprises:
and querying the knowledge base, and recalling the at least one knowledge text related to the conversation.
21. The system of claim 20, wherein the knowledge encoder is a transform encoder.
22. The system of claim 21, wherein the system further comprises:
a training module to obtain a training set of the dialog context, the at least one knowledge text, and the dialog context, train a dialog system composed of the dialog encoder, the dialog decoder, the knowledge encoder, the first attention model, the second attention model, the third attention model, and the softmax function using a back propagation algorithm.
23. An apparatus to generate a dialog context, wherein the apparatus comprises at least one processor and at least one memory;
the at least one memory is for storing computer instructions;
the at least one processor is configured to execute at least some of the computer instructions to implement the method of any of claims 1-11.
24. A computer-readable storage medium storing computer instructions which, when read by a computer, cause the computer to perform the method of any one of claims 1 to 11.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010470216.XA CN111382257A (en) | 2020-05-28 | 2020-05-28 | Method and system for generating dialog context |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010470216.XA CN111382257A (en) | 2020-05-28 | 2020-05-28 | Method and system for generating dialog context |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111382257A true CN111382257A (en) | 2020-07-07 |
Family
ID=71217697
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010470216.XA Pending CN111382257A (en) | 2020-05-28 | 2020-05-28 | Method and system for generating dialog context |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111382257A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084314A (en) * | 2020-08-20 | 2020-12-15 | 电子科技大学 | Knowledge-introducing generating type session system |
CN112214591A (en) * | 2020-10-29 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Conversation prediction method and device |
CN112328756A (en) * | 2020-10-13 | 2021-02-05 | 山东师范大学 | Context-based dialog generation method and system |
CN113032545A (en) * | 2021-05-29 | 2021-06-25 | 成都晓多科技有限公司 | Method and system for conversation understanding and answer configuration based on unsupervised conversation pre-training |
CN113919293A (en) * | 2021-09-29 | 2022-01-11 | 北京搜狗科技发展有限公司 | Formula recognition model training method and device |
CN114942986A (en) * | 2022-06-21 | 2022-08-26 | 平安科技(深圳)有限公司 | Text generation method and device, computer equipment and computer readable storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399460A (en) * | 2019-07-19 | 2019-11-01 | 腾讯科技(深圳)有限公司 | Dialog process method, apparatus, equipment and storage medium |
CN110851575A (en) * | 2019-09-23 | 2020-02-28 | 上海深芯智能科技有限公司 | Dialogue generating system and dialogue realizing method |
CN110858215A (en) * | 2018-08-23 | 2020-03-03 | 广东工业大学 | End-to-end target guiding type dialogue method based on deep learning |
CN111125333A (en) * | 2019-06-06 | 2020-05-08 | 北京理工大学 | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism |
CN111159368A (en) * | 2019-12-12 | 2020-05-15 | 华南理工大学 | Reply generation method for personalized dialogue |
CN111159467A (en) * | 2019-12-31 | 2020-05-15 | 青岛海信智慧家居***股份有限公司 | Method and equipment for processing information interaction |
CN111191015A (en) * | 2019-12-27 | 2020-05-22 | 上海大学 | Neural network movie knowledge intelligent dialogue method |
-
2020
- 2020-05-28 CN CN202010470216.XA patent/CN111382257A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110858215A (en) * | 2018-08-23 | 2020-03-03 | 广东工业大学 | End-to-end target guiding type dialogue method based on deep learning |
CN111125333A (en) * | 2019-06-06 | 2020-05-08 | 北京理工大学 | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism |
CN110399460A (en) * | 2019-07-19 | 2019-11-01 | 腾讯科技(深圳)有限公司 | Dialog process method, apparatus, equipment and storage medium |
CN110851575A (en) * | 2019-09-23 | 2020-02-28 | 上海深芯智能科技有限公司 | Dialogue generating system and dialogue realizing method |
CN111159368A (en) * | 2019-12-12 | 2020-05-15 | 华南理工大学 | Reply generation method for personalized dialogue |
CN111191015A (en) * | 2019-12-27 | 2020-05-22 | 上海大学 | Neural network movie knowledge intelligent dialogue method |
CN111159467A (en) * | 2019-12-31 | 2020-05-15 | 青岛海信智慧家居***股份有限公司 | Method and equipment for processing information interaction |
Non-Patent Citations (6)
Title |
---|
HAO ZHOU等: "Commonsense Knowledge Aware Conversation Generation with Graph Attention", 《INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE 2018》 * |
JANE: "对话清华大学周昊,详解IJCAI杰出论文及其背后的故事", 《HTTPS://JUEJIN.IM/POST/5B6A9E085188251A8D37136D》 * |
PETAR VELICKOVIC等: "GRAPH ATTENTION NETWORKS", 《PUBLISHED AS A CONFERENCE PAPER AT ICLR 2018》 * |
学习ML的皮皮虾: "基于常识知识图谱的对话模型【阅读笔记】", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/50502922》 * |
李少博等: "基于知识拷贝机制的生成式对话模型", 《第十八届全国计算语言学学术会议THE 18TH CHINESE NATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS (CCL 2019)》 * |
陈晨等: "基于深度学习的开放领域对话***研究综述", 《计算机学报》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112084314A (en) * | 2020-08-20 | 2020-12-15 | 电子科技大学 | Knowledge-introducing generating type session system |
CN112084314B (en) * | 2020-08-20 | 2023-02-21 | 电子科技大学 | Knowledge-introducing generating type session system |
CN112328756A (en) * | 2020-10-13 | 2021-02-05 | 山东师范大学 | Context-based dialog generation method and system |
CN112214591A (en) * | 2020-10-29 | 2021-01-12 | 腾讯科技(深圳)有限公司 | Conversation prediction method and device |
CN112214591B (en) * | 2020-10-29 | 2023-11-07 | 腾讯科技(深圳)有限公司 | Dialog prediction method and device |
CN113032545A (en) * | 2021-05-29 | 2021-06-25 | 成都晓多科技有限公司 | Method and system for conversation understanding and answer configuration based on unsupervised conversation pre-training |
CN113919293A (en) * | 2021-09-29 | 2022-01-11 | 北京搜狗科技发展有限公司 | Formula recognition model training method and device |
CN114942986A (en) * | 2022-06-21 | 2022-08-26 | 平安科技(深圳)有限公司 | Text generation method and device, computer equipment and computer readable storage medium |
CN114942986B (en) * | 2022-06-21 | 2024-03-19 | 平安科技(深圳)有限公司 | Text generation method, text generation device, computer equipment and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11423233B2 (en) | On-device projection neural networks for natural language understanding | |
CN110782870B (en) | Speech synthesis method, device, electronic equipment and storage medium | |
Kamath et al. | Deep learning for NLP and speech recognition | |
CN110326002B (en) | Sequence processing using online attention | |
CN111382257A (en) | Method and system for generating dialog context | |
CN111312245B (en) | Voice response method, device and storage medium | |
CN109508377A (en) | Text feature, device, chat robots and storage medium based on Fusion Model | |
CN112214591B (en) | Dialog prediction method and device | |
US11132994B1 (en) | Multi-domain dialog state tracking | |
JP7229345B2 (en) | Sentence processing method, sentence decoding method, device, program and device | |
US11961515B2 (en) | Contrastive Siamese network for semi-supervised speech recognition | |
WO2023231513A1 (en) | Conversation content generation method and apparatus, and storage medium and terminal | |
Mathur et al. | A scaled‐down neural conversational model for chatbots | |
Pieraccini | AI assistants | |
Hsueh et al. | A Task-oriented Chatbot Based on LSTM and Reinforcement Learning | |
CN116611459B (en) | Translation model training method and device, electronic equipment and storage medium | |
CN114373443A (en) | Speech synthesis method and apparatus, computing device, storage medium, and program product | |
CN112150103B (en) | Schedule setting method, schedule setting device and storage medium | |
CN117980915A (en) | Contrast learning and masking modeling for end-to-end self-supervised pre-training | |
CN115204181A (en) | Text detection method and device, electronic equipment and computer readable storage medium | |
KR20230146398A (en) | Sequence text summary processing device using bart model and control method thereof | |
Li et al. | Audio-LLM: Activating the Capabilities of Large Language Models to Comprehend Audio Data | |
CN117521674B (en) | Method, device, computer equipment and storage medium for generating countermeasure information | |
CN117727288B (en) | Speech synthesis method, device, equipment and storage medium | |
CN115577084B (en) | Prediction method and prediction device for dialogue strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200707 |