CN110929476B - Task type multi-round dialogue model construction method based on mixed granularity attention mechanism - Google Patents

Task type multi-round dialogue model construction method based on mixed granularity attention mechanism Download PDF

Info

Publication number
CN110929476B
CN110929476B CN201910929777.9A CN201910929777A CN110929476B CN 110929476 B CN110929476 B CN 110929476B CN 201910929777 A CN201910929777 A CN 201910929777A CN 110929476 B CN110929476 B CN 110929476B
Authority
CN
China
Prior art keywords
word
decoder
layer
granularity
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910929777.9A
Other languages
Chinese (zh)
Other versions
CN110929476A (en
Inventor
仇婕
王鹏
马婷婷
窦海波
高玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unit 63626 Of Pla
Original Assignee
Unit 63626 Of Pla
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unit 63626 Of Pla filed Critical Unit 63626 Of Pla
Priority to CN201910929777.9A priority Critical patent/CN110929476B/en
Publication of CN110929476A publication Critical patent/CN110929476A/en
Application granted granted Critical
Publication of CN110929476B publication Critical patent/CN110929476B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a task type multi-round dialogue model construction method based on a mixed granularity attention mechanism, which comprises the following steps of: s1: preprocessing the text such as word segmentation, word stop, word vector coding and the like; s2: using an input encoder to encode the converted high-dimensional vector into a sentence vector, and memorizing the details of the conversation; s3: the context encoder encodes the sentence vector; s4: the context coding layer outputs an attention mechanism combined with sentence granularity to realize the context coding; s5: step S4 outputs as input the first layer of the output decoder, decoding by decoding the first layer of the layer; s6: calculating an attention value of word granularity; s7: the output of the first layer of the decoder is combined with the attention value of the word granularity calculated in step S6, and the output generated by the decoder is mapped to the dimension of the word list size, and the result is output. The method and the device greatly improve the accuracy of generating the reply by performing multiple rounds of conversation tasks on the actual data set.

Description

Task type multi-round dialogue model construction method based on mixed granularity attention mechanism
Technical Field
The invention relates to the field of natural language processing, in particular to a task type multi-round dialogue model construction method based on a mixed granularity attention mechanism.
Background
The task-oriented multi-turn dialog system in the natural language field refers to the realization of user requirements in specific fields, and the representative is as follows: the method helps users to navigate, find commodities, check weather, make schedules, order air tickets and the like, and is an important mode for realizing man-machine interaction. The task type multi-turn dialogue system can obviously reduce labor cost and improve service efficiency, provides more convenient and natural service for people to acquire information, and has very clear practical value and important application prospect.
The research on the task-based multi-turn dialog systems in recent years can be divided into two categories, one category is the task-based multi-turn dialog system in the traditional field, and the other category is the end-to-end task-based multi-turn dialog system. The task-based multi-turn dialogue model in the traditional field needs to be solved by different methods and models from input to output in each module. This kind of split-module solution has good effect in practical application, but there are some challenges and problems: firstly, because the modules are independent from each other, the information of the lower layer is difficult to feed back to the upper layer module; secondly, the task-based dialog system in the traditional field has poor expandability; third, conventional domain task based dialog systems mostly require a large number of manually annotated task specific corpora. In recent years, many scholars have attempted to apply a sequence-to-sequence model to task-based multi-turn dialog systems to achieve end-to-end solution. Although the end-to-end dialog system has the advantages of good expandability, error propagation resistance and the like compared with the dialog system in the traditional field, the standard sequence-to-sequence model adopted by the end-to-end dialog system cannot well model historical dialog information. In a dialog system, however, it is very critical to construct context information. In order to make the sentences generated by the end-to-end dialogue system more conform to the characteristics of multiple rounds of dialogue, a plurality of solutions have been proposed from different perspectives. Some researchers have introduced a knowledge base into the dialog system, Wen et al (2014), which is based on an end-to-end trainable dialog system. Although the method reduces manual intervention to a certain extent, structured knowledge base data of related professional fields need to be combined, and the knowledge data is difficult to acquire and needs to be acquired by means of expert analysis of the related fields. Mou et al (2016) use statistical methods to calculate the subject words that should appear in the response, which can reduce the generation of meaningless responses. But it is obviously not enough for a multi-turn dialogue system to use only one word as a subject word at a semantic level. Serban et al (2017) proposed a layered coder Decoder Model (HRED) whose core idea is to add a layer of context coder for coding history information based on a standard sequence-to-sequence Model. For modeling multiple rounds of dialog, this hierarchical structure has advantages over conventional sequence-to-sequence models. In the process of back propagation, the context vector information of the conventional sequence-to-sequence structure is gradually diluted by the information of the new statement. And by adding a separate context encoder to model historical information of a multi-turn dialog system from a global perspective, the HRED can better capture semantic information. Serban et al (2017) proposed a VHRED model that attempts to introduce gaussian random variables into the context information based on HRED, increasing the diversity of replies.
Disclosure of Invention
The invention provides a task type multi-round dialogue model construction method based on a mixed granularity attention mechanism. Based on the hierarchical structure characteristic that multiple rounds of conversations consist of sentences and words, the invention designs a mixed attention mechanism aiming at the sentence granularity and the word granularity of the model. The attention of sentence granularity focuses on paying attention to information such as overall context and intention of multi-turn conversations, the attention of word granularity is used for paying more attention to details, and the combination of the information and the information can extract more effective context information from different levels, so that the generated reply is more meaningful. In addition, the invention researches five existing key modeling and training technologies: in order to make the model more accurate, the invention adopts a multilayer network structure, but simultaneously, the problem of gradient disappearance or gradient explosion is also brought, so the model introduces Residual Connection (Residual Connection) in each sub-layer. Due to the fact that the size of a data set is limited, an overfitting is easy to occur on a complex model, and the overfitting is controlled from different angles by combining a dropout mechanism and a label smoothing (1abel smoothing) method. In order to make deep network training more stable, each Layer input of the model adopts a Layer Normalization (Layer Normalization) method. In consideration of the diversity of responses of the multi-turn dialog system, the invention adopts the beam search method to find the most possible response sequence. The five technologies are organically combined, a new mixed model is further obtained through optimization, and the accuracy of reply generation of the end-to-end task type dialogue model is greatly improved.
A task type multi-round dialogue model construction method based on a mixed granularity attention mechanism comprises the following steps:
s1: for the input natural text X 1 ,X 2 ,...,X N After a series of natural language processing steps such as word embedding and word stop, the sentence is processedConverting each word into a vector representation of fixed length
Figure RE-GDA0002323006160000031
S2: encoding the converted high-dimensional vector into a sentence vector using an input encoder, memorizing the details of the dialog, where M denotes the number of layers of the encoder and decoder
Figure RE-GDA0002323006160000032
S3: the sentence vector is used as input for each time step of the context encoder, and the sentence vector is encoded by the context encoder (h) 1 ,...,h t )=RNNContextEncoder(E 1 ,...,E N );
S4: the context coding layer outputs an attention mechanism combined with sentence granularity to realize the context coding;
s5: step S4 outputs as input the first layer of the output decoder, decoding D by decoding the first layer of the output decoder 1 =RNNDecoder 1 (v);
S6: calculating an attention value of word granularity;
s7: and the output of the first layer of the decoder is combined with the attention value of the word granularity calculated in the step S6, the decoding is started step by step until the decoder generates a terminator position, the output generated by the decoder is mapped to the dimension of the word list size, and the result is output.
Further, the specific process of step S4 is:
s4.1: introducing sentence vector u s ,u s Initially by randomly initializing assignments. To h i Performing nonlinear transformation to obtain u i
u i =tanh(W s h i +b s )
S4.2: for u is paired i And u s Calculating similarity to obtain weight, and obtaining normalized weight alpha after softmax i
Figure RE-GDA0002323006160000041
S4.3: to h i And carrying out weighted average to obtain a final context vector v.
Figure RE-GDA0002323006160000042
Further, the specific process of step S6 is:
s6.1: to obtain information of different subspaces, first D 1 And E N Different linear transformations were performed to obtain the following values (Query) 1 ,Value 1 ,Key 1 )…(Query N ,Value N ,Key N ),
Figure RE-GDA0002323006160000043
The weight vector value formula of the ith calculation weight vector is as follows:
Figure RE-GDA0002323006160000044
Figure RE-GDA0002323006160000045
Figure RE-GDA0002323006160000046
s6.2: then, the zoom dot product is calculated, and dim is Key i The ith calculation formula is as follows:
Figure RE-GDA0002323006160000047
s6.3: and finally, splicing the N values obtained by the calculation in the step S6.2, and obtaining the multi-head attention value of the expected dimensionality through simple linear transformation.
mulAttention=concat(Att 1 ,...,Att N )*W out
Further, the specific process of step S7 is:
s7.1: initializing the decoding input d 0 =di nitial
S7.2: the step-by-step decoding from the second layer to the last layer in the decoder is as follows, where L maxsize Denotes the maximum length for which a reply is generated, 0 maxsize
Figure RE-GDA0002323006160000048
S7.3: mapping the output generated by the decoder to the dimension of the word list size;
Figure RE-GDA0002323006160000051
s7.4: obtaining the distribution of the output of the step j on the vocabulary table through normalization;
Figure RE-GDA0002323006160000052
s7.5: finding out the word list ID corresponding to the word with the maximum probability in each step;
Figure RE-GDA0002323006160000053
s7.6: converting the word ID into a readable character string;
Figure RE-GDA0002323006160000054
s7.7: when the decoder generates the terminator, decoding is stopped and the concatenated word generates the N +1 th round of replies.
Y=join(y 1 ,y 2 ,...,y end )
The end-to-end task oriented dialogue model construction method based on the mixed attention mechanism has the following advantages:
1. the invention uses a mixed-granularity attention mechanism, wherein the attention of sentence granularity focuses on paying attention to information such as overall context, intention and the like of multi-turn conversations, the attention of word granularity pays attention to more details, and the combination of the two can extract more effective context information from different layers, so that the generated reply is more meaningful.
2. The end-to-end task oriented model provided by the invention only needs original text data and can automatically organize a relatively large training set through a simple preprocessing method, thereby ensuring the data expansion capability of the system. The problems that the task-based multi-turn dialogue system is lack of data and most training sets need expensive manual annotations are effectively solved.
3. The invention is respectively researched on two real data sets of a Jingdong customer service data set and a Ubuntu Dialogue Corpus, and the experimental result is superior to that of the previous end-to-end Dialogue system model, thereby demonstrating the effectiveness of the model in introducing a practical application scene. In addition, due to the end-to-end structure of the model and the data requirement without manual marking, the model is simple to fall to the ground and strong in mobility, and the defects of the traditional task-oriented multi-turn conversation model are overcome.
Drawings
FIG. 1 is a schematic diagram of a task-based multi-turn dialogue model construction process based on a mixed-granularity attention mechanism according to the present invention;
FIG. 2 is a schematic overall structure diagram of an embodiment of the present invention;
FIG. 3 is a block diagram of model calculation in a three-wheel dialog scenario in accordance with an embodiment of the present invention;
FIG. 4 is a comparison of the experimental results of seq2seq, HRED, VHRED and the model of the present invention in the Kyoto customer service dataset and the Ubuntu Dialogue Corpus dataset, respectively;
FIG. 5 is a sample of multi-turn dialog replies qualitatively analyzing and comparing the characteristics of replies generated by various methods in a multi-turn dialog scenario;
FIG. 6 is an illustration of the effect of various optimization techniques on model performance;
figure 7 is the effect of beamsize on model performance.
Detailed Description
Specific embodiments of the present invention are described below in conjunction with the accompanying drawings so that those skilled in the art can better understand the present invention.
An end-to-end task oriented dialogue model construction method based on a mixed attention mechanism comprises the following steps:
s1, we take the jingdong dialog dataset and the Ubuntu dialog Corpus multi-turn dialog dataset as the datasets of the examples. For the input natural text X 1 ,X 2 ,...,X N Each word of the input text is converted into a high-dimensional vector through preprocessing of stop words, word embedding and the like (the jieba word segmentation is also needed in the Jingdong dialogue data set). We have built a dictionary with a vocabulary of 21,000. The initialization of the Kyoto customer service data set adopts wiki encyclopedia pre-training word vectors, the Ubuntu Dialogue Corpus data set adopts Google News training data, and the dimensionality of the word vectors is set to 300. Conversion of each word in a sentence into a fixed-length vector representation
Figure RE-GDA0002323006160000071
And S2, encoding the converted high-dimensional vector into a sentence vector by using the input encoder, and memorizing the details of the conversation. Wherein the encoder and decoder adopt a six-layer structure, and the context encoder is a layer. The encoder, the context encoder and the decoder all adopt Gated Round Units (GRUs);
Figure RE-GDA0002323006160000072
s3, the sentence vector is used as the input of each time step of the context encoder, and the sentence vector is encoded by the context encoder;
(h 1 ,...,h t )=GRUContextEncoder(E 1 ,...,E N )
s4, the context coding layer outputs attention mechanism combined with sentence granularity to realize the context coding;
the specific steps of step S4 are:
s4.1, introducing sentence vector u s ,u s Initially by randomly initializing assignments. To h i Performing nonlinear transformation to obtain u i The formula is as follows:
u i =tanh(W s h i +b s )
s4.2, for u i And u s Calculating similarity to obtain weight, and obtaining normalized weight alpha after softmax i
Figure RE-GDA0002323006160000073
S4.3, for h i And carrying out weighted average to obtain a final context vector v.
Figure RE-GDA0002323006160000074
S5, the output of step S4 is used as the input of the first layer of the output decoder, and the decoding is carried out by the first layer of the decoding layer;
D 1 =GRUDecoder 1 (v)
s6, calculating the attention value of the word granularity;
the specific step of step S6 is:
s6.1, to obtain information of different subspaces, first D 1 And E N Different linear transformations were performed to obtain the following values (Query) 1 ,Value 1 ,Key 1 )…(Query N ,Value N ,Key N ),
Figure RE-GDA0002323006160000081
Three different weight vectors to be trained for the ith linear transformation, the weight vector value formula for the ith calculation is:
Figure RE-GDA0002323006160000082
Figure RE-GDA0002323006160000083
Figure RE-GDA0002323006160000084
S6.2, carrying out zoom dot product calculation, wherein dim is Key i The ith calculation formula is as follows:
Figure RE-GDA0002323006160000085
and S6.3, finally splicing the N values obtained by the calculation in the step S6.2, and obtaining the multi-head attention value of the expected dimensionality through simple linear transformation.
mulAttention=concat(Att 1 ,...,Att N )*W out
And S7, combining the output of the first layer of the decoder with the attention value of the word granularity calculated in the step S6, starting to decode step by step until the decoder generates a terminator position, mapping the output generated by the decoder to the dimension of the word list size, and outputting the result.
The specific step of step S7 is:
s7.1, initializing the decoding input d 0 =d initial
S7.2, the step-by-step decoding process from the second layer to the sixth layer of the decoder is as follows, wherein L maxsize Denotes the maximum length for which a reply is generated, 0 maxsize
Figure RE-GDA0002323006160000086
S7.3, mapping the output generated by the decoder to the dimension of the word list size;
Figure RE-GDA0002323006160000087
s7.4, obtaining the distribution of the output of the step j on the vocabulary through normalization;
Figure RE-GDA0002323006160000088
s7.5, finding out the word list ID corresponding to the word with the maximum probability in each step;
Figure RE-GDA0002323006160000091
s7.6, converting the word ID into a readable character string;
Figure RE-GDA0002323006160000092
s7.7, when the decoder generates the terminator, the decoding is stopped, and the connected word generates the reply of the (N + 1) th round.
Y=join(y 1 ,y 2 ,...,y end )
Under this framework, five key modeling and training techniques were applied, where the model was optimized using Adam, setting the learning rate to 0.0001. The model outputs a dropout drop probability of 0.2 for each sub-layer, and introduces residual linking and layer normalization with a beam size of 4. And applying uniform label smoothing to the cross entropy of the output calculation, wherein the uncertainty epsilon is 0.1.
To verify the effectiveness of the present invention, we performed comparative experiments using two datasets, the kyoton multi-round Dialogue dataset and the Ubuntu Dialogue multi-round Dialogue dataset. The Jingdong dialogue data set is a multi-round dialogue data set provided by the 2018JD Dialog Challenge tournament, the language is Chinese, and the multi-round dialogue data set contains real dialogue data of Jingdong customers and Jingdong manual customer service. The data set contained 11 ten thousand sessions of multiple sessions with an average session number of 13 sessions. The Ubuntu dialog kernel multi-turn Dialogue data set is a public data set, the language is English, and the Ubuntu dialog kernel multi-turn Dialogue data set contains Ubuntu related problem technical support multi-turn Dialogue data. The data set contained 100 thousand sessions with an average session number of 8 sessions.
This embodiment is compared on the test set with the advanced end-to-end approach that has been published in recent years. It is a task of the present invention to generate a reply to a given dialog segment for that dialog. The invention adopts deltaBLEU method to evaluate the performance of generating reply. The alignment model employs a sequence-to-sequence model (seqseq), a hierarchical recursive coder-decoder (HRED), and a variant HRED (vhred).
FIG. 4 is a comparison of the results of experiments on seq2seq, HRED, VHRED and the model of the invention in the Kyoto customer service dataset and the Ubuntu Dialogue Corpus dataset, respectively. For a multi-turn dialogue system, the model of the invention achieves optimal results on both datasets. But the single round of dialogue, and other methods, become less distant. The model provided by the invention is more advantageous in a scene of multi-turn conversation.
Through the multi-turn dialog reply sample in fig. 5, the characteristics of the replies generated by the methods in the multi-turn dialog scene are qualitatively analyzed and compared. In general, the model of the present invention is able to create more consistent and accurate responses given the context of multiple rounds of conversation, reducing the generation of meaningless replies. Context encoders for baseline models HRED and VHRED models may also utilize context information to some extent, but not as much as the present model as a whole. Referring to fig. 6 and 7, the effect of various optimization techniques on model performance and on the training model is explored. And (3) independently deleting one of the technologies, carrying out a comparison experiment with the original model, and observing the experiment result. Where "-a" in fig. 6 indicates the technique of deleting a and "-" indicates that the training run is unstable. Based on the comparison result of the mixed attention mechanism, the effect of the mixed attention mechanism on the model generation recovery is greatly improved. Based on the results of label smoothing, dropout and beam search, the three have positive influence on the model of the invention, deltaBLEU can be slightly promoted, and for the beam search, the larger the beam size, the better the model performance. Based on the comparison of layer normalization results, the layer normalization is crucial to the training process of the stable model, the deleted layer normalization is not stable enough during model training, and training parameters need to be adjusted again, so that the influence of the deleted layer normalization cannot be quantized and compared. Based on the comparison of the residual connection results, when the number of layers reaches 6 layers, the model without residual is far less effective than a single-layer model, because the gradient disappears and the gradient explodes due to the multi-layer network, and the problem can be solved well by introducing the residual connection. Experimental results show that the key modeling and training technologies have positive effects on the model in different degrees.
Through the above quantitative and qualitative analysis, it is shown that the model proposed by the present invention can better utilize context information, is suitable for the scenes of multiple rounds of conversation, and is superior to the previous end-to-end model.

Claims (4)

1. A task-type multi-round dialogue model construction method based on a mixed-granularity attention mechanism is characterized by comprising the following steps of:
s1: for the input natural text X 1 ,X 2 ,...,X N After a series of natural language processing steps such as word embedding and word stop, each word in the sentence is converted into vector representation with fixed length
Figure RE-FDA0002323006150000011
S2: encoding the converted high-dimensional vector into a sentence vector using an input encoder, memorizing the details of the dialog, where M denotes the number of layers of the encoder and decoder
Figure RE-FDA0002323006150000012
S3: the sentence vector is used as input for each time step of the context encoder, and the sentence vector is encoded by the context encoder (h) 1 ,...,h t )=RNNContextEncoder(E 1 ,...,E N );
S4: the context coding layer outputs an attention mechanism combined with sentence granularity to realize the context coding;
s5: step S4 outputs as input the first layer of the output decoder, decoding D by decoding the first layer of the output decoder 1 =RNNDecoder 1 (v);
S6: calculating an attention value of word granularity;
s7: and the output of the first layer of the decoder is combined with the attention value of the word granularity calculated in the step S6, the decoding is started step by step until the decoder generates a terminator position, the output generated by the decoder is mapped to the dimension of the word list size, and the result is output.
2. The task-based multi-turn dialogue model construction method based on the mixed-granularity attention mechanism according to claim 1, wherein the specific process of step S4 is as follows:
s4.1: introducing sentence vector u s ,u s Initially by randomly initializing assignments. To h i Performing nonlinear transformation to obtain u i
u i =tanh(W s h i +b s );
S4.2: for u is paired i And u s Calculating similarity to obtain weight, and obtaining normalized weight alpha after softmax i
Figure RE-FDA0002323006150000013
S4.3: to h i A weighted average is performed to obtain a final context vector v,
Figure RE-FDA0002323006150000021
3. the task-based multi-turn dialogue model construction method based on the mixed-granularity attention mechanism according to claim 2, wherein the specific process of step S6 is as follows:
s6.1: to obtain information of different subspaces, first D 1 And E N Different linear transformations were performed to obtain the following values (Query) 1 ,Value 1 ,Key 1 )…(Query N ,Value N ,Key N ),
Figure RE-FDA0002323006150000022
The formula of the weight vector value calculated for the ith time is as follows:
Figure RE-FDA0002323006150000023
Figure RE-FDA0002323006150000024
Figure RE-FDA0002323006150000025
s6.2: then, the zoom dot product is calculated, and dim is Key i The ith calculation formula is as follows:
Figure RE-FDA0002323006150000026
s6.3: finally, splicing N values obtained by calculation in the S6.2 step, obtaining a multi-head attention value of an expected dimension through simple linear transformation,
mulAttention=concat(Att 1 ,...,Att N )*W out
4. the method according to claim 3, wherein the specific process of step S7 is as follows:
s7.1: initializing the decoding input d 0 =d initial
S7.2: the step-by-step decoding from the second layer to the last layer in the decoder is as follows, where L maxsize Denotes the maximum length for which a reply is generated, 0 maxsize
Figure RE-FDA0002323006150000027
S7.3: the output produced by the decoder is mapped to the dimension of the word list size,
Figure RE-FDA0002323006150000028
s7.4: obtaining the distribution of the output of the step j on the vocabulary table through normalization,
Figure RE-FDA0002323006150000031
s7.5: finding out the word list ID corresponding to the word with the maximum probability in each step,
Figure RE-FDA0002323006150000032
s7.6: converting word ID into readable character string
Figure RE-FDA0002323006150000033
S7.7: when the decoder generates a terminator, decoding is stopped, and the concatenated word generates an N +1 th round of reply, Y ═ join (Y) 1 ,y 2 ,...,y end )。
CN201910929777.9A 2019-09-27 2019-09-27 Task type multi-round dialogue model construction method based on mixed granularity attention mechanism Active CN110929476B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910929777.9A CN110929476B (en) 2019-09-27 2019-09-27 Task type multi-round dialogue model construction method based on mixed granularity attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910929777.9A CN110929476B (en) 2019-09-27 2019-09-27 Task type multi-round dialogue model construction method based on mixed granularity attention mechanism

Publications (2)

Publication Number Publication Date
CN110929476A CN110929476A (en) 2020-03-27
CN110929476B true CN110929476B (en) 2022-09-30

Family

ID=69849047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910929777.9A Active CN110929476B (en) 2019-09-27 2019-09-27 Task type multi-round dialogue model construction method based on mixed granularity attention mechanism

Country Status (1)

Country Link
CN (1) CN110929476B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163080A (en) * 2020-10-12 2021-01-01 辽宁工程技术大学 Generation type dialogue system based on multi-round emotion analysis
CN112417125B (en) * 2020-12-01 2023-03-24 南开大学 Open domain dialogue reply method and system based on deep reinforcement learning
CN113868395A (en) * 2021-10-11 2021-12-31 北京明略软件***有限公司 Multi-round dialogue generation type model establishing method and system, electronic equipment and medium
CN114357129B (en) * 2021-12-07 2023-02-14 华南理工大学 High-concurrency multi-round chat robot system and data processing method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763504A (en) * 2018-05-30 2018-11-06 浙江大学 It is a kind of that generation method and system are replied based on the dialogue for strengthening binary channels Sequence Learning
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108763504A (en) * 2018-05-30 2018-11-06 浙江大学 It is a kind of that generation method and system are replied based on the dialogue for strengthening binary channels Sequence Learning
CN110134771A (en) * 2019-04-09 2019-08-16 广东工业大学 A kind of implementation method based on more attention mechanism converged network question answering systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于分层编码的深度增强学习对话生成;赵宇晴等;《计算机应用》;20171010(第10期);全文 *

Also Published As

Publication number Publication date
CN110929476A (en) 2020-03-27

Similar Documents

Publication Publication Date Title
CN110929476B (en) Task type multi-round dialogue model construction method based on mixed granularity attention mechanism
US11194972B1 (en) Semantic sentiment analysis method fusing in-depth features and time sequence models
CN108153913B (en) Training method of reply information generation model, reply information generation method and device
CN111858932A (en) Multiple-feature Chinese and English emotion classification method and system based on Transformer
CN113158665A (en) Method for generating text abstract and generating bidirectional corpus-based improved dialog text
CN112989796B (en) Text naming entity information identification method based on syntactic guidance
CN111125333B (en) Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism
CN114116994A (en) Welcome robot dialogue method
CN110717341B (en) Method and device for constructing old-Chinese bilingual corpus with Thai as pivot
CN113536804B (en) Natural language feature extraction method based on keyword enhancement GRU and Kronecker
CN114881042B (en) Chinese emotion analysis method based on graph-convolution network fusion of syntactic dependency and part of speech
Huang et al. End-to-end sequence labeling via convolutional recurrent neural network with a connectionist temporal classification layer
CN113392265A (en) Multimedia processing method, device and equipment
CN113515619A (en) Keyword generation method based on significance information gating mechanism
CN115630145A (en) Multi-granularity emotion-based conversation recommendation method and system
CN115545033A (en) Chinese field text named entity recognition method fusing vocabulary category representation
CN115512195A (en) Image description method based on multi-interaction information fusion
CN114387537A (en) Video question-answering method based on description text
CN117332789A (en) Semantic analysis method and system for dialogue scene
CN113239663B (en) Multi-meaning word Chinese entity relation identification method based on Hopkinson
CN117235261A (en) Multi-modal aspect-level emotion analysis method, device, equipment and storage medium
CN115376547B (en) Pronunciation evaluation method, pronunciation evaluation device, computer equipment and storage medium
CN110738989A (en) method for solving automatic recognition task of location-based voice by using end-to-end network learning of multiple language models
CN114896969A (en) Method for extracting aspect words based on deep learning
Ghorpade et al. ITTS model: speech generation for image captioning using feature extraction for end-to-end synthesis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant