CN113626566B - Knowledge dialogue cross-domain learning method based on synthetic data - Google Patents

Knowledge dialogue cross-domain learning method based on synthetic data Download PDF

Info

Publication number
CN113626566B
CN113626566B CN202110763112.2A CN202110763112A CN113626566B CN 113626566 B CN113626566 B CN 113626566B CN 202110763112 A CN202110763112 A CN 202110763112A CN 113626566 B CN113626566 B CN 113626566B
Authority
CN
China
Prior art keywords
dialogue
knowledge
cross
corpus
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110763112.2A
Other languages
Chinese (zh)
Other versions
CN113626566A (en
Inventor
魏凯敏
林健成
张继连
刘志全
冯丙文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinan University
Original Assignee
Jinan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinan University filed Critical Jinan University
Priority to CN202110763112.2A priority Critical patent/CN113626566B/en
Publication of CN113626566A publication Critical patent/CN113626566A/en
Application granted granted Critical
Publication of CN113626566B publication Critical patent/CN113626566B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a knowledge dialogue cross-domain learning method based on synthetic data. Aiming at the problem of insufficient data resources in the cross-domain learning of a knowledge dialogue system, the method provides the following strategies: aiming at questions and answers, a method for jointly constructing synthetic data by a template and a multi-round dialogue generation model is provided; providing a knowledge reservation and template method to construct a synthetic data method aiming at catastrophic forgetfulness; to utilize the mismatched dialog corpus, we propose a method of constructing synthetic data using methods such as search, filtering, sorting, and the like. The model performance trained by using the synthetic data can approximate to a model trained by using artificial annotation data, and the dependence of the cross-domain learning of a knowledge dialogue system on data resources is effectively relieved.

Description

Knowledge dialogue cross-domain learning method based on synthetic data
Technical Field
The invention relates to the technical field of natural language processing, in particular to a knowledge dialogue cross-domain learning method based on synthetic data.
Background
In the field of dialog systems, knowledge dialogues are widely used to generate replies with more information and more convincing power, applicable to various dialog robots for satisfying the emotion of a user and to various dialog robots for achieving a specific task. However, the current knowledge dialogue system generally has a problem: because the corpus used for training by the knowledge dialogue system has timeliness, the knowledge dialogue system performs poorly when facing new fields. In the face of new fields, available training data is expensive to collect, and may typically suffer from a few data or even zero data. This makes cross-domain learning by the deployed knowledge dialogue system very difficult. According to the organization form of knowledge, the knowledge is divided into structured knowledge and unstructured knowledge, a structured knowledge system usually exists in the form of a knowledge graph triplet, and the invention relates to a knowledge dialogue system using the structured knowledge.
For the research of knowledge dialogue systems, the current research focuses on how to better utilize knowledge for dialogue generation in limited fields. The knowledge dialogue system is less researched how to update online after being deployed, so that the current application of the knowledge dialogue system still has larger limitation.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a knowledge dialogue cross-domain learning method based on synthetic data.
The aim of the invention can be achieved by adopting the following technical scheme:
the method realizes the cross-domain learning of the structured knowledge dialogue system by constructing the synthetic data aiming at four scenes including question-answering, boring, disastrous forgetting and only having mismatched dialogue corpus, the method aims at the realization process of different scenes as follows,
(1) The steps of cross-domain learning for question and answer scenes are as follows:
s11, manually presetting a template;
s12, for any new domain knowledge triplet (Entity, attr, value), wherein the Entity represents a certain Entity in the real world, such as a specific movie name, attr represents an attribute of the Entity, such as a director, the movie Entity has an actor attribute, and Value is a specific Value of the attribute, and the attribute is substituted into the position of the template to obtain a piece of synthesized data;
s13, repeating the step S11 and the step S12 until the knowledge in the new field is completely constructed into corresponding synthetic data;
(2) The steps of cross-domain learning for the chatty scene are as follows:
s21, pretraining on a large-scale dialogue corpus by using a dialoGPT dialogue model;
s22, for any knowledge triplet { Entity, attr, value } in the new field, aggregating according to the Entity to obtain a plurality of groups G;
s23, for the same group, use "do you say { Entity } with me? "may be" is the beginning of a multi-round session;
s24, randomly generating a random number p, wherein 0< = p < = 1;
s25, if the random number p generated in the step S24 is more than 0.5, performing the development of the multi-turn dialogue mentioned in the step S3 by using a dialoGPT dialogue model for writing;
s26, if the random number p generated in the step S24 is less than 0.5, carrying out multiple rounds of dialogue by using a template to carry out writing;
s27, repeating the steps S24-S26 until each knowledge triplet { Entity, attr, value } in the group of conversations is covered and used, and generating corresponding synthesized data;
s28, repeating the steps S22-S27 until all knowledge in the new field is covered;
(3) The steps of cross-domain learning for a catastrophically forgotten scene are as follows:
s31, constructing N pieces of dialogue synthesized data by using templates for N pieces of new domain knowledge to be learned;
s32, randomly extracting N pieces of knowledge and dialogue corpus from the knowledge in the old field;
s33, constructing N pieces of dialogue synthesized data for the N pieces of knowledge extracted in the step S32 by using templates;
s34, mixing the dialogue corpus extracted in the steps S31, S32 and S33 with the synthesized data to form a new data set;
(4) The construction of the synthesized data for the scene where only the non-matching dialogue corpus exists is as follows:
s41, marking dialogue corpus acquired in a social network, wherein the content of dialogue context is context, the corresponding dialogue reply is response, then word segmentation is carried out on the response part, and an inverted index is established by using a tool to obtain a database D;
s42, pre-training the data set constructed in the step S41 by using a BERT model;
s43, carrying out word segmentation on N new domain knowledge to be learned and a plurality of corresponding short sentences by using a crust word segmentation tool; the jieba is a Chinese word segmentation tool, is simple and direct to install, and can be downloaded and used by referring to the address (https:// gitsub.
S44, searching in a database D by taking a Value in a word segmentation result and knowledge triples (Entity, attr, value) as a keyword, and returning a dialogue context of 50 bits before the relevance score;
s45, filtering the searched dialogue context;
s46, scoring by using the trained BERT in the step S42, and forming the synthesized data by using the context with the highest score and the corresponding short sentence in the step S43.
Further, when the scene is a question and answer, the manually preset template is "do you know { Attr } of { Entity? "know about the same, is { Value }; wherein Entiy, attr, value is a filling position corresponding to the knowledge triplet (Attr, value).
Further, when the scene is boring, training is carried out by using a public data set LCCC in a dialogGPT model;
the DialogGPT model includes 12 layers of transducer structures, each layer of transducer structure including 12 heads, 768 dimensions of hidden states;
the loss function of the DialogGPT model is:
-log P(y n |y 0 ,y 1 ……y n-1 ,C)
where C is the context in the dialog corpus, y 0 ~y n-1 Is the character already generated in the reply, y n Is the character currently required to be generated.
Further, the scene is that only a non-matching dialogue corpus exists, the BERT model comprises 12 layers of transformation structures, each layer of transformation structure comprises 12 heads and 768-dimensional hidden states;
the loss function of the BERT model is:
-logP(1|c,r + )-logP(0|c,r - )
wherein C is the context in the dialogue corpus, r + Is a positive sample, is the original response, r in the dialogue and the corpus - Is a negative sample, and is a sentence randomly extracted from the dialogue corpus.
Further, when the scene is that only the non-matching dialogue corpus exists, filtering the retrieved context in step S45 includes filtering sensitive words and filtering names.
Compared with the prior art, the invention has the following advantages and effects:
1. since the method of using synthetic data construction supports online operation, the dialog model can update parameters online. Therefore, the invention can lead the learning dialogue system to still learn the new knowledge after deployment, and offline training and redeployment are not needed.
2. In the dialogue field, the traditional data enhancement technology can only replace some characters in dialogue corpus so as to make expressions more various, but in the face of the situation that dialogue corpus in the new field is zero, the traditional data enhancement technology cannot be used. However, because our synthetic data only depends on knowledge in the field and not on the amount of resources already available for the dialog corpus, the present invention is suitable for a more stringent data environment than conventional data enhancement techniques. Such as where the knowledge dialog corpus is zero in the new domain, or only text in non-dialog form.
3. The invention supports online synthesis of synthetic data generated during learning of new knowledge, which can be used for online updating of dialogue models. Therefore, the storage is not needed after learning, and the storage pressure is reduced.
Drawings
FIG. 1 is a flow chart of a method of knowledge dialogue cross-domain learning based on synthetic data as disclosed in the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Examples
As shown in fig. 1, learning is performed on an existing dialogue corpus to obtain an initial version of the knowledge dialogue system. In the process of online interaction with users in the knowledge dialogue system, the real human society can continuously generate some knowledge, such as newly-launched movies and published news, and the synthetic data strategy provided by the embodiment can construct corresponding synthetic data online and update a model through the synthetic data. The embodiment provides a knowledge dialogue cross-domain learning method based on synthetic data, which aims at four scenes including question-answering, boring, catastrophic forgetting and only having non-matching dialogue corpus, and realizes the cross-domain learning of a structured knowledge dialogue system by constructing the synthetic data.
In this embodiment, the model does not need to be updated offline, and the model can learn the knowledge of the new field online by constructing synthetic data through four types of models. After learning is completed, a model is saved that has completed learning on the new domain of knowledge. The constructed synthetic data in this process can be discarded, thereby avoiding the pressure increase in storage as the number of new fields increases. The method comprises the following specific steps:
(1) The steps of cross-domain learning for question and answer scenes are as follows:
s11, manually presetting a template;
the manually preset template is "do you know { Attr } of { Entity? "know about the same, is { Value }; wherein Entiy, attr, value is a filling position corresponding to the knowledge triplet (Attr, value).
S12, substituting knowledge triples (Value) of any new field into the positions of the templates respectively to obtain synthetic data;
s13, repeating the step S11 and the step S12 until the knowledge in the new field is completely constructed into corresponding synthetic data.
(2) The steps of cross-domain learning for the chatty scene are as follows:
s21, using a dialog generation model based on a transducer: the dialoGPT model performs pre-training on a large-scale dialogue corpus;
s22, for any knowledge triplet { Entity, attr, value } in the new field, aggregating according to the Entity to obtain a plurality of groups G;
s23, for the same group, use "do you say { Entity } with me? "may be" is the beginning of a multi-round session;
s24, randomly generating a random number p, wherein 0< = p < = 1;
s25, if the random number p generated in the step S24 is more than 0.5, performing multiple rounds of dialogue for writing by using the DialoGPT mentioned in the step S21;
s26, if the random number p generated in the step S24 is less than 0.5, carrying out multiple rounds of dialogue by using a template to carry out writing;
s27, repeating the steps S24-S26 until each knowledge triplet { Entity, attr, value } in the group of conversations is covered;
s28, repeating the steps S22-S27 until all knowledge in the new field is covered.
In this embodiment, training is performed in a DialogGPT model using a public dataset LCCC;
the DialogGPT model includes 12 layers of transducer structures, each comprising 12 heads, 768 dimensions of hidden states. Where head is a component in the transformers, each head generates a probability distribution for the context that indicates how important each character is to the task, and the hidden state is the output of each layer of transformers;
the loss function of the DialogGPT model mentioned in the above step S21 is: log P (y) n |y 0 ,y 1 ……y n-1 ,C)
Where C is the context in the dialog corpus, y 0 ~y n-1 Is the character already generated in the reply, y n Is the character currently required to be generated.
(3) The steps of cross-domain learning for a catastrophically forgotten scene are as follows:
s31, constructing N pieces of dialogue synthesized data by using templates for N pieces of new domain knowledge to be learned;
s32, randomly extracting N pieces of knowledge and dialogue corpus from the knowledge in the old field;
s33, constructing N pieces of dialogue synthesized data for the N pieces of knowledge extracted in the step S32 by using templates;
s34, mixing the dialogue corpus extracted in the steps S31, S32 and S33 with the synthesized data to form a new data set;
(4) The step of performing cross-domain learning for a scene where only the non-matching dialogue corpus exists is as follows:
marking dialogue corpus collected in a social network, wherein the content of dialogue context is context, the corresponding dialogue reply is response, then word segmentation is carried out on the response part, and an inverted index is established by using a tool to obtain a database D;
s42, pre-training the data set constructed in the S41 by using a BERT model;
s43, for N new domain knowledge and a plurality of corresponding short sentences to be learned, using a crust word segmentation tool (https:// gitsub.com/fxsjy/jieba) to segment words;
s44, searching in a database D by taking a Value in a word segmentation result and knowledge triples (Entity, attr, value) as a keyword, and returning a dialogue context of a correlation top-50;
s45, filtering the searched dialogue context;
s46, scoring by using the trained BERT in the step S42, and forming the synthetic data by using the context with the highest score and the corresponding short sentence in the step S43.
In this embodiment, filtering the context retrieved in step S45 includes filtering sensitive words, filtering names of people, and the like.
The BERT model comprises 12 layers of transformers, each layer of transformers comprising 12 heads and 768 dimensions of hidden states. Where head is a component in the transformers, each head generates a probability distribution for the context that indicates how important each character is to the task, and the hidden state is the output of each layer of transformers;
the loss function of the BERT model is:
wherein, C is the context in the dialogue corpus, positive sample, original response in the dialogue corpus, negative sample, and a sentence randomly extracted from the dialogue corpus.
After the synthetic data is obtained in the four scenes, learning is performed on the synthetic data in a mode of batch size=1 by using a knowledge dialogue model with a learning rate of 0.0001 and matching with an Adam learner. And after the learning of all the synthesized data is finished, obtaining a final model.
In this example, the models used are all based on neural networks of the transducer structure, which perform better than the RNN, LSTM, etc. models in terms of long-term dependence of the sequence. Residual error networks are used in transformers, so that the problem of gradient disappearance in the training process can be better relieved.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (5)

1. The method realizes the cross-domain learning of the structured knowledge dialogue system by constructing the synthetic data aiming at four scenes including question-answering, boring, disastrous forgetting and only having mismatched dialogue corpus, and is characterized in that the method aims at the realization process of different scenes as follows,
(1) The steps of cross-domain learning for question and answer scenes are as follows:
s11, manually presetting a template;
s12, substituting knowledge triples (Entity, attr, value) in any new field into the positions of the templates respectively to obtain a piece of synthesized data, wherein the Entity represents a specific Entity in the real world, attr represents an attribute of the Entity, and Value is a specific numerical Value of the attribute;
s13, repeating the step S11 and the step S12 until the knowledge in the new field is completely constructed into corresponding synthetic data;
(2) The steps of cross-domain learning for the chatty scene are as follows:
s21, pretraining on a large-scale dialogue corpus by using a dialoGPT dialogue model;
s22, for any knowledge triplet { Entity, attr, value } in the new field, aggregating according to the Entity to obtain a plurality of groups G;
s23, for the same group, use "do you say { Entity } with me? "may be" is the beginning of a multi-round session;
s24, randomly generating a random number p, wherein 0< = p < = 1;
s25, if the random number p generated in the step S24 is more than 0.5, performing the development of the multi-turn dialogue mentioned in the step S3 by using a dialoGPT dialogue model for writing;
s26, if the random number p generated in the step S24 is less than 0.5, carrying out multiple rounds of dialogue by using a template to carry out writing;
s27, repeating the steps S24-S26 until each knowledge triplet { Entity, attr, value } in the group of conversations is covered and used, and generating corresponding synthesized data;
s28, repeating the steps S22-S27 until all knowledge in the new field is covered;
(3) The steps of cross-domain learning for a catastrophically forgotten scene are as follows:
s31, constructing N pieces of dialogue synthesized data by using templates for N pieces of new domain knowledge to be learned;
s32, randomly extracting N pieces of knowledge and dialogue corpus from the knowledge in the old field;
s33, constructing N pieces of dialogue synthesized data for the N pieces of knowledge extracted in the step S32 by using templates;
s34, mixing the dialogue corpus extracted in the steps S31, S32 and S33 with the synthesized data to form a new data set;
(4) The construction of the synthesized data for the scene where only the non-matching dialogue corpus exists is as follows:
s41, marking dialogue corpus acquired in a social network, wherein the content of dialogue context is context, the corresponding dialogue reply is response, then word segmentation is carried out on the response part, and an inverted index is established by using a tool to obtain a database D;
s42, pre-training the data set constructed in the step S41 by using a BERT model;
s43, carrying out word segmentation on N new domain knowledge to be learned and a plurality of corresponding short sentences by using a crust word segmentation tool;
s44, searching in a database D by taking a Value in a word segmentation result and knowledge triples (Entity, attr, value) as a keyword, and returning a dialogue context of 50 bits before the relevance score;
s45, filtering the searched dialogue context;
s46, scoring by using the trained BERT in the step S42, and forming the synthesized data by using the context with the highest score and the corresponding short sentence in the step S43.
2. The method for learning the knowledge dialogue cross-domain based on the synthetic data according to claim 1, wherein when the scene is a question and answer, the manually preset template is "do you know { Attr } of { Entity? "know about the same, is { Value }; wherein Entiy, attr, value is a filling position corresponding to the knowledge triplet (Attr, value).
3. The knowledge dialogue cross-domain learning method based on synthetic data according to claim 1, wherein when the scene is boring, training is performed by using a public data set LCCC in a DialogGPT model;
the DialogGPT model includes 12 layers of transducer structures, each layer of transducer structure including 12 heads, 768 dimensions of hidden states;
the loss function of the DialogGPT model is:
-log P(y n |y 0 ,y 1 ……y n-1 ,C)
where C is the context in the dialog corpus, y 0 ~y n-1 Is the character already generated in the reply, y n Is the character currently required to be generated.
4. The knowledge dialogue cross-domain learning method based on synthetic data according to claim 1, wherein the scene is that only a mismatch dialogue corpus exists, the BERT model comprises 12 layers of transformers structures, each layer of transformers structure comprises 12 heads and 768-dimensional hidden states;
the loss function of the BERT model is:
-logP(1|c,r + )-logP(0|c,r - )
wherein C is the context in the dialogue corpus, r + Is a positive sample, is the original response, r in the dialogue and the corpus - Is a negative sample, and is a sentence randomly extracted from the dialogue corpus.
5. The knowledge dialogue cross-domain learning method based on synthetic data according to claim 1, wherein when the scene is that only the non-matching dialogue corpus exists, filtering the retrieved context in step S45 includes sensitive word filtering and name filtering.
CN202110763112.2A 2021-07-06 2021-07-06 Knowledge dialogue cross-domain learning method based on synthetic data Active CN113626566B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110763112.2A CN113626566B (en) 2021-07-06 2021-07-06 Knowledge dialogue cross-domain learning method based on synthetic data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110763112.2A CN113626566B (en) 2021-07-06 2021-07-06 Knowledge dialogue cross-domain learning method based on synthetic data

Publications (2)

Publication Number Publication Date
CN113626566A CN113626566A (en) 2021-11-09
CN113626566B true CN113626566B (en) 2023-07-18

Family

ID=78379156

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110763112.2A Active CN113626566B (en) 2021-07-06 2021-07-06 Knowledge dialogue cross-domain learning method based on synthetic data

Country Status (1)

Country Link
CN (1) CN113626566B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428055A (en) * 2020-04-20 2020-07-17 神思电子技术股份有限公司 Industry-oriented context omission question-answering method
CN111914074A (en) * 2020-07-16 2020-11-10 华中师范大学 Method and system for generating limited field conversation based on deep learning and knowledge graph
CN112287090A (en) * 2020-11-23 2021-01-29 深圳季连科技有限公司 Financial question asking back method and system based on knowledge graph

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428055A (en) * 2020-04-20 2020-07-17 神思电子技术股份有限公司 Industry-oriented context omission question-answering method
CN111914074A (en) * 2020-07-16 2020-11-10 华中师范大学 Method and system for generating limited field conversation based on deep learning and knowledge graph
CN112287090A (en) * 2020-11-23 2021-01-29 深圳季连科技有限公司 Financial question asking back method and system based on knowledge graph

Also Published As

Publication number Publication date
CN113626566A (en) 2021-11-09

Similar Documents

Publication Publication Date Title
CN108959627B (en) Question-answer interaction method and system based on intelligent robot
CN115238101B (en) Multi-engine intelligent question-answering system oriented to multi-type knowledge base
CN107943998B (en) Man-machine conversation control system and method based on knowledge graph
CN110825881B (en) Method for establishing electric power knowledge graph
CN107944027B (en) Method and system for creating semantic key index
CN102262634B (en) Automatic questioning and answering method and system
CN112800170A (en) Question matching method and device and question reply method and device
CN108710647B (en) Data processing method and device for chat robot
CN108984778A (en) A kind of intelligent interaction automatically request-answering system and self-teaching method
CN109271459B (en) Chat robot based on Lucene and grammar network and implementation method thereof
CN113672708A (en) Language model training method, question and answer pair generation method, device and equipment
CN117370580A (en) Knowledge-graph-based large language model enhanced dual-carbon field service method
CN114153955B (en) Construction method of multi-skill task type dialogue system fusing chatting and common knowledge
CN112506945A (en) Self-adaptive learning guiding method and system based on knowledge graph
CN111428104A (en) Epilepsy auxiliary medical intelligent question-answering method based on viewpoint type reading understanding
CN113065324A (en) Text generation method and device based on structured triples and anchor templates
CN113626566B (en) Knowledge dialogue cross-domain learning method based on synthetic data
Mihaylov et al. A Space Conversational Agent for Retrieving Lessons-learned and Expert Training
CN118093841B (en) Model training method and question-answering method for question-answering system
CN117035064B (en) Combined training method for retrieving enhanced language model and storage medium
Kumar et al. Building conversational Question Answer Machine and comparison of BERT and its different variants
CN116127051B (en) Dialogue generation method based on deep learning, electronic equipment and storage medium
CN112818090B (en) Method and system for generating answer questions and questions based on harmonic words
Szymanski et al. Semantic memory knowledge acquisition through active dialogues
CN112818108B (en) Text semantic misinterpretation chat robot based on shape and near words and data processing method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant