CN108280218A - A kind of flow system based on retrieval and production mixing question and answer - Google Patents
A kind of flow system based on retrieval and production mixing question and answer Download PDFInfo
- Publication number
- CN108280218A CN108280218A CN201810123117.7A CN201810123117A CN108280218A CN 108280218 A CN108280218 A CN 108280218A CN 201810123117 A CN201810123117 A CN 201810123117A CN 108280218 A CN108280218 A CN 108280218A
- Authority
- CN
- China
- Prior art keywords
- question
- retrieval
- model
- answer
- grader
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of flow systems mixing question and answer based on retrieval and production, including grader, Candidate Set, database and model discrimination.The beneficial effects of the invention are as follows:Grader is capable of the context of perception problems,There can be high accuracy rate to the classification of problem,Grader using based on deep learning disaggregated model and the method that is combined of regular expression identification carry out the classification of problem,Important attribute that can be in extraction problem calls corresponding api to carry out real-time query,Establish that inverted index carries out prototype statement retrieval and the query expansion of synonym is retrieved in the problem of Candidate Set searching system matching process,It can search out and the most similar sentence of problem,It is inaccurate to solve the problems, such as that searching system is searched for,Dialog model is engaged in the dialogue the generation of reply using seq2seq models,And Attention mechanism is added wherein,Simultaneously also BeamSearch mechanism is added in decoding end,The sentence of generation more has logicality and structural,Increase the diversity of reply.
Description
Technical field
The present invention relates to a kind of flow system, specially a kind of flow system based on retrieval and production mixing question and answer belongs to
In information retrieval processing technology field.
Background technology
In recent years, question and answer robot is since it is widely applied scene and huge commercial value, by more and more
Vast science and technology-oriented company and scientific research institution attention, therefore also there are many outstanding products, such as the small ice of Microsoft,
The Google assistant of the Siri of apple, ***.Unlike other conditional electronic app, people need not input fixed order
Language (such as:" submission ", " purchase "), and can be exchanged with app using human language.
Question answering system is considered as one of the problem of artificial intelligence field is most difficult to all the time.But with recent years
Carry out the appearance of various Ask-Answer Communities and social network sites, volatile growth, and the hair of hardware occurs in the quantity for talking with language material
So that the calculating power of computer greatly improves, everything all provides new opportunity for the development of question answering system for exhibition.
Question answering system can be divided into based on two kinds of vertical field and Opening field, and Opening field mainly chats class, vertically
Field is mainly assistant's class, and the current mainstream technology for establishing dialogue robot is mainly based upon retrieval model and generates model two
Kind.
In retrieval model, system can according to the problem of look for from Question-Answer databases and asked with this
The semantic most similar question sentence of topic, is then back to the corresponding answer of the question sentence, there are two the main problems of this method:First is
Question-Answer is to limited amount in database, it is possible to the answer for the problem of retrieval is proposed less than user.Second
Problem is that the problem of Question-Answer is to being fixed, possibly can not be proposed according to user obtains completely corresponding answer
Case.
In generating model, conversational system can first understand that the problem of user proposes, then generation word for word, which corresponds to, is somebody's turn to do
The answer of problem.The method of mainstream is Seq2Seq models in deep learning at present, which is first compiled question sentence with the ends encoder
Code indicates for a vector, then vector expression is decoded as a reply by the ends decoder, and the main problem of the model is
It is possible that the answer generated be easy to be general, dull reply (such as:" I does not know ", " good " etc.), such time
The information for including again is less, not substantive meaning.
Invention content
The purpose of the present invention is that solve the above-mentioned problems and provides a kind of based on retrieval and production mixing question and answer
Flow system.
The present invention is achieved through the following technical solutions above-mentioned purpose:A kind of stream based on retrieval and production mixing question and answer
Journey system, including
Grader classifies to a query.
Candidate Set the problem of for failing to classify, is looked for and the immediate problem of the problem, sieve in searching system
The candidate sentence subset elected.
Database, for storing various problem question sentences, convenient for looking for and semantic most similar question sentence of asking a question
Model discrimination calls generation system to generate corresponding answer and provides reply.
Wherein, the grader, which will ask a question, is divided into " weather ", " news ", " joke ", " flight/high ferro ", " near
(geographical location) " and " other " six type, the Candidate Set are obtained using the own coding model based on Recognition with Recurrent Neural Network
The vector expression of each sentence, the Candidate Set carry out problem using BM25 methods of marking and carry out phase with the sentence in database
It is calculated like degree, the model discrimination has used the model based on Seq2Seq to carry out building for generation system.
A kind of flow system based on retrieval and production mixing question and answer, mainly includes the following steps that:
Step A has used the disaggregated model based on convolutional neural networks (CNN) and has been based on two methods of regular expression
To build query graders.
Step B has selected key-value memory databases redis to carry out inverted index and has taken when establishing searching system
Build and stored with question and answer language material, using python realize common retrieval, expanding query, BM25 models the work(such as similarity evaluation
Can, and own coding model is had trained come the semantics recognition of sentence when solving the problems, such as retrieval using Tensorflow.
Step C selects Open Framework Tensorflow to engage in the dialogue model to establish dialog generation system,
Tensorflow is the artificial intelligence framework platform of *** exploitations, can be used for the multinomial depth such as image and natural language processing
Learning areas.
Preferably, in order to have high accuracy rate to the classification of problem, the grader is capable of perception problems
Context is combined by the context with problem.
Preferably, real-time query is carried out in order to the important attribute in extraction problem, the grader is used based on deep
It spends the disaggregated model of study and regular expression identifies that the method being combined carries out the classification of problem.
Preferably, it is matched to solve the problem of that searching system searches for the inaccurate Candidate Set searching system
Inverted index is established in the process carries out prototype statement retrieval and the query expansion retrieval of synonym.
Preferably, there is logicality and structural, in the step C, dialog model makes in order to make the sentence of generation more
Engaged in the dialogue the generation of reply with seq2seq models, and adds Attention mechanism wherein, while also being added in decoding end
BeamSearch mechanism is entered.
The beneficial effects of the invention are as follows:The flow system reasonable design based on retrieval and production mixing question and answer, grader
It is capable of the context of perception problems, is combined by the context with problem, it is high accurate to have to the classification of problem
Rate, grader using based on deep learning disaggregated model and the method that is combined of regular expression identification carry out point of problem
Class, important attribute that can be in extraction problem call corresponding api to carry out real-time query, have stronger real-time, Candidate Set
Establish that inverted index carries out prototype statement retrieval and the query expansion of synonym is retrieved, energy in the problem of searching system matching process
Enough search out with the most similar sentence of problem, solve the problems, such as searching system search for it is inaccurate, in step C, dialog model
Engaged in the dialogue the generation of reply using seq2seq models, and adds Attention mechanism wherein, while also being decoded
End adds BeamSearch mechanism, adds the sentence of Attention mechanism and the Seq2Seq models generation of BeamSearch
Son more has logicality and structural, increases the diversity of reply.
Description of the drawings
Fig. 1 is schematic structural view of the invention.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its
His embodiment, shall fall within the protection scope of the present invention.
Referring to Fig. 1, a kind of flow system based on retrieval and production mixing question and answer, including
Grader classifies to a query.
Candidate Set the problem of for failing to classify, is looked for and the immediate problem of the problem, sieve in searching system
The candidate sentence subset elected.
Database, for storing various problem question sentences, convenient for looking for and semantic most similar question sentence of asking a question
Model discrimination calls generation system to generate corresponding answer and provides reply.
Wherein, the grader, which will ask a question, is divided into " weather ", " news ", " joke ", " flight/high ferro ", " near
(geographical location) " and " other " six type, the Candidate Set are obtained using the own coding model based on Recognition with Recurrent Neural Network
The vector expression of each sentence, the Candidate Set carry out problem using BM25 methods of marking and carry out phase with the sentence in database
It is calculated like degree, the model discrimination has used the model based on Seq2Seq to carry out building for generation system.
A kind of flow system based on retrieval and production mixing question and answer, mainly includes the following steps that:
Step A has used the disaggregated model based on convolutional neural networks (CNN) and has been based on two methods of regular expression
To build query graders.
Step B has selected key-value memory databases redis to carry out inverted index and has taken when establishing searching system
Build and stored with question and answer language material, using python realize common retrieval, expanding query, BM25 models the work(such as similarity evaluation
Can, and own coding model is had trained come the semantics recognition of sentence when solving the problems, such as retrieval using Tensorflow.
Step C selects Open Framework Tensorflow to engage in the dialogue model to establish dialog generation system,
Tensorflow is the artificial intelligence framework platform of *** exploitations, can be used for the multinomial depth such as image and natural language processing
Learning areas.
The grader is capable of the context of perception problems, is combined by the context with problem, can divide problem
There is class high accuracy rate, the grader mutually to be tied with regular expression identification using the disaggregated model based on deep learning
The method of conjunction carries out the classification of problem, and important attribute that can be in extraction problem calls corresponding api to carry out real-time query, tool
Have and establishes inverted index progress prototype statement retrieval in the problem of stronger real-time, Candidate Set searching system matching process
Retrieved with the query expansion of synonym, can search out with the most similar sentence of problem, it is not smart to solve searching system search
True problem, in the step C, dialog model is engaged in the dialogue the generation of reply using seq2seq models, and is added wherein
Attention mechanism, while also adding BeamSearch mechanism in decoding end, add Attention mechanism and
The sentence that the Seq2Seq models of BeamSearch generate more has logicality and structural, increases the diversity of reply.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie
In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, nothing
By from the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by institute
Attached claim rather than above description limit, it is intended that will fall within the meaning and scope of the equivalent requirements of the claims
All changes be included within the present invention.Any reference numeral in claim should not be considered as to the involved right of limitation
It is required that.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only
It contains an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art answer
When considering the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms people in the art
The other embodiment that member is appreciated that.
Claims (6)
1. a kind of flow system based on retrieval and production mixing question and answer, it is characterised in that:Including
Grader classifies to a query;
Candidate Set, the problem of for failing to classify, searching system look for the immediate problem of the problem, screen
Candidate sentence subset;
Database, for storing various problem question sentences, convenient for looking for and semantic most similar question sentence of asking a question;
Model discrimination calls generation system to generate corresponding answer and provides reply;
Wherein, the grader, which will ask a question, is divided into " weather ", " news ", " joke ", " flight/high ferro ", " near " and " its
He " six types, the Candidate Set obtains the vector table of each sentence using the own coding model based on Recognition with Recurrent Neural Network
It reaches, the Candidate Set carries out problem using BM25 methods of marking and carries out similarity calculation, the model with the sentence in database
Model of the Select to use based on Seq2Seq carries out building for generation system.
2. a kind of flow system based on retrieval and production mixing question and answer according to claim 1, which is characterized in that described
Flow system includes the following steps:
Step A has used the disaggregated model based on convolutional neural networks and has built query based on two methods of regular expression
Grader;
Step B has selected key-value memory databases redis to carry out inverted index and has built and ask when establishing searching system
Answer language material storage, using python realize common retrieval, expanding query, BM25 models similarity evaluation function, and use
Tensorflow has trained own coding model come the semantics recognition of sentence when solving the problems, such as retrieval;
Step C selects Open Framework Tensorflow to engage in the dialogue model to establish dialog generation system, can be used for image and oneself
The right multinomial deep learning field of Language Processing.
3. a kind of flow system based on retrieval and production mixing question and answer according to claim 1, it is characterised in that:It is described
Grader is capable of the context of perception problems.
4. a kind of flow system based on retrieval and production mixing question and answer according to claim 1, it is characterised in that:It is described
Grader using based on deep learning disaggregated model and the method that is combined of regular expression identification carry out the classification of problem.
5. a kind of flow system based on retrieval and production mixing question and answer according to claim 1, it is characterised in that:It is described
Establish that inverted index carries out prototype statement retrieval and the query expansion of synonym is examined in the problem of Candidate Set searching system matching process
Rope.
6. a kind of flow system based on retrieval and production mixing question and answer according to claim 2, it is characterised in that:It is described
In step C, dialog model is engaged in the dialogue the generation of reply using seq2seq models, and adds Attention machines wherein
System, while also BeamSearch mechanism is added in decoding end.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810123117.7A CN108280218A (en) | 2018-02-07 | 2018-02-07 | A kind of flow system based on retrieval and production mixing question and answer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810123117.7A CN108280218A (en) | 2018-02-07 | 2018-02-07 | A kind of flow system based on retrieval and production mixing question and answer |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108280218A true CN108280218A (en) | 2018-07-13 |
Family
ID=62807935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810123117.7A Pending CN108280218A (en) | 2018-02-07 | 2018-02-07 | A kind of flow system based on retrieval and production mixing question and answer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108280218A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657126A (en) * | 2018-12-17 | 2019-04-19 | 北京百度网讯科技有限公司 | Answer generation method, device, equipment and medium |
CN109657041A (en) * | 2018-12-04 | 2019-04-19 | 南京理工大学 | The problem of based on deep learning automatic generation method |
CN109918484A (en) * | 2018-12-28 | 2019-06-21 | 中国人民大学 | Talk with generation method and device |
CN110297895A (en) * | 2019-05-24 | 2019-10-01 | 山东大学 | A kind of dialogue method and system based on free text knowledge |
CN110362651A (en) * | 2019-06-11 | 2019-10-22 | 华南师范大学 | Dialogue method, system, device and the storage medium that retrieval and generation combine |
CN111090664A (en) * | 2019-07-18 | 2020-05-01 | 重庆大学 | High-imitation person multi-mode dialogue method based on neural network |
CN111966782A (en) * | 2020-06-29 | 2020-11-20 | 百度在线网络技术(北京)有限公司 | Retrieval method and device for multi-turn conversations, storage medium and electronic equipment |
CN113220856A (en) * | 2021-05-28 | 2021-08-06 | 天津大学 | Multi-round dialogue system based on Chinese pre-training model |
US20210365810A1 (en) * | 2020-05-12 | 2021-11-25 | Bayestree Intelligence Pvt Ltd. | Method of automatically assigning a classification |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
CN101373532A (en) * | 2008-07-10 | 2009-02-25 | 昆明理工大学 | FAQ Chinese request-answering system implementing method in tourism field |
CN104050256A (en) * | 2014-06-13 | 2014-09-17 | 西安蒜泥电子科技有限责任公司 | Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method |
CN105824933A (en) * | 2016-03-18 | 2016-08-03 | 苏州大学 | Automatic question answering system based on main statement position and implementation method thereof |
CN107463699A (en) * | 2017-08-15 | 2017-12-12 | 济南浪潮高新科技投资发展有限公司 | A kind of method for realizing question and answer robot based on seq2seq models |
CN107562792A (en) * | 2017-07-31 | 2018-01-09 | 同济大学 | A kind of question and answer matching process based on deep learning |
-
2018
- 2018-02-07 CN CN201810123117.7A patent/CN108280218A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1928864A (en) * | 2006-09-22 | 2007-03-14 | 浙江大学 | FAQ based Chinese natural language ask and answer method |
CN101373532A (en) * | 2008-07-10 | 2009-02-25 | 昆明理工大学 | FAQ Chinese request-answering system implementing method in tourism field |
CN104050256A (en) * | 2014-06-13 | 2014-09-17 | 西安蒜泥电子科技有限责任公司 | Initiative study-based questioning and answering method and questioning and answering system adopting initiative study-based questioning and answering method |
CN105824933A (en) * | 2016-03-18 | 2016-08-03 | 苏州大学 | Automatic question answering system based on main statement position and implementation method thereof |
CN107562792A (en) * | 2017-07-31 | 2018-01-09 | 同济大学 | A kind of question and answer matching process based on deep learning |
CN107463699A (en) * | 2017-08-15 | 2017-12-12 | 济南浪潮高新科技投资发展有限公司 | A kind of method for realizing question and answer robot based on seq2seq models |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657041A (en) * | 2018-12-04 | 2019-04-19 | 南京理工大学 | The problem of based on deep learning automatic generation method |
CN109657041B (en) * | 2018-12-04 | 2023-09-29 | 南京理工大学 | Deep learning-based automatic problem generation method |
CN109657126A (en) * | 2018-12-17 | 2019-04-19 | 北京百度网讯科技有限公司 | Answer generation method, device, equipment and medium |
CN109657126B (en) * | 2018-12-17 | 2021-03-23 | 北京百度网讯科技有限公司 | Answer generation method, device, equipment and medium |
CN109918484A (en) * | 2018-12-28 | 2019-06-21 | 中国人民大学 | Talk with generation method and device |
CN109918484B (en) * | 2018-12-28 | 2020-12-15 | 中国人民大学 | Dialog generation method and device |
CN110297895A (en) * | 2019-05-24 | 2019-10-01 | 山东大学 | A kind of dialogue method and system based on free text knowledge |
CN110297895B (en) * | 2019-05-24 | 2021-09-17 | 山东大学 | Dialogue method and system based on free text knowledge |
CN110362651A (en) * | 2019-06-11 | 2019-10-22 | 华南师范大学 | Dialogue method, system, device and the storage medium that retrieval and generation combine |
CN111090664A (en) * | 2019-07-18 | 2020-05-01 | 重庆大学 | High-imitation person multi-mode dialogue method based on neural network |
US20210365810A1 (en) * | 2020-05-12 | 2021-11-25 | Bayestree Intelligence Pvt Ltd. | Method of automatically assigning a classification |
CN111966782A (en) * | 2020-06-29 | 2020-11-20 | 百度在线网络技术(北京)有限公司 | Retrieval method and device for multi-turn conversations, storage medium and electronic equipment |
CN111966782B (en) * | 2020-06-29 | 2023-12-12 | 百度在线网络技术(北京)有限公司 | Multi-round dialogue retrieval method and device, storage medium and electronic equipment |
US11947578B2 (en) | 2020-06-29 | 2024-04-02 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method for retrieving multi-turn dialogue, storage medium, and electronic device |
CN113220856A (en) * | 2021-05-28 | 2021-08-06 | 天津大学 | Multi-round dialogue system based on Chinese pre-training model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108280218A (en) | A kind of flow system based on retrieval and production mixing question and answer | |
WO2021159632A1 (en) | Intelligent questioning and answering method and apparatus, computer device, and computer storage medium | |
US10649990B2 (en) | Linking ontologies to expand supported language | |
CN104461525B (en) | A kind of intelligent consulting platform generation system that can customize | |
CN110209897B (en) | Intelligent dialogue method, device, storage medium and equipment | |
CN109960786A (en) | Chinese Measurement of word similarity based on convergence strategy | |
CN110096567B (en) | QA knowledge base reasoning-based multi-round dialogue reply selection method and system | |
TW202009749A (en) | Human-machine dialog method, device, electronic apparatus and computer readable medium | |
CN111639252A (en) | False news identification method based on news-comment relevance analysis | |
CN110019729B (en) | Intelligent question-answering method, storage medium and terminal | |
CN111353049A (en) | Data updating method and device, electronic equipment and computer readable storage medium | |
CN112632239A (en) | Brain-like question-answering system based on artificial intelligence technology | |
Dsouza et al. | Chat with bots intelligently: A critical review & analysis | |
CN108364066B (en) | Artificial neural network chip and its application method based on N-GRAM and WFST model | |
CN110377752A (en) | A kind of knowledge base system applied to the operation of government affairs hall | |
CN116932733A (en) | Information recommendation method and related device based on large language model | |
CN112364148A (en) | Deep learning method-based generative chat robot | |
CN116541493A (en) | Interactive response method, device, equipment and storage medium based on intention recognition | |
CN117251552A (en) | Dialogue processing method and device based on large language model and electronic equipment | |
KR20180116103A (en) | Continuous conversation method and system by using automating conversation scenario network | |
CN114330704A (en) | Statement generation model updating method and device, computer equipment and storage medium | |
CN113065324A (en) | Text generation method and device based on structured triples and anchor templates | |
CN111767386A (en) | Conversation processing method and device, electronic equipment and computer readable storage medium | |
CN115378890B (en) | Information input method, device, storage medium and computer equipment | |
CN116957128A (en) | Service index prediction method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180713 |