CN110377911A - Intension recognizing method and device under dialogue frame - Google Patents
Intension recognizing method and device under dialogue frame Download PDFInfo
- Publication number
- CN110377911A CN110377911A CN201910666196.0A CN201910666196A CN110377911A CN 110377911 A CN110377911 A CN 110377911A CN 201910666196 A CN201910666196 A CN 201910666196A CN 110377911 A CN110377911 A CN 110377911A
- Authority
- CN
- China
- Prior art keywords
- intention assessment
- corpus
- model
- training
- label
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The present invention provides the intension recognizing method and device under a kind of dialogue frame, this method comprises: obtaining the matching degree of the regular monomer in corpus to be identified and preset rules template, the preset rules template includes a plurality of regular monomer, the corresponding label of every rule monomer;Judge whether there is the matching degree greater than preset threshold;If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;If not, by the intention assessment model of the corpus input pre-training to be identified, and, using the output of the intention assessment model of the pre-training as intention assessment result, wherein, by recycling machine learning model to identify that user is intended to when rule template cannot effectively identify that user is intended to, user's intention can be accurately identified, it is lower to the quantitative requirement of training sample, promote the development of the artificial smart machines such as question answering system, dialogue robot.
Description
Technical field
The present invention relates to the intension recognizing methods and dress under field of artificial intelligence more particularly to a kind of dialogue frame
It sets.
Background technique
With the development of search engine technique, modern search engines, question answering system and dialogue robot need no longer
It is simple correlation information retrieval, but is capable of the information requirement of profound understanding user, needed for is accurately provided for user
Service, and correctly identify user be intended that realize this target committed step.
Currently, being mainly intended to using machine learning model identification user, the mark sample ability of hundreds of thousands is generally required
The machine mould for training preferable recognition effect, for the question answering system or dialogue robot that construct initial stage, Wu Fashou
Collection meets the corpus of session operational scenarios on a large scale, therefore cannot train to obtain the preferable machine learning model of recognition effect, causes not
User's intention can be accurately identified;In addition, for the intention assessment based on dialogue, usually very due to user's input in dialogue
It is short, therefore the difficulty for accurately identifying under dialogue frame user's intention is bigger, seriously constrains question answering system, dialogue robot etc.
The development of artificial intelligence equipment.
Summary of the invention
For the problems of the prior art, the present invention provides under a kind of dialogue frame intension recognizing method and device,
Electronic equipment and computer readable storage medium can at least be partially solved problems of the prior art.
To achieve the goals above, the present invention adopts the following technical scheme:
In a first aspect, the intension recognizing method under providing a kind of dialogue frame, comprising:
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes
A plurality of regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;
If it is not, by the intention assessment model of the corpus to be identified input pre-training, also, by the intention of the pre-training
The output of identification model is as intention assessment result.
Further, the matching degree for obtaining the regular monomer in corpus to be identified and preset rules template, comprising:
It parses the corpus to be identified and obtains multiple feature words to be matched and its part of speech;
Dependency rule monomer is obtained from the rule template according to the multiple feature word to be matched;
The corpus to be identified and dependency rule monomer are calculated according to the multiple feature word to be matched and its part of speech
Matching degree.
Further, the parsing corpus to be identified obtains multiple feature words to be matched and its part of speech, comprising:
The corpus to be identified is segmented to obtain multiple feature words;
It removes the stop words in the multiple feature word and obtains multiple feature words to be matched;
The part of speech for marking the multiple feature word to be matched obtains multiple feature words to be matched and its part of speech.
Further, the regular monomer is made of multiple word slots, and each word slot is keyword and its synonym or part of speech;
It is described that dependency rule monomer is obtained from the rule template according to the multiple feature word to be matched, comprising:
The rule that word slot includes the feature word to be matched are obtained according to rule template described in a feature word search to be matched
Then monomer is as dependency rule monomer.
Further, the corresponding weighted value of each institute's predicate slot;
It is described that the corpus to be identified and dependency rule list are calculated according to the multiple feature word to be matched and its part of speech
The matching degree of body, comprising:
By the word slot of multiple feature words to be matched and its part of speech and a dependency rule monomer corresponding position carry out word or
Part of speech matching;
The weighted value of the word slot of successful match is added up and obtains the matching of the corpus to be identified and the dependency rule monomer
Degree.
Further, further includes:
Construct intention assessment model;
The intention assessment model is trained using no label corpus sample to obtain the intention assessment model of pre-training.
Further, described that the intention assessment model is trained to obtain pre-training using no label corpus sample
Intention assessment model, comprising:
The no label corpus sample is clustered, obtain pre-set categories without label corpus sample;
Acquisition original sample is sampled to the no label corpus sample of each pre-set categories;
The intention assessment model is trained to obtain initial intention assessment model using the original sample by mark;
No label corpus sample remaining after sampling is inputted into the initial intention assessment model and obtains every remaining nothing
The label of label corpus sample;
Using no label corpus sample remaining after modified sampling to the initial intention assessment model carry out into
The training of one step obtains the intention assessment model of pre-training.
Further, further includes:
Obtain the testing material of known label;
The intention assessment model of the pre-training is tested using the testing material of the known label, and by the mould
The output of type is as test result;
Based on the test result and known label, judge whether the intention assessment model of pre-training meets preset requirement;
If so, using "current" model as the object module for being used for intention assessment.
Further, further includes:
If "current" model does not meet preset requirement, "current" model is optimized and/or using updated trained sample
This collection re-starts model training.
Second aspect provides the intention assessment device under a kind of dialogue frame, comprising:
Matching degree obtains module, obtains the matching degree of the regular monomer in corpus to be identified and preset rules template, described
Preset rules template includes a plurality of regular monomer, the corresponding label of every rule monomer;
Matching judgment module judges whether there is the matching degree greater than preset threshold;
First intention identification module is greater than the matching degree of preset threshold, then by the corresponding rule of maximum matching degree if it exists
The label of monomer is as intention assessment result;
Second intention identification module is greater than the matching degree of preset threshold if it does not exist, then inputs the corpus to be identified
The intention assessment model of pre-training, also, using the output of the intention assessment model of the pre-training as intention assessment result.
Further, the matching degree acquisition module includes:
Resolution unit parses the corpus to be identified and obtains multiple feature words to be matched and its part of speech;
Rules Filtering unit obtains dependency rule list according to the multiple feature word to be matched from the rule template
Body;
Matching degree computing unit, according to the multiple feature word to be matched and its part of speech calculate the corpus to be identified with
The matching degree of dependency rule monomer.
Further, the resolution unit includes:
Subelement is segmented, the corpus to be identified is segmented to obtain multiple feature words;
Stop words subelement is removed, the stop words in the multiple feature word is removed and obtains multiple feature words to be matched;
Part-of-speech tagging subelement, the part of speech for marking the multiple feature word to be matched obtain multiple feature words to be matched
And its part of speech.
Further, the regular monomer is made of multiple word slots, and each word slot is keyword and its synonym or part of speech;
The Rules Filtering unit includes:
Subelement is searched for, obtaining word slot according to rule template described in a feature word search to be matched includes the spy to be matched
The regular monomer of word is levied as dependency rule monomer.
Further, the corresponding weighted value of each institute's predicate slot;
The matching degree computing unit includes:
Coupling subelement, by multiple feature words to be matched and its word slot of part of speech and a dependency rule monomer corresponding position
Carry out word or part of speech matching;
Weighted value adds up subelement, obtains the corpus to be identified and the phase for the weighted value of the word slot of successful match is cumulative
Close the matching degree of regular monomer.
Further, further includes:
Model construction module constructs intention assessment model;
Training module is trained to obtain the intention of pre-training using no label corpus sample to the intention assessment model
Identification model.
Further, the training module includes:
Cluster cell clusters the no label corpus sample, obtain pre-set categories without label corpus sample;
Sampling unit is sampled acquisition original sample to the no label corpus sample of each pre-set categories;
First training unit is trained to obtain initial using the original sample by mark to the intention assessment model
Intention assessment model;
Unit is marked, no label corpus sample remaining after sampling is inputted into the initial intention assessment model and obtains every
The label of remaining no label corpus sample;
Second training unit knows the initial intention using no label corpus sample remaining after modified sampling
Other model carries out further training and obtains the intention assessment model of pre-training.
Further, further includes:
Test sample obtains module, obtains the testing material of known label;
Test module surveys the intention assessment model of the pre-training using the testing material of the known label
Examination, and using the output of the model as test result;
Test judgment module, be based on the test result and known label, judge pre-training intention assessment model whether
Meet preset requirement;
Model output module, if the intention assessment model of pre-training meets preset requirement, using "current" model as being used for
The object module of intention assessment.
Further, further includes:
Retraining module optimizes "current" model if the intention assessment model of pre-training does not meet preset requirement
And/or model training is re-started using updated training sample set.
The third aspect, provides a kind of electronic equipment, including memory, processor and storage on a memory and can handled
The computer program run on device, the processor are realized when executing described program:
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes
A plurality of regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result.
Fourth aspect provides a kind of computer readable storage medium, is stored thereon with computer program, the computer program
Realization when being executed by processor:
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes
A plurality of regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result.
The embodiment of the present invention provides the intension recognizing method and device, electronic equipment and computer under a kind of dialogue frame
Readable storage medium storing program for executing, this method comprises: the matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, it is described
Preset rules template includes a plurality of regular monomer, the corresponding label of every rule monomer;It judges whether there is and is greater than default threshold
The matching degree of value;If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;If it is not, will be described
The intention assessment model of corpus to be identified input pre-training, also, using the output of the intention assessment model of the pre-training as
Intention assessment result, wherein by recycling machine learning model identification when rule template cannot effectively identify that user is intended to
User is intended to, and can accurately identify user's intention, lower to the quantitative requirement of training sample, promotes question answering system, dialogue machine
The development of the artificial smart machine such as device people.
In addition, the embodiment of the present invention using no label corpus as training sample, using clustering algorithm to no label corpus
Classify, obtain uniform training sample with the mode of sampling, is unable to the covering range of effective guarantee sample, and effectively
The workload for reducing mark sample, improves the speed and convenient degree of model training.
For above and other objects, features and advantages of the invention can be clearer and more comprehensible, preferred embodiment is cited below particularly,
And cooperate institute's accompanying drawings, it is described in detail below.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the application
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.In the accompanying drawings:
Fig. 1 is the server S 1 in the embodiment of the present invention and the configuration diagram between client device B1;
Fig. 2 is the framework between server S 1, client device B1 and database server S2 in the embodiment of the present invention
Schematic diagram;
Fig. 3 is the flow diagram one of the intension recognizing method under the dialogue frame in the embodiment of the present invention;
Fig. 4 shows the specific steps of step S100 in Fig. 3;
Fig. 5 shows the specific steps of step S110 in Fig. 4;
Fig. 6 shows the specific steps of step S130 in Fig. 4;
Fig. 7 is the flow diagram two of the intension recognizing method under the dialogue frame in the embodiment of the present invention;
Fig. 8 shows a kind of specific steps of step S20 in Fig. 7;
Fig. 9 shows another specific steps of step S20 in Fig. 7;
Figure 10 is the structural block diagram of the intention assessment device under the dialogue frame in the embodiment of the present invention;
Figure 11 is the structure chart of electronic equipment of the embodiment of the present invention.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application
Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only
The embodiment of the application a part, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people
Member's every other embodiment obtained without making creative work, all should belong to the model of the application protection
It encloses.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
It should be noted that term " includes " and " tool in the description and claims of this application and above-mentioned attached drawing
Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units
Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear
Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Intention assessment refers to when user and robot are linked up, can be proposed according to user direct of robot or
Indirect information quickly to judge the true intention of user, and the intention assessment for talking with robot is the function point of a core, just
The really basic function for being intended that robot of identification user.The method that the application proposes a set of robot intention assessment can be fast
The robot that speed building meets user demand provides basis, and this method is configured to financial industry collection robot intention assessment
Example is specifically described.
The prior art mainly uses machine learning model identification user to be intended to, and generally requires the mark sample of hundreds of thousands
The machine mould that preferable recognition effect can be trained can not for the question answering system or dialogue robot that construct initial stage
Collection meets the corpus of session operational scenarios on a large scale, therefore cannot train to obtain the preferable machine learning model of recognition effect, causes
User's intention cannot be accurately identified;In addition, for the intention assessment based on dialogue, usually very due to user's input in dialogue
It is short, therefore the difficulty for accurately identifying under dialogue frame user's intention is bigger, seriously constrains question answering system, dialogue robot etc.
The development of artificial intelligence equipment.
At least partly to solve above-mentioned technical problem in the prior art, the embodiment of the present invention is provided under a kind of dialogue frame
Intension recognizing method, pass through rule template cannot effectively identify user be intended to when recycle machine learning model identify user
It is intended to, user's intention can be accurately identified, it is lower to the quantitative requirement of training sample, promote question answering system, dialogue robot
Etc. the development of artificial smart machine.
In addition, the embodiment of the present invention using no label corpus as training sample, using clustering algorithm to no label corpus
Classify, obtain uniform training sample with the mode of sampling, is unable to the covering range of effective guarantee sample, and effectively
The workload for reducing mark sample, improves the speed and convenient degree of model training.
In view of this, the device can be a kind of clothes this application provides the intention assessment device under a kind of dialogue frame
Be engaged in device S1, and referring to Fig. 1, which can communicate to connect at least one client device B1, the client device B1
Corpus to be identified can be sent to the server S 1, the server S 1 can receive the corpus to be identified online.Institute
It states server S 1 can be online or offline to pre-process the corpus to be identified of acquisition, obtains corpus to be identified and default rule
The then matching degree of the regular monomer in template, the preset rules template include a plurality of regular monomer, and every rule monomer is corresponding
One label;Judge whether there is the matching degree greater than preset threshold;If so, by the mark of the corresponding regular monomer of maximum matching degree
Label are used as intention assessment result;If it is not, by the intention assessment model of the corpus input pre-training to be identified, also, will be described
The output of the intention assessment model of pre-training is as intention assessment result.
In addition, referring to fig. 2, the server S 1 can also be communicated to connect at least one database server S2, described
The intention assessment model and/or history that database server S2 is used to store pre-training are without label corpus sample and preset rules
Template.The database server S2 is online by the intention assessment model of the pre-training and/or history without label corpus sample
And preset rules template is sent to the server S 1.
Based on above content, the database server S2 can be also used for the testing material of storage known label.It is described
The testing material of the known label is sent to the server S 1 online by database server S2, and the server S 1 can be with
The testing material of the known label is received online, and test specimens are then obtained according to the testing material of at least one known label
This, and the application test sample carries out model measurement to the model, and using the output of the model as test result, then base
In the known evaluation result of the testing material of the test result and at least one known label, judge whether "current" model meets
Preset requirement, if so, using "current" model as the object module for the intention assessment being used under dialogue frame;If "current" model is not
Meet the preset requirement, then "current" model is optimized and/or using updated training sample set again to the model
Carry out model training.
It is understood that the client device B1 may include smart phone, Flat electronic equipment, network machine top
Box, portable computer, desktop computer, personal digital assistant (PDA), mobile unit, intelligent wearable device etc..Wherein, described
Intelligent wearable device may include smart glasses, smart watches, Intelligent bracelet etc..
It in practical applications, the part of the intention assessment to engage in the dialogue under frame can be in the service as described in above content
The side device S1 executes, that is, framework as shown in Figure 1, operation that can also be all are all completed in the client device B1, and should
The client device B1 can be directly communicatively coupled with database server S2.It can specifically be set according to the client
The processing capacity of standby B1 and the limitation of user's usage scenario etc. select.The application is not construed as limiting this.If all behaviour
Work is all completed in the client device B1, and the client device B1 can also include processor, for the frame that engages in the dialogue
The specific processing of intention assessment under frame.
Any suitable network protocol can be used between the server and the client device to be communicated, including
In the network protocol that the application submitting day is not yet developed.The network protocol for example may include ICP/IP protocol, UDP/IP
Agreement, http protocol, HTTPS agreement etc..Certainly, the network protocol for example can also include using on above-mentioned agreement
RPC agreement (Remote Procedure Call Protocol, remote procedure call protocol), REST agreement
(Representational State Transfer, declarative state transfer protocol) etc..
In one or more embodiments of the application, the testing material is to be not included in for model training without mark
It signs in corpus sample, and is directed to the testing material, its known label need to be obtained.
Fig. 3 is the flow diagram one of the intension recognizing method under the dialogue frame in the embodiment of the present invention;Such as Fig. 3 institute
Show, the intension recognizing method under the dialogue frame may include the following contents:
Step S100: the matching degree of the regular monomer in corpus to be identified and preset rules template, the default rule are obtained
Then template includes a plurality of regular monomer, the corresponding label of every rule monomer.
It is worth noting that robot is still talked in either artificial customer service, can all be produced during for customer service
Raw voice dialogue, can be by above-mentioned voice dialogue transcription at corpus to be identified by speech recognition technology.
Regular monomer may include multiple word slots, and a word can be accommodated in each word slot or defines word in word slot
The part of speech of language is extended in addition, being additionally provided with thesaurus for the word in word slot for the expression way to word.
Such as: regular monomer can be the structure of word slot A+ word slot B, and the part of speech of word slot A is adjective, the word in word slot B
For industrial and commercial bank, which can be expressed as " " adjective " industrial and commercial bank ".
The matching degree for obtaining corpus to be identified and regular monomer is to obtain the similitude of corpus and rule template to be identified.
Wherein, which is maintenance personnel according to the common corpus summary setting under application scenarios.
Step S200: the matching degree greater than preset threshold is judged whether there is;
If so, executing step S300;Otherwise, step S400 is executed.
Wherein, quantify the similarity of each rule in corpus to be identified and rule template by using matching degree, similarity is got over
The value of height, matching degree is bigger.
When the matching degree of corpus to be identified and at least one rule template is greater than preset value, illustrate corpus to be identified and rule
Then the similarity of template has reached preset requirement, at this time, it is believed that can use the meaning that rule template identifies the corpus to be identified
Figure;
When the matching degree of corpus to be identified and strictly all rules template is respectively less than preset value, illustrate corpus to be identified and rule
The similarity of template cannot reach preset requirement, at this time, it is believed that the meaning of the corpus to be identified cannot be identified using rule template
Figure then needs to identify the intention of corpus to be identified using the intention assessment model of pre-training.
Specifically, which chooses according to practical application request, may be, for example, any value in [0.4~2], such as
0.5,0.7,0.8,0.9,1.2,1.5 etc..
Step S300: using the label of the corresponding regular monomer of maximum matching degree as intention assessment result.
That is: it selects with the label of the most like regular monomer of corpus to be identified as intention assessment result.
For example: assuming that corpus to be identified is " I wants to look into remaining sum ", regular monomer is " " verb " remaining sum ", corresponding mark
Label are " querying the balance ", then the corpus to be identified and regular monomer similarity are very high, then the intention assessment knot of the corpus to be identified
Fruit is " querying the balance ".
Step S400: by the intention assessment model of the corpus to be identified input pre-training, also, by the pre-training
The output of intention assessment model is as intention assessment result.
Wherein, the intention assessment model of the pre-training is one kind of machine learning model or deep learning model, such as text
This disaggregated model, the model are based on obtaining after no label corpus sample training.
Through the above technical solution it is known that intension recognizing method under dialogue frame provided in an embodiment of the present invention,
By recycling machine learning model to identify when rule template cannot effectively identify that user is intended to, user is intended to, and rule match can
To come into force in real time, the problem of machine learning model needs the training time that cannot come into force in real time can solve, can accurately identify use
Family is intended to, lower to the quantitative requirement of training sample, promotes the hair of the artificial smart machines such as question answering system, dialogue robot
Exhibition, the advantage of abundant binding rule template and machine learning model are solved and are significantly promoted under conditions of corpus quantity is few
The problem of recognition accuracy can effectively improve the degree of intelligence of dialogue robot, accelerate the speed of production of dialogue robot.
In an alternative embodiment, step S100 may include the following contents, referring to fig. 4:
Step S110: the parsing corpus to be identified obtains multiple feature words to be matched and its part of speech;
For example, corpus to be identified is " I will turn 500 yuan ", then " transferring accounts "-verb can be extracted, " 500 yuan "-amount
Word.
Part of speech includes the types such as quantifier, adjective, verb, pronoun, adverbial word, time word, name word, company's noun.
Step S120: dependency rule monomer is obtained from the rule template according to the multiple feature word to be matched;
Wherein, multiple regular monomers are contained in rule template, regular monomer is made of multiple word slots, provides in each word slot
Word in word slot gives the part of speech of word in word slot;
It is traversed according to any of multiple feature words to be matched feature word to be matched each in the rule template
Regular monomer, obtains regular monomer that word slot includes the feature word to be matched as dependency rule monomer, according to it is multiple to
Dependency rule monomer is combined into the strictly all rules monomer group that feature word traverses, wherein when being traversed, meeting not only will
Feature Words to be matched are compared with the keyword in regular monomer, can also with the synonym in the associated thesaurus of the keyword
It is compared, more fully to obtain dependency rule monomer.
It is more if not obtaining the regular monomer comprising the feature word to be matched according to multiple feature word traversals to be matched
The matching angle value of a feature word to be matched is 0.
Step S130: the corpus to be identified and related rule are calculated according to the multiple feature word to be matched and its part of speech
The then matching degree of monomer.
Specifically, by the word slot in multiple feature words to be matched and its part of speech and dependency rule monomer keyword and
The part of speech of word slot is matched.
In an alternative embodiment, step S110 may include the following contents, participate in Fig. 5:
Step S111: the corpus to be identified is segmented to obtain multiple feature words;
Step S112: it removes the stop words in the multiple feature word and obtains multiple feature words to be matched;
Step S113: the part of speech of the multiple feature word to be matched of mark obtains multiple feature words to be matched and its word
Property.
Wherein, corpus to be identified segmented, go stop words, working for mark part of speech can be using text on the market
The open source software of processing is realized, such as jieba.
In an alternative embodiment, step S130 may include the following contents, referring to Fig. 6:
Step S131: by the word slot of multiple feature words to be matched and its part of speech and a dependency rule monomer corresponding position into
Row word or part of speech matching;
Step S132: the weighted value of the word slot of successful match is added up and obtains the corpus to be identified and the dependency rule list
The matching degree of body.
Wherein, the weighted value of each word slot is defined in regular monomer.
Specifically, the weighted value of each word slot can be manually configured when rule template is arranged, can also be to a certain
The existing positive collection sample of corpus and negative collection sample carry out the appearance of word statistics, certain word or part of speech in two set
Weight of the difference of frequency divided by the total degree of appearance as the word.
In the following, citing is illustrated to matching degree is calculated:
Such as: regular monomer: " adjective " industrial and commercial bank, the weight of the corresponding word slot of adjective are 0.2, industrial and commercial bank pair
The weight for the word slot answered is 0.7, the matching degree of corpus " industrial and commercial bank of wisdom " to be identified and rule are as follows: 0.2+0.7=0.9,
The matching degree of corpus " Alibaba of development " to be identified and rule are as follows: 0.2+0=0.2.
Fig. 7 is the flow diagram two of the intension recognizing method under the dialogue frame in the embodiment of the present invention.Referring to Fig. 7,
Intension recognizing method under the dialogue frame can also include: on the basis of comprising step shown in Fig. 3
Step S10: building intention assessment model;
Wherein, which can be coding-decoding frame model, which includes coding layer.So-called coding, being exactly will
List entries is converted to the vector of a regular length;The vector of the regular length generated before, is exactly then converted by decoding
Output sequence, in the training process of coding-decoding frame model, available characterization and the preferable semantic volume of generalization ability
Code.Wherein, Encoder-Decoder frame is a model framework in deep learning, the mould of Encoder-Decoder frame
Type includes but is not limited to sequence to sequence (Sequence to Sequence, abbreviation Seq2Seq) model, and Seq2Seq model can
Using length memory (Long Short-Term Memory, abbreviation LSTM) or gating cycle unit (Gated
Recurrent Unit, abbreviation GRU) Encoder layers and Decoder layers of algorithm realization, Transformer algorithm can also be used
Realize Encoder-Decoder layers.
For example, the Seq2Seq model includes embedding layers, Encoder layers and Softmax layers of Word.Every is instructed
Practice the problem of sample data includes, such as described problem are as follows: you are good, and may I ask you is XXX, is input to Word embedding
Layer, embedding layers of the Word term vector that each word in the above problem is converted into regular length, is then output to
Encoder layers, Encoder layers handle the term vector of input, and output state variable C is to Softmax layers, state variable C
As Softmax layers of initial value, Softmax layers can export the corresponding label of the above problem after training, such as export
Problem: you are good, and may I ask you is XXX, and corresponding label label_ asks whether me.
Step S20: the intention assessment model is trained to obtain the intention of pre-training using no label corpus sample
Identification model.
On machine learning model training problem, aiming at the problem that manually mark heavy workload, devise using a small amount of mark
The thinking that sample carries out model training is infused, following process is devised:
Referring to Fig. 8, training process be may comprise steps of:
Step S21: clustering the no label corpus sample, obtain pre-set categories without label corpus sample.
In order to obtain the no label corpus sample, it can record to the customer service of preservation, utilize speech recognition technology
Customer service recording is carried out offline transcription into text by (Automatic Speech Recognition, abbreviation ASR), obtains original language
Material;Then by the session operational scenarios in above-mentioned original language material, artificial correction process is carried out, obtains the no label voice sample, institute
Stating check and correction includes but is not limited to error correction, and alignment sentence etc. is handled, every corpus number in the no label corpus sample of acquisition
According to including a problem and an answer.The corpus data is such as are as follows: problem: you are good, and it is more for may I ask the housing loan interest rate of bank
It is few? answer: you are good, and current interest rate is 5.6%.Does is alternatively, problem: my credit card amount how many now? answer: you are good, you
Current amount is 50,000 RMB.
Clustering algorithm can use LDA (Latent Dirichlet Allocation) or K-means clustering algorithm.
The quantity of default classification is needed in cluster, but the corresponding label of pre-set categories is not aware that.
Step S22: acquisition original sample is sampled to the no label corpus sample of each pre-set categories;
I.e. from the no label corpus sample of each pre-set categories, a certain number of corpus samples are obtained, are obtained
To original sample.Wherein, the ratio or quantity of the sampling are configured according to actual needs, and the embodiment of the present invention does not limit
It is fixed.
The small sample after sampling is labeled according to pre-set categories manually, obtains the corresponding label of every corpus data,
To obtain the original small sample by mark.
Step S23: the intention assessment model is trained using the original sample by mark and is initially intended to
Identification model;
The model may include coding (Encoder) layer and classification layer, and Encoder layers of the output is as the classification layer
Input, the classification layer can use Softmax algorithm.The problem of including using every training sample data is as the model
Input, the label for including using every training sample data of first training sample protects as the output of the model
The parameter constant for holding Encoder layers is trained the model.In the training process to model, kept for Encoder layers
Parameter constant, training obtain the parameter of the classification layer, the relationship between no label corpus sample be not only utilized in this way, but also can have
The training classification layer of supervision, completes text categorization task, can use less sample training and obtain the higher model of generalization.
Step S24: no label corpus sample remaining after sampling is inputted into the initial intention assessment model and obtains every
The label of remaining no label corpus sample;
Specifically, after obtaining the model, remaining no label corpus sample is carried out using the initial model
Mark obtains the corresponding label of every corpus data of remaining no label corpus sample, i.e., remaining no label corpus sample
The input as the initial model of every corpus data the problem of including, remaining no label corpus sample can be obtained
The corresponding label of every corpus data.Then, people is carried out to the label of every corpus data of remaining no label corpus sample
Work amendment, revised data include one as supplementary training sample, every training sample data of the supplementary training sample
A problem and corresponding label.
Step S25: using after modified sampling remaining no label corpus sample to the initial intention assessment mould
Type carries out further training and obtains the intention assessment model of pre-training.
By using above scheme, it can solve the problems, such as that project initial stage corpus is few, with machine person to person after project is online
Talk with the increase of quantity, newly-increased data persistently pour into model and carry out Model Self-Learning Continuous optimization.
In an alternative embodiment, step S20 can also include the following steps, referring to Fig. 9:
Step S26: the testing material of known label is obtained;
Step S27: the testing material of the application known label tests the intention assessment model of the pre-training,
And using the output of the model as test result;
Step S28: it is based on the test result and known label, it is pre- to judge whether the intention assessment model of pre-training meets
If it is required that;
If so, executing step S29;If it is not, executing step S30.
Step S29: using "current" model as the object module for being used for intention assessment.
Step S30: optimization "current" model and/or more new training sample set, return step S21.
It based on the same inventive concept, can the embodiment of the present application also provides the intention assessment device under a kind of dialogue frame
With for realizing method described in above-described embodiment, as described in the following examples.Due to the intention assessment under dialogue frame
The principle that device solves the problems, such as is similar to the above method, therefore the implementation of the intention assessment device under dialogue frame may refer to
The implementation of method is stated, overlaps will not be repeated.Used below, predetermined function may be implemented in term " unit " or " module "
The combination of the software and/or hardware of energy.It is hard although device described in following embodiment is preferably realized with software
The realization of the combination of part or software and hardware is also that may and be contemplated.
Figure 10 is the structural block diagram of the intention assessment device under the dialogue frame in the embodiment of the present invention.As shown in Figure 10,
Intention assessment device under the dialogue frame may include: that matching degree obtains module 10, matching judgment module 20, first intention knowledge
Other module 30 and second intention identification module 40.
Matching degree obtains the matching degree for the regular monomer that module 10 obtains in corpus to be identified and preset rules template, described
Preset rules template includes a plurality of regular monomer, the corresponding label of every rule monomer;
It is worth noting that robot is still talked in either artificial customer service, can all be produced during for customer service
Raw voice dialogue, can be by above-mentioned voice dialogue transcription at corpus to be identified by speech recognition technology.
Regular monomer may include multiple word slots, and a word can be accommodated in each word slot or defines word in word slot
The part of speech of language is extended in addition, being additionally provided with thesaurus for the word in word slot for the expression way to word.
Such as: regular monomer can be the structure of word slot A+ word slot B, and the part of speech of word slot A is adjective, the word in word slot B
For industrial and commercial bank, which can be expressed as " " adjective " industrial and commercial bank ".
The matching degree for obtaining corpus to be identified and regular monomer is to obtain the similitude of corpus and rule template to be identified.
Wherein, which is maintenance personnel according to the common corpus summary setting under application scenarios.
Matching judgment module 20 judges whether there is the matching degree greater than preset threshold;
Wherein, quantify the similarity of each rule in corpus to be identified and rule template by using matching degree, similarity is got over
The value of height, matching degree is bigger.
When the matching degree of corpus to be identified and at least one rule template is greater than preset value, illustrate corpus to be identified and rule
Then the similarity of template has reached preset requirement, at this time, it is believed that can use the meaning that rule template identifies the corpus to be identified
Figure;
When the matching degree of corpus to be identified and strictly all rules template is respectively less than preset value, illustrate corpus to be identified and rule
The similarity of template cannot reach preset requirement, at this time, it is believed that the meaning of the corpus to be identified cannot be identified using rule template
Figure then needs to identify the intention of corpus to be identified using the intention assessment model of pre-training.
Specifically, which chooses according to practical application request, may be, for example, any value in [0.4~2], such as
0.5,0.7,0.8,0.9,1.2,1.5 etc..
First intention identification module 30 is greater than the matching degree of preset threshold if it exists, then by the corresponding rule of maximum matching degree
The label of monomer is as intention assessment result;
That is: it selects with the label of the most like regular monomer of corpus to be identified as intention assessment result.
For example: assuming that corpus to be identified is " I wants to look into remaining sum ", regular monomer is " " verb " remaining sum ", corresponding mark
Label are " querying the balance ", then the corpus to be identified and regular monomer similarity are very high, then the intention assessment knot of the corpus to be identified
Fruit is " querying the balance ".
Second intention identification module 40 is greater than the matching degree of preset threshold if it does not exist, then inputs the corpus to be identified
The intention assessment model of pre-training, also, using the output of the intention assessment model of the pre-training as intention assessment result.
Wherein, the intention assessment model of the pre-training is one kind of machine learning model or deep learning model, such as text
This disaggregated model, the model are based on obtaining after no label corpus sample training.
Through the above technical solution it is known that intention assessment device under dialogue frame provided in an embodiment of the present invention,
By recycling machine learning model to identify when rule template cannot effectively identify that user is intended to, user is intended to, and rule match can
To come into force in real time, the problem of machine learning model needs the training time that cannot come into force in real time can solve, can accurately identify use
Family is intended to, lower to the quantitative requirement of training sample, promotes the hair of the artificial smart machines such as question answering system, dialogue robot
Exhibition, the advantage of abundant binding rule template and machine learning model are solved and are significantly promoted under conditions of corpus quantity is few
The problem of recognition accuracy can effectively improve the degree of intelligence of dialogue robot, accelerate the speed of production of dialogue robot.
In an alternative embodiment, the matching degree obtain module include: resolution unit, Rules Filtering unit and
Matching degree computing unit.
Resolution unit parses the corpus to be identified and obtains multiple feature words to be matched and its part of speech;
For example, corpus to be identified is " I will turn 500 yuan ", then " transferring accounts "-verb can be extracted, " 500 yuan "-amount
Word.
Part of speech includes the types such as quantifier, adjective, verb, pronoun, adverbial word, time word, name word, company's noun.
Rules Filtering unit obtains dependency rule list according to the multiple feature word to be matched from the rule template
Body;
Wherein, multiple regular monomers are contained in rule template, regular monomer is made of multiple word slots, provides in each word slot
Word in word slot gives the part of speech of word in word slot;
It is traversed according to any of multiple feature words to be matched feature word to be matched each in the rule template
Regular monomer, obtains regular monomer that word slot includes the feature word to be matched as dependency rule monomer, according to it is multiple to
Dependency rule monomer is combined into the strictly all rules monomer group that feature word traverses, wherein when being traversed, meeting not only will
Feature Words to be matched are compared with the keyword in regular monomer, can also with the synonym in the associated thesaurus of the keyword
It is compared, more fully to obtain dependency rule monomer.
It is more if not obtaining the regular monomer comprising the feature word to be matched according to multiple feature word traversals to be matched
The matching angle value of a feature word to be matched is 0.
Matching degree computing unit according to the multiple feature word to be matched and its part of speech calculate the corpus to be identified with
The matching degree of dependency rule monomer.
Specifically, by the word slot in multiple feature words to be matched and its part of speech and dependency rule monomer keyword and
The part of speech of word slot is matched.
In an alternative embodiment, the resolution unit includes: participle subelement, removes stop words subelement and word
Property mark subelement.
Subelement is segmented, the corpus to be identified is segmented to obtain multiple feature words;
Stop words subelement is removed, the stop words in the multiple feature word is removed and obtains multiple feature words to be matched;
Part-of-speech tagging subelement, the part of speech for marking the multiple feature word to be matched obtain multiple feature words to be matched
And its part of speech.
Wherein, corpus to be identified segmented, go stop words, working for mark part of speech can be using text on the market
The open source software of processing is realized, such as jieba.
In an alternative embodiment, the regular monomer is made of multiple word slots, each word slot be keyword and its
Synonym or part of speech;
The Rules Filtering unit includes: search subelement, according to rule template described in a feature word search to be matched
Obtaining word slot includes the regular monomer of the feature word to be matched as dependency rule monomer.
In an alternative embodiment, the corresponding weighted value of each institute's predicate slot;
The matching degree computing unit includes: coupling subelement, the cumulative subelement of weighted value.
Coupling subelement, by multiple feature words to be matched and its word slot of part of speech and a dependency rule monomer corresponding position
Carry out word or part of speech matching;
Weighted value adds up subelement, obtains the corpus to be identified and the phase for the weighted value of the word slot of successful match is cumulative
Close the matching degree of regular monomer.
Wherein, the weighted value of each word slot is defined in regular monomer.
Specifically, the weighted value of each word slot can be manually configured when rule template is arranged, can also be to a certain
The existing positive collection sample of corpus and negative collection sample carry out the appearance of word statistics, certain word or part of speech in two set
Weight of the difference of frequency divided by the total degree of appearance as the word.
In the following, citing is illustrated to matching degree is calculated:
Such as: regular monomer: " adjective " industrial and commercial bank, the weight of the corresponding word slot of adjective are 0.2, industrial and commercial bank pair
The weight for the word slot answered is 0.7, the matching degree of corpus " industrial and commercial bank of wisdom " to be identified and rule are as follows: 0.2+0.7=0.9,
The matching degree of corpus " Alibaba of development " to be identified and rule are as follows: 0.2+0=0.2.
In an alternative embodiment, the intention assessment device under dialogue frame further include: model construction module and
Training module.
Model construction module constructs intention assessment model;
Wherein, which can be coding-decoding frame model, which includes coding layer.So-called coding, being exactly will
List entries is converted to the vector of a regular length;The vector of the regular length generated before, is exactly then converted by decoding
Output sequence, in the training process of coding-decoding frame model, available characterization and the preferable semantic volume of generalization ability
Code.Wherein, Encoder-Decoder frame is a model framework in deep learning, the mould of Encoder-Decoder frame
Type includes but is not limited to sequence to sequence (Sequence to Sequence, abbreviation Seq2Seq) model, and Seq2Seq model can
Using length memory (Long Short-Term Memory, abbreviation LSTM) or gating cycle unit (Gated
Recurrent Unit, abbreviation GRU) Encoder layers and Decoder layers of algorithm realization, Transformer algorithm can also be used
Realize Encoder-Decoder layers.
For example, the Seq2Seq model includes embedding layers, Encoder layers and Softmax layers of Word.Every is instructed
Practice the problem of sample data includes, such as described problem are as follows: you are good, and may I ask you is XXX, is input to Word embedding
Layer, embedding layers of the Word term vector that each word in the above problem is converted into regular length, is then output to
Encoder layers, Encoder layers handle the term vector of input, and output state variable C is to Softmax layers, state variable C
As Softmax layers of initial value, Softmax layers can export the corresponding label of the above problem after training, such as export
Problem: you are good, and may I ask you is XXX, and corresponding label label_ asks whether me.
Training module is trained to obtain the intention of pre-training using no label corpus sample to the intention assessment model
Identification model.
On machine learning model training problem, aiming at the problem that manually mark heavy workload, in an optional implementation
In example, the training module includes: that cluster cell, sampling unit, the first training unit, mark unit and the second training are single
Member solves the problems, such as manually to mark heavy workload when training by the cooperation of said units.
Cluster cell clusters the no label corpus sample, obtain pre-set categories without label corpus sample;
In order to obtain the no label corpus sample, it can record to the customer service of preservation, utilize speech recognition technology
Customer service recording is carried out offline transcription into text by (Automatic Speech Recognition, abbreviation ASR), obtains original language
Material;Then by the session operational scenarios in above-mentioned original language material, artificial correction process is carried out, obtains the no label voice sample, institute
Stating check and correction includes but is not limited to error correction, and alignment sentence etc. is handled, every corpus number in the no label corpus sample of acquisition
According to including a problem and an answer.The corpus data is such as are as follows: problem: you are good, and it is more for may I ask the housing loan interest rate of bank
It is few? answer: you are good, and current interest rate is 5.6%.Does is alternatively, problem: my credit card amount how many now? answer: you are good, you
Current amount is 50,000 RMB.
Clustering algorithm can use LDA (Latent Dirichlet Allocation) or K-means clustering algorithm.
The quantity of default classification is needed in cluster, but the corresponding label of pre-set categories is not aware that.
Sampling unit is sampled acquisition original sample to the no label corpus sample of each pre-set categories;
I.e. from the no label corpus sample of each pre-set categories, a certain number of corpus samples are obtained, are obtained
To original sample.Wherein, the ratio or quantity of the sampling are configured according to actual needs, and the embodiment of the present invention does not limit
It is fixed.
The small sample after sampling is labeled according to pre-set categories manually, obtains the corresponding label of every corpus data,
To obtain the original small sample by mark.
First training unit is trained to obtain initial using the original sample by mark to the intention assessment model
Intention assessment model;
The model may include coding (Encoder) layer and classification layer, and Encoder layers of the output is as the classification layer
Input, the classification layer can use Softmax algorithm.The problem of including using every training sample data is as the model
Input, the label for including using every training sample data of first training sample protects as the output of the model
The parameter constant for holding Encoder layers is trained the model.In the training process to model, kept for Encoder layers
Parameter constant, training obtain the parameter of the classification layer, the relationship between no label corpus sample be not only utilized in this way, but also can have
The training classification layer of supervision, completes text categorization task, can use less sample training and obtain the higher model of generalization.
No label corpus sample remaining after sampling is inputted the initial intention assessment model and obtains every by mark unit
The label of remaining no label corpus sample;
Specifically, after obtaining the model, remaining no label corpus sample is carried out using the initial model
Mark obtains the corresponding label of every corpus data of remaining no label corpus sample, i.e., remaining no label corpus sample
The input as the initial model of every corpus data the problem of including, remaining no label corpus sample can be obtained
The corresponding label of every corpus data.Then, people is carried out to the label of every corpus data of remaining no label corpus sample
Work amendment, revised data include one as supplementary training sample, every training sample data of the supplementary training sample
A problem and corresponding label.
Second training unit utilizes the remaining no label corpus sample after modified sampling to know the initial intention
Other model carries out further training and obtains the intention assessment model of pre-training.
By using above scheme, it can solve the problems, such as that project initial stage corpus is few, with machine person to person after project is online
Talk with the increase of quantity, newly-increased data persistently pour into model and carry out Model Self-Learning Continuous optimization.
In an alternative embodiment, it is intended that identification device further include: test sample obtains module, test module, survey
Try judgment module, model output module and retraining module.
Test sample obtains the testing material that module obtains known label;
Test module tests the intention assessment model of the pre-training using the testing material of the known label,
And using the output of the model as test result;
It tests judgment module and is based on the test result and known label, judge whether the intention assessment model of pre-training accords with
Close preset requirement;
If the intention assessment model of model output module pre-training meets preset requirement, using "current" model as being used to anticipate
Scheme the object module of identification.
If the intention assessment model of retraining module pre-training does not meet preset requirement, "current" model is optimized
And/or model training is re-started using updated training sample set.
Device, module or the unit that above-described embodiment illustrates can specifically be realized, Huo Zheyou by computer chip or entity
Product with certain function is realized.It is a kind of typical to realize that equipment is electronic equipment, specifically, electronic equipment for example can be with
For personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player,
Any in navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipment sets
Standby combination.
Electronic equipment specifically includes memory, processor and storage on a memory and can in a typical example
The computer program run on a processor, the processor realize following step when executing described program:
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes
A plurality of regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;
If it is not, by the intention assessment model of the corpus to be identified input pre-training, also, by the intention of the pre-training
The output of identification model is as intention assessment result.
As can be seen from the above description, electronic equipment provided in an embodiment of the present invention, the intention assessment that can be used under dialogue frame,
By recycling machine learning model to identify when rule template cannot effectively identify that user is intended to, user is intended to, and can accurately know
Other user is intended to, lower to the quantitative requirement of training sample, promotes the artificial smart machines such as question answering system, dialogue robot
Development.
Below with reference to Figure 11, it illustrates the structural representations for the electronic equipment 600 for being suitable for being used to realize the embodiment of the present application
Figure.
As shown in figure 11, electronic equipment 600 includes central processing unit (CPU) 601, can be according to being stored in read-only deposit
Program in reservoir (ROM) 602 is loaded into random access storage device (RAM) from storage section 608) program in 603 and
Execute various work appropriate and processing.In RAM603, also it is stored with system 600 and operates required various programs and data.
CPU601, ROM602 and RAM603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to bus
604。
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And including such as LAN card, the communications portion 609 of the network interface card of modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 606 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted as needed such as storage section 608.
Particularly, according to an embodiment of the invention, may be implemented as computer above with reference to the process of flow chart description
Software program.For example, the embodiment of the present invention includes a kind of computer readable storage medium, it is stored thereon with computer program,
The computer program realizes following step when being executed by processor:
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes
A plurality of regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;
If it is not, by the intention assessment model of the corpus to be identified input pre-training, also, by the intention of the pre-training
The output of identification model is as intention assessment result.
As can be seen from the above description, computer readable storage medium provided in an embodiment of the present invention, can be used under dialogue frame
Intention assessment, pass through rule template cannot effectively identify user be intended to when recycle machine learning model identify user meaning
Figure can accurately identify user's intention, lower to the quantitative requirement of training sample, promote question answering system, dialogue robot etc.
The development of artificial intelligence equipment.
In such embodiments, which can be downloaded and installed from network by communications portion 609,
And/or it is mounted from detachable media 611.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices
Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each unit can be realized in the same or multiple software and or hardware when application.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including described want
There is also other identical elements in the process, method of element, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The application can describe in the general context of computer-executable instructions executed by a computer, such as program
Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, group
Part, data structure etc..The application can also be practiced in a distributed computing environment, in these distributed computing environments, by
Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage equipment.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
The above description is only an example of the present application, is not intended to limit this application.For those skilled in the art
For, various changes and changes are possible in this application.All any modifications made within the spirit and principles of the present application are equal
Replacement, improvement etc., should be included within the scope of the claims of this application.
Claims (20)
1. the intension recognizing method under a kind of dialogue frame characterized by comprising
The matching degree of the regular monomer in corpus to be identified and preset rules template is obtained, the preset rules template includes a plurality of
Regular monomer, the corresponding label of every rule monomer;
Judge whether there is the matching degree greater than preset threshold;
If so, using the label of the corresponding regular monomer of maximum matching degree as intention assessment result;
If it is not, by the intention assessment model of the corpus to be identified input pre-training, also, by the intention assessment of the pre-training
The output of model is as intention assessment result.
2. the intension recognizing method under dialogue frame according to claim 1, which is characterized in that described to obtain language to be identified
The matching degree of material and the regular monomer in preset rules template, comprising:
It parses the corpus to be identified and obtains multiple feature words to be matched and its part of speech;
Dependency rule monomer is obtained from the rule template according to the multiple feature word to be matched;
The matching of the corpus to be identified and dependency rule monomer is calculated according to the multiple feature word to be matched and its part of speech
Degree.
3. the intension recognizing method under dialogue frame according to claim 2, which is characterized in that wait know described in the parsing
Other corpus obtains multiple feature words to be matched and its part of speech, comprising:
The corpus to be identified is segmented to obtain multiple feature words;
It removes the stop words in the multiple feature word and obtains multiple feature words to be matched;
The part of speech for marking the multiple feature word to be matched obtains multiple feature words to be matched and its part of speech.
4. the intension recognizing method under dialogue frame according to claim 2, which is characterized in that the regular monomer is by more
A word slot composition, each word slot give the part of speech of word in word or word slot in word slot;
It is described that dependency rule monomer is obtained from the rule template according to the multiple feature word to be matched, comprising:
The rule list that word slot includes the feature word to be matched is obtained according to rule template described in a feature word search to be matched
Body is as dependency rule monomer.
5. the intension recognizing method under dialogue frame according to claim 4, which is characterized in that each institute's predicate slot is corresponding
One weighted value;
It is described that the corpus to be identified and dependency rule monomer are calculated according to the multiple feature word to be matched and its part of speech
Matching degree, comprising:
The word slot of multiple feature words to be matched and its part of speech and a dependency rule monomer corresponding position is subjected to word or part of speech
Matching;
The weighted value of the word slot of successful match is added up and obtains the matching degree of the corpus to be identified and the dependency rule monomer.
6. the intension recognizing method under dialogue frame according to claim 1, which is characterized in that further include:
Construct intention assessment model;
The intention assessment model is trained using no label corpus sample to obtain the intention assessment model of pre-training.
7. the intension recognizing method under dialogue frame according to claim 6, which is characterized in that described using no label language
Material sample is trained the intention assessment model to obtain the intention assessment model of pre-training, comprising:
The no label corpus sample is clustered, obtain pre-set categories without label corpus sample;
Acquisition original sample is sampled to the no label corpus sample of each pre-set categories;
The intention assessment model is trained to obtain initial intention assessment model using the original sample by mark;
No label corpus sample remaining after sampling is inputted into the initial intention assessment model and obtains every remaining no label
The label of corpus sample;
The initial intention assessment model is carried out using no label corpus sample remaining after modified sampling further
Training obtains the intention assessment model of pre-training.
8. the intension recognizing method under dialogue frame according to claim 6, which is characterized in that further include:
Obtain the testing material of known label;
The intention assessment model of the pre-training is tested using the testing material of the known label, and by the model
Output is used as test result;
Based on the test result and known label, judge whether the intention assessment model of pre-training meets preset requirement;
If so, using "current" model as the object module for being used for intention assessment.
9. the intension recognizing method under dialogue frame according to claim 8, which is characterized in that further include:
If "current" model does not meet preset requirement, "current" model is optimized and/or using updated training sample set
Re-start model training.
10. the intention assessment device under a kind of dialogue frame characterized by comprising
Matching degree obtains module, obtains the matching degree of the regular monomer in corpus to be identified and preset rules template, described default
Rule template includes a plurality of regular monomer, the corresponding label of every rule monomer;
Matching judgment module judges whether there is the matching degree greater than preset threshold;
First intention identification module is greater than the matching degree of preset threshold, then by the corresponding regular monomer of maximum matching degree if it exists
Label as intention assessment result;
Second intention identification module is greater than the matching degree of preset threshold if it does not exist, then by the pre- instruction of corpus input to be identified
Experienced intention assessment model, also, using the output of the intention assessment model of the pre-training as intention assessment result.
11. the intention assessment device under dialogue frame according to claim 10, which is characterized in that the matching degree obtains
Module includes:
Resolution unit parses the corpus to be identified and obtains multiple feature words to be matched and its part of speech;
Rules Filtering unit obtains dependency rule monomer according to the multiple feature word to be matched from the rule template;
Matching degree computing unit, according to the multiple feature word to be matched and its part of speech calculate the corpus to be identified to it is related
The matching degree of regular monomer.
12. the intention assessment device under dialogue frame according to claim 11, which is characterized in that the resolution unit packet
It includes:
Subelement is segmented, the corpus to be identified is segmented to obtain multiple feature words;
Stop words subelement is removed, the stop words in the multiple feature word is removed and obtains multiple feature words to be matched;
Part-of-speech tagging subelement, mark the multiple feature word to be matched part of speech obtain multiple feature words to be matched and its
Part of speech.
13. the intention assessment device under dialogue frame according to claim 11, which is characterized in that the regular monomer by
Multiple word slot compositions, each word slot are keyword and its synonym or part of speech;
The Rules Filtering unit includes:
Subelement is searched for, obtaining word slot according to rule template described in a feature word search to be matched includes the Feature Words to be matched
The regular monomer of language is as dependency rule monomer.
14. the intention assessment device under dialogue frame according to claim 13, which is characterized in that each institute's predicate slot pair
Answer a weighted value;
The matching degree computing unit includes:
Coupling subelement carries out the word slot of multiple feature words to be matched and its part of speech and a dependency rule monomer corresponding position
Word or part of speech matching;
Weighted value adds up subelement, obtains that the corpus to be identified is related to this to advise for the weighted value of the word slot of successful match is cumulative
The then matching degree of monomer.
15. the intention assessment device under dialogue frame according to claim 10, which is characterized in that further include:
Model construction module constructs intention assessment model;
Training module is trained to obtain the intention assessment of pre-training using no label corpus sample to the intention assessment model
Model.
16. the intention assessment device under dialogue frame according to claim 15, which is characterized in that the training module packet
It includes:
Cluster cell clusters the no label corpus sample, obtain pre-set categories without label corpus sample;
Sampling unit is sampled acquisition original sample to the no label corpus sample of each pre-set categories;
First training unit is trained the intention assessment model using the original sample by mark and is initially intended to
Identification model;
Unit is marked, no label corpus sample remaining after sampling is inputted into the initial intention assessment model and obtains every residue
The label without label corpus sample;
Second training unit, using no label corpus sample remaining after modified sampling to the initial intention assessment mould
Type carries out further training and obtains the intention assessment model of pre-training.
17. the intention assessment device under dialogue frame according to claim 15, which is characterized in that further include:
Test sample obtains module, obtains the testing material of known label;
Test module tests the intention assessment model of the pre-training using the testing material of the known label, and
Using the output of the model as test result;
Judgment module is tested, the test result and known label is based on, judges whether the intention assessment model of pre-training meets
Preset requirement;
Model output module, if the intention assessment model of pre-training meets preset requirement, using "current" model as being used to be intended to
The object module of identification.
18. the intention assessment device under dialogue frame according to claim 17, which is characterized in that further include:
Retraining module, if the intention assessment model of pre-training does not meet preset requirement, "current" model is optimized and/or
Model training is re-started using updated training sample set.
19. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor
Machine program, which is characterized in that the processor realizes the described in any item dialog boxes of claim 1 to 9 when executing described program
The step of intension recognizing method under frame.
20. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program quilt
The step of intension recognizing method under the described in any item dialogue frames of claim 1 to 9 is realized when processor executes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910666196.0A CN110377911B (en) | 2019-07-23 | 2019-07-23 | Method and device for identifying intention under dialog framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910666196.0A CN110377911B (en) | 2019-07-23 | 2019-07-23 | Method and device for identifying intention under dialog framework |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110377911A true CN110377911A (en) | 2019-10-25 |
CN110377911B CN110377911B (en) | 2023-07-21 |
Family
ID=68255091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910666196.0A Active CN110377911B (en) | 2019-07-23 | 2019-07-23 | Method and device for identifying intention under dialog framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110377911B (en) |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110851581A (en) * | 2019-11-19 | 2020-02-28 | 东软集团股份有限公司 | Model parameter determination method, device, equipment and storage medium |
CN110890097A (en) * | 2019-11-21 | 2020-03-17 | 京东数字科技控股有限公司 | Voice processing method and device, computer storage medium and electronic equipment |
CN111027667A (en) * | 2019-12-06 | 2020-04-17 | 北京金山安全软件有限公司 | Intention category identification method and device |
CN111143561A (en) * | 2019-12-26 | 2020-05-12 | 北京百度网讯科技有限公司 | Intention recognition model training method and device and electronic equipment |
CN111143524A (en) * | 2019-12-06 | 2020-05-12 | 联想(北京)有限公司 | User intention determining method and electronic equipment |
CN111177351A (en) * | 2019-12-20 | 2020-05-19 | 北京淇瑀信息科技有限公司 | Method, device and system for acquiring natural language expression intention based on rule |
CN111259625A (en) * | 2020-01-16 | 2020-06-09 | 平安科技(深圳)有限公司 | Intention recognition method, device, equipment and computer readable storage medium |
CN111291156A (en) * | 2020-01-21 | 2020-06-16 | 同方知网(北京)技术有限公司 | Question-answer intention identification method based on knowledge graph |
CN111324727A (en) * | 2020-02-19 | 2020-06-23 | 百度在线网络技术(北京)有限公司 | User intention recognition method, device, equipment and readable storage medium |
CN111339770A (en) * | 2020-02-18 | 2020-06-26 | 百度在线网络技术(北京)有限公司 | Method and apparatus for outputting information |
CN111368045A (en) * | 2020-02-21 | 2020-07-03 | 平安科技(深圳)有限公司 | User intention identification method, device, equipment and computer readable storage medium |
CN111400466A (en) * | 2020-03-05 | 2020-07-10 | 中国工商银行股份有限公司 | Intelligent dialogue method and device based on reinforcement learning |
CN111523311A (en) * | 2020-04-21 | 2020-08-11 | 上海优扬新媒信息技术有限公司 | Search intention identification method and device |
CN111737436A (en) * | 2020-06-24 | 2020-10-02 | 网易(杭州)网络有限公司 | Corpus intention identification method and device, electronic equipment and storage medium |
CN111858900A (en) * | 2020-09-21 | 2020-10-30 | 杭州摸象大数据科技有限公司 | Method, device, equipment and storage medium for generating question semantic parsing rule template |
CN112035641A (en) * | 2020-08-31 | 2020-12-04 | 康键信息技术(深圳)有限公司 | Intention extraction model verification method and device, computer equipment and storage medium |
CN112069302A (en) * | 2020-09-15 | 2020-12-11 | 腾讯科技(深圳)有限公司 | Training method of conversation intention recognition model, conversation intention recognition method and device |
CN112131357A (en) * | 2020-08-21 | 2020-12-25 | 国网浙江省电力有限公司杭州供电公司 | User intention identification method and device based on intelligent dialogue model |
CN112149429A (en) * | 2020-10-21 | 2020-12-29 | 成都小美伴旅信息技术有限公司 | High-accuracy semantic understanding and identifying method based on word slot order model |
CN112395392A (en) * | 2020-11-27 | 2021-02-23 | 浪潮云信息技术股份公司 | Intention identification method and device and readable storage medium |
CN112417121A (en) * | 2020-11-20 | 2021-02-26 | 平安普惠企业管理有限公司 | Client intention recognition method and device, computer equipment and storage medium |
CN112597748A (en) * | 2020-12-18 | 2021-04-02 | 深圳赛安特技术服务有限公司 | Corpus generation method, apparatus, device and computer readable storage medium |
CN112784024A (en) * | 2021-01-11 | 2021-05-11 | 软通动力信息技术(集团)股份有限公司 | Man-machine conversation method, device, equipment and storage medium |
CN113065364A (en) * | 2021-03-29 | 2021-07-02 | 网易(杭州)网络有限公司 | Intention recognition method and device, electronic equipment and storage medium |
WO2021135534A1 (en) * | 2020-06-16 | 2021-07-08 | 平安科技(深圳)有限公司 | Speech recognition-based dialogue management method, apparatus, device and medium |
CN113381973A (en) * | 2021-04-26 | 2021-09-10 | 深圳市任子行科技开发有限公司 | Method, system and computer readable storage medium for identifying SSR flow |
CN113449089A (en) * | 2021-06-11 | 2021-09-28 | 车智互联(北京)科技有限公司 | Intent recognition method of query statement, question answering method and computing device |
CN113495489A (en) * | 2020-04-07 | 2021-10-12 | 深圳爱根斯通科技有限公司 | Automatic configuration method and device, electronic equipment and storage medium |
CN113807148A (en) * | 2020-06-16 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Text recognition matching method and device and terminal equipment |
CN113887643A (en) * | 2021-10-12 | 2022-01-04 | 西安交通大学 | New dialogue intention recognition method based on pseudo label self-training and source domain retraining |
WO2022089546A1 (en) * | 2020-10-28 | 2022-05-05 | 华为云计算技术有限公司 | Label generation method and apparatus, and related device |
TWI768513B (en) * | 2020-10-20 | 2022-06-21 | 宏碁股份有限公司 | Artificial intelligence training system and artificial intelligence training method |
WO2022141875A1 (en) * | 2020-12-30 | 2022-07-07 | 平安科技(深圳)有限公司 | User intention recognition method and apparatus, device, and computer-readable storage medium |
CN115269809A (en) * | 2022-09-19 | 2022-11-01 | 支付宝(杭州)信息技术有限公司 | Method and device for training intention recognition model and method and device for recognizing intention |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107909101A (en) * | 2017-11-10 | 2018-04-13 | 清华大学 | Semi-supervised transfer learning character identifying method and system based on convolutional neural networks |
US20180365322A1 (en) * | 2017-06-20 | 2018-12-20 | Accenture Global Solutions Limited | Automatic extraction of a training corpus for a data classifier based on machine learning algorithms |
CN109063221A (en) * | 2018-11-02 | 2018-12-21 | 北京百度网讯科技有限公司 | Query intention recognition methods and device based on mixed strategy |
CN109241255A (en) * | 2018-08-20 | 2019-01-18 | 华中师范大学 | A kind of intension recognizing method based on deep learning |
CN109376847A (en) * | 2018-08-31 | 2019-02-22 | 深圳壹账通智能科技有限公司 | User's intension recognizing method, device, terminal and computer readable storage medium |
-
2019
- 2019-07-23 CN CN201910666196.0A patent/CN110377911B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180365322A1 (en) * | 2017-06-20 | 2018-12-20 | Accenture Global Solutions Limited | Automatic extraction of a training corpus for a data classifier based on machine learning algorithms |
CN107909101A (en) * | 2017-11-10 | 2018-04-13 | 清华大学 | Semi-supervised transfer learning character identifying method and system based on convolutional neural networks |
CN109241255A (en) * | 2018-08-20 | 2019-01-18 | 华中师范大学 | A kind of intension recognizing method based on deep learning |
CN109376847A (en) * | 2018-08-31 | 2019-02-22 | 深圳壹账通智能科技有限公司 | User's intension recognizing method, device, terminal and computer readable storage medium |
CN109063221A (en) * | 2018-11-02 | 2018-12-21 | 北京百度网讯科技有限公司 | Query intention recognition methods and device based on mixed strategy |
Cited By (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110851581A (en) * | 2019-11-19 | 2020-02-28 | 东软集团股份有限公司 | Model parameter determination method, device, equipment and storage medium |
CN110890097A (en) * | 2019-11-21 | 2020-03-17 | 京东数字科技控股有限公司 | Voice processing method and device, computer storage medium and electronic equipment |
CN111027667A (en) * | 2019-12-06 | 2020-04-17 | 北京金山安全软件有限公司 | Intention category identification method and device |
CN111143524A (en) * | 2019-12-06 | 2020-05-12 | 联想(北京)有限公司 | User intention determining method and electronic equipment |
CN111027667B (en) * | 2019-12-06 | 2023-10-17 | 北京金山安全软件有限公司 | Method and device for identifying intention category |
CN111177351A (en) * | 2019-12-20 | 2020-05-19 | 北京淇瑀信息科技有限公司 | Method, device and system for acquiring natural language expression intention based on rule |
CN111143561A (en) * | 2019-12-26 | 2020-05-12 | 北京百度网讯科技有限公司 | Intention recognition model training method and device and electronic equipment |
CN111143561B (en) * | 2019-12-26 | 2023-04-07 | 北京百度网讯科技有限公司 | Intention recognition model training method and device and electronic equipment |
CN111259625A (en) * | 2020-01-16 | 2020-06-09 | 平安科技(深圳)有限公司 | Intention recognition method, device, equipment and computer readable storage medium |
CN111259625B (en) * | 2020-01-16 | 2023-06-27 | 平安科技(深圳)有限公司 | Intention recognition method, device, equipment and computer readable storage medium |
CN111291156A (en) * | 2020-01-21 | 2020-06-16 | 同方知网(北京)技术有限公司 | Question-answer intention identification method based on knowledge graph |
CN111291156B (en) * | 2020-01-21 | 2024-01-12 | 同方知网(北京)技术有限公司 | Knowledge graph-based question and answer intention recognition method |
CN111339770B (en) * | 2020-02-18 | 2023-07-21 | 百度在线网络技术(北京)有限公司 | Method and device for outputting information |
CN111339770A (en) * | 2020-02-18 | 2020-06-26 | 百度在线网络技术(北京)有限公司 | Method and apparatus for outputting information |
CN111324727B (en) * | 2020-02-19 | 2023-08-01 | 百度在线网络技术(北京)有限公司 | User intention recognition method, device, equipment and readable storage medium |
CN111324727A (en) * | 2020-02-19 | 2020-06-23 | 百度在线网络技术(北京)有限公司 | User intention recognition method, device, equipment and readable storage medium |
CN111368045A (en) * | 2020-02-21 | 2020-07-03 | 平安科技(深圳)有限公司 | User intention identification method, device, equipment and computer readable storage medium |
CN111368045B (en) * | 2020-02-21 | 2024-05-07 | 平安科技(深圳)有限公司 | User intention recognition method, device, equipment and computer readable storage medium |
CN111400466A (en) * | 2020-03-05 | 2020-07-10 | 中国工商银行股份有限公司 | Intelligent dialogue method and device based on reinforcement learning |
CN113495489A (en) * | 2020-04-07 | 2021-10-12 | 深圳爱根斯通科技有限公司 | Automatic configuration method and device, electronic equipment and storage medium |
CN111523311B (en) * | 2020-04-21 | 2023-10-03 | 度小满科技(北京)有限公司 | Search intention recognition method and device |
CN111523311A (en) * | 2020-04-21 | 2020-08-11 | 上海优扬新媒信息技术有限公司 | Search intention identification method and device |
CN113807148A (en) * | 2020-06-16 | 2021-12-17 | 阿里巴巴集团控股有限公司 | Text recognition matching method and device and terminal equipment |
WO2021135534A1 (en) * | 2020-06-16 | 2021-07-08 | 平安科技(深圳)有限公司 | Speech recognition-based dialogue management method, apparatus, device and medium |
CN111737436A (en) * | 2020-06-24 | 2020-10-02 | 网易(杭州)网络有限公司 | Corpus intention identification method and device, electronic equipment and storage medium |
CN112131357A (en) * | 2020-08-21 | 2020-12-25 | 国网浙江省电力有限公司杭州供电公司 | User intention identification method and device based on intelligent dialogue model |
CN112035641A (en) * | 2020-08-31 | 2020-12-04 | 康键信息技术(深圳)有限公司 | Intention extraction model verification method and device, computer equipment and storage medium |
CN112069302A (en) * | 2020-09-15 | 2020-12-11 | 腾讯科技(深圳)有限公司 | Training method of conversation intention recognition model, conversation intention recognition method and device |
CN112069302B (en) * | 2020-09-15 | 2024-03-08 | 腾讯科技(深圳)有限公司 | Training method of conversation intention recognition model, conversation intention recognition method and device |
CN111858900A (en) * | 2020-09-21 | 2020-10-30 | 杭州摸象大数据科技有限公司 | Method, device, equipment and storage medium for generating question semantic parsing rule template |
TWI768513B (en) * | 2020-10-20 | 2022-06-21 | 宏碁股份有限公司 | Artificial intelligence training system and artificial intelligence training method |
CN112149429A (en) * | 2020-10-21 | 2020-12-29 | 成都小美伴旅信息技术有限公司 | High-accuracy semantic understanding and identifying method based on word slot order model |
WO2022089546A1 (en) * | 2020-10-28 | 2022-05-05 | 华为云计算技术有限公司 | Label generation method and apparatus, and related device |
CN112417121A (en) * | 2020-11-20 | 2021-02-26 | 平安普惠企业管理有限公司 | Client intention recognition method and device, computer equipment and storage medium |
CN112395392A (en) * | 2020-11-27 | 2021-02-23 | 浪潮云信息技术股份公司 | Intention identification method and device and readable storage medium |
CN112597748B (en) * | 2020-12-18 | 2023-08-11 | 深圳赛安特技术服务有限公司 | Corpus generation method, corpus generation device, corpus generation equipment and computer-readable storage medium |
CN112597748A (en) * | 2020-12-18 | 2021-04-02 | 深圳赛安特技术服务有限公司 | Corpus generation method, apparatus, device and computer readable storage medium |
WO2022141875A1 (en) * | 2020-12-30 | 2022-07-07 | 平安科技(深圳)有限公司 | User intention recognition method and apparatus, device, and computer-readable storage medium |
CN112784024A (en) * | 2021-01-11 | 2021-05-11 | 软通动力信息技术(集团)股份有限公司 | Man-machine conversation method, device, equipment and storage medium |
CN112784024B (en) * | 2021-01-11 | 2023-10-31 | 软通动力信息技术(集团)股份有限公司 | Man-machine conversation method, device, equipment and storage medium |
CN113065364A (en) * | 2021-03-29 | 2021-07-02 | 网易(杭州)网络有限公司 | Intention recognition method and device, electronic equipment and storage medium |
CN113381973B (en) * | 2021-04-26 | 2023-02-28 | 深圳市任子行科技开发有限公司 | Method, system and computer readable storage medium for identifying SSR flow |
CN113381973A (en) * | 2021-04-26 | 2021-09-10 | 深圳市任子行科技开发有限公司 | Method, system and computer readable storage medium for identifying SSR flow |
CN113449089B (en) * | 2021-06-11 | 2023-12-01 | 车智互联(北京)科技有限公司 | Intent recognition method, question-answering method and computing device of query statement |
CN113449089A (en) * | 2021-06-11 | 2021-09-28 | 车智互联(北京)科技有限公司 | Intent recognition method of query statement, question answering method and computing device |
CN113887643A (en) * | 2021-10-12 | 2022-01-04 | 西安交通大学 | New dialogue intention recognition method based on pseudo label self-training and source domain retraining |
CN115269809A (en) * | 2022-09-19 | 2022-11-01 | 支付宝(杭州)信息技术有限公司 | Method and device for training intention recognition model and method and device for recognizing intention |
Also Published As
Publication number | Publication date |
---|---|
CN110377911B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110377911A (en) | Intension recognizing method and device under dialogue frame | |
CN106776936B (en) | Intelligent interaction method and system | |
CN109635117A (en) | A kind of knowledge based spectrum recognition user intention method and device | |
CN109101620A (en) | Similarity calculating method, clustering method, device, storage medium and electronic equipment | |
CN110309514A (en) | A kind of method for recognizing semantics and device | |
CN109408622A (en) | Sentence processing method and its device, equipment and storage medium | |
US11966389B2 (en) | Natural language to structured query generation via paraphrasing | |
US20220100963A1 (en) | Event extraction from documents with co-reference | |
CN108416032A (en) | A kind of file classification method, device and storage medium | |
CN110597966A (en) | Automatic question answering method and device | |
CN112966089A (en) | Problem processing method, device, equipment, medium and product based on knowledge base | |
CN110297909A (en) | A kind of classification method and device of no label corpus | |
CN112015896B (en) | Emotion classification method and device based on artificial intelligence | |
CN109684354A (en) | Data query method and apparatus | |
CN110210038A (en) | Kernel entity determines method and its system, server and computer-readable medium | |
CN112989761A (en) | Text classification method and device | |
CN104077327B (en) | The recognition methods of core word importance and equipment and search result ordering method and equipment | |
CN114782054A (en) | Customer service quality detection method based on deep learning algorithm and related equipment | |
CN110209561A (en) | Evaluating method and evaluating apparatus for dialogue platform | |
CN107305559A (en) | Method and apparatus are recommended in one kind application | |
CN110059172A (en) | The method and apparatus of recommendation answer based on natural language understanding | |
EP4222635A1 (en) | Lifecycle management for customized natural language processing | |
CN117291185A (en) | Task processing method, entity identification method and task processing data processing method | |
CN117076598A (en) | Semantic retrieval model fusion method and system based on self-adaptive weight | |
CN116756278A (en) | Machine question-answering method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |