CN108304424A - Text key word extracting method and text key word extraction element - Google Patents
Text key word extracting method and text key word extraction element Download PDFInfo
- Publication number
- CN108304424A CN108304424A CN201710203566.8A CN201710203566A CN108304424A CN 108304424 A CN108304424 A CN 108304424A CN 201710203566 A CN201710203566 A CN 201710203566A CN 108304424 A CN108304424 A CN 108304424A
- Authority
- CN
- China
- Prior art keywords
- text
- training
- trained
- network model
- extracted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of text key word extracting method and device, the method in one embodiment include:Obtain text to be extracted;It is scanned in associated keywords database, matches the keyword in the text to be extracted;According to the keyword in the text to be extracted, the text to be extracted matched, all text clause and the combination of corresponding keyword are determined;According to key words probabilities network model, analyzes and determine each text clause and the probability that the synthesis of corresponding crucial phrase is stood;The corresponding keyword combination of the determining maximum probability of probability intermediate value of analysis is determined as the keyword extracted from the text to be extracted combination.This embodiment scheme fast response time, and the difficulty of extraction text key word is simplified, improve the accuracy of text key word.
Description
Technical field
The present invention relates to intelligent interaction fields, are carried more particularly to a kind of text key word extracting method and text key word
Take device.
Background technology
By taking the intelligent interaction devices such as intelligent sound or intelligent assistant as an example, be typically by talk with form and user into
Row interaction, when interacting, by by the speech recognition of user be text after, to the keyword in text (at some
In technology application, also referred to as entity word) it extracts.However, in this interaction, interactive text is usually very
Short, it is extremely difficult will to extract keyword therein (such as singer's name, song title) for only several words.On the other hand, for short essay
For this, relative to long text, a large amount of data can not be crawled from internet, it also can without a large amount of public labeled data
To use, the public corpus data in vertical field is considerably less, and developer oneself is needed to go to collect, this is non-to project cold-start phase
Chang Buli.Therefore there is an urgent need for can obtain the text key word extracting mode of better result.
At present in the case of no labeled data, the mode for extracting text key word mainly uses maximum matching algorithm
With the method based on stencil matching.Maximum matching algorithm is commonly used in Chinese automatic word-cut comprising Forward Maximum Method and inverse
It is matched to maximum.It is from left to right by the several continuation characters and entity in text to be segmented by taking Forward Maximum Method as an example
The vocabulary matching in library (also referred to as keywords database), if matched, is syncopated as a longest word of length.Such as:Short text
For " I wants to listen the song (A, B, C indicate a specific word respectively) of ABC ", the entity library of singer is { " AB ", " ABC " }, then
According to maximum match principle, the entity (keyword) extracted is exactly " ABC ", rather than " AB ".Method based on stencil matching
It is then to be pre-designed some common masterplates, such as " I wants to listen [song] of [singer] ".If the inquiry string of user is that " I thinks
Then the SX " for listening ABC can arrive corresponding entity library inspection again by stencil matching keyword " ABC " and " SX " is extracted
It looks into and whether contains the keyword, if there is then returning the result.However, although the speed of maximum matching algorithm is fast, effect is not
It is good, and keyword of the same name cannot be distinguished.For example, " kissing goodbye " is both likely to be song, it is also possible to be album.And based on
In the method for stencil matching, the saying of user is very strange, relatively good to achieve the effect that, each scene may need tens
Ten thousand masterplate, does not only result in that speed is slow in this way, once and user interrogation mode not in masterplate, will there is no keyword
It can be extracted.
Invention content
Based on this, the present embodiment provides a kind of text key word extracting method and a kind of text key word extraction element,
It can improve the accuracy of text key word, and speed is fast.
A kind of text key word extracting method, including:
Obtain text to be extracted;
It is scanned in associated keywords database, matches the keyword in the text to be extracted;
According to the keyword in the text to be extracted, the text to be extracted matched, all texts are determined
Clause and the combination of corresponding keyword;
According to key words probabilities network model, analyzes and determine what each text clause and the synthesis of corresponding crucial phrase were stood
Probability;
The corresponding keyword combination of the determining maximum probability of probability intermediate value of analysis is determined as from the text to be extracted
The keyword of middle extraction combines.
A kind of text key word extraction element, including:
Text acquisition module, for obtaining text to be extracted;
Keywords matching module is matched for being scanned in associated keywords database in the text to be extracted
Keyword;
Determining module is combined, for according to the keyword in the text to be extracted, the text to be extracted matched,
Determine all text clause and the combination of corresponding keyword;
Probability analysis module, for according to key words probabilities network model, analysis to determine each text clause and correspondence
The vertical probability of crucial phrase synthesis;
Determining module is extracted, the maximum probability of probability intermediate value for determining the probability analysis module analysis is corresponding
Keyword combination is determined as the keyword extracted from the text to be extracted combination.
According to the scheme of embodiment as described above, when needing to extract the keyword in complete to be extracted,
It is to be based on associated keywords database, is scanned in associated keywords database, match the keyword in text to be extracted, so
All text clause and the combination of corresponding keyword are determined based on keyword afterwards, further according to key words probabilities network model point
Analysis determines each text clause and the probability that the synthesis of corresponding crucial phrase is stood, and the determining probability intermediate value of analysis is maximum
The corresponding keyword combination of probability is determined as the keyword extracted from the text to be extracted combination.Its extract it is to be extracted
On the basis of keyword in text, determines all text clause and the combination of corresponding keyword, be then based on keyword
Probabilistic Network Model determines the probability of each text clause and the combination of corresponding keyword, not only fast response time, but also
The difficulty for simplifying extraction text key word, improves the accuracy of text key word.
Description of the drawings
Fig. 1 is the schematic diagram of the application environment of the scheme in one embodiment;
Fig. 2 is the schematic diagram of the composed structure of the terminal in one embodiment;
Fig. 3 is the schematic diagram of the composed structure of the server in one embodiment;
Fig. 4 is the flow diagram of the text key word extracting method in one embodiment;
Fig. 5 is the principle schematic of the generation key words probabilities network model in one embodiment;
Fig. 6 is the principle schematic of the extraction text key word in one embodiment;
Fig. 7 is the flow diagram of the generation key words probabilities network model in a specific example;
Fig. 8 is the flow diagram of the generation key words probabilities network model in another specific example;
Fig. 9 is the flow diagram of the generation key words probabilities network model in another specific example;
Figure 10 is the structural schematic diagram of the text key word extraction element in one embodiment;
Figure 11 is the structural schematic diagram of the text key word extraction element in another embodiment;
Figure 12 is the structural schematic diagram of the model generation module in a specific example.
Specific implementation mode
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, with reference to the accompanying drawings and embodiments, to this
Invention is described in further detail.It should be appreciated that the specific embodiments described herein are only used to explain the present invention,
Do not limit protection scope of the present invention.
Fig. 1 shows the working environment schematic diagram in one embodiment of the invention, as shown in Figure 1, its working environment is related to
Terminal 101, it is also possible to be related to server 102, terminal 101, server 102 can be communicated by network.Terminal 101 can be with
Intelligent interaction is carried out with terminal user, receiving terminal content of text input by user, or by the speech recognition of terminal user
For content of text, by being extracted to the keyword in content of text, subsequent related service can be carried out, such as based on carrying
The keyword taken from local either network inquiry play corresponding song, based on the keyword of extraction from local or network inquiry
Corresponding film, the corresponding weather of the keyword query based on extraction etc..The process for extracting the keyword in content of text, can be with
It is carried out in terminal 101, can also be that content of text is sent to after server 102 and is carried out in server 102 by terminal 101.At this
In example scheme, when extracting the keyword in content of text, it can be carried out in conjunction with key words probabilities network model, the keyword
Probabilistic Network Model can server 102 is stored in locally, after being determined by server 102 to execute subsequent extraction text
The process of keyword in content can also be that the key words probabilities network model is sent to terminal 101 by server 102
Afterwards, the process of the keyword in subsequent extraction content of text is executed by terminal 101.On the other hand, the key words probabilities network
Model can also be after being determined by terminal 101, to be sent to server 102, and be distributed to other terminals 101 by server 102
It executes.The present embodiments relate to be the keyword that terminal 101 or server 102 extract in content of text scheme.
The structural schematic diagram of terminal 101 in one embodiment is as shown in Figure 2.The terminal 101 includes passing through system bus
Processor, storage medium, communication interface, power interface and the memory of connection.Wherein, the storage medium of terminal 101 is stored with one
Kind text key word extraction element, the device is for realizing a kind of text key word extracting method.The communication interface of terminal 101 is used
It connect and communicates in other servers in server 102 or network, the power interface of terminal 101 is used for and external power supply
Connection, external power supply are powered by the power interface to terminal 101.Terminal 101, which can be any type, can realize that intelligence is defeated
Enter equipment of output, such as mobile terminal (such as mobile phone, tablet computer etc.), intelligent sound box etc.;Can also be it is other have it is upper
State the smart machine of structure.
Server 102 and structural schematic diagram in one embodiment are as shown in Figure 3.It includes being connected by system bus
Processor, power supply module, storage medium, memory and communication interface.Wherein, the storage medium of server 102 is stored with operation
System, database and a kind of text key word extraction element, the device is for realizing a kind of text key word extracting method.Service
The communication interface of device is for being attached and communicating with other servers in terminal 101 and network.
Fig. 4 shows the flow diagram of the text key word extracting method in one embodiment, as shown in figure 4, the reality
The text key word extracting method applied in example includes:
Step S401:Obtain text to be extracted;
Step S402:It is scanned in associated keywords database, matches the keyword in the text to be extracted;
Step S403:According to the keyword in the text to be extracted, the text to be extracted matched, institute is determined
Some text clause and the combination of corresponding keyword, wherein any one the text clause determined and its corresponding keyword
Combination, has collectively constituted above-mentioned text to be extracted;
Step S404:According to key words probabilities network model, analyzes and determine each text clause and corresponding keyword
Combine the probability set up;
Step S405:The corresponding keyword combination of the determining maximum probability of probability intermediate value of analysis is determined as waiting for from described
The keyword combination extracted in extraction text.
According to the scheme of embodiment as described above, when needing to extract the keyword in complete to be extracted,
It is to be based on associated keywords database, is scanned in associated keywords database, match the keyword in text to be extracted, so
All text clause and the combination of corresponding keyword are determined based on keyword afterwards, further according to key words probabilities network model point
Analysis determines each text clause and the probability that the synthesis of corresponding crucial phrase is stood, and the determining probability intermediate value of analysis is maximum
The corresponding keyword combination of probability is determined as the keyword extracted from the text to be extracted combination.Its extract it is to be extracted
On the basis of keyword in text, determines all text clause and the combination of corresponding keyword, be then based on keyword
Probabilistic Network Model determines the probability of each text clause and the combination of corresponding keyword, not only fast response time, but also
The difficulty for simplifying extraction text key word, improves the accuracy of text key word.
Scheme in embodiment as described above, can execute in terminal, can also be to be executed on server.
For being executed in terminal, above-mentioned text to be extracted can be the text of terminal user's input, such as terminal use
The text that family is inputted by user-interactive devices such as keyboard, touch screens can also be by the voice progress to terminal user
Identify obtained text.In the present embodiment, the mode for obtaining text to be extracted can be to receive text input by user,
Or by voiced translation input by user at text, in other embodiments, can also obtain by other means described to be extracted
Text.
On the other hand, for being executed in terminal, above-mentioned key words probabilities network model can be by the pre- Mr. of terminal
At can also include step at this point, before above-mentioned acquisition text to be extracted:Generate the key words probabilities network model.This
Outside, can also be after server generates key words probabilities network model, terminal obtains the key words probabilities network from server
Model.Can also include step at this point, before above-mentioned acquisition text to be extracted:Obtain the keyword that server generates
Probabilistic Network Model.
Can receive above-mentioned text to be extracted from terminal, in acquisition, this waits carrying terminal for executing on the server
After taking text, which is uploaded to server.The text to be extracted can be the text of terminal user's input, such as
The text that terminal user is inputted by user-interactive devices such as keyboard, touch screens can also be by the language to terminal user
The text that sound is identified, in other embodiments, the text that can also be obtained by other means.
On the other hand, for executing on the server, above-mentioned key words probabilities network model can be pre- by server
It first generates, can also include step at this point, before above-mentioned acquisition text to be extracted:Generate the key words probabilities network mould
Type.
In a specific example, when terminal or server generate above-mentioned key words probabilities network model, specific side
Formula may include:
Acquisition waits for training text, described to wait for that training text includes each clause rule template and the language material text in each field;
It waits for that training text is trained according to described, obtains the key words probabilities network model.
Wherein, above-mentioned clause rule template shows specific clause rule.Since set clause rule may not
All clause, such as some colloquial clause can be included, can also include each field in waiting for training text therefore
Language material text, the language material text can be some colloquial texts.In a concrete application realization method, the language in each field
Material text can obtain in such a way that reptile crawls.
Wherein, wait for that training text is trained in above-mentioned basis, when obtaining key words probabilities network model, due to waiting training
Text includes each clause rule template and the language material text in each field both texts, thus can also be in conjunction with real in training
Border technology needs are determined.
In a specific example, according to when training text is trained, it is clause that can not treat training text
Rule template or the language material text in each field distinguish, and during training each time, randomly select primary, tool
The mode of body may include:
From waiting for that extracting one in training text at random currently waits for training text, this currently waits for that training text is clause rule mould
Plate or language material text, i.e., that extracts at this time currently waits for that training text may be clause rule template, it is also possible to language material text
This;
Current by extraction waits for that the current network model to be trained of training text input is trained, and waits instructing after being trained
Practice network model;
When the language material text of each clause rule template or each field in training text is not extracted and finished, with training
The current network model to be trained of network model modification to be trained afterwards, and return and wait for extracting one in training text at random from above-mentioned
Current the step of waiting for training text, until the above-mentioned each clause rule template waited in training text, the language material text in each field are equal
Extraction finishes;
Network model to be trained after the training of acquisition is determined as above-mentioned key words probabilities network model.
Wherein, it is to wait for each clause rule template of training text or each field in judgement in above-mentioned specific example
Language material text does not extract when finishing, just with being carried out for the network model modification to be trained after training currently network model to be trained
Illustrate, can also be first with the current network mould to be trained of the network model modification to be trained after trained in particular technique application
After type, then treats the language material text of each clause rule template and each field in training text and whether extracts to finish and judged,
It is after updating at this point, after the language material text of each clause rule template and each field in waiting for training text extracts
Current network model to be trained be determined as above-mentioned key words probabilities network model.
Based on the above-mentioned example for being trained acquisition key words probabilities network model, it is to be understood that due to every time
It is from waiting for that extracting one in training text at random currently waits for training text, therefore, adjacent training process twice is from waiting for training text
That extracts at random in this current waits for training text, it may be possible to the text of same type, such as be clause rule template or be
Language material text, it is also possible to it is different types of text, such as what is once extracted is clause rule template, and another extraction
It is language material text.
In another specific example, it can be the number and language material text that will wait for the clause rule template in training text
Number be set as identical, can be handed over for the expectation text in clause rule template and each field at this time when being trained
For progress, specific mode may include:
A clause rule template is extracted from each clause rule template, and the input of the clause rule template of extraction is worked as
Before network model to be trained be trained, the network model to be trained after being trained;
With after above-mentioned training after trained network model modification currently after training network model, from the language material in each field text
Extract a language material text in this, and by the above-mentioned updated current network model to be trained of the language material text input of extraction into
Row training, the network model to be trained after being trained;
Above-mentioned when the language material text of each clause rule template or each field in training text is not extracted and finished, use
After the training after trained network model modification currently wait train network model after, return extract one from each clause rule template
The step of a clause rule template, until the above-mentioned each clause rule template waited in training text, the language material text in each field are equal
Extraction finishes;
Network model to be trained after the training of acquisition is determined as above-mentioned key words probabilities network model.
Wherein, in the above description of the specific example, be judgement wait for training text each clause rule template or
The language material text in each field does not extract when finishing, just with the current network model to be trained of the network model modification to be trained after training
It returns and is illustrated for extracting clause rule template again, can also be first to be waited for after training in particular technique application
It trains network model modification currently after training network model, then treats each clause rule template and each field in training text
Language material text whether extract to finish and judged, at this point, each clause rule template and each field in waiting for training text
It is that updated current network model to be trained is determined as above-mentioned key words probabilities network mould after language material text extracts
Type.
Wherein, current in the clause rule template or language material text input that will be extracted in above-mentioned two specific example
Can input the clause rule template of extraction or language material text as unit of word when trained network model is trained
Current network model to be trained is trained, to obtain preferable generalization ability.
Based on embodiment as described above and its specific example, this embodiment scheme is when particular technique is realized, Ke Yishi
It is divided into two processes of text entity extraction on the training of line drag and line.Wherein, it when online drag training, can instructed
After practicing data (after training text), by waiting for that training text is trained to acquisition, key words probabilities network to the end is obtained
Model can be carried out as shown in figure 5, text entities extract the stage on line with key words probabilities network model obtained above
The extraction of keyword, as shown in Figure 6.
When into line drag training, two kinds of training data can be prepared, one kind is that each vertical service is led
The rule template in domain, such as by taking music scenario as an example, rule template can be:I want to listen [song] of [singer],
[song] is who sings, [album] which inner song, wherein [singer] indicates that singer, [songer] indicate song,
[album] indicates album.These rule templates are known as clause rule template in the present embodiment.Wherein, for different vertical
For service field, such as music, film, weather etc., there can be different clause rule templates, to for different vertical
Service field trains corresponding different key words probabilities network model.
When collecting these clause rule templates, these clause rule templates can be that the user data collected marks out
Language material, can also be some simple templates manually write by developer.Incipient stage in each vertical service field, this
A little clause rule templates usually can be the rule template write by developer.
Another kind of training data can be the language material text in non-perpendicular field, usually can be some colloquial style language material numbers
According to supplement some language materials that may do not have in above-mentioned clause rule template, to improve the key words probabilities network model of training
Generalization ability, be referred to as the language material text in each field in the present embodiment.For example, " I wants to hear song ", due to " listening one
Under " this saying do not occurred in clause rule masterplate, the generalized poor ability for the model that training obtains can be caused.If
Occur " hearing " this word in text to be extracted, may not just can identify keyword or is identified as the keyword of mistake.From
And it can be by the way that some colloquial style texts (the language material text in each field) be added, to improve the extensive energy for the model that training obtains
Power.In order to not influence the extraction of the keyword in each vertical field, this part language material text can be selected from some non-perpendicular fields,
I.e. these language material texts can be adapted for the training of the model in each vertical field.In a particular application, it can be climbed by reptile
The mode taken obtains these language material texts, and the number of the language material text specifically crawled can be determined in conjunction with actual needs.
It is above-mentioned when training text obtaining, can be that actual needs is combined to be trained.As described above, based on waiting training
The difference of the restriction of the number of clause rule template and the language material text in each field in text, specific training process can be
Difference.
Fig. 7 shows the flow diagram of the generation key words probabilities network model in a specific example, this specifically shows
It is to be illustrated for not treating the language material text that training text is clause rule template or each field and distinguishing in example.
As shown in fig. 7, obtain the language material text comprising clause rule template and each field first waits for training text, with sound
For happy field, clause rule template can be I want to listen [song] of [singer], [song] be it is that who sings, [album] it is inner
Which song etc., the language material text in each field can be include texts such as " I want to hear song ", wherein clause rule template
It is the text for being only applicable to current area, the language material text in each field is to be applicable not only to current area, can be applicable to it
The text in his field.
Then, as shown in fig. 7, specific training process can be:
From waiting for extracting a clause rule template or language material text in training text at random, i.e., that extracts at this time is current
Wait for that training text may be clause rule template, it is also possible to language material text;
By the clause rule template of extraction or language material text input, currently network model to be trained is trained, and is instructed
Network model to be trained after white silk;
Judge that waiting for whether the language material text of each clause rule template and each field in training text extracts finishes;
It finishes, that is, is waited in the clause rule template in training text or the language material text in each field at least if not extracting
When one text does not extract also, then with the current network model to be trained of the network model modification to be trained after above-mentioned training, and return
It returns from above-mentioned the step of waiting for extracting a clause rule template or language material text in training text at random, repeats the above process,
Until the language material text of the above-mentioned each clause rule template waited in training text, each field is extracted and is finished;
If extraction finishes, that is, wait for having extracted in the language material text of the clause rule template and each field in training text
Finish, then the network model to be trained after the training of acquisition is determined as above-mentioned key words probabilities network model, completes above-mentioned training
Process.
Wherein, it is to wait for each clause rule template of training text or each field in judgement in above-mentioned specific example
Language material text does not extract when finishing, just with being carried out for the network model modification to be trained after training currently network model to be trained
Illustrate, can also be first with the current network mould to be trained of the network model modification to be trained after trained in particular technique application
After type, then treats the language material text of each clause rule template and each field in training text and whether extracts to finish and judged,
It is after updating at this point, after the language material text of each clause rule template and each field in waiting for training text extracts
Current network model to be trained be determined as above-mentioned key words probabilities network model.
By the clause rule template of extraction or language material text input currently when trained network model is trained, can
To be that the clause rule template of extraction or language material text are inputted current network model to be trained to be trained as unit of word.
It is trained by being inputted as unit of word, is obtained so as to avoid when being inputted as unit of word, in the case where language material is fewer
The poor situation of the very sparse effect of the result arrived improves to obtain preferable generalization ability and is directed to shorter short text
Extraction accuracy and accuracy.Wherein, for different service fields, different corresponding key words probabilities can be trained
Network model.
Above-mentioned current network model to be trained can use possible training pattern in conjunction with actual needs, specific at one
Using in example, can use LSTM (Long Short-Term Memory, long memory network in short-term) as wait for training pattern into
Row training, LSTM can learn long-term Dependency Specification well as a kind of special convolutional neural networks, can be with using LSTM
Approximate calculation goes out the probability of syntax establishment well.Due to many of LSTM networks unknown parameter, pass through above-mentioned training
Process is estimated that the specific value of these parameters, then in specific keyword extraction to the key in text to be extracted
Word extracts.In the training process, LSTM networks are based on, BPTT (Back Propagation Through may be used
Time) algorithm is trained.
Fig. 8 shows the flow diagram of the generation key words probabilities network model in another specific example, this is specific
It is to wait for that the number of the clause rule template in training text is identical as the number of language material text, is directed to clause rule mould in example
Plate and the expectation text in each field illustrate for alternately training.
As shown in figure 8, specific training process can be:
A clause rule template is extracted from each clause rule template;
The current network model to be trained of clause rule template input of extraction is trained, waits instructing after being trained
Practice network model;
With the current network model to be trained of the network model modification to be trained after above-mentioned training;
A language material text is extracted from the language material text in each field;
The above-mentioned updated current network model to be trained of the language material text input of extraction is trained, is trained
Network model to be trained afterwards;
Judge whether the above-mentioned language material text for waiting for each clause rule template and each field in training text extracts to finish;
It is finished if not extracting, i.e., the above-mentioned language material text for waiting for each clause rule template or each field in training text is not
Extraction finishes, then with after the training after trained network model modification currently wait train network model after, return from each clause advise
A step of clause rule template is then extracted in template, until the above-mentioned each clause rule template waited in training text, each neck
The language material text in domain, which extracts, to be finished;
If extraction finishes, the network model to be trained after the training of acquisition is determined as above-mentioned key words probabilities network mould
Type.
It is to be trained first to extract clause rule template to extract the language material text in each field again in above-mentioned specific example
For illustrate, can also be first to extract the language material text in each field to be trained and extract clause again in another example
Rule template is trained.
It is to wait for each clause rule template of training text or each field in judgement in addition, in above-mentioned specific example
Language material text does not extract when finishing, just again with the currently network model return to be trained of the network model modification to be trained after training
It is illustrated for extraction clause rule template, can also be first with the network to be trained after training in particular technique application
Model modification is currently after training network model, then treats the language material text of each clause rule template and each field in training text
Whether this, which extracts to finish, is judged, as shown in figure 9, at this point, each clause rule template in waiting for training text and each field
Language material text extract after, be that updated current network model to be trained is determined as above-mentioned key words probabilities network
Model.
The other technical characteristics generated in key words probabilities network model in example shown in above-mentioned Fig. 8, Fig. 9, can be with
It is identical as in example shown in Fig. 7.
After obtaining key words probabilities network model by training, you can applied, to the key in text to be extracted
Word extracts.Can be that server will in the case where being to be trained to obtain key words probabilities network model by server
The key words probabilities network model is sent to after terminal, and the extraction of text key word is carried out by terminal, can also be by servicing
Device receive terminal transmission text to be extracted after, by server itself carry out text key word extraction.Be by terminal into
In the case that row training obtains key words probabilities network model, terminal itself can be based on the key words probabilities network model and carry out
The extraction of text key word can also be to be distributed to after the key words probabilities network model is sent to server, by server
Other-end, server and each terminal can carry out the extraction of text key word based on the key words probabilities network model.
When specifically carrying out text key word extraction, text to be extracted is first obtained, which can be that terminal is used
The text that family is inputted by user-interactive devices such as keyboard, touch screens, can also be by the voice to terminal user into
The text that row identification obtains, can also be the text obtained by other means.
In the present embodiment, after obtaining text to be extracted, it can first determine its current affiliated field, then be directed to again
Field corresponding keywords database and key words probabilities network model of the affiliated field in conjunction with belonging to this carry out text key word
Extraction.When only needing to carry out text key word extraction to a field, such as intelligent sound box, then it can bind directly acquiescence
Keywords database and key words probabilities network model carry out the extraction of text key word.It is crucial text may be carried out to multiple fields
When word extracts, such as in server execution, then after can first determining affiliated field, in conjunction with the affiliated corresponding key in field
Dictionary and key words probabilities network model carry out the extraction of text key word.It is affiliated to have determined that in following examples
It is illustrated for field.
After obtaining text to be extracted, according to the text fields to be extracted, according to the associated key of its fields
Dictionary scans in associated keywords database, matches the keyword in text to be extracted, to which exhaustion goes out text to be extracted
Keyword in this.Then according to the keyword in text to be extracted, the text to be extracted matched, all texts are determined
Clause and the combination of corresponding keyword, wherein any one the text clause determined and its combination of corresponding keyword, jointly
Constitute above-mentioned text to be extracted.It is matching it will be understood by those skilled in the art that matching the keyword in text to be extracted
Go out the word that all words with keywords database in text to be extracted match, and determines all text clause and corresponding pass
Keyword combines, and is all possible clause for matching the text to be extracted and the keyword under the clause.Assuming that be extracted
Text be " I wants to listen the QLX of ABC ", singer's entity be { " AB ", " ABC " }, song entity library be { " QLX " }, wherein A, B, C,
Q, L, X indicate a specific word or character respectively.So, it is " I wants to listen the QLX of ABC " for text to be extracted, according to
Singer's entity library { " AB ", " ABC " } and song entity library { " QLX " }, the text key word to be extracted matched are then:AB、
ABC, QLX, and then the possible text clause determined includes:I want to listen the QLX of ABC, I want to listen the QLX of [singer] C, I
Want to listen [singer] QLX, I want to listen [song] of [singer] C, I want to listen [song] of [singer], can obtained from
The text clause of energy and the combination of corresponding keyword are as shown in table 1 below.
Table 1
Possible combination | [singer] | [song] | Probability |
I wants to listen the QLX of ABC | 0.001 | ||
I wants to listen the QLX of [singer] C | AB | 0.002 | |
I wants to listen the QLX of [singer] | ABC | 0.009 | |
I wants to listen [song] of [singer] C | AB | QLX | 0.011 |
I wants to listen [song] of [singer] | ABC | QLX | 0.051 |
In conjunction with upper table 1 as it can be seen that clause " I wants to listen the QLX of [singer] C " it is corresponding keyword combination
“[singer]:AB " has collectively constituted original text to be extracted " I wants to listen the QLX of ABC ", and " I wants to listen [singer] clause
Keyword combination " [singer] corresponding QLX ":ABC " has collectively constituted original text to be extracted, and " I wants to listen ABC's
QLX ", the corresponding keyword combination " [singer] of clause " I wants to listen [song] of [singer] C ":AB;[song]:
QLX " has collectively constituted original text to be extracted " I wants to listen the QLX of ABC ".
Then, each text clause is inputted into above-mentioned key words probabilities network model, each text clause and correspondence can be obtained
The vertical probability of crucial phrase synthesis, as shown in upper table 1 last row.From table 1 it follows that probability value is maximum is
0.051, i.e., the value for the probability that " I wants to listen [QLX] of [ABC] " is set up is maximum, therefore the maximum probability 0.051 of selected value corresponds to
Wen Weiben clause and the combination of corresponding keyword, finally the crucial phrase of determining extraction be combined into { [singer]:ABC;
[song]:QLX}.
Based on thought same as mentioned above, the present embodiment also provides a kind of text key word extraction element, and Figure 10 shows
The structural schematic diagram of the text key word extraction element in one embodiment is gone out.
As shown in Figure 10, the text key word extraction element of the embodiment includes:
Text acquisition module 101, for obtaining text to be extracted;
Keywords matching module 102 matches the text to be extracted for being scanned in associated keywords database
In keyword;
Determining module 103 is combined, for according to the key in the text to be extracted, the text to be extracted matched
Word determines all text clause and the combination of corresponding keyword, wherein any one text clause for determining and its right
The keyword combination answered, has collectively constituted above-mentioned text to be extracted;
Probability analysis module 104, for according to key words probabilities network model, analysis determines each text clause and right
The probability that the crucial phrase synthesis answered is stood;
Determining module 105 is extracted, is corresponded to for probability analysis module 104 to be analyzed the maximum probability of determining probability intermediate value
The keyword combination keyword combination that is determined as extracting from the text to be extracted.
According to the scheme of embodiment as described above, when needing to extract the keyword in complete to be extracted,
It is to be based on associated keywords database, is scanned in associated keywords database, match the keyword in text to be extracted, so
All text clause and the combination of corresponding keyword are determined based on keyword afterwards, further according to key words probabilities network model point
Analysis determines each text clause and the probability that the synthesis of corresponding crucial phrase is stood, and the determining probability intermediate value of analysis is maximum
The corresponding keyword combination of probability is determined as the keyword extracted from the text to be extracted combination.Its extract it is to be extracted
On the basis of keyword in text, determines all text clause and the combination of corresponding keyword, be then based on keyword
Probabilistic Network Model determines the probability of each text clause and the combination of corresponding keyword, not only fast response time, but also
The difficulty for simplifying extraction text key word, improves the accuracy of text key word.
Scheme in embodiment as described above, can execute in terminal, can also be to be executed on server.
For being executed in terminal, above-mentioned text to be extracted can be the text of terminal user's input, such as terminal use
The text that family is inputted by user-interactive devices such as keyboard, touch screens can also be by the voice progress to terminal user
Identify that obtained text can also be in other embodiments the text obtained by other means.
Can receive above-mentioned text to be extracted from terminal, in acquisition, this waits carrying terminal for executing on the server
After taking text, which is uploaded to server.The text to be extracted can be the text of terminal user's input, such as
The text that terminal user is inputted by user-interactive devices such as keyboard, touch screens can also be by the language to terminal user
The text that sound is identified can also be the text obtained by other means.
On the other hand, it is arranged when in terminal or server in the device, above-mentioned key words probabilities network model can be with
It is to be generated in advance by terminal or server.Therefore, in a specific example, as shown in figure 11, text keyword extraction
Device can also include:
Model generation module 106, for generating the key words probabilities network model.
In addition, the device be arranged when in terminal, can also be server generate key words probabilities network model after,
Terminal obtains the key words probabilities network model from server.Therefore, as shown in figure 11, in another embodiment, the text
Keyword extracting device can also include:
Model acquisition module 107, the key words probabilities network model for obtaining server generation.
Figure 12 shows the structural schematic diagram of the model generation module 106 in a specific example, as shown in figure 12, the mould
Type generation module 106 includes:
Training text acquisition module 1061 waits for training text for obtaining, described to wait for that training text includes each clause rule
The language material text in template and each field;
Training module 1062 obtains the key words probabilities network mould for waiting for that training text is trained according to
Type.
Wherein, above-mentioned clause rule template shows specific clause rule.Since set clause rule may not
All clause, such as some colloquial clause can be included, can also include each field in waiting for training text therefore
Language material text, the language material text can be some colloquial texts.In a concrete application realization method, the language in each field
Material text can obtain in such a way that reptile crawls.
As shown in figure 12, which can specifically include:Training text extraction unit 10621, training unit
10622, model determination unit 10623.
Wherein, wait for that training text is trained in above-mentioned basis, when obtaining key words probabilities network model, due to waiting training
Text includes each clause rule template and the language material text in each field both texts, thus can also be in conjunction with real in training
Border technology needs are determined.
In a specific example, according to when training text is trained, it is clause that can not treat training text
Rule template or the language material text in each field distinguish, and during training each time, randomly select once, this
When:
Above-mentioned training text extraction unit 1061, for waiting for extracting one in training text at random currently waiting training from described
Text, it is described currently to wait for that training text is clause rule template or language material text, and waited for after training unit is trained
After training network model, the language material text for waiting for each clause rule template or each field in training text do not extract and finishes
When, it waits for that extracting one in training text at random currently waits for training text from described again, is waited for described in each in training text
Clause rule template, each field language material text extract and finish;
Above-mentioned training unit 10622 currently waits for that training text inputs for extract the training text extraction module
Current network model to be trained is trained, and the network model to be trained after being trained is used in combination after the training and waits training
The current network model to be trained of network model update;
Above-mentioned model determination unit 10623, in each clause rule template waited in training text and each field
Language material text extract when finishing, the network model to be trained after training that the training unit obtains is determined as the pass
Keyword Probabilistic Network Model.
In another specific example, it can be the number and language material text that will wait for the clause rule template in training text
Number be set as identical, can be handed over for the expectation text in clause rule template and each field at this time when being trained
For progress, at this time:
Above-mentioned training text extraction unit 10621, in each clause rule template waited in training text or
The language material text in each field does not extract when finishing, and a clause rule template is alternately extracted from each clause rule template
Or a language material text is extracted from each language material text;
Above-mentioned training unit 10622, clause rule template or language for extracting the training text extraction unit
Currently network model to be trained is trained material text input, and the training is used in combination in the network model to be trained after being trained
The current network model to be trained of network model modification to be trained afterwards;
Above-mentioned model determination unit 10623, in each clause rule template waited in training text and each field
Language material text extract when finishing, the network model to be trained after training that the training unit obtains is determined as the pass
Keyword Probabilistic Network Model.
Wherein, in above-mentioned two specific example, training unit 10622 is in the clause rule template or language material that will be extracted
Text input currently when trained network model is trained, can be the clause rule template that will be extracted or language material text with
Word is that the current network model to be trained of unit input is trained, to obtain preferable generalization ability.
One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, it is non-volatile computer-readable that the program can be stored in one
It takes in storage medium, in the embodiment of the present invention, which can be stored in the storage medium of computer system, and by the calculating
At least one of machine system processor executes, and includes the flow such as the embodiment of above-mentioned each method with realization.Wherein, described
Storage medium can be magnetic disc, CD, read-only memory (Read-Only Memory, ROM) or random access memory
(Random Access Memory, RAM) etc..
Each technical characteristic of embodiment described above can be combined arbitrarily, to keep description succinct, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, it is all considered to be the range of this specification record.
Several embodiments of the invention above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention
Range.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.
Claims (14)
1. a kind of text key word extracting method, which is characterized in that including:
Obtain text to be extracted;
It is scanned in associated keywords database, matches the keyword in the text to be extracted;
According to the keyword in the text to be extracted, the text to be extracted matched, all text clause are determined
And corresponding keyword combination;
According to key words probabilities network model, it is general that analysis determines that each text clause and the synthesis of corresponding crucial phrase are stood
Rate;
The corresponding keyword combination of the determining maximum probability of probability intermediate value of analysis is determined as carrying from the text to be extracted
The keyword combination taken.
2. text key word extracting method according to claim 1, which is characterized in that before obtaining text to be extracted,
It further include step:
Generate the key words probabilities network model.
3. text key word extracting method according to claim 2, which is characterized in that generate the key words probabilities network
The mode of model includes:
Acquisition waits for training text, described to wait for that training text includes each clause rule template and the language material text in each field;
It waits for that training text is trained to described, obtains the key words probabilities network model.
4. text key word extracting method according to claim 3, which is characterized in that wait for that training text is instructed to described
Practice, obtains the key words probabilities network model, including:
Wait for that extracting one in training text at random currently waits for training text from described, it is described currently to wait for that training text is clause rule
Template or language material text;
Current by extraction waits for that the current network model to be trained of training text input is trained, and training net is waited for after being trained
Network model;
Described when the language material text of each clause rule template or each field in training text is not extracted and finished, with described
Currently network model to be trained, return wait for extracting one in training text at random network model modification to be trained after training from described
A current the step of waiting for training text, until the language material text of each clause rule template waited in training text, each field
Extraction finishes;
Network model to be trained after the training of acquisition is determined as the key words probabilities network model.
5. text key word extracting method according to claim 3, which is characterized in that the clause waited in training text
The number of rule template is identical as the number of language material text;
It waits for that training text is trained according to described, obtains the key words probabilities network model, including:
A clause rule template is extracted from each clause rule template, and the input of the clause rule template of extraction is worked as
Before network model to be trained be trained, the network model to be trained after being trained;
With after the training after trained network model modification currently wait train network model after, carried from each language material text
A language material text is taken, and currently network model to be trained is trained by the language material text input of extraction, after being trained
Network model to be trained;
Described when the language material text of each clause rule template or each field in training text is not extracted and finished, with the instruction
After white silk after trained network model modification currently wait train network model after, return extract one from each clause rule template
The step of a clause rule template, until described wait for that each clause rule template, the language material text in each field in training text are equal
Extraction finishes;
Network model to be trained after the training of acquisition is determined as the key words probabilities network model.
6. text key word extracting method according to claim 4 or 5, which is characterized in that by the clause rule mould of extraction
Plate or language material text are inputted current network model to be trained as unit of word and are trained.
7. text key word extracting method according to claim 1, which is characterized in that before obtaining text to be extracted,
It further include step:
Obtain the key words probabilities network model that server generates.
8. a kind of text key word extraction element, which is characterized in that including:
Text acquisition module, for obtaining text to be extracted;
Keywords matching module matches the pass in the text to be extracted for being scanned in associated keywords database
Keyword;
Determining module is combined, for according to the keyword in the text to be extracted, the text to be extracted matched, determining
Go out all text clause and the combination of corresponding keyword;
Probability analysis module, for according to key words probabilities network model, analyzing and determining each text clause and corresponding pass
The probability that keyword combination is set up;
Determining module is extracted, the corresponding key of the maximum probability of probability intermediate value for determining the probability analysis module analysis
Word combination is determined as the keyword extracted from the text to be extracted combination.
9. text key word extraction element according to claim 8, which is characterized in that further include:
Model generation module, for generating the key words probabilities network model.
10. text key word extraction element according to claim 9, which is characterized in that the model generation module includes:
Training text acquisition module waits for training text for obtaining, it is described wait for training text include each clause rule template and
The language material text in each field;
Training module obtains the key words probabilities network model for waiting for that training text is trained to described.
11. text key word extraction element according to claim 10, which is characterized in that the training module includes:
Training text extraction unit, it is described to work as waiting for that extracting one in training text at random currently waits for training text from described
Before wait for that training text is clause rule template or language material text, and the network model to be trained after training unit is trained
Afterwards, described when the language material text of each clause rule template or each field in training text is not extracted and finished, again from institute
It states and waits for extracting one at random in training text and currently wait for training text, until each clause rule mould waited in training text
Plate, each field language material text extract and finish;
Training unit currently waits for the current network to be trained of training text input for extract the training text extraction module
Model is trained, the network model to be trained after being trained, and the network model modification to be trained after the training is used in combination to work as
Before network model to be trained;
Model determination unit, for being carried in the language material text of each clause rule template waited in training text and each field
When taking complete, the network model to be trained after training that the training unit obtains is determined as the key words probabilities network mould
Type.
12. text key word extraction element according to claim 10, which is characterized in that the sentence waited in training text
The number of formula rule template is identical as the number of language material text;
The training module includes:
Training text extraction unit, for the language material text in each clause rule template waited in training text or each field
This is not extracted when finishing, and a clause rule template is alternately extracted from each clause rule template or from each institute's predicate
Expect to extract a language material text in text;
Training unit, the clause rule template or language material text input for extracting the training text extraction unit are current
Network model to be trained is trained, and the network to be trained after the training is used in combination in the network model to be trained after being trained
The current network model to be trained of model modification;
Model determination unit, for being carried in the language material text of each clause rule template waited in training text and each field
When taking complete, the network model to be trained after training that the training unit obtains is determined as the key words probabilities network mould
Type.
13. text key word extraction element according to claim 11 or 12, which is characterized in that the training unit will carry
The clause rule template or language material text taken is inputted current network model to be trained as unit of word and is trained.
14. text key word extraction element according to claim 8, which is characterized in that further include:
Model acquisition module, the key words probabilities network model for obtaining server generation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710203566.8A CN108304424B (en) | 2017-03-30 | 2017-03-30 | Text keyword extraction method and text keyword extraction device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710203566.8A CN108304424B (en) | 2017-03-30 | 2017-03-30 | Text keyword extraction method and text keyword extraction device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108304424A true CN108304424A (en) | 2018-07-20 |
CN108304424B CN108304424B (en) | 2021-09-07 |
Family
ID=62872103
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710203566.8A Active CN108304424B (en) | 2017-03-30 | 2017-03-30 | Text keyword extraction method and text keyword extraction device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108304424B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109271521A (en) * | 2018-11-16 | 2019-01-25 | 北京九狐时代智能科技有限公司 | A kind of file classification method and device |
CN110377916A (en) * | 2018-08-17 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Word prediction technique, device, computer equipment and storage medium |
CN111309878A (en) * | 2020-01-19 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Retrieval type question-answering method, model training method, server and storage medium |
CN111324722A (en) * | 2020-05-15 | 2020-06-23 | 支付宝(杭州)信息技术有限公司 | Method and system for training word weight model |
CN111737979A (en) * | 2020-06-18 | 2020-10-02 | 龙马智芯(珠海横琴)科技有限公司 | Keyword correction method, device, correction equipment and storage medium for voice text |
CN113010648A (en) * | 2021-04-15 | 2021-06-22 | 联仁健康医疗大数据科技股份有限公司 | Content search method, content search device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103186509A (en) * | 2011-12-29 | 2013-07-03 | 北京百度网讯科技有限公司 | Wildcard character class template generalization method and device and general template generalization method and system |
CN104239300A (en) * | 2013-06-06 | 2014-12-24 | 富士通株式会社 | Method and device for excavating semantic keywords from text |
US20150347383A1 (en) * | 2014-05-30 | 2015-12-03 | Apple Inc. | Text prediction using combined word n-gram and unigram language models |
CN105138515A (en) * | 2015-09-02 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Named entity recognition method and device |
-
2017
- 2017-03-30 CN CN201710203566.8A patent/CN108304424B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103186509A (en) * | 2011-12-29 | 2013-07-03 | 北京百度网讯科技有限公司 | Wildcard character class template generalization method and device and general template generalization method and system |
CN104239300A (en) * | 2013-06-06 | 2014-12-24 | 富士通株式会社 | Method and device for excavating semantic keywords from text |
US20150347383A1 (en) * | 2014-05-30 | 2015-12-03 | Apple Inc. | Text prediction using combined word n-gram and unigram language models |
CN105138515A (en) * | 2015-09-02 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Named entity recognition method and device |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110377916A (en) * | 2018-08-17 | 2019-10-25 | 腾讯科技(深圳)有限公司 | Word prediction technique, device, computer equipment and storage medium |
CN110377916B (en) * | 2018-08-17 | 2022-12-16 | 腾讯科技(深圳)有限公司 | Word prediction method, word prediction device, computer equipment and storage medium |
CN109271521A (en) * | 2018-11-16 | 2019-01-25 | 北京九狐时代智能科技有限公司 | A kind of file classification method and device |
CN111309878A (en) * | 2020-01-19 | 2020-06-19 | 支付宝(杭州)信息技术有限公司 | Retrieval type question-answering method, model training method, server and storage medium |
CN111309878B (en) * | 2020-01-19 | 2023-08-22 | 支付宝(杭州)信息技术有限公司 | Search type question-answering method, model training method, server and storage medium |
CN111324722A (en) * | 2020-05-15 | 2020-06-23 | 支付宝(杭州)信息技术有限公司 | Method and system for training word weight model |
CN111737979A (en) * | 2020-06-18 | 2020-10-02 | 龙马智芯(珠海横琴)科技有限公司 | Keyword correction method, device, correction equipment and storage medium for voice text |
CN113010648A (en) * | 2021-04-15 | 2021-06-22 | 联仁健康医疗大数据科技股份有限公司 | Content search method, content search device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN108304424B (en) | 2021-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106649825B (en) | Voice interaction system and creation method and device thereof | |
CN108304424A (en) | Text key word extracting method and text key word extraction element | |
CN108491443B (en) | Computer-implemented method and computer system for interacting with a user | |
CN108446286A (en) | A kind of generation method, device and the server of the answer of natural language question sentence | |
CN109325040B (en) | FAQ question-answer library generalization method, device and equipment | |
CN107818164A (en) | A kind of intelligent answer method and its system | |
CN111708869B (en) | Processing method and device for man-machine conversation | |
CN107818781A (en) | Intelligent interactive method, equipment and storage medium | |
CN110032623B (en) | Method and device for matching question of user with title of knowledge point | |
CN108711420A (en) | Multilingual hybrid model foundation, data capture method and device, electronic equipment | |
CN105956053B (en) | A kind of searching method and device based on the network information | |
CN109271493A (en) | A kind of language text processing method, device and storage medium | |
CN102165518A (en) | System and method for generating natural language phrases from user utterances in dialog systems | |
CN109857846B (en) | Method and device for matching user question and knowledge point | |
CN110147544B (en) | Instruction generation method and device based on natural language and related equipment | |
CN107704453A (en) | A kind of word semantic analysis, word semantic analysis terminal and storage medium | |
CN110457689A (en) | Semantic processes method and relevant apparatus | |
Dinarelli et al. | Discriminative reranking for spoken language understanding | |
CN111209363B (en) | Corpus data processing method, corpus data processing device, server and storage medium | |
CN107832439A (en) | Method, system and the terminal device of more wheel state trackings | |
CN109918627A (en) | Document creation method, device, electronic equipment and storage medium | |
CN112527955A (en) | Data processing method and device | |
CN110413992A (en) | A kind of semantic analysis recognition methods, system, medium and equipment | |
CN112559718B (en) | Method, device, electronic equipment and storage medium for dialogue processing | |
CN113779987A (en) | Event co-reference disambiguation method and system based on self-attention enhanced semantics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |