Detailed description of the invention
For making the purpose of the embodiment of the present application, technical scheme and advantage clearer, below in conjunction with this Shen
Please accompanying drawing in embodiment, the technical scheme in the embodiment of the present application is clearly and completely described,
Obviously, described embodiment is some embodiments of the present application rather than whole embodiments.Based on
Embodiment in the application, those of ordinary skill in the art are obtained under not making creative work premise
The every other embodiment obtained, broadly falls into the scope of the application protection.
For ease of the understanding to the embodiment of the present application, do further with specific embodiment below in conjunction with accompanying drawing
Illustrating, embodiment is not intended that the restriction to the embodiment of the present application.
The method of prediction user's word to be entered that the embodiment of the present application provides can be combined with any input method.Root
According to user inputted term vector corresponding to word and in word table corresponding with term vector record multiple words corresponding
Term vector, it was predicted that user's word to be entered, and show user's word to be entered of prediction so that user can be from
The user's word to be entered shown directly selects and is actually subjected to input word.Achieve and combine context of co-text prediction
User's word to be entered, improves the hit rate of user's word to be entered of prediction, such that it is able to effectively promote defeated
Enter the input speed of method.
Wherein, described term vector is the vector of regular length, and this length can arbitrarily set.In the present embodiment
Word is mapped as term vector, as the basis carrying out quantum chemical method during predicting user's word to be entered.
Institute's predicate table corresponding with term vector be based on substantial amounts of specification text training obtain for preserve word with
The set of term vector corresponding relation.The method of the prediction user's word to be entered namely performing the application needs instruction
Get this word table corresponding with term vector.Training the process of this word table corresponding with term vector is to perform this pre-
Complete before surveying the method for user's word to be entered.
The method flow diagram of prediction user's word to be entered that Fig. 1 provides for a kind of embodiment of the application.Described
The executive agent of method can be the equipment with disposal ability: server or system or device, as
Shown in Fig. 1, described method specifically may include that
Step 110, reads the natural language of user's input from current statistic unit, and described statistic unit is
Semantic unit between predetermined punctuation mark in the natural language of user's input.
Alternatively, can also include training institute's predicate table corresponding with term vector before performing step 110
Step, can improve the word of record and term vector corresponding relation in word table corresponding with term vector by this step
Accuracy, such that it is able to improve according to just inputting term vector prediction user's word to be entered corresponding to word
Really property.
Specifically comprise the following steps that
Step A: choose sample from corpus, carries out natural language processing to described sample, obtains multiple
Continuous print word.
It should be noted that described continuous print word, i.e. selected multiple words are at a statistic unit
In.Described statistic unit refers to the semantic unit between predetermined punctuation mark.Described predetermined punctuation mark
Including: comma, fullstop, question mark, branch, exclamation mark, ellipsis, pause mark, colon.
Sample in corpus can be the literary composition collected from Webpage in advance by server or client
Word information, e.g., a large amount of articles on webpage, comment etc., in these articles, comment, each two is adjacent
Pre-scaling point all forms a statistic unit between meeting.In one embodiment, can be such a
Statistic unit is as a sample;Or can also be to be inputted from user in advance by server or client
Word message in collect.Natural language processing includes standardization and word segmentation processing etc., for example,
Standardization is exactly by unified for the English capitalization comprised in the sample small letter that is converted into, or will wrap in sample
The complex form of Chinese characters contained is converted into simplified Chinese character etc. and processes.And a sample is split into multiple company by word segmentation processing exactly
Continuous word, e.g., by sample " we need to administer city haze " after word segmentation processing, it is possible to
Obtain multiple continuous print word: " we ", " needs ", " improvement ", " city " and " haze ".
Step B: the institute inquired about in described sample in addition to last word in institute's predicate table corresponding with term vector
There is the term vector that other word is corresponding, table corresponding with term vector for institute's predicate do not record the word of term vector,
It is then term vector corresponding to its random assortment.
In the present embodiment, it is referred to as training word by other words all in addition to last word in described sample,
And be referred to as verifying word by last word in described sample.
Term vector is the vector for characterizing word, and it can include several dimensions, the most each dimension pair
Answer the numerical value between one [-1,1].Each word is uniquely corresponding to a term vector, and each term vector is also
It is uniquely corresponding to a word.
Word table corresponding with term vector is for recording multiple word and term vector corresponding to multiple word, the most
Individual word can be the whole word or part obtained after the sample in above-mentioned corpus is done word segmentation processing
Word;Time initial, the term vector that above-mentioned multiple words are corresponding is empty, does not i.e. protect in this word table corresponding with term vector
Deposit the term vector that multiple word is corresponding, then can be term vector corresponding to multiple word random assortment.Such as, may be used
From [-1,1], a number is randomly choosed for each dimension in the term vector that word each in multiple words is corresponding
Value.
The dimension of term vector is not particularly limited by the embodiment of the present application, and the dimension of term vector is many, i.e. represents
The length of term vector is long.When training, the length of term vector is the longest, then train the word and term vector obtained
Corresponding relation the most accurate.
In previous example, after a sample in corpus is carried out natural language processing, obtain multiple
Continuous print word: " we ", " needs ", " improvement ", " city " and " haze ", according to
Above-mentioned definition understands, and training word includes: " we ", " needs ", " improvement ", " city ",
And verify that word is " haze ".
The term vector that word is corresponding of training obtained by query word table corresponding with term vector is respectively as follows:
The term vector of " we " is [C11、C12、C13、…C1m];
The term vector " needed " is [C21、C22、C23、…C2m];
The term vector " administered " is [C31、C32、C33、…C3m];
The term vector in " city " is [C41、C42、C43、…C4m]。
Step C: the term vector that other words all in addition to last word in described sample are corresponding is carried out pre-
Constant linear conversion or nonlinear transformation, obtain predicting term vector.
This step can obtain predicting term vector according to the term vector that training word is corresponding, concrete, can be to training
The term vector that word is corresponding carries out predetermined linear conversion or nonlinear transformation obtains predicting term vector.
Such as previous example, it is assumed that the term vector of prediction can be expressed as [y1、y2、y3、…ym], wherein, y1=
C11+C21+C31+C41, in like manner, y2=C12+C22+C32+C42..., ym=C1m+C2m+C3m+C4m, or
Person, In like manner,
When calculating prediction term vector according to training term vector corresponding to word, for word corresponding to training word to
The algorithm that each dimension of amount uses is identical, namely according to linear transformation method, then the most all
Use this linear transformation method.According to non-linear transformation method, then the most all use nonlinear transformation side
Method.
Step D: described prediction term vector is inputted voice training model, it is judged that the result of output whether with institute
Stating last word to be consistent, wherein, described voice training model is predefined for the word according to input
Vector draws the machine learning model of the word that this term vector is corresponding.
Voice training model is predefined for drawing, according to the term vector of input, the word that this term vector is corresponding
Machine learning model.The effect of this voice training model is to realize determining according to the term vector of input and being somebody's turn to do
The word that term vector is corresponding.The voice training model applied is not particularly limited by the embodiment of the present application.Should
Voice training model can use degree of depth learning model, such as, word2vec or RNN (Recurrent
Neural Network, Multi-Layer Feedback) etc..
After predicting term vector input voice training model, it is judged that whether the word of output is checking word, example
As, will prediction term vector [y1、y2、y3、…ym] input in voice training model, it is judged that the word of output is
No is " haze ".
If the word of output is consistent with the checking word in sample, then end operation, otherwise perform step E.
Step E: if not corresponding, then adjust word corresponding to described other words all in addition to last word to
Amount, until output result is consistent with last word described.
Concrete, if the word of output does not corresponds with the checking word in sample, then adjust prediction term vector, directly
Exporting, according to the prediction term vector after this adjustment, the word being consistent with checking word to voice training model.Root
According to the term vector that the prediction term vector corresponding adjusting training word after adjusting is corresponding.Concrete adjustment amplitude with
The predetermined linear conversion used or non-linear transformation method are correlated with.
Step F: utilize the term vector that other words all in addition to last word after adjusting are corresponding to update institute
The term vector that described in predicate table corresponding with term vector, other words all in addition to last word are corresponding.
It is, the term vector that word is corresponding of training after adjusting updates in institute's predicate table corresponding with term vector
In, this renewal specifically includes: the corresponding relation of training term vector corresponding to word after adjusting and training word
The term vector that in substitute table corresponding with term vector, existing training word is corresponding and the corresponding relation training word.
It is understood that above-mentioned flow process is the processing procedure for a sample, hands-on process needs
Utilizing substantial amounts of sample to repeat above-mentioned flow process, sample size is the most, then train the word and word obtained
In the corresponding table of vector, word is the most accurate with term vector corresponding relation.The word related to below the application and term vector pair
Answering table can be all to use said method to train the word table corresponding with term vector obtained.
Return in step 110, statistic unit be in the natural language that user inputs predetermined punctuation mark it
Between semantic unit, the definition of predetermined punctuation mark is as described in step A.It is, the embodiment of the present application
Can record that user in the statistic unit (i.e. current statistic unit) at user's input operation place inputted from
So language.
Step 120, carries out natural language processing, has been inputted word described natural language.
Wherein, step 120 specifically may include that
Described natural language is carried out natural language processing, obtains n the word that user recently inputs, wherein,
N is the number of the word that user has inputted and the smaller in predetermined number N, n and N in current statistic unit
For positive integer.
Herein, natural language processing includes standardization and word segmentation processing etc..Due to user input from
So language is the most nonstandard, as may be the natural language of input may not only included capitalizing but also including small letter,
It is also possible that the situations such as the complex form of Chinese characters, accordingly, it would be desirable to be standardized natural language processing, in order to machine
Device is capable of identify that.After being standardized natural language processing, in addition it is also necessary to carry out word segmentation processing, just
N the word that user recently inputs can be obtained.Wherein, the user of record has inputted the number of word and can set
Less than predetermined number N (N is positive integer).The user then needing record has inputted the number n (n of word
For positive integer) be: in current statistic unit in the number of the word that user recently inputs and predetermined number N
Smaller.
If, N is 5, then, after natural language is carried out natural language processing, inputted the individual of word
When number is 4, then records these 4 and inputted word;And after natural language is carried out natural language processing,
Inputted the number of word when being 6, the most only recorded front 5 words recently input, namely by record
Current statistic unit in first word delete.If current statistic unit has inputted, such as, user
Have input a comma, then start next statistic unit, start note from the first of next statistic unit word
Record, in previous statistic unit, the word of record is deleted.
Step 130, has inputted, described in inquiry, the term vector that word is corresponding in word table corresponding with term vector, wherein,
Institute's predicate table corresponding with term vector for record include described in inputted multiple words of word and the plurality of word
Corresponding term vector.
Obtain user in the current statistic unit recorded by query word table corresponding with term vector and input word
Corresponding term vector.Wherein, the definition such as step B institute of multiple words of record in word table corresponding with term vector
State.It should be noted that word table corresponding with term vector herein is to use step A to the training of step F
Method obtains, so, each word in multiple words is uniquely in word table corresponding with term vector for record
A corresponding term vector.
Step 140, according to described inputted term vector corresponding to word and word corresponding to the plurality of word to
Amount, it was predicted that user's word to be entered.
Wherein, step 140 specifically may include that
Step 1401, carries out predetermined linear conversion or non-linear change to described term vector corresponding to word that inputted
Change, obtain target word vector.
It should be noted that the alternative approach herein used (predetermined linear conversion or nonlinear transformation) with
The alternative approach used during training word table corresponding with term vector is consistent.
Step 1402, calculates term vector and described target word vector that in the plurality of word, each word is corresponding
Similarity.
Step 1403, predicts user's word to be entered according to described similarity.
The embodiment of the present application is to calculating the word that described in word table corresponding with term vector, in multiple words, each word is corresponding
The method of the similarity of vectorial and described target word vector is not particularly limited, and it can use existing arbitrary
Plant the method for similarity between two vectors that calculates.
Wherein, similarity height represents that hit rate is the highest, i.e. user to input the probability of this word the highest.Then basis
The method of described similarity prediction user's word to be entered can be: similarity is higher than one of predetermined threshold
Or word corresponding to multiple term vector is defined as described user word to be entered;And/or, according to described similarity,
The term vector that the plurality of word is corresponding is ranked up;One or more term vectors forward for sequence are corresponding
Word be defined as described user word to be entered.It is, can be by similarity higher than the one of described predetermined threshold
The word that individual or multiple term vector is corresponding all shows user;Or similarity is higher than the appointment of predetermined threshold
The word that quantity term vector is corresponding shows user;Or according to similarity order from high to low to term vector
After sequence, word corresponding for one or more term vectors forward for sequence is showed user.
Step 150, shows user's word to be entered of prediction.
When showing user's word to be entered of described prediction, can be somebody's turn to do according to the sequence of similarity order from high to low
User's word to be entered of prediction.
The present embodiment achieves and combines context of co-text prediction user's word to be entered, and shown in Figure 2 is pre-
The signal of user's word to be entered of the prediction shown in one of schematic diagram of user's word to be entered surveyed and Fig. 3
The two of figure, when including " we need to administer city " when user has inputted word in a statistic unit,
The user's word to be entered then predicted by the method for the present embodiment is included: " haze ", " pollution ", " rubbish
Rubbish " and " energy shortage " etc.;Include that when user has inputted word in a statistic unit " we need
Plan a city " time, the user's word to be entered predicted by the method for the present embodiment is included: " construction ",
" traffic ", " street " and " developing direction " etc..Visible, user's word to be entered of being predicted with
The linguistic context of context is relevant.When showing user's word to be entered of prediction due to can not be by all pre-
The user's word to be entered surveyed disposably is shown, therefore by the arrow to the left or backward in Fig. 2 and Fig. 3
Page turning or move left and right and show other user predicted word to be entered.
If the user of current presentation word to be entered contains user and is actually subjected to the word of input, then user is without again
Input any out of Memory and just can directly select this word.If the user of current presentation word to be entered does not comprise
User is actually subjected to the word of input, then can update the user of current presentation according to the out of Memory of user's input
Word to be entered.Described out of Memory is the relevant character of the reality word to be entered of user's input, including:
The actual part phonetic of word to be entered, the stroke of actual word to be entered or the katakana etc. of actual word to be entered
Deng.Wherein, according to the out of Memory of user's input, the user's word to be entered updating current presentation includes:
Receiving the out of Memory of user's input, according to described out of Memory, the user updating described prediction treats
The displaying order of input word;And/or,
Receive the out of Memory of user's input, according to described out of Memory, treat defeated from the user of described prediction
Enter word filters out targeted customer's word to be entered, show described targeted customer word to be entered.
As a example by described out of Memory is for the first letter of pinyin of actual word to be entered, if, initial presentation
User's word to be entered of prediction is as in figure 2 it is shown, when user's input Pinyin " s ", then the user after updating treats
Input word is as shown in Figure 4.Due to the first letter of pinyin of " water pollution " in the user's word to be entered shown
For " s ", the first letter of pinyin of other user word to be entered is non-" s ", therefore by word " water pollution "
Sequence in advance, the sequence of the user's word to be entered being namely consistent by the out of Memory inputted with user carries
Before.Or directly user's word to be entered that the out of Memory inputted with user is not inconsistent is hidden, and will with
User's word to be entered screening that the out of Memory of family input is consistent is targeted customer's word to be entered, e.g., permissible
" water pollution ", " appearance of the city ", " city's looks " and " trees " in Fig. 4 is screened as targeted customer
Word to be entered, and show this targeted customer word to be entered.If so actual word to be entered of user is that " water is dirty
Dye " time, then can directly select this word after inputting this first letter of pinyin " s ", and without input
Whole phonetics of this word, improve input speed.
The application provide prediction user's word to be entered method, inputted according to user word corresponding to word to
Amount and the term vector that multiple words of record are corresponding in word table corresponding with term vector, it was predicted that user is to be entered
Word, and show user's word to be entered of prediction so that user can from the user's word to be entered shown directly
Selection is actually subjected to input word.Achieve and combine context of co-text prediction user's word to be entered, improve prediction
The hit rate of user's word to be entered, such that it is able to effectively promote the input speed of input method.
With the method for above-mentioned prediction user word to be entered accordingly, what the embodiment of the present application also provided for is a kind of pre-
Survey the device of user's word to be entered, as it is shown in figure 5, this device may be disposed at any one input method existing
In system, combine in input process and inputted word prediction user's word to be entered.This device includes: read
Take unit 501, processing unit 502, query unit 503, predicting unit 504 and display unit 505.
Read unit 501, for reading the natural language of user's input from current statistic unit, described
Statistic unit is the semantic unit in the natural language that user inputs between predetermined punctuation mark.
Statistic unit is the semantic unit in the natural language that user inputs between predetermined punctuation mark, in advance
The definition of scaling point symbol is as described in step A.It is, the embodiment of the present application can record user inputs behaviour
Make the natural language that in the statistic unit (i.e. current statistic unit) at place, user has inputted.
Processing unit 502, for carrying out at natural language the described natural language reading unit 501 reading
Reason, has been inputted word.
Processing unit 502 specifically for: described natural language is carried out natural language processing, obtains user
N the word recently input, wherein, n be in current statistic unit the number of the word that user has inputted with predetermined
Smaller in number N, n and N is positive integer.
Herein, natural language processing includes standardization and word segmentation processing etc..Due to user input from
So language is the most nonstandard, as may be the natural language of input may not only included capitalizing but also including small letter,
It is also possible that the situations such as the complex form of Chinese characters, accordingly, it would be desirable to be standardized natural language processing, in order to machine
Device is capable of identify that.After being standardized natural language processing, in addition it is also necessary to carry out word segmentation processing, just
N the word that user recently inputs can be obtained.Wherein, the user of record has inputted the number of word and can set
Less than predetermined number N (N is positive integer).The user then needing record has inputted the number n (n of word
For positive integer) be: in current statistic unit in the number of the word that user recently inputs and predetermined number N
Smaller.
If, N is 5, then, after natural language is carried out natural language processing, inputted the individual of word
When number is 4, then records these 4 and inputted word;And after natural language is carried out natural language processing,
Inputted the number of word when being 6, the most only recorded front 5 words recently input, namely by record
Current statistic unit in first word delete.If current statistic unit has inputted, such as, user
Have input a comma, then start next statistic unit, start note from the first of next statistic unit word
Record, in previous statistic unit, the word of record is deleted.
Query unit 503, described in word table corresponding with term vector, query processing unit 502 obtains
Inputted the term vector that word is corresponding, wherein, institute's predicate table corresponding with term vector for record include described in
Input multiple words of word and the term vector that the plurality of word is corresponding.
Obtain user in the current statistic unit recorded by query word table corresponding with term vector and input word
Corresponding term vector.Wherein, the definition such as step B institute of multiple words of record in word table corresponding with term vector
State.It should be noted that word table corresponding with term vector herein is to use step A to the training of step F
Method obtains, so, each word in multiple words is uniquely in word table corresponding with term vector for record
A corresponding term vector.
Predicting unit 504, for having inputted, according to query unit 503 inquiry, the term vector that word is corresponding
And the term vector that the plurality of word is corresponding, it was predicted that user's word to be entered.
Predicting unit 504 specifically for: described term vector corresponding to word that inputted is carried out predetermined linear change
Change or nonlinear transformation, obtain target word vector;
Calculate term vector and the similarity of described target word vector that in the plurality of word, each word is corresponding;
User's word to be entered is predicted according to described similarity.
It should be noted that the alternative approach herein used (predetermined linear conversion or nonlinear transformation) with
The alternative approach used during training word table corresponding with term vector is consistent.
The embodiment of the present application is to calculating the word that described in word table corresponding with term vector, in multiple words, each word is corresponding
The method of the similarity of vectorial and described target word vector is not particularly limited, and it can use existing arbitrary
Plant the method for similarity between two vectors that calculates.
Wherein, similarity height represents that hit rate is the highest, i.e. user to input the probability of this word the highest.
Alternatively, it was predicted that unit 504 also particularly useful for:
Word corresponding higher than one or more term vectors of predetermined threshold for similarity is defined as described user treat
Input word;And/or,
According to described similarity, the term vector that the plurality of word is corresponding is ranked up;
Word corresponding for one or more term vectors forward for sequence is defined as described user word to be entered.
It is, can be complete by word corresponding higher than one or more term vectors of described predetermined threshold for similarity
Portion shows user;Or word corresponding higher than specified quantity the term vector of predetermined threshold for similarity is shown
To user;Or after term vector being sorted according to similarity order from high to low, by forward for sequence one
Or word corresponding to multiple term vector shows user.
Display unit 505, for showing user's word to be entered that predicting unit 504 is predicted.
When showing user's word to be entered of described prediction, can be somebody's turn to do according to the sequence of similarity order from high to low
User's word to be entered of prediction.
The present embodiment achieves and combines context of co-text prediction user's word to be entered, and shown in Figure 2 is pre-
The signal of user's word to be entered of the prediction shown in one of schematic diagram of user's word to be entered surveyed and Fig. 3
The two of figure, when including " we need to administer city " when user has inputted word in a statistic unit,
The user's word to be entered then predicted by the method for the present embodiment is included: " haze ", " pollution ", " rubbish
Rubbish " and " energy shortage " etc.;Include that when user has inputted word in a statistic unit " we need
Plan a city " time, the user's word to be entered predicted by the method for the present embodiment is included: " construction ",
" traffic ", " street " and " developing direction " etc..Visible, user's word to be entered of being predicted with
The linguistic context of context is relevant.When showing user's word to be entered of prediction due to can not be by all pre-
The user's word to be entered surveyed disposably is shown, therefore by the arrow to the left or backward in Fig. 2 and Fig. 3
Page turning or move left and right and show other user predicted word to be entered.
If the user of current presentation word to be entered contains user and is actually subjected to the word of input, then user is without again
Input any out of Memory and just can directly select this word.If the user of current presentation word to be entered does not comprise
User is actually subjected to the word of input, then can update the user of current presentation according to the out of Memory of user's input
Word to be entered.Described out of Memory is the relevant character of the reality word to be entered of user's input, including:
The actual part phonetic of word to be entered, the stroke of actual word to be entered or the katakana etc. of actual word to be entered
Deng.
As shown in Figure 6, device described in another kind of embodiment can also include: receives unit 506, is used for connecing
Receive the out of Memory of user's input, according to described out of Memory, update user's word to be entered of described prediction
Displaying order;And/or,
Receive the out of Memory of user's input, according to described out of Memory, treat defeated from the user of described prediction
Enter word filters out targeted customer's word to be entered, show described targeted customer word to be entered.
As a example by described out of Memory is for the first letter of pinyin of actual word to be entered, if, initial presentation
User's word to be entered of prediction is as in figure 2 it is shown, when user's input Pinyin " s ", then the user after updating treats
Input word is as shown in Figure 4.Due to the first letter of pinyin of " water pollution " in the user's word to be entered shown
For " s ", the first letter of pinyin of other user word to be entered is non-" s ", therefore by word " water pollution "
Sequence in advance, the sequence of the user's word to be entered being namely consistent by the out of Memory inputted with user carries
Before.Or directly user's word to be entered that the out of Memory inputted with user is not inconsistent is hidden, and will with
User's word to be entered screening that the out of Memory of family input is consistent is targeted customer's word to be entered, e.g., permissible
" water pollution ", " appearance of the city ", " city's looks " and " trees " in Fig. 4 is screened as targeted customer
Word to be entered, and show this targeted customer word to be entered.If so actual word to be entered of user is that " water is dirty
Dye " time, then can directly select this word after inputting this first letter of pinyin " s ", and without input
Whole phonetics of this word, improve input speed.
As it is shown in fig. 7, device described in another kind of embodiment can also include: training unit 507, for weight
Perform following process again:
From corpus, choose sample, described sample is carried out natural language processing, obtain multiple continuous print
Word;
Institute's predicate table corresponding with term vector is inquired about in described sample in addition to last word all other
The term vector that word is corresponding, does not records the word of term vector in table corresponding with term vector for institute's predicate, then be it
The term vector that random assortment is corresponding;
The term vector that other words all in addition to last word in described sample are corresponding is carried out predetermined linear
Conversion or nonlinear transformation, obtain predicting term vector;
Described prediction term vector is inputted voice training model, it is judged that the result of output whether with described finally
One word is consistent, and wherein, described voice training model is predefined for obtaining according to the term vector of input
Go out the machine learning model of word corresponding to this term vector;
If not corresponding, then adjust the term vector that described other words all in addition to last word are corresponding, directly
Being consistent to output result with last word described;
Utilize term vector corresponding to other words all in addition to last word after adjusting update institute's predicate with
The term vector that described in term vector correspondence table, other words all in addition to last word are corresponding.
The function of each functional module of the embodiment of the present application device, can pass through each of said method embodiment
Step realizes, therefore, the specific works process of the device that the application provides, repeat the most again at this.
The device of prediction user's word to be entered that the application provides, reads unit 501 from current statistic unit
The middle natural language reading user's input, described statistic unit is predetermined in the natural language that user inputs
Semantic unit between punctuation mark;Processing unit 502 carries out natural language processing to described natural language,
Inputted word;Query unit 503 inquire about in word table corresponding with term vector described in have inputted word corresponding
Term vector, wherein, institute's predicate table corresponding with term vector for record include described in inputted the multiple of word
Word and term vector corresponding to the plurality of word;The institute that predicting unit 504 is inquired about according to query unit 503
State and input term vector corresponding to word and term vector corresponding to the plurality of word, it was predicted that user's word to be entered;
Display unit 505 shows user's word to be entered of prediction.Achieve combine context of co-text prediction user treat
Input word, improves the hit rate of user's word to be entered of prediction, such that it is able to effectively promote input method
Input speed.
Professional should further appreciate that, describes in conjunction with the embodiments described herein
The object of each example and algorithm steps, it is possible to come with electronic hardware, computer software or the combination of the two
Realize, in order to clearly demonstrate the interchangeability of hardware and software, the most according to function
Generally describe composition and the step of each example.These functions are come with hardware or software mode actually
Perform, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can be to often
Individual specifically should being used for uses different methods to realize described function, but this realization it is not considered that
Beyond scope of the present application.
The method described in conjunction with the embodiments described herein or the step of algorithm can use hardware, process
The software module that device performs, or the combination of the two implements.Software module can be placed in random access memory
(RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable ROM,
Other form any well known in depositor, hard disk, moveable magnetic disc, CD-ROM or technical field
Storage medium in.
Above-described detailed description of the invention, is carried out purpose, technical scheme and the beneficial effect of the application
Further describe, be it should be understood that the foregoing is only the application detailed description of the invention and
, it is not used to limit the protection domain of the application, all within spirit herein and principle, done
Any modification, equivalent substitution and improvement etc., within should be included in the protection domain of the application.