Embodiment
Now the embodiment of the invention is described in detail, its example shown in the accompanying drawings, wherein, identical label is represented same parts all the time.Below with reference to the accompanying drawings embodiment is described to explain the present invention.
Fig. 1 is the block diagram that carries out the device of Handwritten Digits Recognition according to the mode of the utilization prediction of the embodiment of the invention.With reference to figure 1, the device that carries out Handwritten Digits Recognition according to the mode of the utilization of embodiment of the invention prediction comprises handwriting input module 100, Handwritten Digits Recognition device 110, prediction character repertoire 120, prediction identification module 130 and display module 140.
Handwriting input module 100 refers to the device that is used to collect relevant informations such as user's hand-written character person's handwriting, for example electronics board/writing pencil, touch-screen etc.When the user with writing pencil or finger on these devices during handwriting characters, handwriting input module 100 just can obtain the information such as person's handwriting point coordinate, pressure and time parameter of the character imported, wherein, the person's handwriting point of described character can be the person's handwriting point that obtains by the certain hour interval sampling, the person's handwriting point that one or more stroke comprised, perhaps the person's handwriting point of prior set point number.These information are sent to Handwritten Digits Recognition device 110 and handle and discern.
Handwritten Digits Recognition device 110 is the devices that are used for discerning hand-written character, and it exports a plurality of recognition result candidate characters from the information such as person's handwriting point coordinate of the character of handwriting input module 100 reception user inputs.The candidate characters of these outputs generally sorts by itself and this hand-written character similarity degree, the similar more front more that comes.Handwritten Digits Recognition device 110 is also exported the distance value of each candidate characters simultaneously except can exporting the recognition result candidate characters.Distance value is exactly numerical value---the similar value of expression recognition result candidate characters and this hand-written character similarity degree.Similar value is more little, and the expression candidate characters is similar more to this hand-written character.For example, when the user on handwriting input module 100 during handwriting characters " word ", 10 possible recognition result candidate characters of Handwritten Digits Recognition device 110 output are: " word space enjoy Zijia keep to inspire confidence in learn the comet ancestor ", export the similar value of 10 correspondences simultaneously: 6324,9915,10527,10597,11008,11111,11263,11392,11421,11460.
Prediction character repertoire 120 is meant some customizing messages by the character that utilizes input, predicts the character repertoire of the character set that is associated with this character.Character set comprises a plurality of characters that are associated with some customizing messages of the character of importing.With reference to figure 2, prediction character repertoire 120 can be divided at least: association's prediction character repertoire 121, stroke prediction character repertoire 122, radicals by which characters are arranged in traditional Chinese dictionaries prediction character repertoire 123 and local prediction character repertoire 124.
Association prediction character repertoire 121 is meant a character can having imported according to the user, the character repertoire of the predicted character set that output is associated with this symbol.Association prediction character repertoire 121 comprises part or alphabet in the used language of user, and each character all related one group of character that has context relation with this character, described one group of character constitutes associates predicted character set.Context relation used herein comprises word relation, Chinese idiom relation, the constituent relation of word/alphabetic writing and the front and back neighbouring relations of the character that the user once imported etc.Association prediction character repertoire 121 can utilize the front and back neighbouring relations of the character that the user once imported to bring in constant renewal in correction.As example, table 1 has provided a Chinese association prediction character repertoire.
Table 1
As can be seen from Table 1, association's predicted character set of forming by tens (in the table 1 being 30) characters that each character is all related.The character of sequence number 5 in the table 1 " zhang ", form the word relation respectively with preceding 3 characters in the association predicted character set, i.e. " husband ", " father-in-law ", " measuring ", and do not form word with remaining 27 character.These 27 characters are high words of frequency of utilization in the Chinese character, are used for the quantity of character in association's predicted character set is supplied.This be because do not have again other can follow " zhang " back forms the character of word.The character of sequence number 9 " Ji " in the table 1 because there is not to follow the character of forming word in its back, so relevant character all be to use the high Chinese character of frequency.In addition, for the application on mobile phone, association's prediction character repertoire 121 can directly be obtained from the language library of T9 input method.
Stroke prediction character repertoire 122 is meant one or more strokes and the precedence thereof that can import according to the user, the character repertoire of the predicted character set that output is associated with it.That is to say, given one or more strokes, the prediction character repertoire can provide beginning stroke all characters identical with these one or more strokes.That the stroke that the Chinese information processing system kind is used can be divided into is horizontal, vertical, 5 kinds of left-falling strokes, right-falling stroke, folding etc.As example, table 2 has provided a Chinese stroke prediction character repertoire.As can be seen from Table 2, each organize orderly stroke all related a predicted character set of forming by a plurality of characters.
Table 2
Radicals by which characters are arranged in traditional Chinese dictionaries predictions character repertoire 123 is meant the character radicals by which characters are arranged in traditional Chinese dictionaries (that is, radical is the part of character) that can import according to the user, and output is the character repertoire of the predicted character set of beginning with these radicals by which characters are arranged in traditional Chinese dictionaries.For example, the user imports radicals by which characters are arranged in traditional Chinese dictionaries " Lv ", and then Dui Ying predicted character set is " skill Ai Jie is the Chinese herbaceous peony awns sesame bud that rues altogether ... ".As example, table 3 has provided a Chinese radicals by which characters are arranged in traditional Chinese dictionaries prediction character repertoire.As can be seen from Table 3, predicted character set of forming by a plurality of characters that each radicals by which characters are arranged in traditional Chinese dictionaries is all related.
Table 3
Local prediction character repertoire 124 is meant the part (may be some strokes, also may be the combination of radicals by which characters are arranged in traditional Chinese dictionaries and stroke) of the character that can import according to the user, output and these character repertoires of the similar predicted character set in importation.For example, user's handwriting input
Then Dui Ying predicted character set is " state group is with being stranded the intercalation circle of child's order because of day ".
Usually, association prediction character repertoire 121, stroke prediction character repertoire 122 and radicals by which characters are arranged in traditional Chinese dictionaries prediction character repertoire 123 are set up in advance, and local prediction character repertoire 124 then is according to the identification of handwriting characters is provided in real time by Character recognizer 110.
In addition, the present invention is applicable to the identification of multilingual character, as Chinese, numeral, English, Japanese, Korean etc.Corresponding prediction character repertoire also has multiple, as Chinese prediction character repertoire, English prediction character repertoire, Japanese prediction character repertoire, Korean prediction character repertoire etc.Prediction character repertoire 120 comprises the part or all of character in the used language of user.
Prediction identification module 130 receives from a plurality of recognition result candidate characters of Handwritten Digits Recognition device 110 outputs and from predicting the predicted character set of character repertoire 120 outputs, and the person's handwriting point of current input is predicted identification.Particularly, the previous character that utilizes the user to import exactly, and the part person's handwriting point of current input, according to the recognition result of current input person's handwriting point, it is the predicted character set of stroke, radicals by which characters are arranged in traditional Chinese dictionaries or local recognition result correspondence, the prediction person's handwriting point that the active user imported may be any character, and exports in the mode of candidate result.Then, the recognition result candidate characters that obtains is discerned in 130 predictions of output of prediction identification module.
Display module 140 is used for showing the recognition result candidate characters of prediction identification module 130 outputs, selects to offer the user.
With reference to Fig. 3 the method that the mode of predicting according to the utilization of the embodiment of the invention is carried out Handwritten Digits Recognition is described below.
Fig. 3 is the process flow diagram that carries out the method for Handwritten Digits Recognition according to the mode of the utilization prediction of the embodiment of the invention.With reference to figure 3, in step 301, the person's handwriting point of user's handwriting characters on handwriting input module 100.In step 302, calculate since the identification of last time, if the person's handwriting point of user's input reaches certain quantity, then carry out step 303, otherwise return step 301.For example, can set the user and whenever write a stroke, system just discerns once.In step 303, the part or all of person's handwriting point that 110 couples of users of Handwritten Digits Recognition device have imported is discerned, and exports the similar value of a plurality of recognition result candidate characters and each candidate characters correspondence.In step 304, from the recognition result of step 303, obtain stroke, the radicals by which characters are arranged in traditional Chinese dictionaries of current input, obtain the local recognition result (promptly all have been imported person's handwriting and have put pairing recognition result) and a last character of having imported of current character simultaneously.In step 305, from the prediction character repertoire of having set up, take out and discern the stroke that obtains, radicals by which characters are arranged in traditional Chinese dictionaries and a last corresponding predicted character set of input character.In step 306, prediction identification module 130 is predicted identification, obtains the preferred recognition result candidate characters of the current person's handwriting point of having imported.The method of prediction identification will further describe below.In step 307, display module 140 is shown to the user with current recognition result candidate characters, so that the user selects.In step 308, if the user finds and select required character from current candidate characters, then finish the input of this character, carry out step 311; Otherwise carry out step 309.In step 309,, show that then the user has write all person's handwritings of current character, carry out step 310 if a stand-by period of lifting of default arrives; Otherwise, show that the user has not also write current character, then return step 301 and continue handwriting characters to allow the user.Wherein, in step 310, system can obtain all complete person's handwritings of current character, carries out general handwriting recognition process.In step 311,, then return step 301 if the user also needs the handwriting input character late; Otherwise input process finishes.
Following mask body introduction is according to recognition methods and the pre-detection identifying method to stroke and radicals by which characters are arranged in traditional Chinese dictionaries of the present invention.
In step 303, to horizontal, vertical, cast aside, press down, the recognition methods of 5 strokes of folding is as described below: the user being started to write each time and lift person's handwriting point between the pen is input to Character recognizer 110 and discerns, if first-selected recognition result is that (similar value is more little less than a certain threshold value T1 for the similar value of one of these 5 strokes and first-selected recognition result correspondence, the confidence level of expression recognition result is high more), think that then the person's handwriting point of current input is the stroke of first-selected recognition result correspondence.Recognition methods to radicals by which characters are arranged in traditional Chinese dictionaries is as described below: all person's handwriting points that the user has been imported are input to Character recognizer 110 and discern, if first-selected recognition result is that (similar value is more little less than a certain threshold value T1 for the similar value of one of radicals by which characters are arranged in traditional Chinese dictionaries and first-selected recognition result correspondence, the confidence level of expression candidate characters is high more), think that then the person's handwriting point of current input is the radicals by which characters are arranged in traditional Chinese dictionaries of first-selected recognition result correspondence.
A kind of preferred pre-detection identifying method that adopts in the step 306 is described below.That is, the stroke predicted character set of taking out in step 305 is A, and the radicals by which characters are arranged in traditional Chinese dictionaries predicted character set is B, and association's predicted character set is C, and the part identification candidate result corresponding characters collection in the step 304 is D.If do not identify stroke, then A is empty, does not identify radicals by which characters are arranged in traditional Chinese dictionaries, and then B is empty, and a last character does not exist, and then C is empty, after current all the person's handwriting point that has write identifications, does not discern candidate result, and then D is empty.Define orderly Candidate Set E, F, G simultaneously.Like this, the method for prediction identification can be described as:
1. if A and B are empty, then E is empty;
2. if A non-NULL, B are empty, then E=A;
If 3. A sky, B non-NULL, then E=B;
If 4. A and B non-NULL all, and common factor is arranged, then E=B, and the ordering among the E is preferential with the common factor;
5. if A and B non-NULL all, and do not have common factor, then E=B
6. if E and C are empty, then F is empty;
7. if E non-NULL, C are empty, then F=E;
If 8. E sky, C non-NULL, then F=C;
If 9. E and C non-NULL all, and common factor is arranged, then F=E, and the ordering among the F is preferential with the common factor;
10. if E and C non-NULL all, and do not have common factor, then F=E;
11. if F and D are empty, then G is empty;
12. if the F non-NULL, D is empty, then G=F;
If 13. the F sky, D non-NULL, then G=D;
If 14. F and D non-NULL all, and common factor is arranged, then G=F, and the ordering among the G is preferential with the common factor;
15. if F and D non-NULL all and does not have common factor, G=F then;
16. prediction of output recognition result G.
Be that example is described identifying in detail with " not " word in user's input " identification " below.Suppose that before this " knowledge " word has been imported and finished.Detailed identification step is as follows.
The 1st step: the user is handwriting input the first stroke on
handwriting input module 100
The 2nd step: current is a stroke (pen of starting to write and lift), carry out step 303.The 3rd step: 110 pairs of Handwritten Digits Recognition devices
Discern, obtain first-selected recognition result and be perpendicular " Shu ", and similar value=3200.The 4th step: similar value 3200 is less than preset threshold T1=5000, and promptly recognition result is stroke perpendicular " Shu ", local recognition result be " Shu ' the 11i fore-telling! V w r ", and local recognition result does not have corresponding radicals by which characters are arranged in traditional Chinese dictionaries.The 5th step: from the prediction character repertoire of having set up, take out " Shu " corresponding predicted character set " only see little day in when going up state in being work as back herewith a little bright because of listening most the interior other water of four-hole ... ", predicted character set be " malapropism must be seen broken to fractal boundary it ... " in a last association that input character " knowledge " is corresponding, at this moment local recognition result be " Shu ' the 11i fore-telling! V w r ".The 6th step: predict identification, the current preferred recognition result candidate characters of having imported person's handwriting point for " only see little day in when going up state in not being work as back herewith a little bright because of listening most the interior water of four-hole ....The 7th step: above recognition result candidate characters is shown to the user so that the user selects by display module 140.The 8th step: the user does not select from candidate characters, but continues to write the 2nd:
(annotate: the user can obtain correct recognition result in this step).The 9th step: 110 pairs the 2nd of Handwritten Digits Recognition device
Discern, obtaining first-selected recognition result is Zhe “ Ya ", and similar value=3556, and the person's handwriting point imported of 110 pairs of Handwritten Digits Recognition devices
Discern, obtain local recognition result and be " mouthful day several R2 say Z river district P ", and similar value=2011 of first-selected recognition result " mouth ".The 10th step: first-selected recognition result Zhe “ Ya " similar value 3556 is less than preset threshold T1=5000; therefore the 2nd recognition result is stroke Zhe “ Ya ", the similar value 2011 of first-selected recognition result " mouth " is less than setting threshold T2=5000 simultaneously, and therefore the importation is radicals by which characters are arranged in traditional Chinese dictionaries " mouths ".The 11st step: from the prediction character repertoire of having set up, take out preceding 2 for " Shu Ya " corresponding predicted character set " when mouthful day being China in the order Tian Yue say and the dawn socket of the eye is looked up with eyes wide open eyeball and stared at that late at night drought is prosperous finely sees peaceful sleepy socket of the eye and take aim to hide and look sidelong at tool and look at ... "; the predicted character set that radicals by which characters are arranged in traditional Chinese dictionaries " mouth " are corresponding " a mouthful medium size sting porphin only a history brother sound of a bird chirping rebuke to sigh to hold in the month and do not cry ... "; and last association's predicted character set that input character " knowledge " is corresponding " malapropism must be seen brokenly to fractal boundary it ... ", this moment, local recognition result was " mouthful day several R2 say Z river district P "; The 12nd step: predict identification, the current preferred recognition result candidate characters of having imported person's handwriting point for " do not see that boundary's mouth medium size stings porphin a history brother sound of a bird chirping and rebuke to sigh to hold in the month and cry ....The 13rd step: above recognition result candidate characters is shown to the user so that the user selects by display module 140.The 14th step: the user does not select from candidate characters, but continues to write the 3rd
(annotate: the user can obtain correct recognition result in this step).The 15th step: 110 pairs the 3rd of Handwritten Digits Recognition device
Discern, recognition result does not have stroke, and the person's handwriting point imported of 110 pairs of Handwritten Digits Recognition devices
Discern, obtain local recognition result for " other merit cut to pieces cut before the letter an ancient unit of weight encourage draw row ", and local recognition result does not have corresponding radicals by which characters are arranged in traditional Chinese dictionaries.The 16th step: from the prediction character repertoire of having set up, take out preceding two for " Shu Ya " corresponding predicted character set " when mouthful day being China in the order Tian Yue say and the dawn socket of the eye is looked up with eyes wide open eyeball and stared at that late at night drought is prosperous finely sees peaceful sleepy socket of the eye and take aim to hide and look sidelong at tool and look at ... "; the predicted character set that radicals by which characters are arranged in traditional Chinese dictionaries " mouth " are corresponding " a mouthful medium size sting porphin only a history brother sound of a bird chirping rebuke to sigh to hold in the month and do not cry ... "; and last association's predicted character set that input character " knowledge " is corresponding " malapropism must be seen brokenly to fractal boundary it ... ", this moment local recognition result for " other merit cut to pieces cut before the letter an ancient unit of weight encourage draw row ".The 17th step: predict identification, the current preferred recognition result candidate characters of having imported person's handwriting point for " other merit is cut a mouthful medium size to pieces and is stung porphin a history brother sound of a bird chirping and rebuke to sigh to hold in the month and cry ....The 18th step: above recognition result candidate characters is shown to the user so that the user selects by display module 140.The 19th step: the user does not select from candidate characters, but continues to write last 1
(annotate: the user can obtain correct recognition result in this step).The 20th step: owing to when writing one, do not have stroke and radicals by which characters are arranged in traditional Chinese dictionaries in the recognition result, so no longer discern stroke and radicals by which characters are arranged in traditional Chinese dictionaries in this step, Handwritten Digits Recognition device 110 is the person's handwriting point to having imported only
Discern, local recognition result is " do not cut to pieces declare ice-cold chaste tree row Yan encourage to cut cut ".The 21st step: from the prediction character repertoire of having set up, take out preceding two for " Shu Ya " corresponding predicted character set " when mouthful day being China in the order Tian Yue say and the dawn socket of the eye is looked up with eyes wide open eyeball and stared at that late at night drought is prosperous finely sees peaceful sleepy socket of the eye and take aim to hide and look sidelong at tool and look at ... "; " a mouthful medium size is stung porphin a history brother sound of a bird chirping and is rebuked to sigh to hold in the month and do not cry to take out the corresponding predicted character set of radicals by which characters are arranged in traditional Chinese dictionaries " mouth " ... "; and last association's predicted character set that input character " knowledge " is corresponding " malapropism must be seen brokenly to fractal boundary it ... ", local recognition result this moment " do not cut to pieces declare ice-cold chaste tree row Yan encourage to cut cut ".The 22nd step: predict identification, the current preferred recognition result candidate characters of having imported person's handwriting point for " not cutting a mouthful medium size to pieces stings porphin a history brother sound of a bird chirping and rebukes to sigh to hold in the month and cry ....The 23rd step: above recognition result candidate characters is shown to the user so that the user selects by display module 140.The 24th step: the user does not select from candidate characters, but waits the predetermined stand-by period arrival (annotate: the user can obtain correct recognition result in this step) of (stand-by period that word has been write that is default).The 25th step: after predetermined past stand-by period, the person's handwriting point that 110 pairs of Handwritten Digits Recognition devices have been imported
Carry out general identification, obtain recognition result and be " do not cut to pieces declare ice-cold chaste tree row Yan encourage to cut cut ".In the 26th step, the user selects first candidate " not ", finishes this word input.
Also can see from above step, use the method for carrying out Handwritten Digits Recognition according to the mode of utilization prediction of the present invention, the user can obtain " not " word of required input in the 8th step.And according to general recognition methods, then need to go on foot " not " word that the user can obtain required input to the 26th.
The method and apparatus that carries out Handwritten Digits Recognition according to the mode of utilization prediction of the present invention has favorable expansibility and ease for use, be suitable for equipment such as Tablet-PC, PDA and hand-written mobile phone and use, be particularly suitable for that this computing power of mobile phone is weak, the terminal of limited storage space.The method and apparatus that carries out Handwritten Digits Recognition according to the mode of utilization prediction of the present invention can be widely used in the various mobile terminal devices that can carry out handwriting input, also can be applied to the equipment that computing power is arranged of external recording device, such as the PC system 400 among Fig. 4, PDA 401, mobile phone 402 and flat computer 403 etc.
By using the method and apparatus that carries out Handwritten Digits Recognition according to the mode of utilization prediction of the present invention, when the user carries out handwriting input, as long as the part of a character of input just can obtain the correct recognition result of this character immediately, thereby can select to finish the input of this character.Therefore need not just to begin identification character after stand-by period of default by the time, thereby accelerated the speed of handwriting input.
Though specifically shown and described the present invention with reference to exemplary embodiment of the present invention, but it should be understood by one skilled in the art that, under the situation that does not break away from the spirit and scope of the present invention that are defined by the claims, can carry out various changes to these embodiment in form and details.