CN110858317A

CN110858317A - Handwriting recognition method and device

Info

Publication number: CN110858317A
Application number: CN201810975834.2A
Authority: CN
Inventors: 辛晓哲
Original assignee: Beijing Sogou Technology Development Co Ltd; Sogou Hangzhou Intelligent Technology Co Ltd
Current assignee: Beijing Sogou Technology Development Co Ltd
Priority date: 2018-08-24
Filing date: 2018-08-24
Publication date: 2020-03-03

Abstract

The invention discloses a handwriting recognition method and a device, wherein the method comprises the following steps: acquiring character handwriting; acquiring a segmentation model corresponding to the language category to which the character handwriting belongs; segmenting the character handwriting by using the segmentation model to obtain a segmentation block sequence; and identifying the segmentation block sequence to obtain a character string corresponding to the character handwriting. By using the method and the device, relatively high identification accuracy can be achieved under the condition that data resources are relatively few.

Description

Handwriting recognition method and device

Technical Field

The invention relates to the field of handwriting recognition, in particular to a handwriting recognition method and device.

Background

The handwriting recognition technology is a process of converting ordered track information generated during writing on a handwriting device into character internal codes, is actually a mapping process from a coordinate sequence of a handwriting track to the character internal codes, and is one of the most natural and convenient means of man-machine interaction. With the popularization of intelligent terminals such as smart phones and palm computers, handwriting recognition technology also enters the era of scale application.

Because different languages of different countries have different writing characteristics, the existing handwriting recognition technology aiming at a single language cannot well realize recognition of handwritten texts in other languages. For this reason, some technical solutions for performing handwriting recognition on multi-national languages are proposed in the art, such as a sequence recognition method based on LSTM (Long Short-Term Memory, Long Short-Term Memory network) + ctc (connectionist Temporal classification), a HMM (hidden markov Model) method based on rules, and so on. The LSTM method requires a large amount of data training to obtain a better convergence result, and requires a large amount of computing resources and training time to obtain a better word accuracy (the word accuracy is reduced by a larger amount); although the HMM-based method can effectively solve the cursive script connection problem, the method is very difficult in the feature extraction process, and the feature extraction directly affects the final model effect.

Disclosure of Invention

The embodiment of the invention provides a handwriting recognition method and device, which can achieve relatively high recognition accuracy under the condition of relatively less data resources.

Therefore, the invention provides the following technical scheme:

a handwriting recognition method, the method comprising:

acquiring character handwriting;

acquiring a segmentation model corresponding to the language category to which the character handwriting belongs;

segmenting the character handwriting by using the segmentation model to obtain a segmentation block sequence;

and identifying the segmentation block sequence to obtain a character string corresponding to the character handwriting.

Optionally, the language categories include: a stroke-connected character writing type language and a non-stroke-connected character writing type language.

Optionally, the method further comprises: pre-constructing a segmentation model corresponding to the continuous stroke character writing language by the following method:

collecting continuous handwriting data as a training sample, and labeling segmentation points of the training sample;

determining characteristic information of each training sample; the characteristic information includes: a baseline of the training sample; the base line is the average line of the interval with the maximum number of projection coordinate points of the character handwriting on the Y axis or the X axis;

determining the intersection point of the training sample and the base line of the training sample, and taking the intersection point as pre-estimated segmentation information;

and training by using the estimated segmentation information and the marking information to obtain a segmentation model corresponding to the stroke-connection character writing language.

Optionally, the segmentation model is a regression model or a convolutional neural network model.

Optionally, the method further comprises: before the segmentation block sequence is identified, judging whether the language category corresponding to the character handwriting is a language which is easy to generate over segmentation; if so, combining the segmentation blocks in the segmentation block sequence; otherwise, executing the step of identifying the cutting block sequence.

Optionally, the combining the segmentation blocks in the segmentation block sequence includes:

extracting the geometric characteristics of each cutting block;

judging whether the adjacent blocks belong to the same character or not by utilizing a multi-element geometric model constructed in advance and the geometric characteristics of the adjacent blocks;

if so, merging the adjacent partitions.

Optionally, the method further comprises: pre-constructing the multivariate geometric model by:

collecting a large amount of segmentation block data as training samples, and labeling segmentation blocks belonging to the same character;

extracting geometric features of the training sample, wherein the geometric features comprise any one or more of the following: length, width, position of each cut piece; the distance and position relationship between adjacent cutting blocks; the difference between the length of each segment and the average length, and the difference between the width of each segment and the average width;

and training by using the geometric features and the labeling information to obtain the multivariate geometric model.

Optionally, the multivariate geometric model is a regression model or a classification model.

Optionally, the recognizing the segmentation block sequence to obtain a character string corresponding to the character script includes:

identifying the segmentation blocks in the segmentation block sequence to obtain characters corresponding to the segmentation blocks;

determining each candidate path;

calculating the score of each candidate path;

selecting an optimal candidate path;

and obtaining a character string corresponding to the character handwriting according to each character on the optimal candidate path.

Optionally, the calculating the score of each candidate path includes:

calculating the scores of all the characters on the candidate paths, and adding the scores of all the characters on the candidate paths to obtain a score sum;

and dividing the sum of the scores by the length of the candidate path to obtain the score of the candidate path.

Optionally, the calculating the score of each character on the candidate path includes:

calculating the score of each character on the candidate path according to any one or more of the following items: the classifier score, the language model score and the word bank score of the character.

A handwriting recognition apparatus, said apparatus comprising:

the receiving module is used for acquiring character handwriting;

the segmentation model acquisition module is used for acquiring a segmentation model of the language category to which the character handwriting belongs;

the segmentation module is used for segmenting the character string handwriting by using the segmentation model to obtain a segmentation block sequence;

and the recognition module is used for recognizing the segmentation block sequence obtained by the segmentation module to obtain a character string corresponding to the character handwriting.

Optionally, the apparatus further comprises: a cutting model construction module for constructing a cutting model in advance; the segmentation model construction module for constructing the segmentation model of the language corresponding to the continuous stroke character writing class comprises:

the data acquisition unit is used for acquiring continuous handwriting data as a training sample and marking the segmentation points of the training sample;

the characteristic information determining unit is used for determining the characteristic information of each training sample; the characteristic information includes: a baseline of the training sample; the base line is the average line of the interval with the maximum number of projection coordinate points of the character handwriting on the Y axis or the X axis;

the pre-estimation unit is used for determining the intersection point of the training sample and the base line of the training sample, and taking the intersection point as pre-estimation segmentation information;

and the segmentation model training unit is used for training by utilizing the estimated segmentation information and the marking information to obtain a segmentation model corresponding to the stroke-connection character writing language.

Optionally, the apparatus further comprises: the combination judgment module and the combination processing module;

the combination judging module judges whether the language category corresponding to the character handwriting is a language which is easy to generate over segmentation or not before the segmentation block sequence is identified by the identification module; and if so, triggering the combined processing module to perform combined processing on the segmentation blocks in the segmentation block sequence.

Optionally, the combination processing module includes:

a geometric feature extraction unit for extracting geometric features of the respective diced blocks;

the character judgment unit is used for judging whether the adjacent cutting blocks belong to the same character or not by utilizing a multi-element geometric model which is constructed in advance and the geometric characteristics of the adjacent cutting blocks;

and the combining unit is used for merging the adjacent segmentation blocks after the character judgment unit judges that the adjacent segmentation blocks belong to the same character.

Optionally, the apparatus further comprises:

a geometric model construction module for constructing the multivariate geometric model; the geometric model construction module comprises:

the sample acquisition unit is used for acquiring a large amount of segmentation block data as training samples and marking segmentation blocks belonging to the same character;

a geometric feature extraction unit, configured to extract geometric features of the training sample, where the geometric features include any one or more of: length, width, position of each cut piece; the distance and position relationship between adjacent cutting blocks; the difference between the length of each segment and the average length, and the difference between the width of each segment and the average width;

and the geometric model training unit is used for training by utilizing the geometric characteristics and the labeling information to obtain the multivariate geometric model.

Optionally, the identification module comprises:

the character recognition unit is used for recognizing the segmentation blocks in the segmentation block sequence to obtain characters corresponding to the segmentation blocks;

a candidate path determination unit for determining each candidate path;

a path score calculation unit for calculating a score of each candidate path;

a selection unit for selecting an optimal candidate path;

and the output unit is used for obtaining a character string corresponding to the character handwriting according to each character on the optimal candidate path.

Optionally, the path score calculating unit includes:

the character score calculating unit is used for calculating the scores of all the characters on the candidate paths and adding the scores of all the characters on the candidate paths to obtain the sum of the scores;

and the normalization unit is used for dividing the total score sum by the length of the candidate path to obtain the score of the candidate path.

Optionally, the character score calculating unit calculates the score of each character on the candidate path according to any one or more of the following items: the classifier score, the language model score and the word bank score of the character.

A computer device, comprising: one or more processors, memory;

the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions to implement the method described above.

A readable storage medium having stored thereon instructions which are executed to implement the foregoing method.

According to the handwriting recognition method and device provided by the embodiment of the invention, corresponding segmentation models are respectively utilized to segment aiming at the writing characteristics of different language categories to obtain segmentation block sequences, and character recognition results are obtained by identifying the segmentation block sequences. By using the scheme of the invention, relatively high identification accuracy can be achieved under the condition of relatively less data resources.

Furthermore, for each segmentation block obtained by segmentation, the segmentation block sequence is directly identified according to whether the language category is a language which is easy to generate over-segmentation, or the segmentation blocks in the segmentation block sequence are combined and then identified, so that the characteristics of segmentation of different languages are met, and the segmentation accuracy is improved.

Furthermore, in the identification process, when calculating the score of each candidate path, a penalty item for the path length is added, namely, the score of all candidate paths in the candidate paths needs to be normalized by the path length, and the unit length score of the candidate paths is calculated, so that the condition that the score of the long path is always larger than that of the short path is avoided, and the score of the candidate paths is more reasonable and effective.

Further, when calculating the candidate path score, the comprehensive score of each character on the candidate path, that is, the classifier score, the language model score and the lexicon score of the character are fully considered, so that a more accurate recognition result can be obtained.

Drawings

In order to more clearly illustrate the embodiments of the present application or technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings can be obtained by those skilled in the art according to the drawings.

FIG. 1 is a flow chart of a handwriting recognition method according to an embodiment of the present invention;

FIG. 2 is a flowchart of constructing a segmentation model corresponding to a language of the stroke-linked character writing class according to an embodiment of the present invention;

FIG. 3 is a flow chart of constructing a multivariate geometric model according to an embodiment of the invention;

FIG. 4 is a flow chart of a character string recognition process in a handwriting recognition method according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a handwriting recognition apparatus according to an embodiment of the invention;

FIG. 6 is a schematic diagram of another embodiment of a handwriting recognition apparatus;

FIG. 7 is a schematic structural diagram of a geometric model building module according to an embodiment of the present invention;

FIG. 8 is a schematic diagram of an architecture of a combinatorial processing module in an embodiment of the invention;

FIG. 9 is a diagram illustrating an exemplary structure of an identification module according to an embodiment of the present invention;

FIG. 10 is a block diagram illustrating an apparatus of a method of handwriting recognition, according to an example embodiment;

fig. 11 is a schematic structural diagram of a server in an embodiment of the present invention.

Detailed Description

In order to make the technical field of the invention better understand the scheme of the embodiment of the invention, the embodiment of the invention is further described in detail with reference to the drawings and the implementation mode.

As shown in fig. 1, it is a flowchart of a handwriting recognition method according to an embodiment of the present invention, and the flowchart includes the following steps:

step 101, acquiring character handwriting.

The character handwriting can be acquired online in real time, or acquired offline through an image recognition technology, and the embodiment of the invention is not limited.

In addition, the character handwriting can be character handwriting input in a row-by-row mode, such as western language, or character handwriting input in a column-by-column mode, such as Mongolian, the writing habit is vertically written from top to bottom and from left to right, and all letters in each word are continuously written to form a vertical trunk line.

It should be noted that, because the character handwriting contains some noises, the dithering of the strokes, the writing speed, and the like, and the subsequent segmentation processing is interfered, in practical application, the data sequence corresponding to the character handwriting can be subjected to denoising processing, for example, the existing denoising technologies such as gaussian filtering are adopted. In addition, considering the balance of the distribution of the sampling points of the character handwriting and the free writing of a user during writing, the data sequence can be subjected to resampling, inclination correction and other processing, specifically, the existing technology can be adopted, for example, the resampling technology adopts a proximity interpolation method, a bilinear interpolation method and the like, and the inclination correction technology adopts a Hough transformation method.

And 102, acquiring a segmentation model corresponding to the language category to which the character handwriting belongs.

In the user input device, selection keys for different languages can be set, and can be physical keys or virtual keys, and for online handwriting recognition, the current input language category of the user can be determined according to the trigger signal of the corresponding key. For offline handwriting recognition, the language class to which the character script belongs may be determined manually.

In the embodiment of the invention, the language categories are mainly divided according to the writing habits of different languages, and can be divided into continuous stroke character writing languages and non-continuous stroke character writing languages. The continuous stroke character writing language refers to that a plurality of characters in a word are used for continuous stroke writing, continuous strokes (such as Chinese characters) between different strokes in one character are not included in the continuous stroke character writing language, and the continuous stroke character writing language comprises the following steps: continuous stroke character writing languages of various languages such as Arabic, English, German, French, Russian and the like; the non-continuous-stroke character writing language means that adjacent characters are not usually written continuously, such as: non-continuous stroke character writing languages such as Chinese, Japanese, Korean and the like.

For the continuous stroke character writing type language, a specific cutting model aiming at the continuous stroke character writing type language can be adopted, and for the non-continuous stroke character writing type language, some conventional cutting models can be adopted.

Aiming at the problem of less continuous stroke data samples, in the embodiment of the invention, a small amount of continuous stroke handwriting data can be used for training a corresponding segmentation model, the segmentation model can adopt a regression model or a classification model, and the topological structure of the segmentation model can be a neural network, such as a convolutional neural network.

The construction process of the segmentation model corresponding to the continuous stroke character writing language is shown in FIG. 2, and comprises the following steps:

and step 21, collecting continuous handwriting data as a training sample, and labeling the segmentation points of the training sample.

Step 22, determining characteristic information of each training sample; the characteristic information includes: the base line of the training sample may further include the lowest line and the highest line of the training sample.

The lowest line is a fitting straight line connecting local minimum coordinate points of Y values or X values in the character handwriting; the highest line is a fitting straight line connecting local maximum coordinate points of Y values or X values in the character handwriting; the base line is the average line of the interval with the maximum number of projection coordinate points of the character handwriting on the Y axis or the X axis.

As mentioned above, the character scripts may be character scripts that are input in series in rows or columns.

For character handwriting input continuously in rows, the lowest line is a fitting straight line connecting local minimum coordinate points of Y values in the character handwriting; the highest line is a fitting straight line connecting local maximum coordinate points of Y values in the character handwriting; and the base line is the average line of the interval with the maximum number of the projection coordinate points of the character handwriting on the Y axis. When the baseline is determined, the character handwriting can be projected to the Y axis, a statistical histogram is obtained according to the projection, and the statistical histogram records the number of coordinate points in each interval of the Y axis; and then determining an interval with the maximum number of coordinate points on the Y axis according to the statistical histogram, and taking the mean line of the interval as the baseline of the character handwriting.

For character handwriting input continuously in columns, the lowest line is a straight line connecting local minimum coordinate points of X values in the character handwriting; the highest line is a straight line connecting the local maximum coordinate points of the X value in the character handwriting; the base line is the average line of the interval with the maximum number of the projection coordinate points of the character handwriting on the X axis. When the baseline is determined, the character handwriting can be projected to an X axis, a statistical histogram is obtained according to the projection, and the statistical histogram records the number of coordinate points in each interval of the X axis; and then determining an interval with the maximum number of X-axis projection coordinate points according to the statistical histogram, and taking the interval as a baseline of the character handwriting.

It should be noted that there may be a plurality of local lowest points and a plurality of local highest points in the writing track, and the local is a set step length along the X-axis or Y-axis direction. When determining the highest line and the lowest line, the highest line and the lowest line can be obtained by straight line fitting through local lowest points and highest points in the writing track.

In addition, when determining the projection interval of the character handwriting on the Y axis or the X axis, the position of the base line can be determined more accurately by means of the highest line and the lowest line.

And step 23, determining the intersection point of the training sample and the base line thereof, and taking the intersection point as pre-estimated segmentation information.

And 24, training by using the estimated segmentation information and the marking information to obtain a segmentation model corresponding to the continuous stroke character writing language.

And 103, segmenting the character handwriting by using the segmentation model to obtain a segmentation block sequence.

Based on the segmentation model of the corresponding continuous stroke character writing language, taking a regression model as an example, when segmenting the character handwriting of the continuous stroke character writing language, the feature information of the character handwriting needs to be determined first, and the feature information includes: the character handwriting base line further comprises the lowest line and the highest line of the character handwriting, the intersection point of the character handwriting and the base line is determined, the intersection point is used as pre-estimated segmentation information, coordinates of each point in the pre-estimated segmentation information are sequentially input into the segmentation model, the probability of each point is obtained according to the output of the segmentation model, and if the probability is larger than a set threshold value, the point is determined to be an actual segmentation point. And segmenting the character handwriting according to the actual segmentation point to obtain each segmentation block corresponding to a single character.

The specific process of segmenting the character handwriting of the non-continuous-stroke character writing language based on the segmentation model of the corresponding non-continuous-stroke character writing language is the same as that of the prior art, and is not repeated herein.

And 104, identifying the segmentation block sequence to obtain a character string corresponding to the character handwriting.

The specific identification process will be described in detail later.

It should be noted that the handwriting recognition method of the embodiment of the present invention can perform segmentation and recognition for a plurality of different languages with different writing habits. For some character scripts which may generate over segmentation, in order to improve the accuracy of subsequent recognition, in another embodiment of the method of the present invention, before the segmentation block sequence is recognized, whether the language category corresponding to the character scripts is a language which is easy to generate over segmentation may be judged; if so, combining the segmentation blocks in the segmentation block sequence, and then identifying the segmentation block sequence after combination processing; otherwise, the above step 104 is directly performed.

The character scripts that may be over-segmented are, for example: chinese, french, german, etc., require the combination of adjacent partitions. Because the Chinese has more strokes and is a radical structure, the strokes in one character are not all intersected, and over segmentation is easy to generate; in languages such as french and german, due to the presence of diacritical letters, diacritical symbols on the letters do not intersect with the letters themselves, and therefore, overcut is also easily generated.

When the combined processing is carried out on the cut blocks, firstly, the geometric features of each cut block are extracted, and the geometric features comprise any one or more of the following: length, width, position of each cut piece; the distance and position relationship between adjacent cutting blocks; the difference between the length of each segment and the average length, and the difference between the width of each segment and the average width; then, judging whether the adjacent blocks belong to the same character or not by utilizing a multi-element geometric model which is constructed in advance and the geometric characteristics of the adjacent blocks; if so, merging the adjacent partitions.

Specifically, for each language, a multivariate geometric model corresponding to the language can be constructed. The multivariate geometric model may employ a regression model or a classification model.

As shown in fig. 3, it is a flowchart of constructing a multivariate geometric model in the embodiment of the present invention, and includes the following steps:

and step 31, collecting a large amount of segmentation block data as training samples, and labeling segmentation blocks belonging to the same character.

And step 32, extracting the geometric features of the training sample.

The geometric features include any one or more of: length, width, position of each cut piece; the distance and position relationship between adjacent cutting blocks; the difference between the length of each segment and the average length, and the difference between the width of each segment and the average width.

It should be noted that, in practical applications, specific information included in the geometric features may be determined according to different structural features of each language, and is not limited to the above listed information, and other information for describing the geometric features may also be included, so that the embodiment of the present invention is not limited thereto.

And step 33, training by using the geometric features and the labeling information to obtain a multivariate geometric model.

It should be noted that, in practical applications, a unified multivariate geometric model can be constructed for various different languages. Accordingly, corresponding grammatical features need to be added when the multivariate geometric model is trained.

It should be noted that, when the segmentation block sequence is identified, each segmentation block may be identified first to obtain corresponding characters, and a character string corresponding to the character handwriting is obtained according to the characters.

For the recognition of each segmentation block, the prior art can be adopted, and the corresponding character of each segmentation block is obtained by using the corresponding language model.

Because a plurality of characters may be recognized corresponding to each segment, and the confidence coefficient of each character is different, in practical application, the character with the highest confidence coefficient may be used as the character of the corresponding segment, and accordingly, the character string formed by the characters may be used as the character string corresponding to the character handwriting.

In order to further improve the accuracy of the finally obtained character string, in another embodiment of the method of the present invention, the comprehensive score of each character obtained by identifying the segmentation block may be comprehensively considered, so as to select an optimal candidate path, and obtain the character string corresponding to the character handwriting according to each character on the optimal candidate path.

As shown in fig. 4, it is a flowchart of a character string recognition process in a handwriting recognition method according to an embodiment of the present invention, and the method includes the following steps:

step 401, identifying the segmentation blocks in the segmentation block sequence to obtain each character corresponding to the segmentation blocks.

At step 402, candidate paths are determined.

In step 403, the score of each candidate route is calculated.

The scores of the characters on the candidate paths may be added, and the sum of the added scores may be used as the score of the candidate path.

Further, considering that the lengths of different candidate paths may be different, in order to eliminate the unbalanced difference between the scores of the long path and the short path, the scores of the candidate paths may be normalized, specifically, the scores of the characters on the candidate paths are calculated, and the scores of the characters on the candidate paths are added to obtain a total score; and dividing the sum of the scores by the length of the candidate path to obtain the score of the candidate path.

The scores of the characters on the candidate paths may be considered in combination from the following aspects, namely, a classifier score, a language model score, and a lexicon score.

The classifier score and the language model score are scores of the characters in the classifier and the language model used in handwriting-based recognition, and the thesaurus score is a score of a character string consisting of the characters on the candidate path in a thesaurus, that is, paths capable of hitting thesaurus words are found in the candidate path network, and the paths are endowed with higher weight values.

Of course, in practical applications, when calculating the score of each character on the candidate path, the score of each character on the candidate path may be calculated according to any one or more of the following items: the classifier score, the language model score and the word bank score of the character.

Step 404, selecting an optimal candidate path.

The candidate paths can be ranked according to the scores from high to low, and one or more candidate paths ranked in the top are selected as the optimal candidate paths.

And 405, obtaining a character string corresponding to the character handwriting according to each character on the optimal candidate path.

According to the handwriting recognition method provided by the embodiment of the invention, corresponding segmentation models are respectively utilized to segment aiming at the writing characteristics of different language categories to obtain segmentation block sequences, and character recognition results are obtained by identifying the segmentation block sequences. By using the scheme of the invention, relatively high identification accuracy can be achieved under the condition of relatively less data resources.

Furthermore, each segmentation block obtained by segmentation can be directly identified according to whether the language category is a language which is easy to generate over-segmentation or not, or segmentation blocks in the segmentation block sequence are combined and then identified, so that the characteristics of segmentation of different languages are met, and the segmentation accuracy is improved.

Correspondingly, an embodiment of the present invention further provides a handwriting recognition apparatus, as shown in fig. 5, which is a schematic structural diagram of the apparatus.

In this embodiment, the apparatus comprises:

the receiving module 501 is configured to acquire a character script, where the acquisition of the character script may be a character script input by a user acquired online in real time or a character script acquired offline through an image recognition technology, and the embodiment of the present invention is not limited thereto; in addition, the character handwriting can be character handwriting input in line succession or character handwriting input in column succession;

a segmentation model obtaining module 502, configured to obtain a segmentation model of a language category to which the character handwriting belongs;

a segmentation module 503, configured to segment the character string handwriting by using the segmentation model to obtain a segmentation block sequence;

and the recognition module 506 is configured to recognize the segmentation block sequence obtained by the segmentation module to obtain a character string corresponding to the character script.

In the embodiment of the present invention, the segmentation model may be pre-constructed by a corresponding segmentation model construction module, and the segmentation model construction module may be a part of the apparatus or may be independent of the apparatus, which is not limited in this embodiment of the present invention.

Because different languages have respective writing characteristics, the languages can be divided into a continuous stroke character writing class language and a non-continuous stroke character writing class language according to the writing habits of the different languages, and accordingly, a cutting model aiming at the continuous stroke character writing class language and a cutting model aiming at the non-continuous stroke character writing class language can be respectively constructed, wherein the cutting model aiming at the non-continuous stroke character writing class language can adopt some conventional cutting models in the prior art, so that only a cutting model construction module used for constructing the cutting model corresponding to the continuous stroke character writing class language is explained below.

The embodiment of the invention is used for constructing the segmentation model of the corresponding continuous stroke character writing class language, and comprises the following units:

the characteristic information determining unit is used for determining the characteristic information of each training sample; the characteristic information includes: the base line of the training sample further comprises the lowest line and the highest line of the training sample; the lowest line is a fitting straight line connecting local minimum coordinate points of Y values or X values in the character handwriting; the highest line is a fitting straight line connecting local maximum coordinate points of Y values or X values in the character handwriting; the base line is the average line of the interval with the maximum number of projection coordinate points of the character handwriting on the Y axis or the X axis; the determining manner of the lowest line, the highest line and the baseline can refer to the foregoing description, and is not repeated herein;

Based on the above segmentation model of the corresponding continuous stroke character writing language, when the segmentation module 503 segments the character handwriting of the continuous stroke character writing language, the feature information of the character handwriting may be determined first, where the feature information includes: the base line of the character handwriting further comprises the lowest line and the highest line of the character handwriting, the intersection point of the character track and the base line is determined, the intersection point is used as the pre-estimated segmentation information, the coordinates of each point in the pre-estimated segmentation information are sequentially input into the segmentation model, and whether the intersection point is the actual segmentation point or not is determined according to the output of the segmentation model. And segmenting the character handwriting according to the actual segmentation point to obtain each segmentation block.

Based on the segmentation model of the corresponding non-continuous-stroke character writing language, the specific process of the segmentation module 503 for segmenting the character handwriting of the non-continuous-stroke character writing language is similar to the prior art, and is not described in detail herein.

The handwriting recognition device provided by the embodiment of the invention can be used for segmenting and recognizing a plurality of different languages with different writing habits. For some character scripts that may generate over-segmentation, in order to improve the accuracy of subsequent recognition, in another embodiment of the apparatus of the present invention, as shown in fig. 6, the method may further include: a combination judgment module 504 and a combination processing module 505. Before the recognition module 506 recognizes the segmentation block sequence, the combination judgment module 504 judges whether the language category corresponding to the character handwriting is a language which is easy to generate over segmentation; if so, triggering the combination processing module 505 to perform combination processing on the segmentation blocks in the segmentation block sequence to obtain a combined segmentation block sequence, and then identifying the combined segmentation block sequence by the identification module 506.

When performing the combining process, the combining process module 505 may determine whether the adjacent partitions belong to the same character by using a pre-constructed multivariate geometric model, and if so, merge the adjacent partitions.

The multivariate geometric model may be pre-constructed by a corresponding geometric model construction module, and the geometric model construction module may be a part of the apparatus or may be independent of the apparatus, which is not limited in the embodiments of the present invention.

Fig. 7 is a schematic structural diagram of a geometric model building module according to an embodiment of the present invention, where the geometric model building module includes the following units:

the sample acquisition unit 71 is used for acquiring a large amount of segmentation block data as training samples and marking segmentation blocks belonging to the same character;

a geometric feature extraction unit 72, configured to extract geometric features of the training samples, where the geometric features include any one or more of the following: length, width, position of each cut piece; the distance and position relationship between adjacent cutting blocks; the difference between the length of each segment and the average length, and the difference between the width of each segment and the average width;

and the geometric model training unit 73 is used for training by using the geometric features and the labeling information to obtain the multivariate geometric model.

In addition, a respective multivariate geometric model can be constructed for each language; the unified multivariate geometric model can also be constructed aiming at various different languages, and correspondingly, corresponding grammatical features need to be added when the multivariate geometric model is trained.

Fig. 8 is a schematic structural diagram of a combined processing module according to an embodiment of the present invention.

The combined processing module comprises the following units:

a geometric feature extraction unit 551 for extracting geometric features of the respective diced pieces;

a character determining unit 552, configured to determine whether adjacent blocks belong to the same character by using a pre-constructed multivariate geometric model and geometric features of the adjacent blocks; the adjacent cutting blocks comprise upper and lower adjacent cutting blocks and left and right adjacent cutting blocks;

a combining unit 553, configured to merge the adjacent split blocks after the character determining unit 82 determines that the adjacent split blocks belong to the same character.

In practical application, when the recognition module 506 recognizes the sequence of the segmentation blocks, it may first recognize each segmentation block to obtain corresponding characters, and obtain a character string corresponding to the character handwriting according to the characters.

In order to further improve the accuracy of the finally obtained character string, the recognition module 506 may further comprehensively consider the comprehensive score of each character obtained by recognizing the segmentation block, so as to select an optimal candidate path, and obtain the character string corresponding to the character handwriting according to each character on the optimal candidate path. Accordingly, a specific structure of the identification module is shown in fig. 9.

In this embodiment, the identification module includes the following units:

a character recognition unit 561, configured to recognize the segmentation block in the segmentation block sequence, so as to obtain each character corresponding to the segmentation block;

a candidate path determining unit 562 for determining each candidate path;

a path score calculation unit 563 for calculating a score of each candidate path;

the selecting unit 564 is configured to select an optimal candidate path, for example, the candidate paths may be ranked from high to low according to the score, and one or more candidate paths ranked in the top are selected as the optimal candidate path;

an output unit 565, configured to obtain, according to each character on the optimal candidate path, a character string corresponding to the character script.

The route score calculating unit 563 may add the scores of the characters on the candidate routes, and a total sum of the added scores may be the score of the candidate route. Further, the sum of the scores can be normalized to eliminate the unbalanced difference of the long and short path scores. Accordingly, one specific structure of the path score calculation unit 563 may include: the character score calculating unit is used for calculating the scores of all the characters on the candidate paths and adding the scores of all the characters on the candidate paths to obtain the sum of the scores; the normalization unit is used for dividing the total score sum by the length of the candidate path to obtain the score of the candidate path.

In this embodiment of the present invention, the character score calculating unit may calculate the score of each character on the candidate path according to any one or more of the following items: the classifier score, the language model score and the word bank score of the character.

The handwriting recognition device provided by the embodiment of the invention is used for segmenting by using corresponding segmentation models respectively according to the writing characteristics of different language categories to obtain segmentation block sequences, and character recognition results are obtained by identifying the segmentation block sequences. By using the scheme of the invention, relatively high identification accuracy can be achieved under the condition of relatively less data resources.

Fig. 10 is a block diagram illustrating an apparatus 800 for a method of handwriting recognition, according to an example embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, and the like.

Referring to fig. 10, the apparatus 800 may include one or more of the following components: processing component 802, memory 804, power component 806, multimedia component 808, audio component 810, input/output (I/O) interface 812, sensor component 814, and communication component 816.

The processing component 802 generally controls overall operation of the device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 802 may include one or more processors 820 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interaction between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operation at the device 800. Examples of such data include instructions for any application or method operating on device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power component 806 provides power to the various components of device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the device 800.

The multimedia component 808 includes a screen that provides an output interface between the device 800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 800 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the apparatus 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 also includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 814 includes one or more sensors for providing various aspects of state assessment for the device 800. For example, the sensor assembly 814 may detect the open/closed state of the device 800, the relative positioning of the components, such as a display and keypad of the apparatus 800, the sensor assembly 814 may also detect a change in position of the apparatus 800 or a component of the apparatus 800, the presence or absence of user contact with the apparatus 800, orientation or acceleration/deceleration of the apparatus 800, and a change in temperature of the apparatus 800. Sensor assembly 814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communications between the apparatus 800 and other devices in a wired or wireless manner. The device 800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 804 comprising instructions, executable by the processor 820 of the device 800 to perform the key press false touch correction method described above is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Embodiments of the present invention further provide a non-transitory computer-readable storage medium, where instructions in the storage medium, when executed by a processor of a mobile terminal, enable the mobile terminal to perform some or all of the steps in the foregoing method embodiments of the present invention.

Fig. 11 is a schematic structural diagram of a server in an embodiment of the present invention. The server 1900, which may vary widely in configuration or performance, may include one or more Central Processing Units (CPUs) 1922 (e.g., one or more processors) and memory 1932, one or more storage media 1930 (e.g., one or more mass storage devices) that store applications 1942 or data 1944. Memory 1932 and storage medium 1930 can be, among other things, transient or persistent storage. The program stored in the storage medium 1930 may include one or more modules (not shown), each of which may include a series of instructions operating on a server. Still further, a central processor 1922 may be provided in communication with the storage medium 1930 to execute a series of instruction operations in the storage medium 1930 on the server 1900.

The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input-output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

Embodiments of the present invention further provide a non-transitory computer-readable storage medium, where when instructions in the storage medium are executed by a processor of a device, the device is enabled to execute the key mis-touch error correction method.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is only limited by the appended claims

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A handwriting recognition method, comprising:

acquiring character handwriting;

2. The method of claim 1, further comprising: before the segmentation block sequence is identified, judging whether the language category corresponding to the character handwriting is a language which is easy to generate over segmentation; if so, combining the segmentation blocks in the segmentation block sequence; otherwise, executing the step of identifying the cutting block sequence.

3. The method of claim 2, wherein the combining the sliced blocks in the sequence of sliced blocks comprises:

extracting the geometric characteristics of each cutting block;

if so, merging the adjacent partitions.

4. The method of claim 3, further comprising: pre-constructing the multivariate geometric model by:

5. The method according to any one of claims 1 to 4, wherein the recognizing the segmentation block sequence to obtain a character string corresponding to the character script comprises:

determining each candidate path;

calculating the score of each candidate path;

selecting an optimal candidate path;

6. The method of claim 5, wherein the calculating the score for each candidate path comprises:

7. The method of claim 6, wherein the calculating the score for each character on the candidate path comprises:

8. A handwriting recognition apparatus, said apparatus comprising:

the receiving module is used for acquiring character handwriting;

9. A computer device, comprising: one or more processors, memory;

the memory is for storing computer-executable instructions, and the processor is for executing the computer-executable instructions to implement the method of any one of claims 1 to 7.

10. A readable storage medium having stored thereon instructions that are executed to implement the method of any one of claims 1 to 7.