Summary of the invention
In view of this, the embodiment of the present invention provides a kind of character identifying method and device, in order to solve the problem of poor robustness in prior art character recognition process.
The invention provides a kind of character identifying method, described method comprises:
The picture that comprises character information to be identified is carried out to binary conversion treatment, the border of character in this picture after identification binary conversion treatment, according to definite border, in the samples pictures that comprises character information to be identified, intercept character zone to be identified, the character zone of this intercepting is carried out to binaryzation, normalized;
In character zone to be identified after normalized, the saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in character edge hum pattern, other values are composed in the position of other pixels, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization;
Identify the pixel value of each pixel in this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and definite ownership direction, adopt relevant position assignment in this ownership direction value edge gradient array, other position assignment of this array are-1;
Each template that this edge gradient array is corresponding with each character of preservation is mated, determine matching distance, using character corresponding matching distance minimum value as recognition result.
The invention provides a kind of character recognition device, described device comprises:
Normalization module, for the picture that comprises character information to be identified is carried out to binary conversion treatment, the border of character to be identified in this picture after identification binary conversion treatment, according to definite border, in the samples pictures that comprises character information to be identified, intercept character zone to be identified, the character zone of this intercepting is carried out to binaryzation, normalized;
Marginal information determination module, for the character zone to be identified after normalized, the saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in character edge hum pattern, other values are composed in the position of other pixels, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization;
Gradient direction determination module, for identifying the pixel value of the each pixel of this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and definite ownership direction, adopt relevant position assignment in this ownership direction value edge gradient array, other position assignment of this array are-1;
Coupling identification module, for this edge gradient array is mated with each template of preservation, determines matching distance, using character corresponding matching distance minimum value as recognition result.
The invention provides a kind of character identifying method and device, the method is during for character recognition to be identified and template establishment, be normalized, determine character edge hum pattern, and the value of each numerical value in definite edge gradient array, when having determined after the assignment of relevant position in edge gradient array, according to the coupling between character to be identified corresponding edge gradient array and each template corresponding to each character, determine matching distance, according to matching distance identification character.Due in the present invention according to the gradient direction of each pixel in character, as the respective value in edge gradient array, and gradient direction has stronger antijamming capability, therefore this character identifying method has stronger robustness, and the each template corresponding with each character mated in the process of identification, according to matching distance, using character corresponding matching distance minimum value as recognition result, therefore can avoid the problem of the poor robustness of character list template matches, improve the scope of application of this matching process.
Embodiment
The embodiment of the present invention, in order to improve efficiency and the precision of character recognition, provides a kind of character identifying method and device.
Below in conjunction with Figure of description, the present invention is described in detail.
Fig. 1 is character recognition process schematic diagram provided by the invention, and this process comprises the following steps:
S101: the picture that comprises character information to be identified is carried out to binary conversion treatment, the border of character to be identified in this picture after identification binary conversion treatment, according to definite border, in the samples pictures that comprises character information to be identified, intercept character zone to be identified, the character zone of this intercepting is carried out to binaryzation, normalized.
S102: in the character zone to be identified after normalized, the saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in character edge hum pattern, other values are composed in the position of other pixels, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization.
S103: the pixel value of identifying each pixel in this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and definite ownership direction, adopt relevant position assignment in this ownership direction value edge gradient array, other position assignment of this array are-1.
S104: each template that this edge gradient array is corresponding with each character of preservation is mated, determine matching distance, using character corresponding matching distance minimum value as recognition result.
Before to character recognition, also comprise for each character, create multiple templates of this character, in the time creating each template of each character, also need to be normalized, to determine character edge hum pattern, and the value of each numerical value in definite edge gradient array, the constructive process of template is identical with the step of carrying out in character recognition process.When having determined after the assignment of relevant position in edge gradient array corresponding to template, adopt the assignment of each relevant position in the edge gradient array that identical method determines that character to be identified is corresponding, according to mating between character to be identified corresponding edge gradient array and template, determine matching distance, according to matching distance identification character.
Due in the present invention due to according to the gradient direction of each pixel in character, as the respective value in edge gradient array, and gradient direction has stronger antijamming capability, therefore this character identifying method has stronger robustness, and the each template corresponding with each character mated in the process of identification, according to matching distance, using character corresponding matching distance minimum value as recognition result, therefore can avoid the problem of the poor robustness of character list template matches, improve the scope of application of this matching process.
Below by specific embodiment, character recognition process of the present invention is elaborated.
In order to improve the accuracy of character recognition, and improve the efficiency of character recognition, need to preserve multiple templates for each character, each template needs representative, and the diversity ratio between template is larger.In the time creating and preserve the template of character, need to be normalized character zone, and extract the feature of character zone after this normalized.
Fig. 2 is the process of the normalized in Character mother plate constructive process provided by the invention, and this process comprises the following steps:
S201: the samples pictures that comprises character information is carried out to binary conversion treatment.
The samples pictures that generally comprises character information is colour picture, before this colour picture is carried out to binaryzation, need to first this colour picture be converted to gray scale picture, adopt afterwards corresponding Binarization methods to carry out binary conversion treatment to this gray scale picture, can adopt in the present invention otsu Binarization methods to carry out binary conversion treatment to gray scale picture.
S202: four borders of the picture from binary conversion treatment start, respectively to picture inner search; When search in this picture white pixel point time, determine that this white pixel point is positioned at the border of this character; According to the position of the white pixel point scanning from each boundary direction, determine the border of this character.
In the present invention in order to detect the region at character place in the picture after this binary conversion treatment, four of picture from this binary conversion treatment borders start respectively, to picture inner scanning, the upper and lower, left and right four direction of the picture from this binary conversion treatment is respectively to picture inner scanning.Concrete in the time recognizing white pixel point first from upper and lower both direction, think the row at place, upper and lower border of behavior character at this white pixel point place, in the time recognizing white pixel point first from left and right both direction, think the row at the place, left and right border of classifying this character as at this white pixel point place.
S203: according to the border of definite character, intercept character zone in the samples pictures that comprises character information.
Form after the row and column of character boundary when having determined, can in the samples pictures that comprises character information, intercept the picture of character zone.
S204: the character zone intercepting is carried out to binary conversion treatment, and according to the size arranging, the picture after this binary conversion treatment is normalized.
When having intercepted from colour picture after this character zone, still be colour picture to picture corresponding to this character zone, the colour picture of character zone is converted to gray scale picture, and adopt otsu Binarization methods to carry out binary conversion treatment to this gray scale picture, afterwards according to the size arranging, the for example size of this setting in the present invention can be wide by 24, high by 48, picture after this binaryzation is normalized, and to the gray scale picture after this conversion, be normalized according to the size of this setting simultaneously.Thereby obtain binary map and gray-scale map after normalization.
Fig. 3 is the characteristic extraction procedure in Character mother plate constructive process provided by the invention, and this process comprises the following steps:
S301: in the character zone after normalization, the saltus step of identification pixel pixel value.
Character zone after this normalization is binary character figure, and the width of this binary character figure is W, is highly H.In this binary character figure, identify the saltus step of pixel value, the pixel value of two neighbor pixels becomes 0 from 1, or becomes 1 situation from 0.
S302: the position of white pixel point during according to pixel value saltus step, in character edge hum pattern, be 255 by the pixel assignment corresponding with this white pixel point position, otherwise assignment is other values, the wherein character zone equal and opposite in direction after this character edge hum pattern and this normalization.
Character zone equal and opposite in direction after this character edge hum pattern and this normalization, i.e. the equal and opposite in direction of this character edge hum pattern and binary character figure, the number of line number, columns, pixel equates.In the time determining the pixel value of each pixel in this character edge hum pattern, what need to recognize is somebody's turn to do and the saltus step of pixel value in binary character figure, during according to saltus step, white pixel is put corresponding position, in character edge hum pattern, be 255 by the pixel assignment corresponding with this white pixel point position, in this character edge hum pattern, the position assignment of other pixels for other values, for example, can be 0.
S303: identify the pixel value of each pixel in this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and determine ownership direction.
S304: be this gradient direction angle by position assignment corresponding with this pixel position in template, other position assignment are-1.
Create in the present invention a template equal to width character edge hum pattern height, template also can be thought a two-dimensional array equating with this character edge hum pattern height and width.In the time determining the assignment of each position in this template, scan this character edge hum pattern, when the pixel value that scans pixel is during for other values, for example, be 0 o'clock, be-1 by the relevant position assignment in this template corresponding with this pixel position; In the time scanning the pixel value of pixel and be 255, while scanning white point, in the grey chromatic graph after normalization on the position corresponding with this pixel, calculate the Grad of this pixel according to following formula:
Gradient=dy/dx
Wherein, dy=g (i, j+1)-g (i, j-1), dx=g (i+1, j)-g (i-1, j), g (i, j) gray-scale value of this pixel correspondence position in the gray level image after normalization, i represents the row at this pixel place, and j represents the row at this pixel place, and Gradient is the Grad of this pixel of calculating.
Angular range 8 deciles by 0 degree to 360 degree, the corresponding gradient direction of each equal portions, adopts respectively 1 ~ 8 to carry out mark, according to the Grad of this pixel calculating, calculate the gradient direction angle of this pixel, according to this gradient direction angle calculating, determine the direction of this gradient direction angle ownership.
Having created after multiple templates for each character, each template is kept to position corresponding with each character in template base, in template base, preserve multiple templates for each character.
In the time identifying for character, when having obtained after the picture that comprises character to be identified, according to the constructive process of above-mentioned template, this picture is converted to gray-scale map, and adopt corresponding Binarization methods, picture after conversion is carried out to binary conversion treatment, and this Binarization methods is identical with the Binarization methods in template establishment process.
In picture after binary conversion treatment, start to picture inner scanning from the four direction of picture respectively, identify the position of first white pixel point in each direction, according to the position of the white pixel point recognizing in each direction, determine the border of this character to be identified; According to definite character boundary, from the colour picture of this character to be identified, intercept this character zone to be identified.
The character zone to be identified intercepting is converted to gray-scale map, and adopt corresponding Binarization methods, this gray-scale map is carried out to binary conversion treatment, and according to the size arranging, the character zone to be identified after this gray-scale map and binary conversion treatment is normalized, wherein the size of this setting, identical with the size arranging in template establishment process, all for example wide by 24, high 48 etc., and the Binarization methods adopting is here also identical with the Binarization methods that normalization process in template establishment adopts.
After after binaryzation, this character zone to be identified is normalized, identify the saltus step of the pixel value of pixel after this normalization, occur from 0 to 1 when recognizing pixel value, or when from 1 to 0 saltus step, the position of white pixel point during according to saltus step, in character edge hum pattern, the position assignment of respective pixel point is 255, and the position assignment of rest of pixels point is 0.
The pixel value of each pixel in character edge hum pattern after identification assignment, in the time that to recognize pixel value be 255 pixel, be adjacent the gray-scale value of pixel according to this pixel in the gray-scale map after normalization, calculate the Grad of this pixel, according to the Grad of this pixel calculating, determine the gradient direction of this pixel.
According to the gradient direction of this pixel of determining, at 0 degree in 8 directions of decile in 360 degree angular ranges, determine the direction of this gradient direction ownership, using the direction of its ownership in character edge gradient array to be identified to numerical value that should pixel position, other position assignment of this array are-1.
Each template that this edge gradient array is corresponding with each character of preservation is mated, determine and the matching distance of each template, specifically in the time determining matching distance, according to following formula:
Wherein,
C (i, j) be that in the edge gradient array of character to be identified, i is capable, the numerical value of j row, t (i, j) is that in template, i is capable, the numerical value of j row, H is the height of normalization rear pattern plate, W is the width of normalization rear pattern plate, and the numerical value of S is according to being not equal to-1 number of times in the edge gradient array of character to be identified, and the each position of template is not equal to-1 number of times and determines.
Fig. 4 is the structural representation of character recognition device provided by the invention, and this device comprises:
Normalization module 41, for the picture that comprises character information to be identified is carried out to binary conversion treatment, the border of character to be identified in this picture after identification binary conversion treatment, according to definite border, in the samples pictures that comprises character information to be identified, intercept character zone to be identified, the character zone of this intercepting is carried out to binaryzation, normalized;
Marginal information determination module 42, for the character zone to be identified after normalized, the saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in character edge hum pattern, other values are composed in the position of other pixels, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization;
Gradient direction determination module 43, for identifying the pixel value of the each pixel of this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and definite ownership direction, adopt relevant position assignment in this ownership direction value edge gradient array, other position assignment of this array are-1;
Coupling identification module 44, for this edge gradient array each template corresponding with each character of preservation mated, determines matching distance, using character corresponding matching distance minimum value as recognition result.
Described normalization module 41, also, for when the drawing template establishment, carries out binary conversion treatment to the samples pictures that comprises character information, the border of character in this picture after identification binary conversion treatment; According to definite character boundary, in the samples pictures that comprises character information, intercept character zone, the character zone of this intercepting is carried out to binaryzation, normalized;
Described marginal information determination module 42, also for when the drawing template establishment, in character zone after normalized, the saltus step of identification pixel value, the position of white pixel point during according to saltus step, the position assignment 255 of respective pixel point in character edge hum pattern, other values are composed in the position of other pixels, wherein the character zone equal and opposite in direction after this character edge hum pattern and this normalization;
Described gradient direction determination module 43, also for when the drawing template establishment, identify the pixel value of each pixel in this character edge hum pattern, in the time recognizing the pixel value of pixel in this character edge hum pattern and be 255, calculate the Grad of this pixel, and determine ownership direction, and be direction value by position assignment corresponding with this pixel position in template, other position assignment are-1.
Described normalization module 41, starts specifically for four borders of the picture from binary conversion treatment, respectively to picture inner search; When search in this picture white pixel point time, determine that this white pixel point is positioned at the border of this character; According to the position of the white pixel point scanning from each boundary direction, determine the border of this character.
Described normalization module 41, also for the character zone of this intercepting is converted to gray-scale map, and is normalized;
Described gradient direction determination module 43, for the grey chromatic graph position corresponding with this pixel after normalization, calculates the Grad of this pixel according to following formula:
Gradient=dy/dx
Wherein, dy=g (i, j+1)-g (i, j-1), dx=g (i+1, j)-g (i-1, j), g (i, j) gray-scale value of this pixel correspondence position in the gray level image after normalization, i represents the row at this pixel place, and j represents the row at this pixel place, and Gradient is the Grad of this pixel of calculating;
According to this Grad calculating, compute gradient deflection;
According to this gradient direction angle, and 8 directions of dividing between 0 to 360 degree, determine the direction that this gradient direction angle belongs to.
Described coupling identification module 44, specifically for basis
determine matching distance, wherein,
C (i, j) be that in the edge gradient array of character to be identified, i is capable, the numerical value of j row, t (i, j) is that in template, i is capable, the numerical value of j row, H is the height of normalization rear pattern plate, W is the width of normalization rear pattern plate, and the numerical value of S is according to being not equal to-1 number of times in the edge gradient array of character to be identified, and the each position of template is not equal to-1 number of times and determines.
The invention provides a kind of character identifying method and device, the method is during for character recognition to be identified and template establishment, be normalized, determine character edge hum pattern, and the value of each numerical value in definite edge gradient array, when having determined after the assignment of relevant position in edge gradient array, according to the coupling between character to be identified corresponding edge gradient array and each template corresponding to each character, determine matching distance, according to matching distance identification character.Due in the present invention according to the gradient direction of each pixel in character, as the respective value in edge gradient array, and gradient direction has stronger antijamming capability, therefore this character identifying method has stronger robustness, and the each template corresponding with each character mated in the process of identification, according to matching distance, using character corresponding matching distance minimum value as recognition result, therefore can avoid the problem of the poor robustness of character list template matches, improve the scope of application of this matching process.
Above-mentioned explanation illustrates and has described a preferred embodiment of the present invention, but as previously mentioned, be to be understood that the present invention is not limited to disclosed form herein, should not regard the eliminating to other embodiment as, and can be used for various other combinations, amendment and environment, and can, in invention contemplated scope described herein, improve by technology or the knowledge of above-mentioned design or association area.And the change that those skilled in the art carry out and variation do not depart from the spirit and scope of the present invention, all should be in the protection domain of claims of the present invention.