CN106127198A

CN106127198A - A kind of image character recognition method based on Multi-classifers integrated

Info

Publication number: CN106127198A
Application number: CN201610442435.0A
Authority: CN
Inventors: 潘家辉; 黄绍峰; 罗笑玲; 欧阳天优
Original assignee: South China Normal University
Current assignee: South China Normal University
Priority date: 2016-06-20
Filing date: 2016-06-20
Publication date: 2016-11-16

Abstract

The present invention provides a kind of image character recognition method based on Multi-classifers integrated, and colored image to be identified is converted to gray level image；Gray level image is carried out binary conversion treatment, and the image region segmentation of Word message will be comprised out；Each Chinese character is split from monoblock character image；Extract grid search-engine and the direction character of each Chinese character；Use minimum distance classifier, select stroke density total length feature to carry out the rough sort of ground floor；Use nearest neighbour classification device, select peripheral characteristic, grid search-engine and direction character to combine the classification and matching of the second layer respectively.The invention have the advantages that the existing stronger capacity of resisting disturbance of Text region, have again the ability of stronger descriptive text partial structurtes, and affected by stroke width less；Have employed minimum distance classifier, the combining classifiers technology combined closest to grader complementation, make system more reliability；Can Intelligent Recognition word, improve the adaptability of system and discrimination be high.

Description

A kind of image character recognition method based on Multi-classifers integrated

Technical field

The present invention relates to pictograph identification field, more particularly, to a kind of image based on Multi-classifers integrated literary composition Word recognition methods.

Background technology

Social development enters the information age, and along with expansion, deep and socialization's needs of practical activity, human needs goes to know The information of the form content complexity of a lot of classes.People the most no longer rest on oneself ear and eyes go to directly obtain these Information, but use computer that word inputs computer automatically.Owing to scientific and technological level improves constantly so that various different Object of study obtains " image conversion " and " digitized ", and the multimedia messages based on image rapidly becomes important information transmission matchmaker Being situated between, the Word message in image contains abundant high-layer semantic information.Extract these words, semanteme high-level for image Understanding, index and retrieve the most helpful.

Now for the research of character image identification technology, being also faced with Railway Project, one is that image data amount is big, typically comes Saying, will obtain higher accuracy of identification, original image should have higher resolution, at least should be greater than 64 × 64.Two is image It is stained, owing to the interference of target environment, the error of transmission, the error of sensor, noise, ambient interferences, deformation etc. can be stained figure Picture.Three is accuracy, and the vision of displacement, rotation, dimensional variation, distortion, with the mankind is the same, there are between target and sensor The change of position, it is therefore desirable to system is when target produces displacement, rotation, dimensional variation, distortion, still is able to correctly identify mesh Mark.Four is real-time, and in the application of military field, the identification target that the system that is desirable that greatly can be real-time, this just requires system There is the out speed and recognition efficiency being exceedingly fast.

In view of the current situation of current character identification system, the discrimination how improving print hand writing is still current grinding Study carefully focus, under World Scene, how to identify that word will be a direction of character identification system development.In addition, how tool is built Have the space of a whole page to automatically analyze, fault-tolerance is strong, discrimination is high, mistake self study self-correction, the character identification system of easy extension characteristic are The goal in research of Text region automatization.So, the research of pictograph identification technology is particularly important.

Summary of the invention

The present invention is to overcome at least one defect described in above-mentioned prior art, it is provided that a kind of automatization, discrimination height Image character recognition method based on Multi-classifers integrated.

For solving above-mentioned technical problem, technical scheme is as follows:

A kind of image character recognition method based on Multi-classifers integrated, said method comprising the steps of:

S1: colored image to be identified is converted to gray level image, if image to be identified is originally as gray level image, omits this step Suddenly；

S2: the gray level image obtained is carried out binary conversion treatment, and the image region segmentation of Word message will be comprised out；

S3: each Chinese character is split from monoblock character image；

S4: extract grid search-engine and the direction character of each Chinese character；

S5: use minimum distance classifier, selects stroke density total length feature to carry out the rough sort of ground floor；

S6: use nearest neighbour classification device, selects peripheral characteristic, grid search-engine and direction character to combine the second layer respectively Classification and matching.

In the preferred scheme of one, in step S1, when colored image to be identified is converted to gray level image, employing adds Weight average value method carries out gradation conversion, i.e. the value weighted average to R, G, B: R=G=B=a*R+b*G+c*B；Wherein, R, G, B represents red, green and blue respectively, and a, b, c are respectively the weights of R, G, B, wherein b > a > c.

When pictograph identification, the image to be identified of input is typically all color RGB image, it comprises substantial amounts of face Color information, if image carrying out place comprehend the execution speed of reduction system, RGB image includes a lot of and Text region in addition Unrelated colouring information, is unfavorable for the location of word, and gray level image, only comprise monochrome information, do not comprise color information, favorably In later stage process further to image, the speed of service can be improved, be conducive to next step text location.Owing to human eye is to green Color is most sensitive, takes second place red sensitivity, minimum to blue sensitivity, so when at b > under conditions of a > c, can obtain To the gray level image being relatively easy to identification.

In the preferred scheme of one, in step S2, use OTSU algorithm (Da-Jin algorithm or maximum variance between clusters) to ash Degree image carries out binary conversion treatment.

The binary conversion treatment of image, is that the gray value to the pixel on image is set to 0 or 255, i.e. big when all gray scales In or be judged as specific object equal to the pixel of threshold values, its gray value is 255, and otherwise, its gray value is 0, represents it His object area or background, the image after process will present obvious black and white effect.The binaryzation of image will be to have 256 The gray level of pixel, after suitable threshold values is chosen, is divided into 2 grades by the gray level image of individual tonal gradation.Through binary conversion treatment After image, its character is only the most relevant with the position of the pixel that gray value is 0 or 255, no longer relates to the picture of other gray levels Vegetarian refreshments, it is simple to image is further processed, and the treating capacity of data and decrement less, and obtain binary image still The old feature that can reflect image entirety and local.In order to obtain preferable binary image, choosing of threshold values is most important.Choosing Take suitable threshold values, be possible not only to effectively remove noise, and image can be divided into target area and background significantly, significantly Reduce quantity of information, the speed that raising processes.

In the preferred scheme of one, in step S3, use the single word in character segmentation method identification image-region, i.e. profit Single character is divided by the crest formed by the upright projection on image level direction of the white space between word and word with trough Cut out.

In the preferred scheme of one, in step S3, in order to improve accuracy rate, use regression equation character segmentation method identification single Word, is i.e. square figure according to Chinese character, has the feature of uniform-dimension substantially, the word height obtained when utilizing row cutting Estimate the width of word, thus predict the position of next word.

In the preferred scheme of one, in step S4, the concrete grammar extracting word grid search-engine is as follows:

1) word lattice is divided into 8 × 8 parts；

2) the stain number in obtaining every part, with P11, P12 ... P18, P21 ... P88 represents；

3) the stain number P=P11+P12+ that word is total is obtained ...+Pl8+P21+ ...+P88；

4) the percentage ratio Pij=Pij × 100/P of whole word stain number shared by stain number in obtaining every part, wherein i, j are big In the integer equal to 1 and less than or equal to 8, and characteristic vector (P11, P12 ... P18, P21 ... P88) it is exactly the grid search-engine of word.

In the preferred scheme of one, in step S4, the concrete grammar extracting words direction feature is as follows:

Word lattice image is carried out binaryzation and normalization, and extracts profile information, to each imparting one on profile Or the attribute of both direction, direction takes level, vertical and positive and negative 45 ° of totally four angles, word lattice is divided into n × n net Lattice, calculate the number of 4 direction attributes that each grid includes, thus constitute 4 dimensional vectors, and comprehensive all of grid is special Levy, form the characteristic vector of a 4 × n × n dimension, be direction character.

In the preferred scheme of one, in step S5, the concrete grammar building minimum distance classifier is as follows:

1) from sample, extract the stroke density length characteristic vector as rough sort of word.2) each classification is calculated respectively The feature corresponding to sample, the most one-dimensional of each class has characteristic set, by set, an average can be calculated, also It it is exactly eigencenter.3) generally for eliminating different characteristic because the different impact of dimension, the most one-dimensional feature is needed by we Do a normalization, or scaling is to intervals such as (-1,1) so that it is go dimension.4) utilize the distance criterion chosen, treat Originally judging of classification.

In the preferred scheme of one, in step S6, the concrete grammar building nearest neighbour classification device is as follows:

1) initializing distance is maximum

2) unknown sample and distance dist of each training sample are calculated

3) K up till now is obtained individual closest to ultimate range maxdist in sample

4) if dist is less than maxdist, then using this training sample as K-nearest samples

5) repeat step 2,3,4, until the distance of unknown sample and all training samples is the completeest

6) number of times that in statistics K-nearest samples, each class label occurs

7) the class label class label as unknown sample of frequency of occurrences maximum is selected

Compared with prior art, technical solution of the present invention provides the benefit that: the present invention provides a kind of based on Multi-classifers integrated Image character recognition method, colored image to be identified is converted to gray level image；Gray level image is carried out binary conversion treatment, And the image region segmentation of Word message will be comprised out；Each Chinese character is split from monoblock character image；Extract every The grid search-engine of individual Chinese character and direction character；Use minimum distance classifier, select stroke density total length feature to carry out the The rough sort of one layer；Use nearest neighbour classification device, select peripheral characteristic, grid search-engine and direction character to combine respectively The classification and matching of the second layer.For feature extraction, the method using grid and direction character to combine, make Text region existing stronger Capacity of resisting disturbance, have again the ability of stronger descriptive text partial structurtes, and affected less by stroke width；For figure As, in Text region, applying artificial intelligence's learning art, the adaptability and the discrimination that improve system are high；Grader is set Meter, have employed minimum distance classifier, the combining classifiers technology combined closest to grader complementation, makes system the most reliable Property.

Accompanying drawing explanation

Fig. 1 is the flow chart of image character recognition method based on Multi-classifers integrated.

Fig. 2 is the schematic diagram of gradation conversion and binaryzation.

Fig. 3 is the schematic diagram of regression equation character segmentation method.

Fig. 4 is the schematic diagram extracting word grid search-engine.

Fig. 5 is the schematic diagram extracting direction grid search-engine.

Fig. 6 is the Text region schematic diagram of multiple combining classifiers.

Fig. 7 is the schematic diagram that whole section of Text segmentation becomes single font.

Fig. 8 is the schematic diagram of the form output character with text box.

Detailed description of the invention

Accompanying drawing being merely cited for property explanation, it is impossible to be interpreted as the restriction to this patent；

In order to the present embodiment is more preferably described, some parts of accompanying drawing have omission, zoom in or out, and do not represent the chi of actual product Very little；

To those skilled in the art, in accompanying drawing, some known features and explanation thereof may will be understood by omission.

With embodiment, technical scheme is described further below in conjunction with the accompanying drawings.

Embodiment 1

As it is shown in figure 1, a kind of image character recognition method based on Multi-classifers integrated, said method comprising the steps of:

When colored image to be identified is converted to gray level image, use weighted average method to carry out gradation conversion, i.e. to R, The value weighted average of G, B: R=G=B=a*R+b*G+c*B；Wherein, R, G, B represent red, green and blue respectively, a, b, c It is respectively the weights of R, G, B, wherein b > a > c.

As in figure 2 it is shown, in step S2, use OTSU algorithm that gray level image is carried out binary conversion treatment.The binary conversion treatment of image, It is that the gray value to the pixel on image is set to 0 or 255, is i.e. determined more than or equal to the pixel of threshold values when all gray scales For specific object, its gray value is 255, and otherwise, its gray value is 0, represents other object area or background, after process Image will present obvious black and white effect.The binaryzation of image will be that the gray level image with 256 tonal gradations is through closing After suitable threshold values is chosen, the gray level of pixel is divided into 2 grades.Image after binary conversion treatment, its character and gray value The position being the pixel of 0 or 255 is relevant, no longer relates to the pixel of other gray levels, it is simple to make image further Process, and the treating capacity of data and decrement less, and the binary image obtained still can reflect that image is overall with local Feature.In order to obtain preferable binary image, choosing of threshold values is most important.Choose suitable threshold values, be possible not only to Noise is removed on effect ground, and image can be divided into target area and background significantly, greatly reduces quantity of information, the speed that raising processes Degree.

OTSU algorithm is by the gamma characteristic of image, divides the image into background and target 2 part, between background and target Inter-class variance is the biggest, illustrates that the difference of 2 parts of pie graph picture is the biggest, when partial target mistake is divided into background or part background mistake to divide 2 part difference all can be caused to diminish for target, therefore, the segmentation making inter-class variance maximum means that misclassification probability is minimum；

Otsu algorithm steps is as follows:

If image comprise L gray level (0,1 ..., L-1), gray value be i pixel count as Ni, the picture element that image is total Number is N=N0+N1+...+N (L-1), and gray value is that the probability of the point of i is: P (i)=N (i)/N；

Entire image is divided into dark space c1 and clear zone c2 two class by thresholding t, then inter-class variance σ is the function of t: σ=a1*a2 (u1-u2) ^2 ；In formula, aj is the area ratio with the image gross area of class cj, a1=sum (P (i)) i-> t, a2=1-a1;

Uj is the average of class cj, u1=sum (i*P (i))/a1 0-> t, u2=sum (i*P (i))/a2, t+1-> L-1, should Method selects optimum thresholding t^ to make inter-class variance maximum, it may be assumed that make Δ u=u1-u2, σ b=max{a1 (t) * a2 (t) Δ u^2}.

S3: each Chinese character is split from monoblock character image；

As it is shown on figure 3, in step S3, use the single word in character segmentation method identification image-region, i.e. utilize between word and word The crest that formed of white space upright projection on image level direction with trough by single Character segmentation out.In order to carry High-accuracy, uses the regression equation character segmentation single word of method identification, is i.e. square figure according to Chinese character, has uniform chi substantially Very little feature, the word height obtained when utilizing row cutting is to estimate the width of word, thus predicts the position of next word.

The feature of extraction single kind carries out Chinese Character Recognition, and misclassification rate is difficult to reduce, and anti-interference is not easy to improve.Because this The Chinese character information amount that sample is utilized is limited, it is impossible to reflect the feature of Chinese character comprehensively, for any feature, certainly exists it " dead angle " identified, i.e. utilizes this feature to be difficult to distinguish Chinese character.From the perspective of pattern recognition, if by whole arrows of Chinese character The space that quantization characteristic is formed be referred to as space Ω (i=1,2 ...), then utilize the information of whole space Ω to carry out Chinese character knowledge Not, owing to the Chinese character information provided is very abundant, anti-interference can be greatly enhanced.But, in actual applications, it is necessary to consider that know Other accuracy and recognition speed (operand) and system resource three's is compromise.So the OCR system of any one practicality only profit Information with wherein parton space.Due to the defect of information, the problem the most inevitably running into identification " dead angle ".

On the basis of these technique study, the present invention have selected the grid search-engine of Chinese character and direction character carries out Chinese character knowledge Not, these features have stronger capacity of resisting disturbance, have again the ability of stronger descriptive text partial structurtes, and by stroke width The impact of degree is less, brings out the best in each other, makes " dead angle " of Chinese Character Recognition significantly to reduce, thus improve discrimination.

As shown in Figure 4, in step S4, the concrete grammar extracting word grid search-engine is as follows:

1) word lattice is divided into m × m part, the present embodiment is divided into 8 × 8 parts.

2) the stain number in obtaining every part, with P11, P12 ... P18, P21 ... P88 represents.

3) the stain number P=P11+P12+ that word is total is obtained ...+Pl8+P21+ ...+P88.

4) the percentage ratio Pij=Pij × 100/P, wherein i, j of whole word stain number shared by stain number in obtaining every part For the integer more than or equal to 1 and less than or equal to 8, and characteristic vector (P11, P12 ... P18, P21 ... P88) it is exactly that the grid of word is special Levy.

As it is shown in figure 5, in step S4, the concrete grammar extracting words direction feature is as follows:

S5: as shown in Figure 6, uses minimum distance classifier, selects stroke density total length feature to carry out ground floor Rough sort；

Minimum distance classifier selects stroke density total length feature to carry out the rough sort of ground floor.In this approach, quilt Recognition mode is minimum with the distance of affiliated pattern class sample.Assuming that c classification represents characteristic vector R1 of pattern ..., Rc Representing, x is the characteristic vector of identified pattern, | x-Ri | be x Yu Ri (i=1,2 ..., c) between distance, if | x-Ri | Minimum, then be divided into the i-th class x.

S6: use nearest neighbour classification device, selects peripheral characteristic, grid search-engine and direction character to combine the respectively The classification and matching of two layers.

Nearest neighbour classification device selects grid search-engine and direction character to combine the classification and matching of the second layer respectively.? Nearest Neighbor Classifier is to be extended on the basis of minimum distance classification, is depended on as differentiation by each sample in training set According to, find the sample in the training set that sample to be sorted is nearest, classify on this basis.

Through test of many times and research, conclusion shows can not fundamentally improve systematicness based on single evaluator principle Can, recognition result integrated of multiple grader should be relied on.Multi-classifers integrated is i.e. improved by the grader of multiple complementations The performance of single grader, obtains a reliability higher identification system.Therefore, the present invention use minimum distance classifier and Nearest neighbour classification device is integrated, and by the optimization in classifier design, further increase word can not rate and accuracy rate.

For checking effectiveness of the invention, need to carry out related experiment, the present invention uses the original image comprising 697 Chinese characters Test.First this original image is converted into gray level image to carry out next step operation.Cut by regression equation word Point-score becomes single font whole section of Text segmentation, tests effect such as Fig. 7, can split each Chinese character exactly.Finally, use The word that the method identification of multi-feature extraction and Multi-classifers integrated splits, and export with the form of text box, test knot Fruit such as Fig. 8, result is the most correct.

Multi-feature extraction method and multi-classifier integrating method make raising pictograph discrimination be possibly realized, and it is good Recognition effect causes the most attention of people, has broad application prospects.The present invention is real based on multi-classifier integrating method Existing pictograph identification, makes the process of pictograph information and extraction be more feasible.

Obviously, the above embodiment of the present invention is only for clearly demonstrating example of the present invention, and is not right The restriction of embodiments of the present invention.For those of ordinary skill in the field, the most also may be used To make other changes in different forms.Here without also cannot all of embodiment be given exhaustive.All at this Any amendment, equivalent and the improvement etc. made within the spirit of invention and principle, should be included in the claims in the present invention Protection domain within.

Claims

1. an image character recognition method based on Multi-classifers integrated, it is characterised in that said method comprising the steps of:

S3: each Chinese character is split from monoblock character image；

Image character recognition method based on Multi-classifers integrated the most according to claim 1, it is characterised in that step S1 In, when colored image to be identified is converted to gray level image, use weighted average method to carry out gradation conversion, i.e. to R, G, The value weighted average of B: R=G=B=a*R+b*G+c*B；Wherein, R, G, B represent red, green and blue respectively, and a, b, c divide Not Wei the weights of R, G, B, wherein b > a > c.

Image character recognition method based on Multi-classifers integrated the most according to claim 1, it is characterised in that step S2 In, use OTSU algorithm that gray level image carries out binary conversion treatment, OTSU algorithm is by the gamma characteristic of image, divides the image into Background and target 2 part, the inter-class variance between background and target is the biggest, illustrates that the difference of 2 parts of pie graph picture is the biggest, when Partial target mistake is divided into background or part background mistake to be divided into target that 2 part difference all can be caused to diminish, and therefore, makes inter-class variance Big segmentation means that misclassification probability is minimum；

Otsu algorithm steps is as follows:

Image character recognition method based on Multi-classifers integrated the most according to claim 1, it is characterised in that step S3 In, use the single word in character segmentation method identification image-region, i.e. utilize the white space between word and word at image level The crest that upright projection on direction is formed is with trough by single Character segmentation out.

Image character recognition method based on Multi-classifers integrated the most according to claim 4, it is characterised in that step S3 In, use the regression equation character segmentation single word of method identification, be i.e. square figure according to Chinese character, there is the spy of uniform-dimension substantially Point, the word height obtained when utilizing row cutting is to estimate the width of word, thus predicts the position of next word.

Image character recognition method based on Multi-classifers integrated the most according to claim 1, it is characterised in that step S4 In, the concrete grammar extracting word grid search-engine is as follows:

1) word lattice is divided into 8 × 8 parts；

Image character recognition method based on Multi-classifers integrated the most according to claim 1, it is characterised in that step S4 In, the concrete grammar extracting words direction feature is as follows: