CN107292307B

CN107292307B - Automatic identification method and system for inverted Chinese character verification code

Info

Publication number: CN107292307B
Application number: CN201710599718.0A
Authority: CN
Inventors: 路松峰; 罗立志; 王同洋
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2017-07-21
Filing date: 2017-07-21
Publication date: 2019-12-17
Anticipated expiration: 2037-07-21
Also published as: CN107292307A

Abstract

The invention discloses an automatic identification method and system for an inverted Chinese character verification code, wherein the method comprises the following steps: processing the verification code image to be identified to obtain a target verification code image, and segmenting the target verification code image by the left boundary and the right boundary of each character in the target verification code image to obtain each single-character image; generating label information corresponding to the target verification code image, wherein the label information is represented by binary, 0 represents that characters in the single-character image are normal, and 1 represents that the characters in the single-character image are inverted; the length of the label information is used as a parameter and is transmitted to a trained inverted Chinese character automatic identification model, and the number of output of the neuron is set to be consistent with the length of the label information by the inverted Chinese character automatic identification model; and (4) using the sum of the cross entropies as a loss function of the trained automatic identification model of the inverted Chinese character, training the loss function until the loss function is converged, and outputting an identification result. The automatic identification of the inverted Chinese character verification code can be realized.

Description

automatic identification method and system for inverted Chinese character verification code

Technical Field

the invention belongs to the technical field of automatic identification, and particularly relates to an automatic identification method and system for an inverted Chinese character verification code.

background

with the rapid development of the internet, the network brings great convenience to the life of people, and meanwhile, the security problem of the network is increasingly prominent. The network verification code is a widely used verification means, and plays an important role in network security.

At present, the identification of the character type verification code is mainly the identification of characters, and the main flow is as follows: preparing an original picture material; preprocessing a picture, cutting characters and normalizing the size of the picture; marking picture characters; extracting character and picture features; generating a training data set corresponding to the features and the labels; training the feature tag data to generate a recognition model; a new set of unknown pictures is predicted using the recognition model.

however, the conventional verification code recognition technology is to recognize characters, such as numbers, letters, etc., appearing in the verification code. Without regard to how the inverted chinese character is recognized.

disclosure of Invention

In view of the above defects or improvement requirements of the prior art, the present invention provides an automatic identification method and system for an inverted chinese character verification code, so as to solve the technical problem that the traditional verification code identification technology cannot identify inverted chinese characters.

To achieve the above object, according to one aspect of the present invention, there is provided an automatic identification method of an inverted chinese character verification code, comprising:

processing a verification code image to be recognized to obtain a target verification code image, acquiring the left and right boundaries of each character in the target verification code image, and segmenting the target verification code image according to the left and right boundaries of each character to obtain single-character images;

Generating label information corresponding to the target verification code image, wherein the label information is represented by binary, 0 represents that characters in a single-character image are normal, and 1 represents that characters in the single-character image are inverted;

Transmitting the length of the label information as a parameter to a trained automatic inverted Chinese character recognition model, and setting the number of output of the neuron to be consistent with the length of the label information by the trained automatic inverted Chinese character recognition model, wherein each output is a two-class classifier;

and using the sum of cross entropies as a loss function of the trained automatic inverted Chinese character recognition model, training the loss function until the loss function is converged, and outputting a recognition result, wherein the number of the cross entropies is the same as the number of outputs.

preferably, the training method of the trained automatic inverted Chinese character recognition model comprises the following steps:

Respectively processing each sample image in the test sample image to obtain the left and right boundaries of each character in each processed sample image, and segmenting each sample image by the left and right boundaries of each character in each sample image to obtain a single character image corresponding to each sample image;

Respectively generating label information corresponding to each sample image, wherein the label information of each sample image is represented by binary, 0 represents that characters in a single-character image in the corresponding sample image are normal, and 1 represents that characters in the single-character image in the corresponding sample image are inverted;

Sequentially transmitting the length of the label information of each sample image as a parameter to a training model, and setting the output number of neurons by the training model according to the length of the input label information, wherein each output is a two-classification classifier;

And training the training model by using the sum of the cross entropies as a loss function of the training model until the output accuracy of the training model meets the requirement of preset accuracy, and storing the training model.

Preferably, the training model is a 2-dimensional convolutional neural network model, and the 2-dimensional convolutional neural network model adopts a multilayer convolutional layer plus a multilayer pooling layer, and finally a full-link layer.

Preferably, the parameters of the 2-dimensional convolutional neural network model are that the input is the number of characters in the verification code image, the number of convolutional kernels is M, M-dimensional features are extracted for each input, the size of the convolutional kernels is N × N, the channel value is L, the boundary processing mode padding value is SAME, the pooling function adopts average pooling avg _ pool, and C × C pooling is used, that is, the color block of C × C is reduced to be C × CThe boundary processing mode padding value is SAME.

Preferably, M is 32, N is 5, L is 1 and C is 2.

According to another aspect of the present invention, there is provided an automatic identification system of an inverted chinese character verification code, comprising:

the character segmentation module is used for processing a verification code image to be identified to obtain a target verification code image, acquiring the left and right boundaries of each character in the target verification code image, and segmenting the target verification code image according to the left and right boundaries of each character to obtain single-character images;

the tag information generating module is used for generating tag information corresponding to the target verification code image, wherein the tag information is represented by binary, 0 represents that characters in a single character image are normal, and 1 represents that characters in the single character image are inverted;

The input module is used for transmitting the length of the label information as a parameter to a trained automatic inverted Chinese character recognition model, and the trained automatic inverted Chinese character recognition model sets the number of output of the neuron to be consistent with the length of the label information, wherein each output is a two-class classifier;

and the recognition module is used for training the loss function by using the sum of cross entropies as the loss function of the trained automatic inverted Chinese character recognition model until the loss function is converged and outputting a recognition result, wherein the number of the cross entropies is the same as the number of outputs.

preferably, the character segmentation module is further configured to process each sample image in the test sample image, obtain left and right boundaries of each character in each processed sample image, and segment each sample image by the left and right boundaries of each character in each sample image to obtain a single character image corresponding to each sample image;

The label information generating module is further used for respectively generating label information corresponding to each sample image, wherein the label information of each sample image is represented by binary, 0 represents that characters in a single-character image in the corresponding sample image are normal, and 1 represents that characters in the single-character image in the corresponding sample image are inverted;

The input module is further used for sequentially transmitting the length of the label information of each sample image as a parameter to a training model, and the training model sets the output number of the neurons according to the length of the input label information, wherein each output is a two-class classifier;

and the model training module is used for training the training model by using the sum of cross entropies as a loss function of the training model until the output accuracy of the training model meets the requirement of preset accuracy, and storing the training model.

Preferably, M is 32, N is 5, L is 1 and C is 2.

in general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects: the verification code image is segmented to obtain single character images, and then the binary sequence is used for distinguishing inverted characters from normal characters, so that the inverted characters and the normal characters can be accurately distinguished; the length of the generated label information containing character characteristics in the verification code image is used as the input of the trained automatic identification model of the inverted Chinese character, so that the automatic identification of the inverted Chinese character verification code can be realized; the inverted Chinese character automatic identification model is obtained by training the constructed training model through the segmented test sample image, inverted characters in the Chinese character verification code can be automatically identified, manual operation is reduced, and an automatic process is realized.

Drawings

fig. 1 is a schematic flow chart of an automatic identification method for an inverted chinese character verification code according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

Fig. 1 is a schematic flow chart of an automatic identification method for an inverted chinese character verification code according to an embodiment of the present invention, where the method shown in fig. 1 includes the following steps:

s1, processing the verification code image to be recognized to obtain a target verification code image, obtaining the left and right boundaries of each character in the target verification code image, and segmenting the target verification code image according to the left and right boundaries of each character to obtain single-character images;

as an alternative embodiment, in step S1, the following method may be adopted to process the verification code image to be recognized:

preprocessing the image of the verification code to be identified, and mainly comprising the technologies of graying, binaryzation, noise suppression and the like.

1. Graying of image

A color value in the RGB system consists of 3 components, such an image is called a color image, and the RGB system is called a color space model. Common color space models are also HSI, CMYK, etc. If the color space of an image is one-dimensional (a color value has only one color component), the image is a grayscale image. In the bitmap image, a grayscale image is generally displayed with R ═ G ═ B.

The following three methods are commonly used for graying:

g(x,y)＝0.11×R(x,y)+0.59×G(x,y)+0.3×B(x,y) (2.2)

g(x,y)＝min(R(x,y),G(x,y),B(x,y)) (2.3)

wherein, the method of formula (2.1) is derived from the formula for calculating the I component in the I color space, and formula (2.2) is derived from the formula for calculating the Y component in the NTSC color space. Equation (2.3) is based on using a method that preserves minimum brightness (black).

An RGB color image can be regarded as being composed of 3 monochromatic grayscale images, and a grayscale image can be obtained by directly taking any one of the RGB channels, such as g (x, y) ═ B (x, y), if the luminance information of the target pixel in the image is mainly distributed on the B channel, otherwise the grayscale result will be a large loss of luminance information. The gray image is also called a brightness image, the brightness is represented by the normalized value, the maximum value represents white, and the minimum value represents black.

let P (x, y) denote a point in the image, x, y are the abscissa and ordinate of the image, respectively, R (x, y) denotes the color component of the R channel, G (x, y) denotes the color component of the G channel, and B (x, y) denotes the color component of the B channel. The luminance value of the point P (x, y) is represented by L (x, y). The brightness of the color image is not strictly defined and calculated, and is generally calculated by the formula (2.1), which is denoted as L1(x, y). The same luminance value calculated by the formula (2.2) is denoted as L2(x, y), and the luminance value calculated by the formula (2.3) is denoted as L3(x, y). It can be demonstrated that:

L3(x,y)≤L1(x,y) (2.4)

L3(x,y)≤L2(x,y) (2.5)

The formula (2.1) takes the average value of RGB channels, the obtained image is relatively soft, and meanwhile, the average brightness difference between the target and the background is reduced, so that the subsequent threshold processing is not facilitated. The formula (2.2) considers that the human eye has the strongest fitness to green, the second blue and the worst red. When processing the green and blue tone verification code image, the effect of equation (2.2) is satisfactory, but when processing the red tone image, because the weight of red in the equation is small, the brightness difference between the target pixel and the background pixel after graying is severely reduced, and the effect is not as good as that of equation (2.1). The formula (2.3) is based on a premise that the brightness information of the target pixel is limited and reserved, and the subsequent threshold segmentation is facilitated.

2. Binarization of images

The gray scale map of a typical 24-bit RGB image is 8-bit 256 gray scales, and if this gray scale is reduced to 1-bit 2 gray scales, a binary image is obtained in which all data are 0 or 1.

As an alternative implementation, after the target verification code image is obtained by processing the verification code image to be recognized, the target verification code image needs to be segmented, where the character segmentation includes segmenting a character region from the verification code image and dividing the character region into two parts of a single character. If statistical feature matching and neural network recognition are used, individual characters must be segmented first. Simple segmentation methods include equidistant segmentation, integral projection segmentation, intersection segmentation, communication area finding, and the like. And cutting the characters into a group of character series pictures to be detected. And outputting a 0,1 sequence with the same number according to the number of the characters in the picture (wherein 0 represents that the Chinese character is normal, and 1 represents that the Chinese character is inverted). And a step of processing the divided picture, namely scanning the left and right boundaries of each character, namely finding out the left and right most points of each character, and cutting according to the boundaries.

S2, generating label information corresponding to the target verification code image, wherein the label information is represented by binary, 0 represents that the characters in the single-character image are normal, and 1 represents that the characters in the single-character image are inverted;

s3, transferring the length of the label information as a parameter to a trained automatic inverted Chinese character recognition model, and setting the number of output of the neuron to be consistent with the length of the label information by the trained automatic inverted Chinese character recognition model, wherein each output is a two-class classifier;

and S4, training the loss function by using the sum of the cross entropies as the loss function of the trained inverted Chinese character automatic recognition model until the loss function is converged, and outputting a recognition result, wherein the number of the cross entropies is the same as the number of outputs.

where the recognition result is similar to 010100 (binary number equals the number of characters, where 1 indicates the kanji inversion).

the training method of the trained automatic identification model of the inverted Chinese characters comprises the following steps:

Sequentially transmitting the length of the label information of each sample image as a parameter to a training model, and setting the output number of the neurons by the training model according to the length of the input label information, wherein each output is a two-classification classifier;

After parameterization is carried out by adopting the method, the method can adapt to the conditions of characters with different numbers, and achieves better universality.

Preferably, a tool kit based on TensorFlow and TF-Slim can be adopted to train the training model, the optimizer can select Adam, the accuracy is finally calculated, the model is saved, and the requirement of the preset accuracy can be determined as required.

The training model is a 2-dimensional convolutional neural network model, the 2-dimensional convolutional neural network model adopts a multilayer convolutional layer, a multilayer pooling layer and a full-connection layer, the output is a 0,1 sequence (the total number of 0,1 is the same as the number of single characters in the verification code image), and preferably, the 2-dimensional convolutional neural network model provided by Tensorflow can be used.

The parameters of the 2-dimensional convolutional neural network model are that the input is the number of characters in the verification code image, the number of convolutional kernels is M, M-dimensional features are extracted from each input, the size of the convolutional kernels is N, the channel value is L, the boundary processing mode padding value is SAME, the pooling function adopts average pooling avg _ pool, and C pooling is used, namely C color blocks are reduced to be SAMEThe boundary processing mode padding value is SAME. Preferably, M is 32, N is 5, L is 1 and C is 2.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An automatic identification method for an inverted Chinese character verification code is characterized by comprising the following steps:

transmitting the length of the label information as a parameter to a trained automatic inverted Chinese character recognition model, and setting the number of output of the neuron to be consistent with the length of the label information by the trained automatic inverted Chinese character recognition model, wherein each output is a two-class classifier; the automatic identification model of the inverted Chinese characters is a 2-dimensional convolutional neural network model, and the 2-dimensional convolutional neural network model adopts a multilayer convolutional layer, a multilayer pooling layer and a full-connection layer;

2. The method of claim 1, wherein the training method of the trained automatic inverted Chinese character recognition model comprises the following steps:

3. the method of claim 2, wherein the parameters of the 2-dimensional convolutional neural network model are that the inputs are the number of characters in the captcha image, the number of convolutional kernels is M, M-dimensional features are extracted for each input, the size of the convolutional kernels is N x N, the channel value is L, the boundary processing mode padding value is SAME, the pooling function uses average pooling avg _ pool, C is used for pooling C, and C is used for reducing C color blocks to C color blocksThe boundary processing mode padding value is SAME.

4. The method of claim 3, wherein M is 32, N is 5, L is 1, and C is 2.

5. An automatic identification system for an inverted Chinese character verification code is characterized by comprising:

the input module is used for transmitting the length of the label information as a parameter to a trained automatic inverted Chinese character recognition model, and the trained automatic inverted Chinese character recognition model sets the number of output of the neuron to be consistent with the length of the label information, wherein each output is a two-class classifier; the automatic identification model of the inverted Chinese characters is a 2-dimensional convolutional neural network model, and the 2-dimensional convolutional neural network model adopts a multilayer convolutional layer, a multilayer pooling layer and a full-connection layer;

6. the system of claim 5, wherein the character segmentation module is further configured to process each sample image in the test sample image, to obtain left and right boundaries of each character in each processed sample image, and to segment each sample image by the left and right boundaries of each character in each sample image to obtain a single character image corresponding to each sample image;

7. The system of claim 6, wherein the parameters of the 2-dimensional convolutional neural network model are that the inputs are the number of characters in the captcha image, the number of convolutional kernels is M, M-dimensional features are extracted for each input, the size of the convolutional kernels is N x N, the channel value is L, the boundary processing mode padding value is SAME, the pooling function uses average pooling avg _ pool, and C pooling is used, i.e., C color blocks are reduced to C color blocksThe boundary processing mode padding value is SAME.

8. the system of claim 7, wherein M is 32, N is 5, L is 1, and C is 2.