CN103218613A - Method and device for identifying handwritten form figures - Google Patents

Method and device for identifying handwritten form figures Download PDF

Info

Publication number
CN103218613A
CN103218613A CN2013101230858A CN201310123085A CN103218613A CN 103218613 A CN103218613 A CN 103218613A CN 2013101230858 A CN2013101230858 A CN 2013101230858A CN 201310123085 A CN201310123085 A CN 201310123085A CN 103218613 A CN103218613 A CN 103218613A
Authority
CN
China
Prior art keywords
image
partiald
digital
pixel
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013101230858A
Other languages
Chinese (zh)
Other versions
CN103218613B (en
Inventor
张莉
周伟达
王晓乾
何书萍
王邦军
杨季文
李凡长
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201310123085.8A priority Critical patent/CN103218613B/en
Publication of CN103218613A publication Critical patent/CN103218613A/en
Application granted granted Critical
Publication of CN103218613B publication Critical patent/CN103218613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method and device for identifying handwritten form figures. The method for identifying the handwritten form figures includes the following steps: determining an image to be identified; determining at least three pixel features of a specific pixel in the image to be identified according to the gray level of the pixel; respectively determining corresponding covariance of the image to be identified according to the at least three pixel features of the specific pixel; respectively calculating the distance between each covariance of the image to be identified and a Lie group mean value which corresponds to each category of figure-type labels which are included in a preset training image set; respectively determining a figure-type label which corresponds to a minimum distance among a plurality of distances which are determined by each covariance of the image to be identified as a spare figure-type label; and determining the figure-type label with a largest number in the spare figure-type label as a figure-type label to be identified of the image to be identified. Therefore, by adoption of the method for identifying the handwritten form figures, identification accuracy of the handwritten form figures can be effectively improved.

Description

Handwritten digit recognition method and device
Technical Field
The invention relates to the technical field of handwritten form number recognition, in particular to a handwritten form number recognition method and device.
Background
The handwriting of arabic numerals, which is a common symbol in all countries of the world, frequently appears in various fields such as postal systems, bank checks, industrial applications, and the like. With the rapid development of computer technology and digital image processing technology, handwritten digit recognition technology is widely applied, and great convenience is brought to the work of people.
Since numbers in various fields often represent precise values, and small errors are likely to have unpredictable results, a simple and efficient handwritten number recognition method with high accuracy has been an important research direction.
With the popularization and application of machine learning technology, many physicists and chemists begin to widely use data of the lie group theory research related field; accordingly, in the field of handwritten numeral recognition technology, lie group structure data has been widely used in its good mathematical structure.
The lie group mean classifier (lieMeans) is a simple and effective classification method proposed by j.a. hartigan et al in the article "AK-Means Clustering Algorithm", however, it selects a single covariance characteristic to realize classification, so that the solution found by the gradient descent method is only a local minimum value, not necessarily a global minimum value, and the performance is not good enough when processing the multi-classification problem.
Therefore, the existing handwritten form recognition method based on the lie group mean classifier realizes classification by selecting a single covariance characteristic, cannot fully utilize the spatial information of an image to be recognized, and causes low recognition accuracy.
Disclosure of Invention
In order to solve the above technical problems, embodiments of the present invention provide a method and an apparatus for identifying handwritten form numbers, so as to improve accuracy of identification of handwritten form numbers, and the technical scheme is as follows:
in one aspect, an embodiment of the present invention provides a handwritten digit recognition method, including:
determining an image to be recognized, wherein the image to be recognized comprises a digital category label to be recognized in a handwritten form;
determining at least three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
respectively determining corresponding covariance of the image to be identified according to at least three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
respectively calculating the distance between each covariance of the image to be identified and the corresponding lie group mean value of each class of digital class label in a preset training image set; each training image in the training image set comprises a handwritten form of digital class labels, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label;
respectively determining a digital category label corresponding to the minimum distance in the plurality of distances determined for each covariance of the image to be identified as a standby digital category label;
and determining the digital category label with the largest number in the standby digital category labels as the digital category label to be identified.
Wherein, the calculation formula according to when determining three pixel characteristics of the specific pixel point in the image to be identified comprises:
φ 1 ( I , x , y ) = ( x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | ) T
φ 2 ( I , x , y ) = ( I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | ) T
φ 3 ( I , x , y ) = x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ ∂ x I ( x , y ) | 2 + | ∂ ∂ y I ( x , y ) | 2 , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | , a tan ( | ∂ ∂ x I ( x , y ) | | ∂ ∂ y I ( x , y ) | ) T
wherein phi isj(I, x, y) (j =1,2, 3) is the j-th pixel feature of the pixel point (x, y) of the image to be recognized, I (x, y) represents the gray value at the pixel point (x, y),
Figure BDA00003032327600031
is the first order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600032
is the first order partial derivative in the y direction at pixel point (x, y),
Figure BDA00003032327600033
is the second order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600034
the second-order partial derivatives in the y direction at the pixel points (x, y) are represented, x is larger than or equal to 1 and smaller than or equal to m, m is a row pixel value in the image to be recognized, y is larger than or equal to 1 and smaller than or equal to n, n is a column pixel value in the image to be recognized, and T is matrix transposition.
Wherein, according to the three pixel characteristics of the specific pixel point, the calculation formula according to which the corresponding covariance of the image to be identified is respectively determined comprises:
C j = 1 mn Σ x = 1 m Σ y = 1 n ( φ j ( I , x , y ) - φ ‾ j ( I ) ) ( φ j ( I , x , y ) - φ ‾ j ( I ) ) T , j = 1,2,3
wherein, CjThe covariance corresponding to the jth pixel feature,
Figure BDA00003032327600036
and T is the average value of the jth pixel characteristic in the image to be identified, and the matrix transposition is carried out.
The calculation formula for respectively calculating the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each class of digital class label in the preset training image set comprises:
d k j ( C j , m k j ) = Σ i = 1 d i ln ( λ i 2 ) , k = 1 , · · · , c , j = 1,2,3
wherein,
Figure BDA00003032327600038
is the j-th mean value of the lie group corresponding to the k-th class digital class label, c is the class number of the digital class label, and lambdaiIs CjAndof the generalized eigenvalue, diRepresenting the number of rows or columns of the covariance feature matrix.
The determining method of the mean value of the lie group of each type of digital category label included in the preset training image set comprises the following steps:
determining three pixel characteristics of each training image in the preset training image set according to the gray value of the pixel point;
determining corresponding covariance for each training image according to three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
and inputting the covariance corresponding to all the training images and related to the same pixel feature into a corresponding lie group mean classifier so as to determine the lie group mean value of each class of digital class labels and related to the pixel feature.
Wherein, the specific pixel point in the image to be identified comprises:
all pixel points in the image to be identified;
or,
and partial pixel points in the image to be recognized are pixel points of a handwriting area in the image to be recognized, and the handwriting area is a partial image area in the image to be recognized.
In another aspect, an embodiment of the present invention provides a handwritten digit recognition apparatus, including:
the device comprises a to-be-recognized image determining module, a recognition module and a recognition module, wherein the to-be-recognized image determining module is used for determining an image to be recognized, and the image to be recognized comprises a to-be-recognized digital category label in a handwritten form;
the pixel characteristic determining module is used for determining at least three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
the covariance determination module is used for respectively determining corresponding covariance of the image to be identified according to at least three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
the distance determining module is used for respectively calculating the distance between each covariance of the image to be identified and the corresponding lie group mean value of each type of digital class label in a preset training image set; each training image in the training image set comprises a handwritten form of digital class labels, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label;
a standby label determining module, configured to determine, as standby digital category labels, digital category labels corresponding to minimum distances among the distances determined for each covariance of the image to be identified, respectively;
and the to-be-identified label determining module is used for determining the digital category label with the largest number in the standby digital category labels as the to-be-identified digital category label.
In the scheme, at least three covariance characteristics are determined by using at least three pixel characteristics of a specific pixel point of an image to be recognized, and the category determination of handwritten figures is realized by using the determined at least three covariance characteristics. Therefore, compared with the mode of realizing classification by adopting a single covariance characteristic in the prior art, the scheme fully utilizes the spatial information of the image to be recognized, and therefore, the recognition accuracy of the handwritten form number can be effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a first flowchart of a handwritten digit recognition method according to an embodiment of the present invention;
FIG. 2 is a second flowchart of a handwritten digit recognition method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a handwritten digit recognition apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to improve the identification accuracy of handwritten numbers, the embodiment of the invention provides a method and a device for identifying handwritten numbers.
First, a method for recognizing handwritten numbers according to an embodiment of the present invention will be described.
As shown in fig. 1, a method of handwritten digit recognition may include:
s101, determining an image to be identified;
when handwritten numbers need to be recognized, an image to be recognized, which contains a digital category label to be recognized in a handwritten form, is determined first, and then subsequent processing is performed on the basis of the image to be recognized.
It should be noted that the numeric category label is a specific number, the category number corresponding to the numeric category label is 10 categories, and the 10 category label is: 0-9. The number category label to be identified may be any number from 0 to 9, and the handwriting form corresponding to the number category label to be identified may not be limited to one.
S102, determining at least three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
the stroke positions are beneficial to identifying numbers, and the gray value of the pixel point at the position with the stroke is different from other positions, so that the gray value of the pixel point can be used as important space information of the handwritten numbers to be identified. Therefore, after the image to be recognized is determined, in order to fully utilize the spatial information of the image to be recognized, at least three pixel characteristics of a specific pixel point in the image to be recognized can be determined according to the gray value of the pixel point.
It should be noted that, in order to ensure higher accuracy, a specific pixel point in the image to be recognized may include: and all the pixel points in the image to be identified. Further, in order to improve the processing efficiency on the premise of ensuring higher accuracy, the specific pixel points in the image to be recognized may include: and partial pixel points in the image to be recognized are pixel points of a handwriting area in the image to be recognized, and the handwriting area is a partial image area in the image to be recognized.
S103, respectively determining the corresponding covariance of the image to be identified according to at least three pixel characteristics of the specific pixel point;
wherein each pixel feature uniquely corresponds to one covariance.
After at least three pixel characteristics of a specific pixel point of the image to be recognized are determined, the covariance of the image to be recognized with respect to each pixel characteristic can be determined, and then the determined at least three covariances are utilized for subsequent processing.
S104, respectively calculating the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each class of digital class label in a preset training image set;
each training image in the training image set comprises a handwritten digital class label, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label.
It will be appreciated that in practical applications, the number class labels included in the training image set relate to 10 numbers: 0 to 9; moreover, the number of the training images corresponding to each type of digital category label included in the preset training image set may be different or the same.
Further, it should be noted that the determining manner of the at least three lie group mean values corresponding to each class of digital class label in the training image set may include:
a. determining at least three pixel characteristics of each training image in the preset training image set according to the gray value of the pixel point;
b. determining corresponding covariance for each training image according to at least three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
c. and inputting the covariance corresponding to all the training images and related to the same pixel feature into a corresponding lie group mean classifier so as to determine the lie group mean value of each class of digital class labels and related to the pixel feature.
By the above method, each class of digital class label in the training image set corresponds to at least three lie group mean values.
S105, respectively determining a digital category label corresponding to the minimum distance in the plurality of distances determined for each covariance of the image to be identified as a standby digital category label;
and S106, determining the digital category label with the largest number in the spare digital category labels as the digital category label to be identified.
Since the smaller the distance between the covariance of the image to be recognized and the corresponding lie group mean value of a digital category label is, the greater the probability that the digital category label to be recognized is the digital category label is, after the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each category of digital category label included in the preset training image set is determined, the digital category label corresponding to the minimum distance among the distances determined for each covariance of the image to be recognized may be determined as a spare digital category label, and then the largest number of digital category labels among the spare digital category labels is determined as the digital category label to be recognized.
In the scheme, at least three covariance characteristics are determined by using at least three pixel characteristics of a specific pixel point of an image to be recognized, and the category determination of handwritten figures is realized by using the determined at least three covariance characteristics. Therefore, compared with the mode of realizing classification by adopting a single covariance characteristic in the prior art, the scheme fully utilizes the spatial information of the image to be recognized, and therefore, the recognition accuracy of the handwritten form number can be effectively improved.
The method for recognizing handwritten numbers provided by the embodiment of the invention is described below by taking three covariance characteristics as an example.
As shown in fig. 2, a method of handwritten digit recognition may include:
s201, determining an image to be identified;
when handwritten numbers need to be recognized, an image to be recognized, which contains a digital category label to be recognized in a handwritten form, is determined first, and then subsequent processing is performed on the basis of the image to be recognized.
It should be noted that the numeric category label is a specific number, the numeric category corresponding to the numeric category label is 10 categories, and the 10 category label is: 0-9. The number category label to be identified may be any number from 0 to 9, and the handwriting form corresponding to the number category label to be identified may not be limited to one.
S202, determining three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
the stroke positions are beneficial to identifying numbers, and the gray value of the pixel point at the position with the stroke is different from other positions, so that the gray value of the pixel point can be used as important space information of the handwritten numbers to be identified. Therefore, after the image to be recognized is determined, in order to fully utilize the spatial information of the image to be recognized, three pixel characteristics of a specific pixel point in the image to be recognized can be determined according to the gray value of the pixel point.
It should be noted that, in order to ensure higher accuracy, a specific pixel point in the image to be recognized may include: and all the pixel points in the image to be identified. Further, in order to improve the processing efficiency on the premise of ensuring higher accuracy, the specific pixel points in the image to be recognized may include: and partial pixel points in the image to be recognized are pixel points of a handwriting area in the image to be recognized, and the handwriting area is a partial image area in the image to be recognized.
When determining three pixel characteristics of a specific pixel point in the image to be identified, the calculation formula according to which comprises:
φ 1 ( I , x , y ) = ( x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | ) T - - - ( 1 )
φ 2 ( I , x , y ) = ( I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | ) T - - - ( 2 )
φ 3 ( I , x , y ) = x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ ∂ x I ( x , y ) | 2 + | ∂ ∂ y I ( x , y ) | 2 , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | , a tan ( | ∂ ∂ x I ( x , y ) | | ∂ ∂ y I ( x , y ) | ) T - - - ( 3 )
wherein phi isj(I, x, y) (j =1,2, 3) is the j-th pixel feature of the pixel point (x, y) of the image to be recognized, I (x, y) represents the gray value at the pixel point (x, y),
Figure BDA00003032327600092
is the first order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600093
is the first order partial derivative in the y direction at pixel point (x, y),is the second order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600095
is the second-order partial derivative in the y direction at the pixel point (x, y), x is more than or equal to 1 and less than or equal to m, m is the row pixel value in the image to be recognized, y is more than or equal to 1 and less than or equal to n, n is the column pixel value in the image to be recognized, T is the running momentAnd (5) matrix transposition.
S203, respectively determining the corresponding covariance of the image to be identified according to the three pixel characteristics of the specific pixel point;
wherein each pixel feature uniquely corresponds to one covariance.
After three pixel characteristics of a specific pixel point of the image to be recognized are determined, the covariance of the image to be recognized about each pixel characteristic can be determined, and then the determined three covariances are utilized for subsequent processing.
Wherein, according to the three pixel characteristics of the specific pixel point, the calculation formula according to which the corresponding covariance of the image to be identified is respectively determined may include:
C j = 1 mn Σ x = 1 m Σ y = 1 n ( φ j ( I , x , y ) - φ ‾ j ( I ) ) ( φ j ( I , x , y ) - φ ‾ j ( I ) ) T , j = 1,2,3 - - - ( 4 )
wherein, CjThe covariance corresponding to the jth pixel feature,and T is the average value of the jth pixel characteristic in the image to be identified, and the matrix transposition is carried out.
S204, respectively calculating the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each class of digital class label in a preset training image set;
each training image in the training image set comprises a handwritten digital class label, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label.
It will be appreciated that in practical applications, the number class labels included in the training image set relate to 10 numbers: 0 to 9; moreover, the number of the training images corresponding to each type of digital category label included in the preset training image set may be different or the same.
The calculation formula for respectively calculating the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each type of digital category label in the preset training image set comprises:
d k j ( C j , m k j ) = Σ i = 1 d i 1 n ( λ i 2 ) , k = 1 , · · · , c , j = 1,2,3 - - - ( 5 )
wherein,
Figure BDA00003032327600102
is the j-th mean value of the lie group corresponding to the k-th class digital class label, c is the class number of the digital class label, and lambdaiIs CjAnd
Figure BDA00003032327600103
of the generalized eigenvalue, diRepresenting the number of rows or columns of the covariance feature matrix.
Further, the determining method of the mean of the lie groups of each type of digital category label included in the preset training image set may include:
a. determining three pixel characteristics of each training image in the preset training image set according to the gray value of the pixel point;
b. determining corresponding covariance for each training image according to three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
c. and inputting the covariance corresponding to all the training images and related to the same pixel feature into a corresponding lie group mean classifier so as to determine the lie group mean value of each class of digital class labels and related to the pixel feature.
It should be noted that, for the training image, the calculation formulas for determining the three pixel features of the specific pixel point are calculation formulas (1), (2) and (3), and the calculation formula for determining the three covariances of the training image is calculation formula (4).
S205, respectively determining a digital category label corresponding to the minimum distance in the plurality of distances determined for each covariance of the image to be identified as a standby digital category label;
and S206, determining the digital category label with the largest number in the spare digital category labels as the digital category label to be identified.
Since the smaller the distance between the covariance of the image to be recognized and the corresponding lie group mean value of a digital category label is, the greater the probability that the digital category label to be recognized is the digital category label is, after the distance between each covariance of the image to be recognized and the corresponding lie group mean value of each category of digital category label included in the preset training image set is determined, the digital category label corresponding to the minimum distance among the distances determined for each covariance of the image to be recognized may be determined as a spare digital category label, and then the largest number of digital category labels among the spare digital category labels is determined as the digital category label to be recognized.
In the scheme, three covariance characteristics are determined by using three pixel characteristics of a specific pixel point of an image to be recognized, and the category determination of the handwritten form number is realized by using the determined three covariance characteristics. Therefore, compared with the mode of realizing classification by adopting a single covariance characteristic in the prior art, the scheme fully utilizes the spatial information of the image to be recognized, and therefore, the recognition accuracy of the handwritten form number can be effectively improved.
It should be noted that, constructing a preset training image set may randomly obtain a number of training images of each type of digital category label from the MNIST handwritten digital data set. Where MNIST is a subset of the U.S. famous dataset NIST, a common experimental dataset for pattern recognition, which has a training set of 60000 training images and a test set of 10000 test images.
The training process corresponding to the handwritten digit recognition method based on three covariance characteristics provided by the embodiment of the invention is described as follows:
(1) training an image processing process:
1) determining a set of training images
Figure BDA00003032327600111
Wherein, Ii∈Rm×nIs the ith training image, m and n represent the row and column pixel values of the training image, liE {1, …, c } is IiCorresponding numerical class labels, i.e. representations IiWhich number is, N represents the total number of training images, c represents the number of categories of the number category label; whereinLet m = N =28, N =100c, and let i = 1.
2) For training image IiThe pixel point at the upper point (x, y) extracts the following three pixel characteristics:
φ 1 ( I i , x , y ) = ( x , y , I i ( x , y ) , | ∂ ∂ x I i ( x , y ) | , | ∂ ∂ y I i ( x , y ) | ) T
φ 2 ( I i , x , y ) = ( I i ( x , y ) , | ∂ ∂ x I i ( x , y ) | , | ∂ ∂ y I i ( x , y ) | , | ∂ 2 ∂ x ∂ x I i ( x , y ) | , | ∂ 2 ∂ y ∂ y I i ( x , y ) | ) T
φ 3 ( I i , x , y ) = x , y , I i ( x , y ) , | ∂ ∂ x I i ( x , y ) | , | ∂ ∂ y I i ( x , y ) | , | ∂ ∂ x I i ( x , y ) | 2 + | ∂ ∂ y I i ( x , y ) | 2 , | ∂ 2 ∂ x ∂ x I i ( x , y ) | , | ∂ 2 ∂ y ∂ y I i ( x , y ) | , a tan ( | ∂ ∂ x I i ( x , y ) | | ∂ ∂ y I i ( x , y ) | ) T
wherein phi isj(I, x, y) (j =1,2, 3), which is a training image IiOf the pixel point (x, y), Ii(x, y) represents the gray value at the pixel point (x, y),is the first order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600124
is the first order partial derivative in the y direction at pixel point (x, y),
Figure BDA00003032327600125
is the second order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600126
is the second-order partial derivative in the y direction at the pixel point (x, y), x is more than or equal to 1 and less than or equal to 28, m is the training image IiY is more than or equal to 1 and less than or equal to 28, n is the training image IiT is matrix transposing.
3) Determining a training image I according to the extracted three pixel characteristics and the following calculation formulaiThree covariance of (2):
C i j = 1 mn Σ x = 1 m Σ y = 1 n ( φ j ( I i , x , y ) - φ ‾ j ( I i ) ) ( φ j ( I i , x , y ) - φ ‾ j ( I i ) ) T , j = 1,2,3
wherein,
Figure BDA00003032327600128
for training image IiThe covariance corresponding to the jth pixel feature of (a),
for training image IiAnd T is the matrix transposition.
4) Stop if i = N, otherwise i = i +1, repeat 2) and 3).
5) Covariance of the obtained training imagesObtaining the mean value of each class of digital class labels by using the jth mean value classifier of the lie groups
Figure BDA00003032327600131
k=1,…,c。
(2) And (3) testing an image processing process:
1) determining a test image I, wherein x ∈ Rm×n
2) Extracting pixel characteristic phi from pixel point (x, y) on test image Ij(I, x, y), j =1,2,3, according to the same formula as the calculation formula for extracting three pixel features from the training image;
3) determining the covariance C of the test image I according to the three extracted pixel characteristicsjThe formula is the same as the calculation formula for determining the three covariances for the training image;
(3) the identification process comprises the following steps:
1) calculating the covariance C of the test imagejAnd the mean value of the lie group of each class of digital class label in the training image set
Figure BDA00003032327600132
Is a distance therebetween, i.e.
d k j ( C j , m k j ) = Σ i = 1 d i 1 n ( λ i 2 ) , k = 1 , · · · , c , j = 1,2,3
2) The classification result of the jth lie group mean classifier on the test image is qjI.e. by
q j = arg min k = 1 , · · · , c d k j ( C j , m k j )
3) And integrating the results of the three lie group mean classifiers according to a majority voting criterion, and outputting the category of the test image.
The effect of the invention can be verified by the following experiment:
randomly selecting 100 training images corresponding to each type of digital class labels from a training set, and randomly selecting 200 testing images from a testing set; this sampling process is repeated 10 times, and the final output result is the average result of the 10 times. The experimental comparison method comprises a plum group mean classifier and the invention.
The whole experimental procedure included three groups of experiments, namely two classifications for the number (1,9), three classifications for the number (1,7,9), and four classifications for the number (1,2,7, 9).
The results of the experiment are shown in table 1. The results of the lie group mean classifier are four, with results of different covariances and the average of the three. As can be seen from table 1, for the two-class classification, the three-class classification and the four-class classification, the false recognition rate of the present invention is smaller than that of the plum group mean classifier, so that the present invention has a better recognition effect.
Figure BDA00003032327600141
TABLE 1
Through the above description of the method embodiments, those skilled in the art can clearly understand that the present invention can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: various media that can store program codes, such as Read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and so on.
Corresponding to the above method embodiment, an embodiment of the present invention further provides a handwritten numeral recognition apparatus, as shown in fig. 3, which may include:
the image to be recognized determining module 110 is configured to determine an image to be recognized, where the image to be recognized includes a digital category tag to be recognized in a handwritten form;
the pixel feature determining module 120 is configured to determine at least three pixel features of a specific pixel point in the image to be identified according to a gray value of the pixel point;
a covariance determination module 130, configured to determine, according to at least three pixel features of the specific pixel point, respective covariance of the image to be identified, where each pixel feature uniquely corresponds to one covariance;
a distance determining module 140, configured to calculate a distance between each covariance of the image to be identified and a corresponding lie group mean of each class of digital class label included in a preset training image set; each training image in the training image set comprises a handwritten form of digital class labels, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label;
a spare tag determining module 150, configured to determine, as a spare digital class tag, a digital class tag corresponding to a minimum distance of a plurality of distances determined for each covariance of the image to be identified, respectively;
a to-be-identified tag determining module 160, configured to determine a digital category tag with the largest number in the spare digital category tags as the to-be-identified digital category tag.
In the scheme, at least three covariance characteristics are determined by using at least three pixel characteristics of a specific pixel point of an image to be recognized, and the category determination of handwritten figures is realized by using the determined at least three covariance characteristics. Therefore, compared with the mode of realizing classification by adopting a single covariance characteristic in the prior art, the scheme fully utilizes the spatial information of the image to be recognized, and therefore, the recognition accuracy of the handwritten form number can be effectively improved.
The calculation formula according to which the pixel feature determining module 120 determines three pixel features of a specific pixel point in the image to be recognized may include:
φ 1 ( I , x , y ) = ( x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | ) T
φ 2 ( I , x , y ) = ( I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | ) T
φ 3 ( I , x , y ) = x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ ∂ x I ( x , y ) | 2 + | ∂ ∂ y I ( x , y ) | 2 , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | , a tan ( | ∂ ∂ x I ( x , y ) | | ∂ ∂ y I ( x , y ) | ) T
wherein phi isj(I, x, y) (j =1,2, 3) is the j-th pixel feature of the pixel point (x, y) of the image to be recognized, I (x, y) represents the gray value at the pixel point (x, y),
Figure BDA00003032327600161
is the first order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600162
is the first order partial derivative in the y direction at pixel point (x, y),
Figure BDA00003032327600163
is the second order partial derivative in the x direction at pixel point (x, y),
Figure BDA00003032327600164
the second-order partial derivatives in the y direction at the pixel points (x, y) are represented, x is larger than or equal to 1 and smaller than or equal to m, m is a row pixel value in the image to be recognized, y is larger than or equal to 1 and smaller than or equal to n, n is a column pixel value in the image to be recognized, and T is matrix transposition.
Correspondingly, the calculation formula according to which the covariance determination module 130 determines the respective covariance of the image to be identified according to the three pixel characteristics of the specific pixel point may include:
C j = 1 mn Σ x = 1 m Σ y = 1 n ( φ j ( I , x , y ) - φ ‾ j ( I ) ) ( φ j ( I , x , y ) - φ ‾ j ( I ) ) T , j = 1,2,3
wherein, CjThe covariance corresponding to the jth pixel feature,
Figure BDA00003032327600166
and T is the average value of the jth pixel characteristic in the image to be identified, and the matrix transposition is carried out.
The calculation formula according to which the distance determining module 140 calculates the distance between each covariance of the image to be recognized and the corresponding lie group mean of each class of digital class label included in the preset training image set may include:
d k j ( C j , m k j ) = Σ i = 1 d i 1 n ( λ i 2 ) , k = 1 , · · · , c , j = 1,2,3
wherein,
Figure BDA00003032327600167
is the j-th mean value of the lie group corresponding to the k-th class digital class label, c is the class number of the digital class label, and lambdaiIs Cj and
Figure BDA00003032327600168
of the generalized eigenvalue, diRepresenting the number of rows or columns of the covariance feature matrix.
For device or system embodiments, as they correspond substantially to method embodiments, reference may be made to the method embodiments for some of their descriptions. The above-described embodiments of the apparatus or system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways without departing from the spirit and scope of the present application. The present embodiment is an exemplary example only, and should not be taken as limiting, and the specific disclosure should not be taken as limiting the purpose of the application. For example, the division of the unit or the sub-unit is only one logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or a plurality of sub-units are combined together. In addition, various elements or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented.
Additionally, the systems, apparatus, and methods described, as well as the illustrations of various embodiments, may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present application. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The foregoing is directed to embodiments of the present invention, and it is understood that various modifications and improvements can be made by those skilled in the art without departing from the spirit of the invention.

Claims (7)

1. A method for handwritten digit recognition, comprising:
determining an image to be recognized, wherein the image to be recognized comprises a digital category label to be recognized in a handwritten form;
determining at least three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
respectively determining corresponding covariance of the image to be identified according to at least three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
respectively calculating the distance between each covariance of the image to be identified and the corresponding lie group mean value of each class of digital class label in a preset training image set; each training image in the training image set comprises a handwritten form of digital class labels, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label;
respectively determining a digital category label corresponding to the minimum distance in the plurality of distances determined for each covariance of the image to be identified as a standby digital category label;
and determining the digital category label with the largest number in the standby digital category labels as the digital category label to be identified.
2. The method according to claim 1, wherein the calculation formula according to which the three pixel characteristics of the specific pixel point in the image to be recognized are determined comprises:
φ 1 ( I , x , y ) = ( x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | ) T
φ 2 ( I , x , y ) = ( I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | ) T
φ 3 ( I , x , y ) = x , y , I ( x , y ) , | ∂ ∂ x I ( x , y ) | , | ∂ ∂ y I ( x , y ) | , | ∂ ∂ x I ( x , y ) | 2 + | ∂ ∂ y I ( x , y ) | 2 , | ∂ 2 ∂ x ∂ x I ( x , y ) | , | ∂ 2 ∂ y ∂ y I ( x , y ) | , a tan ( | ∂ ∂ x I ( x , y ) | | ∂ ∂ y I ( x , y ) | ) T
wherein phi isj(I, x, y) (j =1,2, 3) is the j-th pixel feature of the pixel point (x, y) of the image to be recognized, I (x, y) represents the gray value at the pixel point (x, y),is the first order partial derivative in the x direction at pixel point (x, y),
Figure FDA00003032327500022
is the first order partial derivative in the y direction at pixel point (x, y),is the second order partial derivative in the x direction at pixel point (x, y),
Figure FDA00003032327500024
the second-order partial derivatives in the y direction at the pixel points (x, y) are represented, x is larger than or equal to 1 and smaller than or equal to m, m is a row pixel value in the image to be recognized, y is larger than or equal to 1 and smaller than or equal to n, n is a column pixel value in the image to be recognized, and T is matrix transposition.
3. The method of claim 2, wherein the calculation formula according to which the respective covariances of the image to be recognized are respectively determined according to the three pixel characteristics of the specific pixel point comprises:
C j = 1 mn Σ x = 1 m Σ y = 1 n ( φ j ( I , x , y ) - φ ‾ j ( I ) ) ( φ j ( I , x , y ) - φ ‾ j ( I ) ) T , j = 1,2,3
wherein, CjThe covariance corresponding to the jth pixel feature,
Figure FDA00003032327500026
and T is the average value of the jth pixel characteristic in the image to be identified, and the matrix transposition is carried out.
4. The method of claim 3, wherein the calculation formula for calculating the distance between each covariance of the image to be recognized and the mean of the corresponding lie group of each class of digital class label included in the preset training image set comprises:
d k j ( C j , m k j ) = Σ i = 1 d i 1 n ( λ i 2 ) , k = 1 , · · · , c , j = 1,2,3
wherein,
Figure FDA00003032327500028
is the j-th mean value of the lie group corresponding to the k-th class digital class label, c is the class number of the digital class label, and lambdaiIs CjAndof the generalized eigenvalue, diRepresenting the number of rows or columns of the covariance feature matrix.
5. The method of claim 4, wherein the determining the mean of the lie groups of each class of digital class labels included in the preset training image set comprises:
determining three pixel characteristics of each training image in the preset training image set according to the gray value of the pixel point;
determining corresponding covariance for each training image according to three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
and inputting the covariance corresponding to all the training images and related to the same pixel feature into a corresponding lie group mean classifier so as to determine the lie group mean value of each class of digital class labels and related to the pixel feature.
6. The method of claim 1, wherein the specific pixel points in the image to be recognized comprise:
all pixel points in the image to be identified;
or,
and partial pixel points in the image to be recognized are pixel points of a handwriting area in the image to be recognized, and the handwriting area is a partial image area in the image to be recognized.
7. A handwritten number recognition device, comprising:
the device comprises a to-be-recognized image determining module, a recognition module and a recognition module, wherein the to-be-recognized image determining module is used for determining an image to be recognized, and the image to be recognized comprises a to-be-recognized digital category label in a handwritten form;
the pixel characteristic determining module is used for determining at least three pixel characteristics of a specific pixel point in the image to be identified according to the gray value of the pixel point;
the covariance determination module is used for respectively determining corresponding covariance of the image to be identified according to at least three pixel characteristics of the specific pixel point, wherein each pixel characteristic uniquely corresponds to one covariance;
the distance determining module is used for respectively calculating the distance between each covariance of the image to be identified and the corresponding lie group mean value of each type of digital class label in a preset training image set; each training image in the training image set comprises a handwritten form of digital class labels, the digital class labels contained in the training image set relate to all digital classes, each class of digital class label in the training image set corresponds to at least three lie group mean values, and each covariance of the image to be recognized corresponds to one lie group mean value of each class of digital class label;
a standby label determining module, configured to determine, as standby digital category labels, digital category labels corresponding to minimum distances among the distances determined for each covariance of the image to be identified, respectively;
and the to-be-identified label determining module is used for determining the digital category label with the largest number in the standby digital category labels as the to-be-identified digital category label.
CN201310123085.8A 2013-04-10 2013-04-10 Handwritten Numeral Recognition Method and device Active CN103218613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310123085.8A CN103218613B (en) 2013-04-10 2013-04-10 Handwritten Numeral Recognition Method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310123085.8A CN103218613B (en) 2013-04-10 2013-04-10 Handwritten Numeral Recognition Method and device

Publications (2)

Publication Number Publication Date
CN103218613A true CN103218613A (en) 2013-07-24
CN103218613B CN103218613B (en) 2016-04-20

Family

ID=48816382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310123085.8A Active CN103218613B (en) 2013-04-10 2013-04-10 Handwritten Numeral Recognition Method and device

Country Status (1)

Country Link
CN (1) CN103218613B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400161A (en) * 2013-07-18 2013-11-20 苏州大学 Handwritten numeral recognition method and system
CN104866867A (en) * 2015-05-15 2015-08-26 浙江大学 Multi-national banknote serial number character identification method based on sorter

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001059766A (en) * 1999-08-24 2001-03-06 Nippon Sharyo Seizo Kaisha Ltd Liquid-level meter of tank lorry
CN102722713A (en) * 2012-02-22 2012-10-10 苏州大学 Handwritten numeral recognition method based on lie group structure data and system thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001059766A (en) * 1999-08-24 2001-03-06 Nippon Sharyo Seizo Kaisha Ltd Liquid-level meter of tank lorry
CN102722713A (en) * 2012-02-22 2012-10-10 苏州大学 Handwritten numeral recognition method based on lie group structure data and system thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宋曰聪等: "手写体数字识别***中一种新的特征提取方案", 《计算机科学》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400161A (en) * 2013-07-18 2013-11-20 苏州大学 Handwritten numeral recognition method and system
CN104866867A (en) * 2015-05-15 2015-08-26 浙江大学 Multi-national banknote serial number character identification method based on sorter
CN104866867B (en) * 2015-05-15 2017-12-05 浙江大学 A kind of multinational paper money sequence number character identifying method based on cleaning-sorting machine

Also Published As

Publication number Publication date
CN103218613B (en) 2016-04-20

Similar Documents

Publication Publication Date Title
CN103164701B (en) Handwritten Numeral Recognition Method and device
Lin et al. Masked face detection via a modified LeNet
Goodfellow et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks
EP2808827B1 (en) System and method for OCR output verification
CN102156871B (en) Image classification method based on category correlated codebook and classifier voting strategy
Wang Pattern recognition, machine intelligence and biometrics
CN106570513A (en) Fault diagnosis method and apparatus for big data network system
CN106845358B (en) Method and system for recognizing image features of handwritten characters
CN102982349A (en) Image recognition method and device
CN102291392A (en) Hybrid intrusion detection method based on bagging algorithm
Chaabouni et al. Fractal and multi-fractal for arabic offline writer identification
Cruz et al. Feature representation selection based on classifier projection space and oracle analysis
Nasrollahi et al. Printed persian subword recognition using wavelet packet descriptors
CN106056074A (en) Single training sample face identification method based on area sparse
Roy et al. CNN based recognition of handwritten multilingual city names
Gou et al. Representation-based classification methods with enhanced linear reconstruction measures for face recognition
Camastra et al. Combining neural gas and learning vector quantization for cursive character recognition
Khalid et al. Tropical wood species recognition system based on multi-feature extractors and classifiers
CN103218613B (en) Handwritten Numeral Recognition Method and device
Wu et al. Handwritten digit classification using the mnist data set
Shayegan et al. A New Dataset Size Reduction Approach for PCA‐Based Classification in OCR Application
CN106033546A (en) Behavior classification method based on top-down learning
Fursov et al. Sequence embeddings help to identify fraudulent cases in healthcare insurance
Zapranis et al. Identification of the head-and-shoulders technical analysis pattern with neural networks
EP4290481A1 (en) Methods and systems for performing data capture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant