CN117671704A

CN117671704A - Handwriting digital recognition method, handwriting digital recognition device and computer storage medium

Info

Publication number: CN117671704A
Application number: CN202410130100.XA
Authority: CN
Inventors: 薛杨涛; 白文涛; 司亚利; 钟珊; 龚声蓉
Original assignee: Changshu Institute of Technology
Current assignee: Changshu Institute of Technology
Priority date: 2024-01-31
Filing date: 2024-01-31
Publication date: 2024-03-08
Anticipated expiration: 2044-01-31
Also published as: CN117671704B

Abstract

The invention discloses a handwriting digital recognition method, which comprises the following steps of carrying out normalization processing on samples collected in the steps to obtain training data, wherein the training data comprises label data and label-free data; calculating an intra-class divergence matrix and an inter-class divergence matrix from the tag data; constructing a neighbor graph by using the label data and the label-free data to calculate manifold regular terms; the training data is utilized to learn through a Laplace self-adaptive weight discriminant analysis method to obtain an optimal projection matrix, an iterative optimization method is adopted to solve an optimization problem to obtain the optimal projection matrix, a sample to be identified is subjected to normalization processing, projected data is obtained through the optimal projection matrix, and then a nearest neighbor classifier is adopted to obtain an identification tag. The invention also discloses a device based on the method and a computer storage medium. The method and the device effectively solve the problem of multiple classification with less tag data, improve the utilization rate of the data and improve the classification performance.

Description

Handwriting digital recognition method, handwriting digital recognition device and computer storage medium

Technical Field

The present invention relates to the field of image recognition technologies, and in particular, to a handwriting digital recognition method, device, and computer storage medium.

Background

The linear discriminant analysis method (linear discriminant analysis, LDA) is a classical supervised learning algorithm, mainly for dimension reduction and classification. The main idea is to project the data into a new space so that the same kind of data is as close as possible and different kinds of data are as far as possible. The method can be used for solving the problem of image classification, such as handwriting digital recognition and the like. The linear discriminant analysis method can improve the classification accuracy by projecting the data to the optimal linear discriminant direction. LDA is not an optimal choice for multi-classification problems. Under the homodyne gaussian assumption, the projection of LDA is obtained by maximizing the weighted arithmetic average of Kullback-Leibler (KL) divergence between different classes, the projection direction of which is dominated by class pairs with large KL divergence, which results in overlapping phenomenon of class pairs with small KL divergence in projection space, so that the accuracy of classification is also significantly degraded. Aiming at the problem of class separation of LDA in multi-classification problems, many researchers propose various schemes for constructing weights to optimize LDA. The supervised discriminant analysis method is mainly divided into two main categories, one category is to replace arithmetic average value of KL divergence among different categories, and different weights are given to the category pairs with different KL divergence; the other class is a class pair focusing on the separation of similar class pairs, emphasizing small KL divergence. However, these methods are supervised, require enough label data to train the model, and are prone to over-fitting problems.

With the development of scientific technology, the technology and tools for collecting data are continuously advanced, a large amount of data can be used, but the labeling work of the label data also needs a large amount of manpower and material resources, so how to use the label-free data to help improve the performance of the existing algorithm becomes a current research hot spot. Semi-supervised learning is to use a large amount of unlabeled data to assist a small amount of labeled data to improve learning performance, so that a learning model with stronger generalization capability is obtained. How to extend the supervised discriminant analysis method to the semi-supervised learning to obtain a more effective classification model becomes one of the tasks to be solved urgently.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a handwriting digital recognition method, expands a supervised discriminant analysis method to semi-supervised learning, and solves the problems of few label data in multi-classification tasks and class separation existing in the traditional discriminant analysis method. Another object of the present invention is to provide a handwriting digital recognition device and a corresponding computer storage medium.

The technical scheme of the invention is as follows: a handwriting digital recognition method, comprising the steps of:

step S1, carrying out normalization processing on collected samples to obtain training data, wherein the training data comprises label data and label-free data;

s2, calculating an intra-class divergence matrix and an inter-class divergence matrix from tag data in training data;

s3, constructing a neighbor graph to calculate manifold regular terms according to the label data and the label-free data in the training data;

s4, learning by using training data through a Laplace adaptive weight discriminant analysis method to obtain an optimal projection matrix, wherein the method comprises the steps of setting an optimization target of the Laplace adaptive weight discriminant analysis method as follows

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix +.>Is an inter-class divergence matrix->For projection matrix->L of (2) _2,1 Norm, m is the number of features, d is the dimension of projection space, +.>Is a manifold regularization term that is used to determine,to weigh the parameters->Is a unitary matrix->The number of categories of tag information in the training data; solving a projection matrix by adopting an iterative optimization method>And weight vector->Obtaining an optimal projection matrix;

and S5, carrying out normalization processing on the sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining the identification tag by adopting a nearest neighbor classifier.

The invention also provides a handwriting digital recognition device, which comprises:

and a pretreatment module: normalizing the collected samples to obtain training data, wherein the training data comprises label data and label-free data;

a first calculation module: calculating an intra-class divergence matrix and an inter-class divergence matrix from tag data in the training data;

a second calculation module: constructing a neighbor graph by using label data and label-free data in the training data to calculate manifold regular terms;

and an optimal projection matrix solving module: the training data is utilized to obtain an optimal projection matrix through learning by a Laplace adaptive weight discriminant analysis method, and the method comprises the steps of setting an optimization target of the Laplace adaptive weight discriminant analysis method as follows

，

and an identification module: and carrying out normalization processing on the sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining the identification tag by adopting a nearest neighbor classifier.

Further, the step S3 and the second calculating module include calculating steps:

step S3.1, training data is utilizedConstructing a neighbor graph to obtain a neighbor matrixTag data->Label-free data->The number of tag data is->The number of unlabeled data is->Neighbor matrix->The construction mode of (2) is as follows:

，

wherein the method comprises the steps ofDenoted as->Is->A neighbor set;

step S3.2, calculating Laplacian matrix in manifold regularization termWherein->Is a diagonal matrix, diagonalElement->The obtained manifold regularization term in the projection space is

，

Wherein the method comprises the steps ofIs L ₂ Norms (F/F)>Denoted as->Image in low-dimensional projection space, +.>，。

Further, the method for solving the projection matrix by adopting iterative optimizationAnd weight vector->Obtaining an optimal projection matrix, comprising the steps of:

step S4.1, initializing weightsSolving the projection matrix +.>Transformation of the optimization function of LapAWDA into

，

Wherein the method comprises the steps ofIs constant (I)>0 is a trade-off coefficient>，

First calculate the matrixObtaining the optimization target as

，

And converting the optimization problem into a feature decomposition problem by using a Lagrangian multiplier method:

，

wherein the method comprises the steps ofIs a diagonal matrix, diagonal element->，/>Is->Is>Row vector->For eigenvalues, an optimal projection matrix +.>Is made up of->The corresponding +.>A feature vector composition, wherein->；

Step S4.2, fixing the projection matrixSolving for weight vector +.>At this time, the objective function of LapAWDA becomes

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Step S4.3, updating weight vectorContinuing to solve the projection matrix according to step S4.1>The method comprises the steps of carrying out a first treatment on the surface of the After the optimal projection matrix of the round is obtained, the next round of alternate iterative solution is carried out, namely +.>Fixed projection matrix->According toStep S4.2 updating weight vector +.>Repeating the step S4.3 until the stopping condition is met to obtain the optimal projection matrix +.>。

Because the optimization problem of the Laplace adaptive weight discriminant analysis method is not a classical quadratic optimization problem, a rapid and effective iterative optimization algorithm is adopted in the solving process, and can be theoretically proved to be convergent.

Further, the stopping condition in the step S4.3 is that。

Further, the intra-class divergence matrixThe calculation method is as follows:

，

wherein the method comprises the steps ofIndicate->Class->Sample number->Indicate->A mean vector of the class;

the inter-class divergence matrixCalculation ofThe method is as follows:

。

the present invention also provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described handwritten number recognition method.

Compared with the prior art, the technical scheme provided by the invention has the advantages that:

the method introduces the structural information of the label-free data through manifold regular terms, and adopts a self-adaptive weight method to balance the KL divergence among each class pair, so that the class pair with small KL divergence is prevented from disappearing in a projection space. In addition, L is applied to the projection matrix _2,1 The norm constraint aims to obtain a sparse-discrimination projection matrix, so that the classification precision is further improved, and the method is more suitable for multi-classification tasks. The optimal solution obtained by the LapAWDA optimization problem is that useful information can be extracted to facilitate the subsequent classification task.

The semi-supervised feature extraction method is combined with nearest neighbor classification to obtain a multi-classification model through the combination of the discriminant analysis method and manifold regular terms, so that the method can be used for solving the problem of semi-supervised multi-classification with less tag data, the utilization rate of the data is improved, and the classification performance is improved.

Drawings

Fig. 1 is a flow chart of a handwriting digital recognition method.

Fig. 2 is a flow chart of a discriminant analysis method using laplace adaptive weights.

Fig. 3 is a sample of the MNIST dataset.

Fig. 4 is the average accuracy over dimensions for 10 methods on the MNIST dataset at 10 tag data.

Fig. 5 shows the average accuracy of 10 methods over MNIST datasets as a function of dimension for 20 tag data.

Fig. 6 shows the average accuracy of 10 methods over an MNIST dataset as a function of dimension for 30 tag data.

Detailed Description

The present invention is further described below with reference to examples, which are to be construed as merely illustrative of the present invention and not limiting of its scope, and various modifications to the equivalent arrangements of the present invention will become apparent to those skilled in the art upon reading the present description, which are within the scope of the appended claims.

The handwriting digital recognition device according to the present embodiment includes:

and an optimal projection matrix solving module: learning by using training data through a Laplace self-adaptive weight discriminant analysis method to obtain an optimal projection matrix;

Specifically, please refer to fig. 1 and 2, the handwriting digital recognition method adopted by the device includes the following steps:

step S1, carrying out normalization processing on the collected samples to obtain training data, wherein the normalization processing is carried out by mapping each pixel value of an image divided by 255 to [0,1 ]]Is not limited in terms of the range of (a). The training data includes tag dataAnd no tag data->Training data is->Wherein the label of the label dataThe sign vector is->Tag information->，/>The number of tag data is +.>The number of unlabeled data is->The total training data is +.>。

for the followingClass data, total intra-class divergence matrix +.>The calculation method is as follows:，

wherein the method comprises the steps ofIndicate->Class->Sample number->Indicate->And (5) a mean vector of the class.

And for the case ofBetween any two of the classes, there is a composition->The pair gets->Inter-class divergence matrix, i.e.>Class and->Inter-class divergence matrix of classes>The calculation mode of (a) is as follows:

。

s3, constructing a neighbor graph to calculate manifold regular terms according to the label data and the label-free data in the training data; the method specifically comprises the following steps:

step S3.1, training data is utilizedConstructing a neighbor graph to obtain a neighbor matrixNeighbor matrix->The construction mode of (2) is as follows:

，

wherein the method comprises the steps ofDenoted as->Is->Neighbor set.

Step S3.2, calculating Laplacian matrix in manifold regularization termWherein->For diagonal matrix, diagonal element->The obtained manifold regularization term in the projection space is

，

Wherein the method comprises the steps ofIs L ₂ Norms (F/F)>Denoted as->Image in low-dimensional projection space, +.>D is the dimension of the projection space.

Step S4, the expression of the manifold regular term can be seen to be the projection vectorCorrelation, so, manifold regularization term is introduced into Laplacian adaptive weight discriminant analysis method (LapAWDA), which requires solving projection vectors in the optimization process. Due to the weight vector->Is not predefined but learned from a low-dimensional projection space, and the optimization objective of LapAWDA shows that it is non-smooth and cannot directly solve projection vectors simultaneously>And weight vector->However, an approximate optimal solution can be obtained by adopting an iterative optimization algorithm, so that the LapAWDA optimization problem adopts 1) a fixed weight vector +.>Updating projection vector +.>The method comprises the steps of carrying out a first treatment on the surface of the 2) Fixed projection vector +.>Update weight vector +.>And the two steps are alternately and iteratively solved until a stopping condition is met to obtain an approximate optimal solution.

Specifically, the optimization objective of the laplace adaptive weight discriminant analysis method is set to

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix +.>Is an inter-class divergence matrix->For projection matrix->L of (2) _2,1 Norm, m is the number of features, +.>，/>To weigh the parameters->Is an identity matrix.

The iterative optimization steps are as follows:

，

First calculate the matrixObtaining the optimization target as

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Step S4.3, updating weight vectorContinuing to solve the projection matrix according to step S4.1>The method comprises the steps of carrying out a first treatment on the surface of the After the optimal projection matrix of the round is obtained, the next round of alternate iterative solution is carried out, namely +.>Fixed projection matrix->Updating the weight vector according to step S4.2 +.>Step S4.3 is repeated until stop condition +.>Obtaining an optimal projection matrix->。

Step S5, carrying out normalization processing on the sample to be identified,obtaining projected data through the optimal projection matrix, and normalizing the identification sample dataThe image after projection is +.>And then obtaining the identification tag by adopting a nearest neighbor classifier.

The demonstration experiment of the invention uses the data set as follows: MNIST handwriting digital image.

The MNIST dataset is a classical dataset in the machine learning field, consisting of 60000 training samples and 10000 test samples, each of which is a 28 x 28 pixel gray scale handwritten digital picture, as shown in fig. 3. The training set in the experiment consists of 100 images randomly extracted from each training set of 0-9 handwriting numbers, and the total of 10 categories is 10000 test samples. In order to verify the effectiveness of the invention in semi-supervised learning, different amounts of label data are adopted for training in experiments, and the average value of the accuracy of 10 times of experiments on a test set is taken as an evaluation index. The invention relates to a feature extraction method, which adopts a nearest neighbor classifier for showing the classification performance of the feature extraction method on an MNIST data set.

Experimental hardware environment: intel Core i5 (2.7 GHz) processor and Macbook Pro for 8GB memory. Code execution environment: matlab (R2015 b). The experimental results are as follows:

to verify the validity and superiority of the present invention, experiments compared 5 supervised discriminant analysis methods (LDA, LFDA, aPAC, LADA and MDAAWS) with 4 semi-supervised discriminant analysis methods (SLDA, SMMC, SSDR and SELF), where the neighbor number was set to 5, and the regularized term parameters in each method were in the parameter rangeAnd (3) obtaining the content through grid search. Table 1 records the classification accuracy of the invention with other 9 comparison methods for different numbers of tags, where the characteristic dimension is 20. From the table, it can be seen that the semi-supervised discriminant analysis method is accurate in classificationThe rate is generally higher than that of the corresponding supervised method, which means that the non-label data provides favorable information, and as the number of label samples increases, the classification performance of a part of the supervised discriminant analysis method is reduced due to over-fitting, but the semi-supervised discriminant analysis method does not meet the phenomenon. Therefore, the generalization capability of an algorithm can be improved by introducing information of the label-free data, and the classification accuracy obtained by the LapAWDA method provided by the invention is highest on training data of 10, 20 or 30 label data and is obviously superior to other discriminant analysis methods.

TABLE 1 Classification average accuracy of 10 methods for different tag numbers (%)

To investigate the effect of feature number and number of marked samples on the projection matrix obtained by lapwda, 10, 20 and 30 marked samples were randomly selected from each class of training data, respectively, with the remaining training data considered as unmarked samples. Fig. 4 to 6 show the accuracy of the multiple discriminant classification methods in the dimensional range from 5 to 50 on MNIST datasets, with the highest accuracy achieved in each dimension, and particularly in the first 20 dimensions, the classification performance of the method is far better than other methods. And the classification accuracy of the invention increases as the number of labeled samples increases. The results show that more discrimination information can be obtained from training data through the projection matrix in the classification task, and meanwhile, the utilization rate of the tag data is improved.

It should be noted that the specific methods of the above-described embodiments may form computer program products, and that the computer program products embodied herein may therefore be stored on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.).

Claims

1. A handwriting digital recognition method, characterized by comprising the steps of:

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix +.>Is an inter-class divergence matrix->For projection matrix->L of (2) _2,1 Norm, m is the number of features, d is the dimension of projection space, +.>Is manifold regular term->To weigh the parameters->Is a unitary matrix->The number of categories of tag information in the training data; solving a projection matrix by adopting an iterative optimization method>And weight vector->Obtaining an optimal projection matrix;

2. The handwriting recognition method according to claim 1, wherein said step S3 comprises the steps of:

step S3.1, training data is utilizedConstructing a neighbor graph to obtain a neighbor matrix>Tag data->Label-free data->The number of tag data is->The number of unlabeled data is->Neighbor matrix->The construction mode of (2) is as follows:

，

wherein the method comprises the steps ofDenoted as->Is->A neighbor set;

，

3. The handwriting digital recognition method of claim 1, wherein said iterative optimization method is used to solve a projection matrixAnd weight vector->Obtaining an optimal projection matrix, comprising the steps of:

，

First calculate the momentArrayObtaining the optimization target as

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Step S4.3, updating weight vectorContinuing to solve the projection matrix according to step S4.1>The method comprises the steps of carrying out a first treatment on the surface of the After the optimal projection matrix of the round is obtained, the next round of alternate iterative solution is carried out, namely +.>Fixed projection matrix->Updating the weight vector according to step S4.2 +.>RepeatingStep S4.3 until stopping condition is met to obtain optimal projection matrix +.>。

4. A handwriting recognition method according to claim 3 and wherein said stop condition in step S4.3 is。

5. The method of handwriting digital recognition according to claim 1, wherein said intra-class divergence matrixThe calculation method is as follows:

，

the inter-class divergence matrixThe calculation method is as follows:

。

6. a handwriting digital recognition device, comprising:

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix +.>Is an inter-class divergence matrix->For projection matrixL of (2) _2,1 Norm, m is the number of features, d is the projectionDimension of space->Is manifold regular term->To weigh the parameters->Is a unitary matrix->The number of categories of tag information in the training data; solving a projection matrix by adopting an iterative optimization method>And weight vector->Obtaining an optimal projection matrix;

7. The handwriting digital recognition device of claim 6, wherein said second calculation module calculates manifold regularization term comprising:

using training dataConstructing a neighbor graph to obtain a neighbor matrix>Tag dataLabel-free data->The number of tag data is->The number of unlabeled data is->Neighbor matrix->The construction mode of (2) is as follows:

，

wherein the method comprises the steps ofDenoted as->Is->A neighbor set;

computing Laplacian matrix in manifold regularization termWherein->As a diagonal matrix, diagonal elementsThe obtained manifold regularization term in the projection space is

，

8. The handwriting digital recognition device of claim 6, wherein said optimal projection matrix solving module adopts an iterative optimization method to solve a projection matrixAnd weight vector->Obtaining an optimal projection matrix, comprising the steps of:

initializing weightsSolving the projection matrix +.>Transformation of the optimization function of LapAWDA into

，

First calculate the matrixObtaining the optimization target as

，

Fixed projection matrixSolving for weight vector +.>At this time, the objective function of LapAWDA becomes

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Updating weight vectorsContinuing to solve the projection matrix +.>The method comprises the steps of carrying out a first treatment on the surface of the After the optimal projection matrix of the round is obtained, the next round of alternate iterative solution is carried out, namely +.>Fixed projection matrix->Re-updating the weight vector +.>Repeatedly solving the projection matrix +.>Until the stopping condition is met, obtaining the optimal projection matrix +.>。

9. The handwriting recognition device of claim 8, wherein the stop condition is。

10. A computer storage medium having stored thereon a computer program which, when executed by a processor, implements the handwriting digital recognition method of any one of claims 1 to 5.