CN117671704B

CN117671704B - Handwriting digital recognition method, handwriting digital recognition device and computer storage medium

Info

Publication number: CN117671704B
Application number: CN202410130100.XA
Authority: CN
Inventors: 薛杨涛; 白文涛; 司亚利; 钟珊; 龚声蓉
Original assignee: Changshu Institute of Technology
Current assignee: Changshu Institute of Technology
Priority date: 2024-01-31
Filing date: 2024-01-31
Publication date: 2024-04-26
Anticipated expiration: 2044-01-31
Also published as: CN117671704A

Abstract

The invention discloses a handwriting digital recognition method, which comprises the following steps of carrying out normalization processing on samples collected in the steps to obtain training data, wherein the training data comprises label data and label-free data; calculating an intra-class divergence matrix and an inter-class divergence matrix from the tag data; constructing a neighbor graph by using the label data and the label-free data to calculate manifold regular terms; the training data is utilized to learn through a Laplace self-adaptive weight discriminant analysis method to obtain an optimal projection matrix, an iterative optimization method is adopted to solve an optimization problem to obtain the optimal projection matrix, a sample to be identified is subjected to normalization processing, projected data is obtained through the optimal projection matrix, and then a nearest neighbor classifier is adopted to obtain an identification tag. The invention also discloses a device based on the method and a computer storage medium. The method and the device effectively solve the problem of multiple classification with less tag data, improve the utilization rate of the data and improve the classification performance.

Description

Handwriting digital recognition method, handwriting digital recognition device and computer storage medium

Technical Field

The present invention relates to the field of image recognition technologies, and in particular, to a handwriting digital recognition method, device, and computer storage medium.

Background

The linear discriminant analysis method (LINEAR DISCRIMINANT ANALYSIS, LDA) is a classical supervised learning algorithm, mainly used for dimension reduction and classification. The main idea is to project the data into a new space so that the same kind of data is as close as possible and different kinds of data are as far as possible. The method can be used for solving the problem of image classification, such as handwriting digital recognition and the like. The linear discriminant analysis method can improve the classification accuracy by projecting the data to the optimal linear discriminant direction. LDA is not an optimal choice for multi-classification problems. Under the homodyne gaussian assumption, the projection of LDA is obtained by maximizing the weighted arithmetic average of Kullback-Leibler (KL) divergence between different classes, the projection direction of which is dominated by class pairs with large KL divergence, which results in overlapping phenomenon of class pairs with small KL divergence in projection space, so that the accuracy of classification is also significantly degraded. Aiming at the problem of class separation of LDA in multi-classification problems, many researchers propose various schemes for constructing weights to optimize LDA. The supervised discriminant analysis method is mainly divided into two main categories, one category is to replace arithmetic average value of KL divergence among different categories, and different weights are given to the category pairs with different KL divergence; the other class is a class pair focusing on the separation of similar class pairs, emphasizing small KL divergence. However, these methods are supervised, require enough label data to train the model, and are prone to over-fitting problems.

With the development of scientific technology, the technology and tools for collecting data are continuously advanced, a large amount of data can be used, but the labeling work of the label data also needs a large amount of manpower and material resources, so how to use the label-free data to help improve the performance of the existing algorithm becomes a current research hot spot. Semi-supervised learning is to use a large amount of unlabeled data to assist a small amount of labeled data to improve learning performance, so that a learning model with stronger generalization capability is obtained. How to extend the supervised discriminant analysis method to the semi-supervised learning to obtain a more effective classification model becomes one of the tasks to be solved urgently.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a handwriting digital recognition method, expands a supervised discriminant analysis method to semi-supervised learning, and solves the problems of few label data in multi-classification tasks and class separation existing in the traditional discriminant analysis method. Another object of the present invention is to provide a handwriting digital recognition device and a corresponding computer storage medium.

The technical scheme of the invention is as follows: a handwriting digital recognition method, comprising the steps of:

Step S1, carrying out normalization processing on collected samples to obtain training data, wherein the training data comprises label data and label-free data;

s2, calculating an intra-class divergence matrix and an inter-class divergence matrix from tag data in training data;

s3, constructing a neighbor graph to calculate manifold regular terms according to the label data and the label-free data in the training data;

s4, learning by using training data through a Laplace adaptive weight discriminant analysis method to obtain an optimal projection matrix, wherein the method comprises the steps of setting an optimization target of the Laplace adaptive weight discriminant analysis method as follows

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix,/>Is an inter-class divergence matrix,/>For projection matrix/>Is the L _2,1 norm of the projection space, m is the feature number, d is the dimension of the projection space,/>Is a manifold regularization term that is used to determine,To balance the parameters,/>Is a unitary matrix,/>The number of categories of tag information in the training data; solving projection matrix/>, by adopting iterative optimization methodAnd weight vector/>Obtaining an optimal projection matrix;

and S5, carrying out normalization processing on the sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining the identification tag by adopting a nearest neighbor classifier.

The invention also provides a handwriting digital recognition device, which comprises:

and a pretreatment module: normalizing the collected samples to obtain training data, wherein the training data comprises label data and label-free data;

a first calculation module: calculating an intra-class divergence matrix and an inter-class divergence matrix from tag data in the training data;

A second calculation module: constructing a neighbor graph by using label data and label-free data in the training data to calculate manifold regular terms;

And an optimal projection matrix solving module: the training data is utilized to obtain an optimal projection matrix through learning by a Laplace adaptive weight discriminant analysis method, and the method comprises the steps of setting an optimization target of the Laplace adaptive weight discriminant analysis method as follows

，

And an identification module: and carrying out normalization processing on the sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining the identification tag by adopting a nearest neighbor classifier.

Further, the step S3 and the second calculating module include calculating steps:

Step S3.1, training data is utilized Constructing a neighbor graph to obtain a neighbor matrixTag data/>Label-free data/>The number of tag data is/>The number of unlabeled data is/>Neighbor matrix/>The construction mode of (2) is as follows:

，

Wherein the method comprises the steps of Expressed as/>/>A neighbor set;

Step S3.2, calculating Laplacian matrix in manifold regularization term Wherein/>Is a diagonal matrix, diagonal element/>The obtained manifold regularization term in the projection space is

，

Wherein the method comprises the steps ofIs L ₂ norm,/>Expressed as/>Image in low-dimensional projection space,/>，。

Further, the method for solving the projection matrix by adopting iterative optimizationAnd weight vector/>Obtaining an optimal projection matrix, comprising the steps of:

Step S4.1, initializing weights Solving the projection matrix/>The optimization function of LapAWDA turns into

，

Wherein the method comprises the steps ofIs constant,/>0 Is a trade-off coefficient,/>，

First calculate the matrixObtaining the optimization target as

，

And converting the optimization problem into a feature decomposition problem by using a Lagrangian multiplier method:

，

Wherein the method comprises the steps of Is a diagonal matrix, diagonal element/>，/>For/>(1 /)Line vector,/>As eigenvalues, the optimal projection matrix/>Is formed by/>, Corresponding to the maximum eigenvalueA feature vector composition, wherein/>；

Step S4.2, fixing the projection matrixSolving for weight vector/>At this time LapAWDA the objective function becomes

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Step S4.3, updating weight vectorContinuing to solve the projection matrix/>, according to step S4.1; After the optimal projection matrix of the round is obtained, the next round of alternate iteration solution is carried out, namely/>Fixed projection matrix/>Updating the weight vector/>, according to step S4.2Repeating the step S4.3 until the stopping condition is met to obtain the optimal projection matrix/>。

Because the optimization problem of the Laplace adaptive weight discriminant analysis method is not a classical quadratic optimization problem, a rapid and effective iterative optimization algorithm is adopted in the solving process, and can be theoretically proved to be convergent.

Further, the stopping condition in the step S4.3 is that。

Further, the intra-class divergence matrixThe calculation method is as follows:

，

Wherein the method comprises the steps of Represents the/>/>, In classSample number,/>Represents the/>A mean vector of the class;

The inter-class divergence matrix The calculation method is as follows:

。

The present invention also provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the above-described handwritten number recognition method.

Compared with the prior art, the technical scheme provided by the invention has the advantages that:

The method introduces the structural information of the label-free data through manifold regular terms, and adopts a self-adaptive weight method to balance the KL divergence among each class pair, so that the class pair with small KL divergence is prevented from disappearing in a projection space. In addition, L _2,1 norm constraint is applied to the projection matrix, so that a sparse-discrimination projection matrix is obtained, classification accuracy is further improved, and the method is more suitable for multiple classification tasks. LapAWDA the optimal solution obtained by the optimization problem is that useful information can be extracted to facilitate the subsequent classification task.

The semi-supervised feature extraction method is combined with nearest neighbor classification to obtain a multi-classification model through the combination of the discriminant analysis method and manifold regular terms, so that the method can be used for solving the problem of semi-supervised multi-classification with less tag data, the utilization rate of the data is improved, and the classification performance is improved.

Drawings

Fig. 1 is a flow chart of a handwriting digital recognition method.

Fig. 2 is a flow chart of a discriminant analysis method using laplace adaptive weights.

Fig. 3 is a sample of the MNIST dataset.

Fig. 4 is the average accuracy over dimensions for 10 methods on the MNIST dataset at 10 tag data.

Fig. 5 shows the average accuracy of 10 methods over MNIST datasets as a function of dimension for 20 tag data.

Fig. 6 shows the average accuracy of 10 methods over an MNIST dataset as a function of dimension for 30 tag data.

Detailed Description

The present application is further described below with reference to examples, which are to be construed as merely illustrative of the present application and not limiting of its scope, and various modifications to the equivalent arrangements of the present application will become apparent to those skilled in the art upon reading the present description, which are within the scope of the application as defined in the appended claims.

The handwriting digital recognition device according to the present embodiment includes:

and an optimal projection matrix solving module: learning by using training data through a Laplace self-adaptive weight discriminant analysis method to obtain an optimal projection matrix;

Specifically, please refer to fig. 1 and 2, the handwriting digital recognition method adopted by the device includes the following steps:

And S1, carrying out normalization processing on the collected samples to obtain training data, wherein the normalization processing is carried out by dividing each pixel value of the image by 255 and mapping the divided pixel value to the range of [0, 1]. The training data includes tag data And unlabeled data/>Training data is/>Wherein the tag vector of the tag data isTag information/>，/>The number of tag data is the number of categories/>The number of unlabeled data is/>Total training data is/>。

For the following Class data, total intra-class divergence matrix/>The calculation method is as follows:，

Wherein the method comprises the steps of Represents the/>/>, In classSample number,/>Represents the/>And (5) a mean vector of the class.

And for the case ofComposition between any two classes/>The pair of individuals gets/>Inter-class divergence matrix, i.e./>Class and/>Inter-class divergence matrix/>The calculation mode of (a) is as follows:

。

S3, constructing a neighbor graph to calculate manifold regular terms according to the label data and the label-free data in the training data; the method specifically comprises the following steps:

Step S3.1, training data is utilized Constructing a neighbor graph to obtain a neighbor matrixNeighbor matrix/>The construction mode of (2) is as follows:

，

Wherein the method comprises the steps of Expressed as/>/>Neighbor set.

Step S3.2, calculating Laplacian matrix in manifold regularization termWherein/>Is a diagonal matrix, diagonal element/>The obtained manifold regularization term in the projection space is

，

Wherein the method comprises the steps ofIs L ₂ norm,/>Expressed as/>Image in low-dimensional projection space,/>D is the dimension of the projection space.

Step S4, the expression of the manifold regular term can be seen to be the projection vectorCorrelation, so manifold regularization term is introduced into the Laplace adaptive weight discriminant analysis method (LapAWDA), which requires the projection vector to be solved in the optimization process. Due to the weight vector/>Is not predefined, but rather is learned from a low-dimensional projection space, and the optimization objective of LapAWDA indicates that it is non-smooth and cannot directly solve for the projection vector/>And weight vector/>However, an algorithm of iterative optimization can obtain an approximate optimal solution, so LapAWDA optimization problem adopts 1) fixed weight vector/>Updating projection vector/>; 2) Fixed projection vector/>Update weight vector/>And the two steps are alternately and iteratively solved until a stopping condition is met to obtain an approximate optimal solution.

Specifically, the optimization objective of the laplace adaptive weight discriminant analysis method is set to

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix,/>Is an inter-class divergence matrix,/>For projection matrix/>L _2,1 norm of (a), m is the number of features,/>，/>To balance the parameters,/>Is an identity matrix.

The iterative optimization steps are as follows:

，

First calculate the matrixObtaining the optimization target as

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Step S4.3, updating weight vectorContinuing to solve the projection matrix/>, according to step S4.1; After the optimal projection matrix of the round is obtained, the next round of alternate iteration solution is carried out, namely/>Fixed projection matrix/>Updating the weight vector/>, according to step S4.2Repeating the step S4.3 until the stopping condition/>, is metObtaining the optimal projection matrix/>。

S5, carrying out normalization processing on the sample to be identified, obtaining projected data through an optimal projection matrix, and normalizing the identified sample dataPost-projection image is/>And then obtaining the identification tag by adopting a nearest neighbor classifier.

The demonstration experiment of the invention uses the data set as follows: MNIST handwriting digital image.

The MNIST dataset is a classical dataset in the machine learning field, consisting of 60000 training samples and 10000 test samples, each of which is a 28 x 28 pixel gray scale handwritten digital picture, as shown in fig. 3. The training set in the experiment consists of 100 images randomly extracted from each training set of 0-9 handwriting numbers, and the total of 10 categories is 10000 test samples. In order to verify the effectiveness of the invention in semi-supervised learning, different amounts of label data are adopted for training in experiments, and the average value of the accuracy of 10 times of experiments on a test set is taken as an evaluation index. The invention relates to a feature extraction method, which adopts a nearest neighbor classifier for showing the classification performance of the feature extraction method on an MNIST data set.

Experimental hardware environment: intel Core i5 (2.7 GHz) processor and Macbook Pro of 8GB memory. Code execution environment: matlab (R2015 b). The experimental results are as follows:

To verify the validity and superiority of the present invention, experiments compared 5 supervised discriminant analysis methods (LDA, LFDA, aPAC, LADA and MDAAWS) with 4 semi-supervised discriminant analysis methods (SLDA, SMMC, SSDR and SELF), where the neighbor number was set to 5, and the regularized term parameters in each method were in the parameter range And (3) obtaining the content through grid search. Table 1 records the classification accuracy of the invention with other 9 comparison methods for different numbers of tags, where the characteristic dimension is 20. From the table, it can be seen that the classification accuracy of the semi-supervised discriminant analysis method is generally higher than that of the corresponding supervised method, which indicates that the non-tag data provides advantageous information, and that as the number of tag samples increases, the classification performance of the partially supervised discriminant analysis method is reduced due to over-fitting, but none of the semi-supervised discriminant analysis methods encounters this phenomenon. Therefore, the generalization capability of the algorithm can be improved by introducing the information of the label-free data, and the LapAWDA method provided by the invention has the highest classification accuracy no matter on the training data of 10, 20 or 30 label data, and is obviously superior to other discriminant analysis methods.

TABLE 1 Classification average accuracy of 10 methods for different tag numbers (%)

To investigate the effect of feature number and number of marked samples on the projection matrix obtained LapAWDA, 10, 20 and 30 marked samples were randomly selected from each class of training data, respectively, with the remaining training data considered as unmarked samples. Fig. 4 to 6 show the accuracy of the multiple discriminant classification methods in the dimensional range from 5 to 50 on MNIST datasets, with the highest accuracy achieved in each dimension, and particularly in the first 20 dimensions, the classification performance of the method is far better than other methods. And the classification accuracy of the invention increases as the number of labeled samples increases. The results show that more discrimination information can be obtained from training data through the projection matrix in the classification task, and meanwhile, the utilization rate of the tag data is improved.

It should be noted that the particular methods of the above-described embodiments may form a computer program product, and that the computer program product in which the present application is implemented may therefore be stored on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.).

Claims

1. A handwriting digital recognition method, characterized by comprising the steps of:

，

Wherein the method comprises the steps ofIs an intra-class divergence matrix,/>Is an inter-class divergence matrix,/>For projection matrixIs the L _2,1 norm of the projection space, m is the feature number, d is the dimension of the projection space,/>Is manifold regular term,/>To balance the parameters,/>Is a unitary matrix,/>The number of categories of tag information in the training data; solving projection matrix/>, by adopting iterative optimization methodAnd weight vector/>Obtaining an optimal projection matrix;

s5, carrying out normalization processing on a sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining an identification tag by adopting a nearest neighbor classifier; the step S3 includes the steps of:

，

Wherein the method comprises the steps of Expressed as/>/>A neighbor set;

，

2. The handwriting digital recognition method of claim 1, wherein said iterative optimization method is used to solve a projection matrixAnd weight vector/>Obtaining an optimal projection matrix, comprising the steps of:

，

First calculate the matrixObtaining the optimization target as

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

3. The method according to claim 2, wherein the stop condition in step S4.3 is that。

4. The method of handwriting digital recognition according to claim 1, wherein said intra-class divergence matrixThe calculation method is as follows:

，

The inter-class divergence matrix The calculation method is as follows:

。

5. A handwriting digital recognition device, comprising:

，

and an identification module: carrying out normalization processing on a sample to be identified, obtaining projected data through an optimal projection matrix, and obtaining an identification tag by adopting a nearest neighbor classifier;

the second computing module computing manifold regularization term includes:

Using training data Constructing a neighbor graph to obtain a neighbor matrix/>Tag data/>Label-free data/>The number of tag data is/>The number of unlabeled data is/>Neighbor matrix/>The construction mode of (2) is as follows:

，

Wherein the method comprises the steps of Expressed as/>/>A neighbor set;

Computing Laplacian matrix in manifold regularization term Wherein/>As a diagonal matrix, diagonal elementsThe obtained manifold regularization term in the projection space is

，

6. The handwriting recognition device of claim 5, wherein the optimal projection matrix solving module adopts an iterative optimization method to solve a projection matrixAnd weight vector/>Obtaining an optimal projection matrix, comprising the steps of:

Initializing weights Solving the projection matrix/>The optimization function of LapAWDA turns into

，

First calculate the matrixObtaining the optimization target as

，

Fixed projection matrixSolving for weight vector/>At this time LapAWDA the objective function becomes

，

From the Cauchy inequality, a solution of the weight vector is obtained

；

Updating weight vectorsContinuing to solve the projection matrix/>; After the optimal projection matrix of the round is obtained, the next round of alternate iteration solution is carried out, namely/>Fixed projection matrix/>Re-updating weight vector/>Repeatedly solving the projection matrix/>Until the stopping condition is met, obtaining the optimal projection matrix/>。

7. The handwriting recognition device of claim 6, wherein the stop condition is。

8. A computer storage medium having stored thereon a computer program which, when executed by a processor, implements the handwriting digital recognition method of any one of claims 1 to 4.