CN102722713A - Handwritten numeral recognition method based on lie group structure data and system thereof - Google Patents

Handwritten numeral recognition method based on lie group structure data and system thereof Download PDF

Info

Publication number
CN102722713A
CN102722713A CN2012100411160A CN201210041116A CN102722713A CN 102722713 A CN102722713 A CN 102722713A CN 2012100411160 A CN2012100411160 A CN 2012100411160A CN 201210041116 A CN201210041116 A CN 201210041116A CN 102722713 A CN102722713 A CN 102722713A
Authority
CN
China
Prior art keywords
lie group
structured data
training
group structured
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100411160A
Other languages
Chinese (zh)
Other versions
CN102722713B (en
Inventor
张莉
王晓乾
杨季文
何书萍
李凡长
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN201210041116.0A priority Critical patent/CN102722713B/en
Publication of CN102722713A publication Critical patent/CN102722713A/en
Application granted granted Critical
Publication of CN102722713B publication Critical patent/CN102722713B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the invention provide a handwritten numeral recognition method based on lie group structure data and a system thereof. The method is characterized by: extracting corresponding lie group structure data from original handwritten numeral image data; through constructing a matrix Gaussian kernel function, using a support vector machine algorithm to train a classifier model; inputting the lie group structure data corresponding to the handwritten numeral image data to be detected into the classifier model obtained through training respectively so as to obtain a corresponding numeral type; carrying out non-linear characteristic capturing on the lie group structure data corresponding to the handwritten numeral image data to be detected so that the handwritten numeral recognition is better realized.

Description

A kind of Handwritten Numeral Recognition Method and system based on the Lie group structured data
Technical field
The present invention relates to the Handwritten Digital Recognition technical field, more particularly, relate to a kind of Handwritten Numeral Recognition Method and system based on the Lie group structured data.
Background technology
The develop rapidly of Along with computer technology and digital image processing techniques in recent years; The Handwritten Digital Recognition technology is added up at large-scale data; Accommodation distribution has obtained in finance, the tax and the financial field using widely, in this simultaneously; Along with the popularization and application of machine learning techniques, a lot of physicists and chemist begin the data of widely-used Lie group theoretical research association area.Accordingly, in the Handwritten Digital Recognition technical field, the Lie group structured data is widely used with its good mathematical structure.
At present; Handwritten Digital Recognition based on the Lie group structured data generally all is to set up sorter model through sorting algorithm; Thereby the Lie group structured data to the handwritten form digital picture carries out classification processing; Obtain sorter output result, and then obtain the recognition result of handwriting digital according to the output result of sorter.Prior art sorting algorithm commonly used is a Lie group Fisher algorithm; Lie group Fisher algorithm needs original Lie group structured data is carried out a linear transformation projection; Make homogeneous data project to together as far as possible; The inhomogeneity data as much as possible away from; Though the data after the projection have good separability, the Lie group Fisher algorithm of employing linear classification method can not be caught the nonlinear characteristic of Lie group structured data, and this just causes Lie group Fisher algorithm on the nonlinear characteristic of handling the Lie group structured data, to have certain defective.
Summary of the invention
In view of this; The present invention provides a kind of Handwritten Numeral Recognition Method and system based on the Lie group structured data; To solve the defective that existing Handwritten Digital Recognition technology exists on the nonlinear characteristic of handling the Lie group structured data, to realize the Nonlinear Processing of Lie group structured data.
For realizing above-mentioned purpose, the present invention provides following technical scheme:
A kind of Handwritten Numeral Recognition Method based on the Lie group structured data comprises step:
A. from original handwriting digital view data, extract the Lie group structured data of respective amount;
The corresponding relation of the class label of B. that the Lie group structured data is corresponding with it handwriting digital view data is as training sample; Obtain gathering with the Lie group structured data corresponding training sample of said respective amount, structure is handled the matrix gaussian kernel function of Lie group structured data simultaneously:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, p>0 is a kernel function, ‖ ‖ FBe matrix norm;
C. utilizing algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
D. handwriting digital view data to be measured is corresponding Lie group structured data is input to respectively and trains in the sorter model that obtains, and obtains corresponding digital classification.
The present invention also provides a kind of Handwritten Digital Recognition system based on the Lie group structured data, comprising:
Lie group structured data extraction module is used for from the Lie group structured data of original handwriting digital view data extraction respective amount;
Pre-processing module; The corresponding relation of class label that is used for the handwriting digital view data that the Lie group structured data is corresponding with it is as training sample; Obtain Lie group structured data corresponding training sample set with said respective amount; Simultaneously, structure is handled the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, and a ≠ b, p>0 is a kernel function, ‖ ‖ FBe matrix norm;
The model training module is used to utilize algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
Sort module is used for the Lie group structured data that handwriting digital view data to be measured is corresponding, is input to respectively and trains in the sorter model that obtains, and obtains corresponding digital classification.
Based on above technical scheme; The embodiment of the invention is through the structural matrix gaussian kernel function; Utilize algorithm of support vector machine to handle the Lie group structured data; By the advantage of algorithm of support vector machine, realized the Nonlinear Processing of Lie group structured data at identification small sample, non-linear and high dimensional pattern hypograph.
Description of drawings
In order to be illustrated more clearly in the embodiment of the invention or technical scheme of the prior art; To do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art below; Obviously, the accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills; Under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the process flow diagram of a kind of Handwritten Numeral Recognition Method based on the Lie group structured data of the present invention;
Fig. 2 is the Lie group mean algorithm, and Lie group Fisher algorithm and the inventive method are to the comparison diagram of the classification performance of numeral 1 and 7;
Fig. 3 is Lie group mean algorithm and the inventive method comparison diagram to the classification performance of numeral 1,7 and 9;
Fig. 4 is Lie group mean algorithm and the inventive method comparison diagram to the classification performance of numeral 1,2,7 and 9;
Fig. 5 is the structured flowchart of a kind of Handwritten Digital Recognition system based on the Lie group structured data of the present invention;
Fig. 6 is the structured flowchart of model training module of the present invention;
Fig. 7 is the structured flowchart of sort module of the present invention.
Embodiment
The inventor has structural risk minimization and good advantages such as generalization ability through discovering algorithm of support vector machine; Adopt algorithm of support vector machine to realize Handwritten Digital Recognition based on the Lie group structured data; Can solve the identification problem of handwriting digital under small sample, non-linear and high dimensional pattern, thereby solve the defective that existing Handwritten Digital Recognition technology exists on the nonlinear characteristic of catching the Lie group structured data.But the inventor also finds because the Lie group structured data is matrix data rather than vector data through research; The not processing of support matrix data of algorithm of support vector machine of standard application at present, therefore the SVMs method of standard application also can't be handled the Lie group structured data at present.
The inventor through research further after discovery can pass through the structural matrix gaussian kernel function, utilize algorithm of support vector machine, set up respective classified device model, the Lie group structured data is carried out classification processing, and then realizes the object of the invention.
In conjunction with the invention described above thought; To combine the accompanying drawing in the embodiment of the invention below, the technical scheme in the embodiment of the invention carried out clear, intactly description, obviously; Described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills are not making the every other embodiment that is obtained under the creative work prerequisite, all belong to the scope of the present invention's protection.
Fig. 1 is the process flow diagram of a kind of Handwritten Numeral Recognition Method based on the Lie group structured data of the present invention.With reference to Fig. 1, this method can comprise:
Step S100, from original handwriting digital view data, extract the Lie group structured data of respective amount;
For ease of describing; Represent the handwriting digital view data with x in the embodiment of the invention, represent the Lie group structured data, and the number of establishing original handwriting digital view data x is that (l is an integer to l with z; And l>=1), then original handwriting digital view data x is respectively x 1... x l, the Lie group structured data of respective amount is z 1... z l
With a handwriting digital image data extraction Lie group structured data is example, and establishing rv is reference vector, on the stroke zone of x view data, gets k point at random, constitutes k vector v i = r i n i , I=1 ..., k, r iThe mould of expression vector is long, n iBe v iDirection, and ‖ n i‖=1=M i* rv, M i = e Ai e - θ i e θ i e Ai ,
Figure BDA0000137587660000043
Figure BDA0000137587660000044
Wherein ‖ ‖ representes that this vectorial mould is long, then can obtain the pairing Lie group structured data of handwriting digital view data sample x z to be:
The corresponding relation of the class label of step S200, handwriting digital view data that the Lie group structured data is corresponding with it is as training sample; Obtain gathering with the Lie group structured data corresponding training sample of said respective amount, structure is handled the matrix gaussian kernel function of Lie group structured data simultaneously;
If y is the class label of handwriting digital view data x, y ∈ 1 ... c}, c are the classification number of handwriting digital view data x, then with the Lie group structured data z that extracts 1... z lClass label y with corresponding handwriting digital view data 1... y lMake up, can obtain comprising training sample the set { (z of x and y corresponding relation 1, y 1) ... (z l, y l);
The algorithm of support vector machine of standard application is not supported the processing of Lie group structured data at present; The kernel function of therefore existing algorithm of support vector machine is for the present invention and incompatible; For solving the application of algorithm of support vector machine to the Lie group structured data; Can make up the matrix gaussian kernel function of algorithm of support vector machine, make the Lie group structured data compatible mutually with algorithm of support vector machine, the concrete formula of matrix gaussian kernel function is following:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, a, b are integer, equal ∈ 1 ... l}, and a ≠ b, p>0 is a kernel function, ‖ ‖ FBe matrix norm.
Step S300, utilizing algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
Those skilled in the art can know and utilize algorithm of support vector machine, carry out the conventional method of machine training; And the embodiment of the invention to provide the machine training method that a kind of many classification problems of supporting the Lie group structured data are handled; This training method is specially: said each Lie group structured data is input to said c respectively gets in several sorter models of combination of 2; Lie group structured data obtains corresponding c and gets several sorters output of combination result of 2; Add up this Lie group structured data among the said output result and be divided in the c class a certain type value; And maximizing therefrom, said maximal value is confirmed as the digital classification of the corresponding handwriting digital view data of this Lie group structured data.
Machine training method for the ease of to the embodiment of the invention is carried out detail knowledge, and hereinafter will provide concrete training process.
Training sample set { (z 1, y 1) ... (z l, y l) the classification number be c, therefrom appoint and get two types of class label corresponding sample, the sample that is promptly taken out includes only this two types of labels; And the label of the sample in the training sample that the takes out set be these two types of labels, is that a combination can obtain c and gets several combinations of combination of 2 with two types of class label corresponding sample, is convenient the statement; An existing combination i who gets with c in several combinations of combination of 2, j (i, the equal ∈ { 1 of j; ... c}; And i ≠ j) two types of label corresponding sample are example, and the training process of sorter model is described, concrete training process is:
From training sample set { (z 1, y 1) ... (z l, y l) in extract i, behind two types of samples of j, with said i, two types of samples of j carry out form optimization and can get: the order
Figure BDA0000137587660000052
Figure BDA0000137587660000053
Wherein, subscript ij representes and i, two types of relevant data messages of j, and subscript m is represented an index,
Figure BDA0000137587660000054
Represent i, two types of relevant Lie group structured datas of j, l IjExpression i, the sample sum that j is two types, For
Figure BDA0000137587660000056
The corresponding class label, and work as y m Ij = i , Then y ‾ m Ij = - 1 , When y m Ij = j , Then y ‾ m Ij = + 1 ;
The present invention is based on the Lie group structured data, use the SVMs method to discern the view data of handwriting digital, using the SVMs method to handle the i of handwriting digital view data so, two types of branch time-likes of j, then need find the solution following optimization problem:
max Σ m = 1 l ij β m ij - 1 2 Σ m = 1 l ij Σ n = 1 l ij y ‾ m ij y ‾ n ij β m ij β n ij k ( z m ij , z n ij )
s . t . Σ m = 1 l ij y ‾ m ij β m ij = 0,0 ≤ β m ij ≤ S ,
M wherein, n all representes an index,
Figure BDA0000137587660000063
For The corresponding class label,
Figure BDA0000137587660000065
M, n is integer, and m, the equal ∈ of n 1 ... l Ij,
Figure BDA0000137587660000066
For the algorithm of support vector machine training produces model coefficient, S is the regular parameter of algorithm of support vector machine training need, produces following sorter model according to above-mentioned optimization training:
f Ij ( z ) = Sgn { Σ m = 1 l Ij β m Ij y ‾ m Ij k ( z , z m Ij ) + b Ij } , I, j all=1 ... c, and i ≠ j;
Sgn () expression sign function in the following formula, b IjBe model threshold, can be by the computes gained:
b Ij = y Sv ‾ - Σ m = 1 l Ij β m Ij y ‾ m Ij k ( z Sv , z m Ij ) , Z wherein SvCorresponding coefficient value does
Figure BDA0000137587660000069
The above-mentioned i that drawn, the corresponding sorter model of two types of samples of j, if also there are all the other combinations that from training sample, extract, then the principle of all the other combined training sorter models is identical therewith, can contrast each other, repeats no more here.
Step S400, Lie group structured data that handwriting digital view data to be measured is corresponding are input to respectively in the sorter model that training obtains, and obtain corresponding digital classification.
After step S300 obtains sorter model, can from handwriting digital view data to be measured, extract corresponding Lie group structured data,, obtain corresponding digital classification so that handwriting digital view data to be measured is classified.Need to prove that the purposes of the original handwriting digital view data among the step S100 is the training classifier model, it can think a huge handwriting digital image data base; And handwriting digital view data to be measured among the step S400 is the identifying object of Handwritten Numeral Recognition Method of the present invention, the view data of the handwriting digital of discerning for needs.
The sorter model that the embodiment of the invention trained can be handled many classification problems of Lie group structured data; On concrete Classification and Identification; Can carry out according to following manner: the Lie group structured data that handwriting digital view data to be measured is corresponding is input to said c respectively and gets in several sorter models of combination of 2; Lie group structured data obtains corresponding c and gets several sorters output of combination result of 2; Add up this Lie group structured data among the said output result and be divided in the c class a certain type value, and maximizing therefrom, said maximal value is confirmed as the digital classification of the corresponding handwriting digital view data of this Lie group structured data;
The available formula f of said sorter output result Ij(z) expression, i, j=1 ... c, and i ≠ j, concrete in the time will adding up this Lie group structured data and be divided into the value of i class, can carry out through following formula:
Σ j = 1 , i ≠ j c f ij ( z ) , i = 1 , . . . c ,
This Lie group structured data that can obtain being added up through following formula is divided into c value of i class; Through formula
Figure BDA0000137587660000072
maximizing from this c value, the pairing digital classification of the maximal value that is searched out is exactly the classification of the pairing handwriting digital view data of this Lie group structured data.
The embodiment of the invention is through the structural matrix gaussian kernel function; Utilize algorithm of support vector machine to handle the corresponding Lie group structured data of handwriting digital view data; Advantage by algorithm of support vector machine identification small sample, non-linear and high dimensional pattern hypograph has realized that the nonlinear characteristic of Lie group structured data is caught;
Secondly, be reduced to a plurality of two classification problems through many classification problems, and carry out classification processing, realized that many classification problems of Lie group structured data are handled, thereby better realized Handwritten Digital Recognition according to algorithm of support vector machine with the Lie group structured data.
The beneficial effect that can bring through following experimental verification the present invention below:
The classification number of the general storable handwriting digital view data of handwriting digital database is 10 types; Existing four types of selecting wherein; Obtaining numeral 1,2,7 and 9 experimentizes; From training set and test set, get preceding 200 respectively for every type, promptly every type all has 200 training samples and test sample book.On training sample, select parameter with ten times of cross validations then, wherein the span of regular factor is: { 2 -1, 2 0... 2 4, matrix gaussian kernel parameter value scope is { 2 -10, 2 -9... 2 -6.Use the parameter of select then and come to train again a model, obtain discrimination at test set estimated performance.Further, can consider the influence of a cloud quantity to discrimination, the value set of some cloud number is for { 5,10,15,20,25,30,35,40,50} puts cloud and produces at random, can repeat 5 experiments, provides an average result.Fig. 2 to Fig. 4 shows and adopts Lie group mean algorithm, Lie group Fisher algorithm and the technical scheme of the present invention resulting experimental result that experimentizes.
Fig. 2 is the Lie group mean algorithm, and Lie group Fisher algorithm and the inventive method are to the comparison diagram of the classification performance of numeral 1 and 7.With reference to Fig. 2, can find obviously to be superior to Lie group mean algorithm and Lie group Fisher algorithm, and discrimination is got increasing of counting out along with each sample and is presented increase tendency based on classifying quality of the present invention.Fig. 3 is that Lie group mean algorithm and the inventive method are to numeral 1; The comparison diagram of 7 and 9 classification performance; Fig. 4 is Lie group mean algorithm and the inventive method comparison diagram to the classification performance of numeral 1,2,7 and 9; With reference to Fig. 3 and Fig. 4, can find out that the many classifying qualities of the present invention obviously are superior to the Lie group mean algorithm.
Fig. 5 is the structured flowchart of a kind of Handwritten Digital Recognition system based on the Lie group structured data of the present invention.With reference to Fig. 5, this system can comprise:
Lie group structured data extraction module 100 is used for from the Lie group structured data of original handwriting digital view data extraction respective amount;
Pre-processing module 200; The corresponding relation of class label that is used for the handwriting digital view data that the Lie group structured data is corresponding with it is as training sample; Obtain Lie group structured data corresponding training sample set with said respective amount; Simultaneously, structure is handled the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, and a ≠ b, p>0 is a kernel function, ‖ ‖ FBe matrix norm;
Model training module 300 is used to utilize algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
Sort module 400 is used for the Lie group structured data that handwriting digital view data to be measured is corresponding, is input to respectively and trains in the sorter model that obtains, and obtains corresponding digital classification.
Wherein, the structure of model training module 300 can be as shown in Figure 6, comprising:
Combination acquiring unit 301 is used for appointing from the set of said training sample and gets two types of class label corresponding sample, obtains c and gets several combinations of combination of 2, and c is the classification number of handwriting digital view data;
Circuit training unit 302 is used for each unit of being combined as, and utilizes algorithm of support vector machine respectively, is kernel function with said matrix gaussian kernel function, and corresponding sample is respectively made up in input, and training obtains c and gets several sorter models of combination of 2.
Further, circuit training unit 302 can comprise
Training subelement (not shown) is used for extraction and comprises i, the combination of two types of samples of j, and i, the equal ∈ of j 1 ... c}, and i ≠ j, the flow process of execution training classifier model: order
Figure BDA0000137587660000082
Figure BDA0000137587660000083
L representes the number of handwriting digital view data, and x representes the handwriting digital view data, and z representes the Lie group structured data, and y is the class label of handwriting digital view data x; Y ∈ 1 ... and c}, subscript ij representes and i; Two types of relevant data messages of j, subscript m is represented an index
Figure BDA0000137587660000084
Represent i, two types of relevant Lie group structured datas of j, l IjExpression i, the sample sum that j is two types,
Figure BDA0000137587660000085
For
Figure BDA0000137587660000086
The corresponding class label, and work as y m Ij = i , Then y ‾ m Ij = - 1 , When y m Ij = j , Then y ‾ m Ij = + 1 , And find the solution
max Σ m = 1 l ij β m ij - 1 2 Σ m = 1 l ij Σ n = 1 l ij y ‾ m ij y ‾ n ij β m ij β n ij k ( z m ij , z n ij )
Figure BDA0000137587660000092
M, n all represent an index, M, n is integer, and m, the equal ∈ of n 1 ... l Ij,
Figure BDA0000137587660000094
For the algorithm of support vector machine training produces model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f Ij ( z ) = Sgn { Σ m = 1 l Ij β m Ij y ‾ m Ij k ( z , z m Ij ) + b Ij } , Sgn () representes sign function, b IjIt is model threshold;
Circulation subelement (not shown) is used for after said training subelement is accomplished the flow process of above-mentioned training classifier model, extracting another combination, carries out above-mentioned sorter model training flow process again, gets several sorter models of combination of 2 until obtaining c.
The structure of sort module 400 can be as shown in Figure 7, comprising:
Computing unit 401 is used for the Lie group structured data that handwriting digital view data to be measured is corresponding and is input to said c respectively and gets several sorter models of combination of 2, and Lie group structured data obtains corresponding c and gets several sorters output of combination result of 2;
Statistic unit 402, be used for adding up said output as a result this Lie group structured data be divided in the c class a certain type value, and maximizing therefrom;
Confirm unit 403, be used for the maximal value that said statistic unit searches out is confirmed as the digital classification of the corresponding handwriting digital view data of this Lie group structured data.
Further, statistic unit 402 can comprise:
Class primary system meter subelement (not shown); Be used for according to formula
Figure BDA0000137587660000096
i ∈ { 1; ... c} adds up the value that this Lie group structured data among the said output result is divided into the i class, said i class be hypothesis in the c class that will add up a certain type;
Maximal value is searched subelement (not shown), is used for according to formula:
Maximal value in the numerical value of the said statistics subelement statistics of
Figure BDA0000137587660000097
searching.
The present invention is based on the Handwritten Digital Recognition system of Lie group structured data, corresponding each other with the Handwritten Numeral Recognition Method based on the Lie group structured data, the system concrete function is realized and can referring to the method for correspondence no longer be repeated no more here.
The professional can also further recognize; The unit and the algorithm steps of each example of describing in conjunction with embodiment disclosed herein; Can realize with electronic hardware, computer software or the combination of the two; For the interchangeability of hardware and software clearly is described, the composition and the step of each example described prevailingly according to function in above-mentioned explanation.These functions still are that software mode is carried out with hardware actually, depend on the application-specific and the design constraint of technical scheme.The professional and technical personnel can use distinct methods to realize described function to each certain applications, but this realization should not thought and exceeds scope of the present invention.
The method of describing in conjunction with embodiment disclosed herein or the step of algorithm can be directly with the software modules of hardware, processor execution, and perhaps the combination of the two is implemented.Software module can place the storage medium of any other form known in random access memory (RAM), internal memory, ROM (read-only memory) (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or the technical field.
To the above-mentioned explanation of the disclosed embodiments, make this area professional and technical personnel can realize or use the present invention.Multiple modification to these embodiment will be conspicuous concerning those skilled in the art, and defined General Principle can realize under the situation that does not break away from the spirit or scope of the present invention in other embodiments among this paper.Therefore, the present invention will can not be restricted to these embodiment shown in this paper, but will meet and principle disclosed herein and features of novelty the wideest corresponding to scope.

Claims (10)

1. the Handwritten Numeral Recognition Method based on the Lie group structured data is characterized in that, comprises step:
A. from original handwriting digital view data, extract the Lie group structured data of respective amount;
The corresponding relation of the class label of B. that the Lie group structured data is corresponding with it handwriting digital view data is as training sample; Obtain gathering with the Lie group structured data corresponding training sample of said respective amount, structure is handled the matrix gaussian kernel function of Lie group structured data simultaneously:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, p>0 is a kernel function, ‖ ‖ FBe matrix norm;
C. utilizing algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
D. handwriting digital view data to be measured is corresponding Lie group structured data is input to respectively and trains in the sorter model that obtains, and obtains corresponding digital classification.
2. method according to claim 1 is characterized in that, said step C is specially:
From the set of said training sample, appoint and get two types of class label corresponding sample, obtain c and get several combinations of combination of 2, c is the classification number of handwriting digital view data; Each combination comprises two types of class label corresponding sample; With each unit of being combined as, utilize algorithm of support vector machine respectively, be kernel function with said matrix gaussian kernel function; Corresponding sample is respectively made up in input, and training obtains c and gets several sorter models of combination of 2.
3. method according to claim 2 is characterized in that, said step C comprises:
C1. from the set of said training sample, appoint and get two types of class label corresponding sample, obtain c and get several combinations of combination of 2;
C2. extraction comprises i, the combination of two types of samples of j, and i, the equal ∈ of j 1 ... c}, and i ≠ j, the flow process of execution training classifier model: order
Figure FDA0000137587650000012
Figure FDA0000137587650000013
Wherein l representes the number of handwriting digital view data, and z representes the Lie group structured data, and y is the class label of handwriting digital view data, y ∈ 1 ... and c}, subscript ij represent and i, two types of relevant data messages of j, and subscript m is represented an index,
Figure FDA0000137587650000014
Represent i, two types of relevant Lie group structured datas of j, l IjExpression i, the sample sum that j is two types,
Figure FDA0000137587650000015
For
Figure FDA0000137587650000016
The corresponding class label, and work as
Figure FDA0000137587650000017
Then
Figure FDA0000137587650000018
When
Figure FDA0000137587650000019
Then y ‾ m Ij = + 1 , And find the solution,
max Σ m = 1 l ij β m ij - 1 2 Σ m = 1 l ij Σ n = 1 l ij y ‾ m ij y ‾ n ij β m ij β n ij k ( z m ij , z n ij )
Figure FDA0000137587650000022
For
Figure FDA0000137587650000024
Corresponding class label, m, n are all represented an index,
Figure FDA0000137587650000025
M, n is integer, and m, the equal ∈ of n 1 ... l Ij,
Figure FDA0000137587650000026
For the algorithm of support vector machine training produces model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f Ij ( z ) = Sgn { Σ m = 1 l Ij β m Ij y ‾ m Ij k ( z , z m Ij ) + b Ij } , Sgn () representes sign function, b IjIt is model threshold;
C3. extract another combination, carry out above-mentioned sorter model training flow process, get several sorter models of combination of 2 until obtaining c.
4. according to claim 2 or 3 described methods, it is characterized in that said step D is specially:
The Lie group structured data that handwriting digital view data to be measured is corresponding is input to said c respectively and gets in several sorter models of combination of 2; Lie group structured data obtains corresponding c and gets several sorters output of combination result of 2; Add up this Lie group structured data among the said output result and be divided in the c class a certain type value; And maximizing therefrom, said maximal value is confirmed as the digital classification of the corresponding handwriting digital view data of this Lie group structured data.
5. method according to claim 4 is characterized in that, this Lie group structured data is divided into value a certain type in the c class and is specially among the said output result of said statistics:
According to formula
Figure FDA0000137587650000028
i ∈ 1 ... c} adds up the value that this Lie group structured data among the said output result is divided into the i class;
Said therefrom maximizing is specially:
According to formula f ( z ) = Max i = 1 . . . c Σ i = 1 , i ≠ j c f Ij ( z ) Maximizing.
6. the Handwritten Digital Recognition system based on the Lie group structured data is characterized in that, comprising:
Lie group structured data extraction module is used for from the Lie group structured data of original handwriting digital view data extraction respective amount;
Pre-processing module; The corresponding relation of class label that is used for the handwriting digital view data that the Lie group structured data is corresponding with it is as training sample; Obtain Lie group structured data corresponding training sample set with said respective amount; Simultaneously, structure is handled the matrix gaussian kernel function of Lie group structured data:
k ( z a , z b ) = e - p × | | z a - z b | | F 2 , Said z aAnd z bRepresent any two Lie group structured datas, and a ≠ b, p>0 is a kernel function, ‖ ‖ FBe matrix norm;
The model training module is used to utilize algorithm of support vector machine, is kernel function with said matrix gaussian kernel function, the input training sample, and training obtains sorter model;
Sort module is used for the Lie group structured data that handwriting digital view data to be measured is corresponding, is input to respectively and trains in the sorter model that obtains, and obtains corresponding digital classification.
7. method according to claim 6 is characterized in that, said model training module comprises:
The combination acquiring unit is used for appointing from the set of said training sample and gets two types of class label corresponding sample, obtains c and gets several combinations of combination of 2, and c is the classification number of handwriting digital view data;
The circuit training unit is used for each unit of being combined as, and utilizes algorithm of support vector machine respectively, is kernel function with said matrix gaussian kernel function, and corresponding sample is respectively made up in input, and training obtains c and gets several sorter models of combination of 2.
8. system according to claim 7 is characterized in that, said circuit training unit comprises:
The training subelement is used for extraction and comprises i, the combination of two types of samples of j, and i, the equal ∈ of j 1 ... c}, and i ≠ j, the flow process of execution training classifier model: order
Figure FDA0000137587650000032
Figure FDA0000137587650000033
Wherein l representes the number of handwriting digital view data, and z representes the Lie group structured data, and y is the class label of handwriting digital view data, y ∈ 1 ... and c}, subscript ij represent and i, two types of relevant data messages of j, and subscript m is represented an index, Represent i, two types of relevant Lie group structured datas of j, l IjExpression i, the sample sum that j is two types,
Figure FDA0000137587650000035
For
Figure FDA0000137587650000036
The corresponding class label, and work as
Figure FDA0000137587650000037
Then When
Figure FDA0000137587650000039
Then y ‾ m Ij = + 1 , And find the solution,
max Σ m = 1 l ij β m ij - 1 2 Σ m = 1 l ij Σ n = 1 l ij y ‾ m ij y ‾ n ij β m ij β n ij k ( z m ij , z n ij )
Figure FDA00001375876500000313
For
Figure FDA00001375876500000314
Corresponding class label, m, n are all represented an index, M, n is integer, and m, the equal ∈ of n 1 ... l Ij,
Figure FDA00001375876500000316
For the algorithm of support vector machine training produces model coefficient, S is the regular parameter of algorithm of support vector machine training need, obtains sorter model according to above-mentioned solving result f Ij ( z ) = Sgn { Σ m = 1 l Ij β m Ij y ‾ m Ij k ( z , z m Ij ) + b Ij } , Sgn () representes sign function, b IjIt is model threshold;
The circulation subelement is used for after said training subelement is accomplished the flow process of above-mentioned training classifier model, extracting another combination, carries out above-mentioned sorter model training flow process again, gets several sorter models of combination of 2 until obtaining c.
9. according to claim 7 or 8 described systems, it is characterized in that said sort module comprises:
Computing unit is used for the Lie group structured data that handwriting digital view data to be measured is corresponding and is input to said c respectively and gets several sorter models of combination of 2, and Lie group structured data obtains corresponding c and gets several sorters output of combination result of 2;
Statistic unit, be used for adding up said output as a result this Lie group structured data be divided in the c class a certain type value, and maximizing therefrom;
Confirm the unit, be used for the maximal value that said statistic unit searches out is confirmed as the digital classification of the corresponding handwriting digital view data of this Lie group structured data.
10. system according to claim 9 is characterized in that, said statistic unit comprises:
Class primary system meter subelement; Be used for according to formula i ∈ { 1; ... c} adds up the value that this Lie group structured data among the said output result is divided into the i class, said i class be hypothesis in the c class that will add up a certain type;
Maximal value is searched subelement, is used for according to formula f ( z ) = Max i = 1 . . . c Σ i = 1 , i ≠ j c f Ij ( z ) Seek the maximal value in the numerical value of said statistics subelement statistics.
CN201210041116.0A 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof Expired - Fee Related CN102722713B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210041116.0A CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210041116.0A CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Publications (2)

Publication Number Publication Date
CN102722713A true CN102722713A (en) 2012-10-10
CN102722713B CN102722713B (en) 2014-07-16

Family

ID=46948463

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210041116.0A Expired - Fee Related CN102722713B (en) 2012-02-22 2012-02-22 Handwritten numeral recognition method based on lie group structure data and system thereof

Country Status (1)

Country Link
CN (1) CN102722713B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982343A (en) * 2012-11-12 2013-03-20 信阳师范学院 Handwritten number recognition and incremental type obscure support vector machine method
CN103164701A (en) * 2013-04-10 2013-06-19 苏州大学 Method and device for recognizing handwritten numbers
CN103218613A (en) * 2013-04-10 2013-07-24 苏州大学 Method and device for identifying handwritten form figures
CN103258211A (en) * 2013-05-31 2013-08-21 苏州大学 Handwriting digital recognition method and system
CN103310217A (en) * 2013-06-20 2013-09-18 苏州大学 Handwritten digit recognition method and device on basis of image covariance characteristics
CN103310237A (en) * 2013-07-09 2013-09-18 苏州大学 Handwritten digit recognition method and system
CN103400161A (en) * 2013-07-18 2013-11-20 苏州大学 Handwritten numeral recognition method and system
CN108647670A (en) * 2018-05-22 2018-10-12 哈尔滨理工大学 A kind of characteristic recognition method of the lateral vehicle image based on support vector machines
CN109978064A (en) * 2019-03-29 2019-07-05 苏州大学 Lie group dictionary learning classification method based on image set
WO2019232861A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Handwriting model training method and apparatus, text recognition method and apparatus, and device and medium
CN111026897A (en) * 2019-11-19 2020-04-17 武汉大学 Scene classification method and system based on Lie-Fisher remote sensing image
CN111062417A (en) * 2019-11-19 2020-04-24 武汉大学 Lie-Mean-based flat shell defect detection method and system
CN111191618A (en) * 2020-01-02 2020-05-22 武汉大学 KNN scene classification method and system based on matrix group

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624712A (en) * 2004-12-09 2005-06-08 上海交通大学 Hand writing number identification method based on kernel function

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1624712A (en) * 2004-12-09 2005-06-08 上海交通大学 Hand writing number identification method based on kernel function

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
TOBIAS GLASMACHERS ET AL.: "Gradient-Based Adaptation of General Gaussian Kernels", 《NEURAL COMPUTATION》 *
李凡长: "基于Lie群的机器学习理论框架", 《云南民族大学学报(自然科学版)》 *
王晓乾等: "一种新的李群分类器在手写体数字中的应用", 《计算机工程与科学》 *
陈明等: "一种李群机器学习线性分类算法研究", 《微电子学与计算机》 *
高聪等: "李群核学习算法研究", 《计算机科学与探索》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982343B (en) * 2012-11-12 2015-03-25 信阳师范学院 Handwritten number recognition and incremental type obscure support vector machine method
CN102982343A (en) * 2012-11-12 2013-03-20 信阳师范学院 Handwritten number recognition and incremental type obscure support vector machine method
CN103164701A (en) * 2013-04-10 2013-06-19 苏州大学 Method and device for recognizing handwritten numbers
CN103218613A (en) * 2013-04-10 2013-07-24 苏州大学 Method and device for identifying handwritten form figures
CN103164701B (en) * 2013-04-10 2016-06-01 苏州大学 Handwritten Numeral Recognition Method and device
CN103218613B (en) * 2013-04-10 2016-04-20 苏州大学 Handwritten Numeral Recognition Method and device
CN103258211A (en) * 2013-05-31 2013-08-21 苏州大学 Handwriting digital recognition method and system
CN103310217A (en) * 2013-06-20 2013-09-18 苏州大学 Handwritten digit recognition method and device on basis of image covariance characteristics
CN103310217B (en) * 2013-06-20 2016-06-01 苏州大学 Based on Handwritten Numeral Recognition Method and the device of image covariance feature
CN103310237A (en) * 2013-07-09 2013-09-18 苏州大学 Handwritten digit recognition method and system
CN103310237B (en) * 2013-07-09 2016-08-24 苏州大学 Handwritten Numeral Recognition Method and system
CN103400161A (en) * 2013-07-18 2013-11-20 苏州大学 Handwritten numeral recognition method and system
CN108647670A (en) * 2018-05-22 2018-10-12 哈尔滨理工大学 A kind of characteristic recognition method of the lateral vehicle image based on support vector machines
WO2019232861A1 (en) * 2018-06-04 2019-12-12 平安科技(深圳)有限公司 Handwriting model training method and apparatus, text recognition method and apparatus, and device and medium
CN109978064A (en) * 2019-03-29 2019-07-05 苏州大学 Lie group dictionary learning classification method based on image set
CN111026897A (en) * 2019-11-19 2020-04-17 武汉大学 Scene classification method and system based on Lie-Fisher remote sensing image
CN111062417A (en) * 2019-11-19 2020-04-24 武汉大学 Lie-Mean-based flat shell defect detection method and system
CN111191618A (en) * 2020-01-02 2020-05-22 武汉大学 KNN scene classification method and system based on matrix group

Also Published As

Publication number Publication date
CN102722713B (en) 2014-07-16

Similar Documents

Publication Publication Date Title
CN102722713A (en) Handwritten numeral recognition method based on lie group structure data and system thereof
CN109388712A (en) A kind of trade classification method and terminal device based on machine learning
CN102982349B (en) A kind of image-recognizing method and device
CN103258217A (en) Pedestrian detection method based on incremental learning
CN103164701B (en) Handwritten Numeral Recognition Method and device
CN103177265B (en) High-definition image classification method based on kernel function Yu sparse coding
CN103617435A (en) Image sorting method and system for active learning
CN104951791A (en) Data classification method and apparatus
CN110533018A (en) A kind of classification method and device of image
CN103473275A (en) Automatic image labeling method and automatic image labeling system by means of multi-feature fusion
CN104615730A (en) Method and device for classifying multiple labels
CN102411592B (en) Text classification method and device
Pramanik et al. A study on the effect of CNN-based transfer learning on handwritten Indic and mixed numeral recognition
CN103971136A (en) Large-scale data-oriented parallel structured support vector machine classification method
CN114663002A (en) Method and equipment for automatically matching performance assessment indexes
CN108170697B (en) International trade file processing method and system and server
CN112434884A (en) Method and device for establishing supplier classified portrait
CN104966109A (en) Medical laboratory report image classification method and apparatus
CN112508000B (en) Method and equipment for generating OCR image recognition model training data
CN111553442B (en) Optimization method and system for classifier chain tag sequence
CN111553361B (en) Pathological section label identification method
US20230394865A1 (en) Methods and systems for performing data capture
CN107688744A (en) Malicious file sorting technique and device based on Image Feature Matching
CN109670162A (en) The determination method, apparatus and terminal device of title
CN104573101B (en) A kind of data flow real-time grading method and system of rule-based route

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Zhang Li

Inventor after: Wang Xiaoqian

Inventor after: Yang Jiwen

Inventor after: He Shuping

Inventor after: Li Fanchang

Inventor after: Zhang Zhao

Inventor before: Zhang Li

Inventor before: Wang Xiaoqian

Inventor before: Yang Jiwen

Inventor before: He Shuping

Inventor before: Li Fanchang

COR Change of bibliographic data
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140716