CN101984455B

CN101984455B - Method for solving linear discrimination vector in matrix rank spaces of between-class scatter and total scattering

Info

Publication number: CN101984455B
Application number: CN 201010568119
Authority: CN
Inventors: 贺云辉
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing China Spacenet Telecom Co Ltd
Priority date: 2010-12-01
Filing date: 2010-12-01
Publication date: 2013-05-08
Anticipated expiration: 2030-12-01
Also published as: CN101984455A

Abstract

The invention discloses a method for solving a linear discrimination vector in matrix rank spaces of a between-class scatter and total scattering. The method comprises the following steps: constructing three matrixes by utilizing a training sample and type information thereof; calculating a new matrix by utilizing the constructed three matrixes and a sample matrix; orthogonalizing the column vector of the new matrix to acquire an mutually orthogonal discrimination vector; projecting a sample characteristic and a characteristic to be identified to the calculated linear discrimination vector in an identifying phase to acquire an optimal discrimination characteristic; and calculating the distance between the characteristic to be identified and the sample characteristic, and classifying the sample to be identified to the face type corresponding to the minimum distance. The linear discrimination vector is uncorrelated to the characteristics, so as to eliminate redundancy among linear discrimination characteristics and improve the discriminating capability of the discrimination characteristic.

Description

Find the solution the method for linear discriminant vector between class scatter and total divergence battle array rank space

Technical field

Invention relates to a kind of method of finding the solution the linear discriminant vector between class scatter and total divergence battle array rank space, specifically linear discriminant feature extraction and the recognition methods of the higher-dimension sample under condition of small sample.The present invention can be used for machine learning and area of pattern recognition, can be used for feature extraction and the identification of the various high dimensional datas under condition of small sample.

Background technology

The feature extraction technology is an important content in pattern-recognition, usually can be divided into supervision and the large class of non-supervisory feature extraction two.Non-supervisory Feature Extraction Method can be divided into principal component analysis and independent component analysis two classes, due to the classification information of not using under training sample, therefore is difficult to obtain the diagnostic characteristics useful to Classification and Identification.The diagnostic characteristics of supervision extracts and has utilized affiliated this important information of classification of each sample, therefore can obtain the diagnostic characteristics that is conducive to classify.

The Feature Extraction Method that is based on certain criterion that feature extraction is commonly used, these class methods obtain a transformation matrix by criterion function of optimization, with the extremely low n-dimensional subspace n of original higher-dimension sample characteristics dimensionality reduction, make the feature in low n-dimensional subspace n compacter, better separability is arranged, therefore also be referred to as subspace method.

The sorting technique that cognitive phase is fit to for the characteristic Design that is drawn into is divided into regional with the sample characteristics space, and then the zone according to sample characteristics to be identified place is classified in corresponding classification.The feature extraction stage is commonly used the nearest neighbor classifier classification after obtaining the diagnostic characteristics of higher-dimension sample.

In the linear feature extraction of subspace-based, commonly based on the linear discriminant analysis of Fisher criterion (being called for short FLDA) method.The FLDA method is by optimization Fisher criterion, and after making the discriminant vector that obtains to the sample dimensionality reduction, divergence is minimum in the between class scatter maximum of lower dimensional space sample characteristics and class, thereby the gained diagnostic characteristics has best class separability after the dimensionality reduction conversion.But when the dimension of sample greater than class in during the divergence rank of matrix, find the solution the best discriminant technique feature and have ill singular problem, this problem is also referred to as the ill singular problem under condition of small sample.In the method for present existing solution small sample morbid state singular problem, the Regularization method adds that to divergence battle array in unusual class a little disturbance battle array makes it reversible, and Generalized Inverse Method utilizes the generalized inverse of divergence battle array in class to replace inverse matrix.To the small sample problem of higher-dimension, the calculated amount of these two kinds of methods is larger, is difficult to practicality.The kernel that needs divergence battle array in compute classes based on the method for kernel, calculated amount are also larger, are difficult to be applied to the more occasion of sample number; At first method based on principal component analysis and kernel carries out principal component analysis to sample, with the sample dimensionality reduction, use again the kernel method and obtain the linear discriminant feature, although reduced calculated amount, a large amount of higher-dimension samples are carried out principal component analysis have calculated amount greatly and the unsettled problem of numerical value.Calculate the best discriminant technique vector and be equivalent to the eigenvector of finding the solution Generalized Characteristic Equation in the FLDA method, have calculated amount greatly and the unsettled problem of numerical evaluation and high dimensional data is found the solution eigenvector.Differentiate in the kernel of common vector method (DCV) divergence battle array in class and find the solution discriminant vector optimization between class scatter, overcome the small sample problem of FLDA, the little and numerical stability of calculated amount.Its whole computation process is completed in two steps: at first an optional sample projects to and obtains such other common vector in the kernel of divergence battle array in class in each classification, then utilizes common vector optimization between class scatter to obtain best Fisher discriminant vector.For reducing calculated amount and increasing numerical stability, further utilize orthogonalization procedure to replace finding the solution secular equation.Yet improved DCV method need to be carried out the dimensionality reduction of a high dimensional data, and carries out orthogonalization procedure twice, has so also increased the complexity of calculating.The DCV method is only found the solution the best discriminant technique vector and is come optimization Fisher criterion in the kernel of divergence battle array in class in addition, is difficult to search optimum discriminant vector when the kernel dimension of divergence battle array in class is less.

Summary of the invention

The present invention seeks to exist the defective that is difficult to search whole best discriminant technique vectors that a kind of linear discriminant Feature Extraction Method is provided for existing linear discriminant feature extraction technology.

The present invention adopts following technical scheme for achieving the above object:

Find the solution the method for linear discriminant vector between class scatter and total divergence battle array rank space, it is characterized in that comprising the steps:

(1), structural matrix A, B and D

The sample that C classification arranged, i classification has N _iIndividual training sample, i=1,2..., C, total sample number is With the sample vector representation that collects, namely

Represent j sample in i classification;

The matrix A of the capable N-1 row of structure N is:

A = [\begin{matrix} - 1_{1 \times N - 1} \\ I_{N - 1 \times N - 1} \end{matrix}],

Wherein-1 _{1 * N-1}Each element that represents 1 row N-1 row is-1 row vector; I _{N-1 * N-1}The unit matrix of the capable N-1 row of expression N-1;

The matrix B of the capable C row of structure N-1 is:

The matrix D of the capable C-1 row of structure C is:

D = [\begin{matrix} - \frac{1}{N_{1}} & - \frac{1}{N_{1}} & . . . & - \frac{1}{N_{1}} \\ \frac{1}{N_{2}} & 0 & . . . & . . . \\ 0 & \frac{1}{N_{3}} & . . . & 0 \\ 0 & 0 & O & 0 \\ 0 & 0 & . . . & \frac{1}{N_{C}} \end{matrix}],

(2), compute matrix V

V = [x_{1}^{1}, x_{2}^{1}, . . ., x_{1}^{2}, . . ., x_{1}^{C}, . . ., x_{N_{C}}^{C}] ABD,

Wherein

Represent j sample in i classification, j=1,2 ..., N _i

(3), the column vector of matrix V is carried out orthogonalization procedure and obtain C-1 discriminant vector w ₁, w ₂..., w _C-1, consist of transformation matrix W=[w by it ₁, w ₂..., w _C-1];

(4), training sample and sample to be identified are projected to respectively the optimum linear diagnostic characteristics that obtains sample on the linear discriminant vector:

y_{j}^{i} = W^{T} x_{j}^{i}

Wherein

Represent j training sample in i classification

The optimum linear diagnostic characteristics, W ^TThe transposed matrix of representing matrix W;

Sample x to be identified is projected on C-1 optimum linear discriminant vector obtain diagnostic characteristics y=W ^Tx；

(5), the distance of calculation training sample diagnostic characteristics and sample diagnostic characteristics to be identified, sample to be identified is included in classification under training sample corresponding to minor increment:

Calculate

Minor increment with y:

I=1,2 ..., C, j=1,2 ..., N _i, wherein

Expression

With the Euclidean distance of y, min represents to ask minor increment; The criterion of identification is that sample x to be identified is included in classification under training sample corresponding to minor increment

Advantage of the present invention is that (1) need not to calculate Generalized Characteristic Equation, only need carry out orthogonalization procedure with regard to available linear discriminant vector, has overcome and has found the solution the problem of finding the solution ill unusual Generalized Characteristic Equation that the linear discriminant vector faces under the condition of small sample.In computation process, only need utilize three matrixes of training sample and affiliated classification information thereof structure, the recycling training sample calculates a matrix V, the column vector of matrix V is carried out orthogonalization procedure obtain the optimum linear discriminant vector.(2) method of the present invention is to find the solution discriminant vector to come optimization between class scatter battle array in total divergence rank of matrix space, and when in class, divergence battle array kernel dimension was less, method of the present invention still can search the optimum linear discriminant vector in than large space.(3) owing to using orthogonalization procedure to calculate the linear discriminant vector, gained linear discriminant vector feature is uncorrelated mutually, has eliminated the redundancy between the linear discriminant feature, has improved the distinguishing ability of gained diagnostic characteristics.

Description of drawings

Fig. 1 is the calculation flow chart of performing step of the present invention.

Embodiment

Be elaborated below in conjunction with 1 pair of technical scheme of the present invention of accompanying drawing:

(1), structural matrix A, B and D;

(2), compute matrix V;

(4), training sample and sample to be identified are projected to respectively the optimum linear diagnostic characteristics that obtains sample on the linear discriminant vector;

(5), the distance of calculation training sample diagnostic characteristics and sample diagnostic characteristics to be identified, sample to be identified is included in classification under training sample corresponding to minor increment.

Embodiment:

Adopt public AT﹠amp; T standard faces image data base.AT﹠amp; The T storehouse comprises 40 people's face classifications, and everyone face classification has the facial image of 10 different human face postures, expression and face detail, and the image size is 112 * 92.

The data pre-service: the image array with 112 * 92 carries out down-sampling, and size becomes 28 * 23.Stretching by row is the column vector of 644 dimensions, and the pixel value of image is normalized between 0-1.Every class people's face sample is divided into two parts at random, and a part is as training sample, and a part is as test sample book.

The sample that C=40 classification arranged, the number of training of i classification are N _i, N _iSpan be N _i=2,3,4,5,6,7,8,9, i=1 wherein, 2..., 40, total sample number is

At first structural matrix A, B and D:

The matrix A of the capable N-1 row of structure N is

A = [\begin{matrix} - 1_{1 \times N - 1} \\ I_{N - 1 \times N - 1} \end{matrix}]

The matrix B of capable 40 row of structure N-1 is

The matrix D of constructing 40 row 39 row is

D = [\begin{matrix} - \frac{1}{N_{1}} & - \frac{1}{N_{1}} & . . . & - \frac{1}{N_{1}} \\ \frac{1}{N_{2}} & 0 & . . . & . . . \\ 0 & \frac{1}{N_{3}} & . . . & 0 \\ 0 & 0 & O & 0 \\ 0 & 0 & . . . & \frac{1}{N_{C}} \end{matrix}]

Then compute matrix V:

V = [x_{1}^{1}, x_{2}^{1}, . . ., x_{1}^{2}, . . ., x_{1}^{C}, . . ., x_{N_{C}}^{C}] ABD

Wherein

Represent j sample in i classification, i=1,2..., 40, j=1,2 ..., N _i

The column vector of matrix V is carried out orthogonalization procedure obtain 39 discriminant vector w ₁, w ₂..., w ₃₉, consist of transformation matrix W=[w by it ₁, w ₂..., w ₃₉].

Training sample and sample to be identified are projected to respectively the optimum linear diagnostic characteristics that obtains sample on the linear discriminant vector

y_{j}^{i} = W^{T} x_{j}^{i}

Wherein Represent j training sample in i classification

The optimum linear diagnostic characteristics, W ^TThe transposed matrix of representing matrix W.Sample x to be identified is projected to obtain diagnostic characteristics y=W on the optimum linear discriminant vector ^Tx。

The distance of last calculation training sample diagnostic characteristics and sample diagnostic characteristics to be identified, be included into sample to be identified in classification under training sample corresponding to minor increment:

Calculate Minor increment with y: I=1,2 ..., 40, j=1,2 ..., N _iWherein

Expression

With the Euclidean distance of y, min represents to ask minor increment; The criterion of identification is that sample x to be identified is included in classification under training sample corresponding to minor increment.

Table 1 is to use AT﹠amp; The result of T face database test is carried out 8 experiments altogether.To every class people's face, each experiment is chosen 2 to 9 samples at random as training sample, and remaining sample is as test sample book, and each experiment repeats 10 times, calculates average recognition rate.The result of method of the present invention and Fisherface method is compared, test two kinds of methods at every turn and all get identical training sample and test sample book, method of the present invention all is better than the Fisherface method.

Table 1 average recognition rate (%)

Claims

1. a method of finding the solution the linear discriminant vector between class scatter and total divergence battle array rank space, is characterized in that, adopts public AT﹠amp; T standard faces image data base, AT﹠amp; The T storehouse comprises 40 people's face classifications, and everyone face classification has the facial image of 10 different human face postures, expression and face detail, and the image size is 112 * 92; Then carry out the data pre-service: the image array with 112 * 92 carries out down-sampling, and size becomes 28 * 23; Stretching by row is the column vector of 644 dimensions, and the pixel value of image is normalized between 0-1; Every class people's face sample is divided into two parts at random, and a part is as training sample, and a part is as test sample book; Comprise the steps:

(1), structural matrix A, B and D

The sample that C=40 classification arranged, the number of training of i classification are N _i, N _iSpan be N _i=2,3,4,5,6,7,8,9, i=1 wherein, 2..., 40, total sample number is The matrix A of the capable N-1 row of structure N is:

A = [\begin{matrix} {- 1}_{1 \times N - 1} \\ I_{N - 1 \times N - 1} \end{matrix}],

The matrix B of the capable C row of structure N-1 is:

The matrix D of the capable C-1 row of structure C is:

(2), compute matrix V

V = [x_{1}^{1}, x_{2}^{1}, . . ., x_{1}^{2}, . . ., x_{1}^{C}, . . ., x_{N_{C}}^{C}] ABD,

Wherein

Represent j sample in i classification, i=1,2..., 40, j=1,2 ..., N _i

y_{j}^{i} = W^{T} x_{j}^{i}

Wherein

Represent j training sample in i classification

Calculate Minor increment with y:

I=1,2 .., C, j=1,2 .., N _i, wherein Expression