CN105389560B

CN105389560B - Figure optimization Dimensionality Reduction method based on local restriction

Info

Publication number: CN105389560B
Application number: CN201510777140.4A
Authority: CN
Inventors: 齐妙; 王建中; 孔俊; 易玉根
Original assignee: Northeast Normal University
Current assignee: Northeast Normal University
Priority date: 2015-11-13
Filing date: 2015-11-13
Publication date: 2018-05-11
Anticipated expiration: 2035-11-13
Also published as: CN105389560A

Abstract

The present invention relates to a kind of figure based on local restriction to optimize Dimensionality Reduction method, belongs to image processing field.First by figure optimization and projection matrix Learning Integration a to Unified frame, allow the adaptive renewal of figure during Dimensionality Reduction, secondly by introducing local restriction, it can be very good to excavate and keep the local message of high dimensional data, it is also proposed that one effectively and quickly more new strategy solves the algorithm of proposition.Substantial amounts of experiment and comparing result show that the present invention has good performance and is better than existing correlation technique, suitable for target identification, data clusters and data visualization.

Description

Graph optimization dimension reduction method based on local constraint

Technical Field

The present invention belongs to the field of digital image processing.

Background

Many data in the real world are high-dimensional, and although the high-dimensional data have more information, dimension disasters, empty space phenomena, concentration phenomena and the like can be caused by directly operating the high-dimensional data in practical application. In order to solve the problems of high-dimensional data, dimension reduction is an effective method, irrelevant and redundant features can be effectively eliminated through dimension reduction, the efficiency of a mining task can be improved, the essential rule of data is revealed, the prediction performance is improved, and the like. Therefore, dimensionality reduction is usually an important link for many practical applications, and has important practical research value.

Currently, existing dimension reduction methods can be divided into two categories: linear methods and non-linear methods. Linear methods map data points in a high dimensional space into a low dimensional space by learning a linear transformation. Representative linear dimensionality reduction methods include Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Although such methods are capable of low-dimensional representation and are simple to operate, they are not capable of mining the intrinsic non-linear structure of high-dimensional data. Therefore, the scholars propose a large number of non-linear manifold methods, such as equidistant feature mapping (ISOMAP), local Linear Embedding (LLE), and Laplace feature mapping (LE). Although such methods work well on some reference artificial datasets, they only obtain a low-dimensional representation of the training sample and do not give a high-to-low-dimensional display function mapping, i.e., do not obtain a low-dimensional representation of the test sample. Therefore, these non-linear dimension reduction methods are not suitable for classification and recognition tasks. To overcome this limitation, a series of linearized manifold learning methods have been proposed, which are all dimension reduction methods based on graph construction and are capable of learning a projection matrix, such as Neighborhood Preserving Embedding (NPE) and Local Preserving Projection (LPP), that transforms from high-dimensional data to low-dimensional subspace. Specifically, a graph is first constructed from input data or a priori knowledge, and then a low-dimensional representation is obtained based on the constructed graph. Therefore, the construction of the graph becomes particularly important in the dimension reduction method.

However, in practice, it is often difficult to construct a high quality map. Based on k-neighbors and epsilon maps are two simple and widely applicable patterning methods, such as the ISOMAP, LLE, NPE, and LPP methods. In these methods, the values of k and ε often need to be given empirically, and if these two parameter values are set improperly, the eigenstructure of the high-dimensional data is not well mined. Moreover, both methods typically set the same neighborhood parameters for all samples and ignore the different local structures of each sample, thereby degrading the performance of the method. To address this problem, yang and Chen et al propose a sample correlation method to construct a graph, which adaptively determines the neighborhood of each sample based on the similarity between sample pairs. Based on sparse representation theory, qiao et al propose a sparse preserving projection dimensionality reduction method (SPP) which utilizes a least squares method of L1 norm regularization to automatically construct a graph. Liu et al propose a low-rank representation method for graph construction (LRR), which can jointly obtain low-dimensional representations of all high-dimensional data by forcing the sparse representation coefficients to be low-rank, thereby maintaining the global structure of the data. Although such methods can overcome k-neighbors andthe main drawback of the limitation of the graph approach is that the graph building process is independent of the subsequent dimension reduction task. That is, the graph is fixed during the dimension reduction process, thereby degrading the performance of the dimension reduction task.

Recently, graph optimization has become a hotspot of research. Zhang et al propose a graph-optimized local preserving projection method (GoLPP) that integrates graph construction and dimension reduction processes into a unified framework. Qiao et al propose an adaptive graph dimension reduction method (DRAG) that jointly constructs a graph and learns a projection matrix. Zhang et al propose a graph optimization algorithm based on sparse constraint to perform dimensionality reduction (GODRSC), and by adding an L1 regularization term, the method can obtain a more flexible sparse graph. The three methods combine graph construction and dimensionality reduction, automatically update the graph in the process of learning the projection matrix and obtain good performance. However, these methods ignore local information of the original high-dimensional data, that is, do not consider the similarity of the original high-dimensional data, and further cannot ensure that similar high-dimensional samples are also similar in a low-dimensional space, so that the dimension reduction effect is not ideal.

Disclosure of Invention

The invention provides a graph optimization dimension reduction method based on local constraint to solve the problem that the dimension reduction effect is not ideal.

The technical scheme adopted by the invention comprises the following steps:

1. reading high dimensional dataWherein x is _i Is the ith sample, D is the dimension of the sample, and n is the number of the samples;

2. constructing local constraint based on a neighbor reconstruction relation;

in order to enable the low-dimensional representation of the high-dimensional data to maintain the local relationship of the original data, a graph matrix S needs to be constructed in the process of obtaining a projection matrix, and local information of the data can be effectively captured by reconstructing a sample by using the neighbor of the sample, so the following constraint relationship is considered in the process of constructing the graph matrix S:

wherein the content of the first and second substances,represents the multiplication operation of the corresponding elements and represents the multiplication operation of the corresponding elements,is a sample x _i Is indicated by a vector r _i,j ＝exp(||x _i -x _j || ² /σ)，S＝[S _ij ] _n×n Is a matrix of graphs, S _·i Is the ith column of the graph matrix S;

3. constructing local constraints based on sample similarity;

considering that close samples in the high-dimensional space should have similar reconstruction coefficients, the following constraint relationship is considered in constructing the graph matrix S:

wherein S is _·i And S _·j For the ith and jth columns of the graph matrix S, respectively, the samples x are represented _i And x _j Reconstruction coefficient of (w) _ij ＝exp(-||x _i -x _j || ² σ) is a thermal kernel function;

4. constructing a dimensionality reduction objective function based on two local constraints:

s.t.P ^T P＝I

wherein the content of the first and second substances,is a projection matrix, (x) _i -XS _·i ) Represents a sample x _i Error reconstructed by other samples in X, P ^T Denotes the transpose of the matrix P, I is the identity matrix, λ&0 is a compromise parameter;

5. optimizing an objective function through an iteration strategy, firstly fixing a projection matrix P, and updating a graph matrix S; then fixing the graph matrix S and updating the projection matrix P; finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N (N is less than or equal to 15) iteration;

6. for subsequent tasks such as identification, clustering and the like, the high-dimensional data X is projected to the matrix P to obtain low-dimensional representation of the high-dimensional data, so that the purpose of dimensionality reduction is achieved;

X ^low ＝P ^T X (13)

wherein the content of the first and second substances,is a low-dimensional representation of the high-dimensional data X, i.e. each sample in the high-dimensional space changes from the original D-dimension to the D-dimension of the low-dimensional space.

The invention unifies the projection matrix learning and graph construction processes into a framework, can automatically update the graph in the dimension reduction process, and can effectively mine and maintain the local information of high-dimensional data by establishing two local constraint relations, namely a neighbor reconstruction relation and sample similarity. Particularly, an algorithm based on an iterative update strategy is provided for solving a projection matrix and a graph, so that the high-dimensional data can be effectively reduced. A large number of experiments and comparison results show that the method has good performance, and is suitable for target identification, clustering, data visualization and the like due to the existing representative dimension reduction method.

The invention carries out experimental comparison and analysis aiming at 3 standard face databases and 3 standard clustering data sets, and quantitatively evaluates the effectiveness and superiority of the proposed method. A large number of comparison experiment results show that the method provided by the invention not only can effectively perform face recognition and automatic data clustering, but also has better stability.

The invention has the following beneficial effects:

(1) The invention is an effective dimension reduction method aiming at high-dimensional data;

(2) Two local constraints are established instead of a single constraint for dimensionality reduction, so that the local relation of high-dimensional data can be better maintained by low-dimensional data;

(3) The projection matrix learning and graph optimization processes are unified under one framework, so that the constructed graph can be updated in a self-adaptive manner, and the performance of dimensionality reduction is improved;

(4) An effective and rapid iteration updating solving method is provided, so that the objective function can be converged within a few iterations;

(5) The method can be widely applied to dimensionality reduction of high-dimensional data, and is beneficial to tasks such as subsequent identification, clustering and data visualization.

Drawings

FIG. 1 (a) is a partial face image in the face database Yale used in the present invention;

FIG. 1 (b) is a partial face image in Extended YaleB face database used in the present invention;

fig. 1 (c) is a partial face image in the face database CMU PIE used in the present invention;

FIG. 2 is a partial image of a COIL 20 dataset used in the present invention;

FIG. 3 (a) is the comparison result of different methods on the face database Yale under different dimensions;

FIG. 3 (b) is the comparison result of different methods in different dimensions on the Extended YaleB face database;

FIG. 3 (c) is a comparison result of different methods on the CMU PIE of the face database under different dimensions;

FIG. 4 (a) is a convergence curve of the LC-GODR method on the face database Yale;

FIG. 4 (b) is the convergence curve of the LC-GODR method on Extended YaleB face database;

FIG. 4 (c) is a convergence curve of the LC-GODR method on the face database CMU PIE;

FIG. 5 (a) is a convergence curve of the LC-GODR method on the clustered data set Glass;

FIG. 5 (b) is a convergence curve of the LC-GODR method on the clustering data set Sonar;

fig. 5 (c) is a convergence curve of the LC-GODR method on the clustering data set COIL 20.

Detailed Description

The method comprises the following steps:

2. constructing local constraint based on a neighbor reconstruction relation;

in order to keep the local relationship of the original data in the low-dimensional representation of the high-dimensional data, a graph matrix S is constructed in the process of solving the projection matrix. In general, if sample x _i And sample x _j Is a neighbor relation, sample x _i Can be sampled by x _j Efficiently reconstructing the vertex x in the constructed graph matrix S _i And x _j There is an edge in between, and the weights on the edge can assign the reconstruction coefficients. That is, reconstructing a sample with its neighbor samples can effectively capture local information of the data. Based on the above analysis, the following local constraints based on neighbor reconstruction are established:

wherein:representing the multiplication operation of the corresponding elements,is a sample x _i Is indicated by a vector r _i,j ＝exp(||x _i -x _j || ² σ) is sample x _i And sample x _j The similarity between the two groups is similar to each other,S＝[S _ij ] _n×n is a matrix of graphs, S _·i Is the ith column of the graph matrix S;

as can be seen from the indicator vector, sample x _i And sample x _j The more closely, r _i,j The smaller the value is, the larger the reconstruction coefficient is given to the corresponding edge in the graph matrix S by minimizing the formula (1);

3. constructing local constraints based on sample similarity;

it has been proved that similar samples in the high-dimensional space should have similar reconstruction coefficients, which is an important condition for mining local information, and therefore, the following local constraint based on the sample similarity is established:

wherein S is _·i And S _·j For the ith and jth columns of the graph matrix S, respectively, the samples x are represented _i And sample x _j Reconstruction coefficient of (w) _ij ＝exp(-||x _i -x _j || ² /σ) is a thermal kernel function used to characterize the sample x _i And sample x _j Similarity between the samples, by minimizing equation (2), similar samples have similar reconstruction coefficients;

4. unifying projection matrix learning and graph construction under a unified framework, and establishing a dimension reduction target function based on two local constraints:

s.t.P ^T P＝I

using algebraic transformation, equation (3) is rewritten as:

wherein S is ^T Which represents a transpose of the matrix S,for locally indicating a vector R _i L = D-W is laplace matrix, W = [ W ] _ij ] _n×n Is a symmetrical sample similarity array, D is a diagonal array, the diagonal element of which is the sum of the rows and columns of W, and tr (-) represents the trace of the matrix;

5. similar to the GoLPP and DRAG methods, the objective function is optimized by an alternating iterative strategy:

(1) First, the projection matrix P is fixed, the graph matrix S is updated, and equation (4) can be written as:

wherein, y _i ＝P ^T x _i And Y = P ^T X, the last term of the constraint term of equation (5) may be rewritten as:

by algebraic transformation, equation (5) can be rewritten as:

the graph matrix S can be updated column by column, for the ith column S of S _·i The objective function may be updated as:

for S in the formula (8) _·i If the partial derivative is calculated and the reciprocal is equal to zero, then:

S _·i ＝(Y ^T Y+λE _i +λL) ^-1 Y ^T y _i (10)

(2) Next, fixing the graph matrix S, updating the projection matrix P, and by removing the irrelevant terms, the optimization problem of equation (3) with respect to P is:

s.t.P ^T P＝I

let M = X (I-S) ^T +S ^T S)X ^T And C = XX ^T Equation (11) can be written as:

s.t.P ^T P＝I

obviously, equation (12) is a trace-ratio optimization problem that can be solved by iterative trace-ratio method (ITR) or newton's decomposition method (DNR);

finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N (N is less than or equal to 15) iteration;

to demonstrate the convergence of the proposed algorithm, the proposed algorithm can be decomposed into two sub-problems, such as equation (5) and equation (11). For the first sub-problem, a closed-form solution of the graph matrix S can be obtained by equation (10), and it is obvious that the objective function value of us is in a descending trend in the process of iteratively solving S. For the second subproblem, it has been demonstrated previously that the ITR and DNR methods can yield a globally optimal solution to the trace ratio problem. Therefore, solving equation (12) with ITRs or DNRs also tends to decrease our objective function values. Finally, since all terms in equation (3) are not less than 0, our objective function has a lower bound. Therefore, according to the cauchy convergence rule, the algorithm we propose is convergent;

6. projecting the high-dimensional data X to the matrix P to obtain low-dimensional representation of the high-dimensional data, thereby achieving the purpose of dimensionality reduction;

X ^low ＝P ^T X (13)

Experimental example: the beneficial effects of the invention are further illustrated by the analysis and comparison of specific experimental results.

The invention provides an effective graph optimization dimension reduction method based on local constraint. In order to evaluate the proposed method efficiently and systematically, we performed a number of classification recognition and clustering experiments on 3 standard face databases (Yale, extended Yale B and CMU PIE) and 3 standard data sets (Glass, sonar and COIL 20) on UCI databases, where Yale, extended Yale B, CMU PIE and COIL 20 are each an image and Glas and Sonar are each a vector, and each image can be represented as a high-dimensional vector when dimension reduction is performed using the proposed method. Fig. 1 (a) - (c) and fig. 2 show partial images in the face database and the COIL 20 data set, respectively. Table 1 and table 2 give the detailed information of the 3 face databases and the 3 UCI data sets, respectively. In addition, the performance of the method (LC-GODR for short) provided by the invention is compared with that of some representative methods from the quantitative point of view, including LPP, NPE, SGLPP, LSR-NPE, LRR-NPE, SPP, goLPP, DRAG and GODRSC.

When a face recognition task is carried out, one image of each face is randomly selected as a training sample to obtain a projection matrix P, the rest t images are used as test samples, and simple and effective Euclidean distance and nearest neighbor classifiers (NN) are adopted for recognition. In order to verify that the proposed method has good stability, the process is repeated 10 times, and the average recognition result of 10 times is used as the final recognition rate.

Where T is the number of correctly identified samples and N is the number of all identified samples.

Obviously, the parameter λ and the low-dimensional representation dimension d of the sample are two important parameters affecting the face recognition result, and table 3 shows the influence of the parameter λ on the face recognition rate. It can be seen that for the Yale database, the algorithm proposed by us achieves the highest recognition rate when λ takes a smaller value. In contrast, the proposed algorithm exhibits better performance in Extended YaleB and CMU PIE when λ takes a larger value. This is because, compared to Extended YaleB and CMU PIE, yale database has a small number of training samples and a large intra-class variance, and samples from the same class may not be adjacent in feature space, so the reconstruction relationship of the samples should be strengthened. Conversely, since Extended YaleB and CMU PIE contain more training samples, and head pose and facial expression changes are smaller, with smaller intra-class variance relative to the Yale database, the local constraints should be given greater weight.

Fig. 3 (a) - (c) show the effect of the low-dimensional subspace dimension d on the recognition rate. It can be seen from the figure that the proposed LC-GODR method performs worse than some other methods when the subspace dimension is lower. However, as the dimensionality increases, the recognition rate of the LC-GODR method has improved significantly. Table 3 shows the comparison of the highest recognition rate and the standard deviation for different methods, wherein the numerical values in brackets represent the corresponding low-dimensional subspace dimension d when the highest recognition rate is obtained. The standard deviation is the standard deviation of the recognition rate of 10 times, and a smaller one indicates better stability. We can see that the proposed method achieves the highest recognition rate on all three databases, 89.86%,90.59% and 93.75%, respectively, and has better stability. In contrast, the DRAG and GODRSC methods also yield better performance, but are inferior to the LC-GODR method.

Fig. 4 (a) - (c) show the convergence curves of the LC-GODR method on the 3-face database. The horizontal axis represents the iteration times, and the vertical axis represents the objective function value, so that the proposed iteration updating strategy has a fast convergence speed, namely the function achieves convergence within 20 iteration times.

When the clustering task is realized, the low-dimensional representation is automatically clustered by using a K-means algorithm, and clustering performance is evaluated by using clustering precision (AC), wherein the AC is defined as follows:

wherein, the first and the second end of the pipe are connected with each other,n is the total number of training samples, l _i Is the data x _i The real label of c _i Is a sample x _i The obtained cluster label, map (-) is the optimal mapping function that maps each cluster label to an equivalent real label by the Kuhn-Munkres algorithm.

As can be seen from the formula (15), the range of the AC value belongs to the interval [0,1], and the larger the AC value is, the better the algorithm clustering performance is. Since the K-means clustering method is sensitive to the initial clustering center, we randomly initialize the class center 10 times, and take the average clustering accuracy of 10 times as the final clustering result. The setting process of the parameter lambda and the low-dimensional representation dimension d is the same as that of face recognition, and table 5 shows clustering result comparison of different methods. As can be seen from the table, since LPP and NEP are patterned using the k-nearest neighbor method, their performance is lower than that of the other methods in most cases. In the SGLPP method, the number of neighbors per sample is adaptive, and its performance is better than LPP on both UCI data sets. Because the LSR-NPE, LRR-NPE, SPP, goLPP, DRAG, and GODRSC methods use more advanced techniques to construct the graph, they achieve higher clustering accuracy than the LPP, NEP, and SGLPP methods. Obviously, the LC-GODR method exhibits the best performance, with clustering accuracies of 0.5754,0.6730 and 0.6337, respectively, due to the consideration of the two constraints.

Fig. 5 (a) - (c) show the convergence curves of the LC-GODR method on 3 clustered data sets, similar to fig. 4, where the curves approach to be smooth within 20 iterations, which illustrates that the proposed method can have a fast convergence rate.

TABLE 1 detailed information of the personal face database

Table 2 details of UCI data sets

TABLE 3 influence of parameter λ on recognition Rate

TABLE 4 comparison of maximum recognition rate and standard deviation for different methods

TABLE 5 comparison of highest clustering accuracy and standard deviation for different methods

The method integrates graph optimization and projection matrix learning into a unified framework, and enables the graph to be updated in a self-adaptive mode in the dimension reduction process. Secondly, local information of high-dimensional data can be effectively mined and maintained by introducing local constraints. In particular, an efficient update strategy is proposed to solve the proposed algorithm. A large number of experimental and comparative results show that the invention has good performance and is superior to the prior related method. The method provided by the invention is suitable for target identification, data clustering and data visualization.

In view of this, the invention provides a graph optimization dimension reduction method based on local constraints, which considers two local constraints simultaneously in the dimension reduction process, so that the low-dimensional representation of a high-dimensional sample can well keep the local relationship of original high-dimensional data. In particular, identification and clustering experiments were performed on 3 international standard face databases and 3 standard data sets established by the university of california, respectively, in the european part (3 face databases and 3 UCI data sets are detailed in tables 1 and 2), and the comparison experiments prove that the proposed method has good performance.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiment, and all technical solutions belonging to the principle of the present invention belong to the protection scope of the present invention.

Claims

1. A graph optimization dimension reduction method based on local constraint is characterized by comprising the following steps:

1) Reading high dimensional dataWherein x is _i The number of samples is the ith sample, D is the dimension of the sample, and n is the number of the samples;

2) Constructing local constraint based on a neighbor reconstruction relation;

in order to enable the low-dimensional representation of the high-dimensional data to maintain the local relationship of the original data, a graph matrix S needs to be constructed in the process of solving a projection matrix, and local information of the data can be effectively captured by reconstructing a sample by using the neighbor of the sample, so that the following constraint relationship is considered in the process of constructing the graph matrix S:

wherein the content of the first and second substances,represents the multiplication operation of the corresponding elements and represents the multiplication operation of the corresponding elements,is a sample x _i Is used to indicate the vector of (a),is a graph matrix, S · _i Is the ith column of the graph matrix S;

3) Constructing local constraint based on sample similarity;

4) And constructing a dimensionality reduction objective function based on two local constraints:

wherein the content of the first and second substances,for projection matrix, D>>d，(x _i -XS _·i ) Represents a sample x _i Error reconstructed by other samples in X, P ^T Denotes the transpose of the matrix P, I is the identity matrix, λ&gt, 0 is a compromise parameter;

5) Optimizing a target function through an iteration strategy, firstly fixing a projection matrix P, and updating a graph matrix S; then fixing the graph matrix S and updating the projection matrix P; finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N iterations, wherein N is less than or equal to 15;

6) For subsequent identification and clustering tasks, projecting the high-dimensional data X to a matrix P to obtain low-dimensional representation of the high-dimensional data, thereby achieving the purpose of dimensionality reduction;

X ^low ＝P ^T X (13)

wherein the content of the first and second substances,is a low-dimensional representation of the high-dimensional data X, i.e. each sample in the high-dimensional space is changed from the original D-dimension to the D-dimension of the low-dimensional space.

2. The method for graph optimization dimension reduction based on local constraints of claim 1, wherein: in step 4), using algebraic transformation, formula (3) is rewritten as:

wherein S is ^T Which represents a transpose of the matrix S,for locally indicating a vector R _i L = D-W is laplace matrix, W = [ W ] _ij ] _n×n Is a symmetric sample-like array, D is a diagonal array whose diagonal elements are the sum of rows and columns of W, and tr (-) denotes the trace of the matrix.

3. The method for graph optimization dimension reduction based on local constraints of claim 2, wherein: in step 5), the projection matrix P is first fixed, the graph matrix S is updated, and equation (4) can be written as:

by algebraic transformation, equation (5) can be rewritten as:

the graph matrix S can be updated column by column, for the ith column S of S _·i The objective function can be updated as:

S _·i ＝(Y ^T Y+λE _i +λL) ^-1 Y ^T y _i (10)。

4. the method for graph optimization dimension reduction based on local constraints of claim 1, wherein: in the step 5) of the method,

fixing the graph matrix S, updating the projection matrix P, and by removing the irrelevant terms, the optimization problem of formula (3) with respect to P is:

let M = X (I-S) ^T +S ^T S)X ^T And C = XX ^T Equation (11) can be written as:

equation (12) is solved by iterative trace ratio method (ITR) or newton decomposition method (DNR).