CN105389560B - Figure optimization Dimensionality Reduction method based on local restriction - Google Patents

Figure optimization Dimensionality Reduction method based on local restriction Download PDF

Info

Publication number
CN105389560B
CN105389560B CN201510777140.4A CN201510777140A CN105389560B CN 105389560 B CN105389560 B CN 105389560B CN 201510777140 A CN201510777140 A CN 201510777140A CN 105389560 B CN105389560 B CN 105389560B
Authority
CN
China
Prior art keywords
matrix
graph
sample
local
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510777140.4A
Other languages
Chinese (zh)
Other versions
CN105389560A (en
Inventor
齐妙
王建中
孔俊
易玉根
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Normal University
Original Assignee
Northeast Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Normal University filed Critical Northeast Normal University
Priority to CN201510777140.4A priority Critical patent/CN105389560B/en
Publication of CN105389560A publication Critical patent/CN105389560A/en
Application granted granted Critical
Publication of CN105389560B publication Critical patent/CN105389560B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention relates to a kind of figure based on local restriction to optimize Dimensionality Reduction method, belongs to image processing field.First by figure optimization and projection matrix Learning Integration a to Unified frame, allow the adaptive renewal of figure during Dimensionality Reduction, secondly by introducing local restriction, it can be very good to excavate and keep the local message of high dimensional data, it is also proposed that one effectively and quickly more new strategy solves the algorithm of proposition.Substantial amounts of experiment and comparing result show that the present invention has good performance and is better than existing correlation technique, suitable for target identification, data clusters and data visualization.

Description

Graph optimization dimension reduction method based on local constraint
Technical Field
The present invention belongs to the field of digital image processing.
Background
Many data in the real world are high-dimensional, and although the high-dimensional data have more information, dimension disasters, empty space phenomena, concentration phenomena and the like can be caused by directly operating the high-dimensional data in practical application. In order to solve the problems of high-dimensional data, dimension reduction is an effective method, irrelevant and redundant features can be effectively eliminated through dimension reduction, the efficiency of a mining task can be improved, the essential rule of data is revealed, the prediction performance is improved, and the like. Therefore, dimensionality reduction is usually an important link for many practical applications, and has important practical research value.
Currently, existing dimension reduction methods can be divided into two categories: linear methods and non-linear methods. Linear methods map data points in a high dimensional space into a low dimensional space by learning a linear transformation. Representative linear dimensionality reduction methods include Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Although such methods are capable of low-dimensional representation and are simple to operate, they are not capable of mining the intrinsic non-linear structure of high-dimensional data. Therefore, the scholars propose a large number of non-linear manifold methods, such as equidistant feature mapping (ISOMAP), local Linear Embedding (LLE), and Laplace feature mapping (LE). Although such methods work well on some reference artificial datasets, they only obtain a low-dimensional representation of the training sample and do not give a high-to-low-dimensional display function mapping, i.e., do not obtain a low-dimensional representation of the test sample. Therefore, these non-linear dimension reduction methods are not suitable for classification and recognition tasks. To overcome this limitation, a series of linearized manifold learning methods have been proposed, which are all dimension reduction methods based on graph construction and are capable of learning a projection matrix, such as Neighborhood Preserving Embedding (NPE) and Local Preserving Projection (LPP), that transforms from high-dimensional data to low-dimensional subspace. Specifically, a graph is first constructed from input data or a priori knowledge, and then a low-dimensional representation is obtained based on the constructed graph. Therefore, the construction of the graph becomes particularly important in the dimension reduction method.
However, in practice, it is often difficult to construct a high quality map. Based on k-neighbors and epsilon maps are two simple and widely applicable patterning methods, such as the ISOMAP, LLE, NPE, and LPP methods. In these methods, the values of k and ε often need to be given empirically, and if these two parameter values are set improperly, the eigenstructure of the high-dimensional data is not well mined. Moreover, both methods typically set the same neighborhood parameters for all samples and ignore the different local structures of each sample, thereby degrading the performance of the method. To address this problem, yang and Chen et al propose a sample correlation method to construct a graph, which adaptively determines the neighborhood of each sample based on the similarity between sample pairs. Based on sparse representation theory, qiao et al propose a sparse preserving projection dimensionality reduction method (SPP) which utilizes a least squares method of L1 norm regularization to automatically construct a graph. Liu et al propose a low-rank representation method for graph construction (LRR), which can jointly obtain low-dimensional representations of all high-dimensional data by forcing the sparse representation coefficients to be low-rank, thereby maintaining the global structure of the data. Although such methods can overcome k-neighbors andthe main drawback of the limitation of the graph approach is that the graph building process is independent of the subsequent dimension reduction task. That is, the graph is fixed during the dimension reduction process, thereby degrading the performance of the dimension reduction task.
Recently, graph optimization has become a hotspot of research. Zhang et al propose a graph-optimized local preserving projection method (GoLPP) that integrates graph construction and dimension reduction processes into a unified framework. Qiao et al propose an adaptive graph dimension reduction method (DRAG) that jointly constructs a graph and learns a projection matrix. Zhang et al propose a graph optimization algorithm based on sparse constraint to perform dimensionality reduction (GODRSC), and by adding an L1 regularization term, the method can obtain a more flexible sparse graph. The three methods combine graph construction and dimensionality reduction, automatically update the graph in the process of learning the projection matrix and obtain good performance. However, these methods ignore local information of the original high-dimensional data, that is, do not consider the similarity of the original high-dimensional data, and further cannot ensure that similar high-dimensional samples are also similar in a low-dimensional space, so that the dimension reduction effect is not ideal.
Disclosure of Invention
The invention provides a graph optimization dimension reduction method based on local constraint to solve the problem that the dimension reduction effect is not ideal.
The technical scheme adopted by the invention comprises the following steps:
1. reading high dimensional dataWherein x is i Is the ith sample, D is the dimension of the sample, and n is the number of the samples;
2. constructing local constraint based on a neighbor reconstruction relation;
in order to enable the low-dimensional representation of the high-dimensional data to maintain the local relationship of the original data, a graph matrix S needs to be constructed in the process of obtaining a projection matrix, and local information of the data can be effectively captured by reconstructing a sample by using the neighbor of the sample, so the following constraint relationship is considered in the process of constructing the graph matrix S:
wherein the content of the first and second substances,represents the multiplication operation of the corresponding elements and represents the multiplication operation of the corresponding elements,is a sample x i Is indicated by a vector r i,j =exp(||x i -x j || 2 /σ),S=[S ij ] n×n Is a matrix of graphs, S ·i Is the ith column of the graph matrix S;
3. constructing local constraints based on sample similarity;
considering that close samples in the high-dimensional space should have similar reconstruction coefficients, the following constraint relationship is considered in constructing the graph matrix S:
wherein S is ·i And S ·j For the ith and jth columns of the graph matrix S, respectively, the samples x are represented i And x j Reconstruction coefficient of (w) ij =exp(-||x i -x j || 2 σ) is a thermal kernel function;
4. constructing a dimensionality reduction objective function based on two local constraints:
s.t.P T P=I
wherein the content of the first and second substances,is a projection matrix, (x) i -XS ·i ) Represents a sample x i Error reconstructed by other samples in X, P T Denotes the transpose of the matrix P, I is the identity matrix, λ&0 is a compromise parameter;
5. optimizing an objective function through an iteration strategy, firstly fixing a projection matrix P, and updating a graph matrix S; then fixing the graph matrix S and updating the projection matrix P; finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N (N is less than or equal to 15) iteration;
6. for subsequent tasks such as identification, clustering and the like, the high-dimensional data X is projected to the matrix P to obtain low-dimensional representation of the high-dimensional data, so that the purpose of dimensionality reduction is achieved;
X low =P T X (13)
wherein the content of the first and second substances,is a low-dimensional representation of the high-dimensional data X, i.e. each sample in the high-dimensional space changes from the original D-dimension to the D-dimension of the low-dimensional space.
The invention unifies the projection matrix learning and graph construction processes into a framework, can automatically update the graph in the dimension reduction process, and can effectively mine and maintain the local information of high-dimensional data by establishing two local constraint relations, namely a neighbor reconstruction relation and sample similarity. Particularly, an algorithm based on an iterative update strategy is provided for solving a projection matrix and a graph, so that the high-dimensional data can be effectively reduced. A large number of experiments and comparison results show that the method has good performance, and is suitable for target identification, clustering, data visualization and the like due to the existing representative dimension reduction method.
The invention carries out experimental comparison and analysis aiming at 3 standard face databases and 3 standard clustering data sets, and quantitatively evaluates the effectiveness and superiority of the proposed method. A large number of comparison experiment results show that the method provided by the invention not only can effectively perform face recognition and automatic data clustering, but also has better stability.
The invention has the following beneficial effects:
(1) The invention is an effective dimension reduction method aiming at high-dimensional data;
(2) Two local constraints are established instead of a single constraint for dimensionality reduction, so that the local relation of high-dimensional data can be better maintained by low-dimensional data;
(3) The projection matrix learning and graph optimization processes are unified under one framework, so that the constructed graph can be updated in a self-adaptive manner, and the performance of dimensionality reduction is improved;
(4) An effective and rapid iteration updating solving method is provided, so that the objective function can be converged within a few iterations;
(5) The method can be widely applied to dimensionality reduction of high-dimensional data, and is beneficial to tasks such as subsequent identification, clustering and data visualization.
Drawings
FIG. 1 (a) is a partial face image in the face database Yale used in the present invention;
FIG. 1 (b) is a partial face image in Extended YaleB face database used in the present invention;
fig. 1 (c) is a partial face image in the face database CMU PIE used in the present invention;
FIG. 2 is a partial image of a COIL 20 dataset used in the present invention;
FIG. 3 (a) is the comparison result of different methods on the face database Yale under different dimensions;
FIG. 3 (b) is the comparison result of different methods in different dimensions on the Extended YaleB face database;
FIG. 3 (c) is a comparison result of different methods on the CMU PIE of the face database under different dimensions;
FIG. 4 (a) is a convergence curve of the LC-GODR method on the face database Yale;
FIG. 4 (b) is the convergence curve of the LC-GODR method on Extended YaleB face database;
FIG. 4 (c) is a convergence curve of the LC-GODR method on the face database CMU PIE;
FIG. 5 (a) is a convergence curve of the LC-GODR method on the clustered data set Glass;
FIG. 5 (b) is a convergence curve of the LC-GODR method on the clustering data set Sonar;
fig. 5 (c) is a convergence curve of the LC-GODR method on the clustering data set COIL 20.
Detailed Description
The method comprises the following steps:
1. reading high dimensional dataWherein x is i Is the ith sample, D is the dimension of the sample, and n is the number of the samples;
2. constructing local constraint based on a neighbor reconstruction relation;
in order to keep the local relationship of the original data in the low-dimensional representation of the high-dimensional data, a graph matrix S is constructed in the process of solving the projection matrix. In general, if sample x i And sample x j Is a neighbor relation, sample x i Can be sampled by x j Efficiently reconstructing the vertex x in the constructed graph matrix S i And x j There is an edge in between, and the weights on the edge can assign the reconstruction coefficients. That is, reconstructing a sample with its neighbor samples can effectively capture local information of the data. Based on the above analysis, the following local constraints based on neighbor reconstruction are established:
wherein:representing the multiplication operation of the corresponding elements,is a sample x i Is indicated by a vector r i,j =exp(||x i -x j || 2 σ) is sample x i And sample x j The similarity between the two groups is similar to each other,S=[S ij ] n×n is a matrix of graphs, S ·i Is the ith column of the graph matrix S;
as can be seen from the indicator vector, sample x i And sample x j The more closely, r i,j The smaller the value is, the larger the reconstruction coefficient is given to the corresponding edge in the graph matrix S by minimizing the formula (1);
3. constructing local constraints based on sample similarity;
it has been proved that similar samples in the high-dimensional space should have similar reconstruction coefficients, which is an important condition for mining local information, and therefore, the following local constraint based on the sample similarity is established:
wherein S is ·i And S ·j For the ith and jth columns of the graph matrix S, respectively, the samples x are represented i And sample x j Reconstruction coefficient of (w) ij =exp(-||x i -x j || 2 /σ) is a thermal kernel function used to characterize the sample x i And sample x j Similarity between the samples, by minimizing equation (2), similar samples have similar reconstruction coefficients;
4. unifying projection matrix learning and graph construction under a unified framework, and establishing a dimension reduction target function based on two local constraints:
s.t.P T P=I
wherein the content of the first and second substances,is a projection matrix, (x) i -XS ·i ) Represents a sample x i Error reconstructed by other samples in X, P T Denotes the transpose of the matrix P, I is the identity matrix, λ&0 is a compromise parameter;
using algebraic transformation, equation (3) is rewritten as:
wherein S is T Which represents a transpose of the matrix S,for locally indicating a vector R i L = D-W is laplace matrix, W = [ W ] ij ] n×n Is a symmetrical sample similarity array, D is a diagonal array, the diagonal element of which is the sum of the rows and columns of W, and tr (-) represents the trace of the matrix;
5. similar to the GoLPP and DRAG methods, the objective function is optimized by an alternating iterative strategy:
(1) First, the projection matrix P is fixed, the graph matrix S is updated, and equation (4) can be written as:
wherein, y i =P T x i And Y = P T X, the last term of the constraint term of equation (5) may be rewritten as:
by algebraic transformation, equation (5) can be rewritten as:
the graph matrix S can be updated column by column, for the ith column S of S ·i The objective function may be updated as:
for S in the formula (8) ·i If the partial derivative is calculated and the reciprocal is equal to zero, then:
S ·i =(Y T Y+λE i +λL) -1 Y T y i (10)
(2) Next, fixing the graph matrix S, updating the projection matrix P, and by removing the irrelevant terms, the optimization problem of equation (3) with respect to P is:
s.t.P T P=I
let M = X (I-S) T +S T S)X T And C = XX T Equation (11) can be written as:
s.t.P T P=I
obviously, equation (12) is a trace-ratio optimization problem that can be solved by iterative trace-ratio method (ITR) or newton's decomposition method (DNR);
finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N (N is less than or equal to 15) iteration;
to demonstrate the convergence of the proposed algorithm, the proposed algorithm can be decomposed into two sub-problems, such as equation (5) and equation (11). For the first sub-problem, a closed-form solution of the graph matrix S can be obtained by equation (10), and it is obvious that the objective function value of us is in a descending trend in the process of iteratively solving S. For the second subproblem, it has been demonstrated previously that the ITR and DNR methods can yield a globally optimal solution to the trace ratio problem. Therefore, solving equation (12) with ITRs or DNRs also tends to decrease our objective function values. Finally, since all terms in equation (3) are not less than 0, our objective function has a lower bound. Therefore, according to the cauchy convergence rule, the algorithm we propose is convergent;
6. projecting the high-dimensional data X to the matrix P to obtain low-dimensional representation of the high-dimensional data, thereby achieving the purpose of dimensionality reduction;
X low =P T X (13)
wherein the content of the first and second substances,is a low-dimensional representation of the high-dimensional data X, i.e. each sample in the high-dimensional space changes from the original D-dimension to the D-dimension of the low-dimensional space.
Experimental example: the beneficial effects of the invention are further illustrated by the analysis and comparison of specific experimental results.
The invention provides an effective graph optimization dimension reduction method based on local constraint. In order to evaluate the proposed method efficiently and systematically, we performed a number of classification recognition and clustering experiments on 3 standard face databases (Yale, extended Yale B and CMU PIE) and 3 standard data sets (Glass, sonar and COIL 20) on UCI databases, where Yale, extended Yale B, CMU PIE and COIL 20 are each an image and Glas and Sonar are each a vector, and each image can be represented as a high-dimensional vector when dimension reduction is performed using the proposed method. Fig. 1 (a) - (c) and fig. 2 show partial images in the face database and the COIL 20 data set, respectively. Table 1 and table 2 give the detailed information of the 3 face databases and the 3 UCI data sets, respectively. In addition, the performance of the method (LC-GODR for short) provided by the invention is compared with that of some representative methods from the quantitative point of view, including LPP, NPE, SGLPP, LSR-NPE, LRR-NPE, SPP, goLPP, DRAG and GODRSC.
When a face recognition task is carried out, one image of each face is randomly selected as a training sample to obtain a projection matrix P, the rest t images are used as test samples, and simple and effective Euclidean distance and nearest neighbor classifiers (NN) are adopted for recognition. In order to verify that the proposed method has good stability, the process is repeated 10 times, and the average recognition result of 10 times is used as the final recognition rate.
Where T is the number of correctly identified samples and N is the number of all identified samples.
Obviously, the parameter λ and the low-dimensional representation dimension d of the sample are two important parameters affecting the face recognition result, and table 3 shows the influence of the parameter λ on the face recognition rate. It can be seen that for the Yale database, the algorithm proposed by us achieves the highest recognition rate when λ takes a smaller value. In contrast, the proposed algorithm exhibits better performance in Extended YaleB and CMU PIE when λ takes a larger value. This is because, compared to Extended YaleB and CMU PIE, yale database has a small number of training samples and a large intra-class variance, and samples from the same class may not be adjacent in feature space, so the reconstruction relationship of the samples should be strengthened. Conversely, since Extended YaleB and CMU PIE contain more training samples, and head pose and facial expression changes are smaller, with smaller intra-class variance relative to the Yale database, the local constraints should be given greater weight.
Fig. 3 (a) - (c) show the effect of the low-dimensional subspace dimension d on the recognition rate. It can be seen from the figure that the proposed LC-GODR method performs worse than some other methods when the subspace dimension is lower. However, as the dimensionality increases, the recognition rate of the LC-GODR method has improved significantly. Table 3 shows the comparison of the highest recognition rate and the standard deviation for different methods, wherein the numerical values in brackets represent the corresponding low-dimensional subspace dimension d when the highest recognition rate is obtained. The standard deviation is the standard deviation of the recognition rate of 10 times, and a smaller one indicates better stability. We can see that the proposed method achieves the highest recognition rate on all three databases, 89.86%,90.59% and 93.75%, respectively, and has better stability. In contrast, the DRAG and GODRSC methods also yield better performance, but are inferior to the LC-GODR method.
Fig. 4 (a) - (c) show the convergence curves of the LC-GODR method on the 3-face database. The horizontal axis represents the iteration times, and the vertical axis represents the objective function value, so that the proposed iteration updating strategy has a fast convergence speed, namely the function achieves convergence within 20 iteration times.
When the clustering task is realized, the low-dimensional representation is automatically clustered by using a K-means algorithm, and clustering performance is evaluated by using clustering precision (AC), wherein the AC is defined as follows:
wherein, the first and the second end of the pipe are connected with each other,n is the total number of training samples, l i Is the data x i The real label of c i Is a sample x i The obtained cluster label, map (-) is the optimal mapping function that maps each cluster label to an equivalent real label by the Kuhn-Munkres algorithm.
As can be seen from the formula (15), the range of the AC value belongs to the interval [0,1], and the larger the AC value is, the better the algorithm clustering performance is. Since the K-means clustering method is sensitive to the initial clustering center, we randomly initialize the class center 10 times, and take the average clustering accuracy of 10 times as the final clustering result. The setting process of the parameter lambda and the low-dimensional representation dimension d is the same as that of face recognition, and table 5 shows clustering result comparison of different methods. As can be seen from the table, since LPP and NEP are patterned using the k-nearest neighbor method, their performance is lower than that of the other methods in most cases. In the SGLPP method, the number of neighbors per sample is adaptive, and its performance is better than LPP on both UCI data sets. Because the LSR-NPE, LRR-NPE, SPP, goLPP, DRAG, and GODRSC methods use more advanced techniques to construct the graph, they achieve higher clustering accuracy than the LPP, NEP, and SGLPP methods. Obviously, the LC-GODR method exhibits the best performance, with clustering accuracies of 0.5754,0.6730 and 0.6337, respectively, due to the consideration of the two constraints.
Fig. 5 (a) - (c) show the convergence curves of the LC-GODR method on 3 clustered data sets, similar to fig. 4, where the curves approach to be smooth within 20 iterations, which illustrates that the proposed method can have a fast convergence rate.
TABLE 1 detailed information of the personal face database
Table 2 details of UCI data sets
TABLE 3 influence of parameter λ on recognition Rate
TABLE 4 comparison of maximum recognition rate and standard deviation for different methods
TABLE 5 comparison of highest clustering accuracy and standard deviation for different methods
The method integrates graph optimization and projection matrix learning into a unified framework, and enables the graph to be updated in a self-adaptive mode in the dimension reduction process. Secondly, local information of high-dimensional data can be effectively mined and maintained by introducing local constraints. In particular, an efficient update strategy is proposed to solve the proposed algorithm. A large number of experimental and comparative results show that the invention has good performance and is superior to the prior related method. The method provided by the invention is suitable for target identification, data clustering and data visualization.
In view of this, the invention provides a graph optimization dimension reduction method based on local constraints, which considers two local constraints simultaneously in the dimension reduction process, so that the low-dimensional representation of a high-dimensional sample can well keep the local relationship of original high-dimensional data. In particular, identification and clustering experiments were performed on 3 international standard face databases and 3 standard data sets established by the university of california, respectively, in the european part (3 face databases and 3 UCI data sets are detailed in tables 1 and 2), and the comparison experiments prove that the proposed method has good performance.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiment, and all technical solutions belonging to the principle of the present invention belong to the protection scope of the present invention.

Claims (4)

1. A graph optimization dimension reduction method based on local constraint is characterized by comprising the following steps:
1) Reading high dimensional dataWherein x is i The number of samples is the ith sample, D is the dimension of the sample, and n is the number of the samples;
2) Constructing local constraint based on a neighbor reconstruction relation;
in order to enable the low-dimensional representation of the high-dimensional data to maintain the local relationship of the original data, a graph matrix S needs to be constructed in the process of solving a projection matrix, and local information of the data can be effectively captured by reconstructing a sample by using the neighbor of the sample, so that the following constraint relationship is considered in the process of constructing the graph matrix S:
wherein the content of the first and second substances,represents the multiplication operation of the corresponding elements and represents the multiplication operation of the corresponding elements,is a sample x i Is used to indicate the vector of (a),is a graph matrix, S · i Is the ith column of the graph matrix S;
3) Constructing local constraint based on sample similarity;
considering that close samples in the high-dimensional space should have similar reconstruction coefficients, the following constraint relationship is considered in constructing the graph matrix S:
wherein S is ·i And S ·j For the ith and jth columns of the graph matrix S, respectively, the samples x are represented i And x j Reconstruction coefficient of (w) ij =exp(-||x i -x j || 2 σ) is a thermal kernel function;
4) And constructing a dimensionality reduction objective function based on two local constraints:
wherein the content of the first and second substances,for projection matrix, D>>d,(x i -XS ·i ) Represents a sample x i Error reconstructed by other samples in X, P T Denotes the transpose of the matrix P, I is the identity matrix, λ&gt, 0 is a compromise parameter;
5) Optimizing a target function through an iteration strategy, firstly fixing a projection matrix P, and updating a graph matrix S; then fixing the graph matrix S and updating the projection matrix P; finally, obtaining an optimized projection matrix P and an optimized graph matrix S through N iterations, wherein N is less than or equal to 15;
6) For subsequent identification and clustering tasks, projecting the high-dimensional data X to a matrix P to obtain low-dimensional representation of the high-dimensional data, thereby achieving the purpose of dimensionality reduction;
X low =P T X (13)
wherein the content of the first and second substances,is a low-dimensional representation of the high-dimensional data X, i.e. each sample in the high-dimensional space is changed from the original D-dimension to the D-dimension of the low-dimensional space.
2. The method for graph optimization dimension reduction based on local constraints of claim 1, wherein: in step 4), using algebraic transformation, formula (3) is rewritten as:
wherein S is T Which represents a transpose of the matrix S,for locally indicating a vector R i L = D-W is laplace matrix, W = [ W ] ij ] n×n Is a symmetric sample-like array, D is a diagonal array whose diagonal elements are the sum of rows and columns of W, and tr (-) denotes the trace of the matrix.
3. The method for graph optimization dimension reduction based on local constraints of claim 2, wherein: in step 5), the projection matrix P is first fixed, the graph matrix S is updated, and equation (4) can be written as:
wherein, y i =P T x i And Y = P T X, the last term of the constraint term of equation (5) may be rewritten as:
by algebraic transformation, equation (5) can be rewritten as:
the graph matrix S can be updated column by column, for the ith column S of S ·i The objective function can be updated as:
for S in the formula (8) ·i If the partial derivative is calculated and the reciprocal is equal to zero, then:
S ·i =(Y T Y+λE i +λL) -1 Y T y i (10)。
4. the method for graph optimization dimension reduction based on local constraints of claim 1, wherein: in the step 5) of the method,
fixing the graph matrix S, updating the projection matrix P, and by removing the irrelevant terms, the optimization problem of formula (3) with respect to P is:
let M = X (I-S) T +S T S)X T And C = XX T Equation (11) can be written as:
equation (12) is solved by iterative trace ratio method (ITR) or newton decomposition method (DNR).
CN201510777140.4A 2015-11-13 2015-11-13 Figure optimization Dimensionality Reduction method based on local restriction Active CN105389560B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510777140.4A CN105389560B (en) 2015-11-13 2015-11-13 Figure optimization Dimensionality Reduction method based on local restriction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510777140.4A CN105389560B (en) 2015-11-13 2015-11-13 Figure optimization Dimensionality Reduction method based on local restriction

Publications (2)

Publication Number Publication Date
CN105389560A CN105389560A (en) 2016-03-09
CN105389560B true CN105389560B (en) 2018-05-11

Family

ID=55421832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510777140.4A Active CN105389560B (en) 2015-11-13 2015-11-13 Figure optimization Dimensionality Reduction method based on local restriction

Country Status (1)

Country Link
CN (1) CN105389560B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105825215B (en) * 2016-03-15 2019-07-19 云南大学 It is a kind of that the instrument localization method of kernel function is embedded in based on local neighbor and uses carrier
CN108805179B (en) * 2018-05-24 2022-03-29 华南理工大学 Face local constraint coding based calibration and recognition method
CN109815440B (en) * 2019-01-16 2023-06-23 江西师范大学 Dimension reduction method combining graph optimization and projection learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218617A (en) * 2013-05-13 2013-07-24 山东大学 Multi-linear large space feature extraction method
CN103226699A (en) * 2013-04-16 2013-07-31 哈尔滨工程大学 Face recognition method based on separation degree difference supervised locality preserving projection
CN103605889A (en) * 2013-11-13 2014-02-26 浙江工业大学 Data dimension reduction method based on data global-local structure preserving projections

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100543707B1 (en) * 2003-12-04 2006-01-20 삼성전자주식회사 Face recognition method and apparatus using PCA learning per subgroup

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103226699A (en) * 2013-04-16 2013-07-31 哈尔滨工程大学 Face recognition method based on separation degree difference supervised locality preserving projection
CN103218617A (en) * 2013-05-13 2013-07-24 山东大学 Multi-linear large space feature extraction method
CN103605889A (en) * 2013-11-13 2014-02-26 浙江工业大学 Data dimension reduction method based on data global-local structure preserving projections

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于有监督降维的人脸识别方法;姚明海等;《计算机工程》;20140625;第40卷(第5期);第228-233页 *
基于特征子空间邻域的局部保持流形学习算法;王娜等;《计算机应用研究》;20120626;第29卷(第4期);第1318-1321页 *

Also Published As

Publication number Publication date
CN105389560A (en) 2016-03-09

Similar Documents

Publication Publication Date Title
Xie et al. Hyper-Laplacian regularized multilinear multiview self-representations for clustering and semisupervised learning
Zhao et al. On similarity preserving feature selection
CN107203787B (en) Unsupervised regularization matrix decomposition feature selection method
CN109993208B (en) Clustering processing method for noisy images
Wang et al. Minimum error entropy based sparse representation for robust subspace clustering
CN105389560B (en) Figure optimization Dimensionality Reduction method based on local restriction
CN109063555B (en) Multi-pose face recognition method based on low-rank decomposition and sparse representation residual error comparison
Zhang et al. Enabling in-situ data analysis for large protein-folding trajectory datasets
CN114299362A (en) Small sample image classification method based on k-means clustering
Ma et al. The BYY annealing learning algorithm for Gaussian mixture with automated model selection
CN109815440B (en) Dimension reduction method combining graph optimization and projection learning
CN109657693B (en) Classification method based on correlation entropy and transfer learning
CN111027582A (en) Semi-supervised feature subspace learning method and device based on low-rank graph learning
CN108388918B (en) Data feature selection method with structure retention characteristics
Zhao et al. A novel multi-view clustering method via low-rank and matrix-induced regularization
CN108121964B (en) Matrix-based joint sparse local preserving projection face recognition method
Su et al. Graph regularized low-rank tensor representation for feature selection
Ubaru et al. UoI-NMF cluster: a robust nonnegative matrix factorization algorithm for improved parts-based decomposition and reconstruction of noisy data
Wei et al. Self-regularized fixed-rank representation for subspace segmentation
CN109614581B (en) Non-negative matrix factorization clustering method based on dual local learning
Chen et al. A general model for robust tensor factorization with unknown noise
Lv et al. A robust mixed error coding method based on nonconvex sparse representation
Meng et al. Robust discriminant projection via joint margin and locality structure preservation
Yang et al. Robust landmark graph-based clustering for high-dimensional data
Qu et al. A Fast Sparse NMF Optimization Algorithm for Hyperspectral Unmixing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant