WO2022178977A1 - Unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding - Google Patents
Unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding Download PDFInfo
- Publication number
- WO2022178977A1 WO2022178977A1 PCT/CN2021/090827 CN2021090827W WO2022178977A1 WO 2022178977 A1 WO2022178977 A1 WO 2022178977A1 CN 2021090827 W CN2021090827 W CN 2021090827W WO 2022178977 A1 WO2022178977 A1 WO 2022178977A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- matrix
- nearest neighbor
- dimensionality reduction
- dimension
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims abstract description 97
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 11
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims description 9
- 238000000513 principal component analysis Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000004364 calculation method Methods 0.000 abstract description 4
- 238000003909 pattern recognition Methods 0.000 abstract description 4
- 238000013500 data storage Methods 0.000 abstract description 3
- 230000006870 function Effects 0.000 description 32
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2135—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Definitions
- the invention relates to an unsupervised data dimension reduction method based on self-adaptive nearest neighbor graph embedding, and belongs to the fields of image recognition, classification and pattern recognition.
- Data dimensionality reduction technology is an important research topic in the field of image classification and pattern recognition. Under the background of big data, the amount of raw data directly obtained in practical application scenarios is huge. Handle the requirements of the hardware platform. Data dimensionality reduction is to perform feature extraction and dimension reduction processing on the original high-dimensional data. While ensuring that the dimensionality-reduced data still retains most of the information contained in the original data, the dimensionality of the data is reduced as much as possible to improve data storage and processing. efficiency, reducing the requirements for hardware and subsequent data processing algorithms. Because data dimensionality reduction can reduce the data dimension and required storage space, save model training and calculation time, and improve the accuracy of subsequent applied algorithms, data dimensionality reduction technology has been widely used in pattern recognition, computer vision, hyperspectral image processing and other fields. After the data is processed by dimensionality reduction, the amount of data is greatly reduced, which can improve the speed and accuracy of subsequent data classification.
- the dimensionality reduction method based on graph embedding regards sample points as graph nodes, and the weight value between nodes represents the distance between nodes. Perform dimensionality reduction.
- the traditional dimensionality reduction method based on graph embedding needs to construct the nearest neighbor graph in advance. The quality of the nearest neighbor graph construction is directly related to the effect after dimensionality reduction.
- Dimensionality reduction is a technique that can convert multi-dimensional indicators into a small number of comprehensive indicators, and is generally used as a preprocessing step.
- the commonly used dimensionality reduction technology is principal component analysis technology, namely PCA. Through PCA technology, useful information can be extracted and feature dimensionality reduction can be performed to obtain a low-dimensional feature space.
- the speed and accuracy of classification and recognition are effectively improved, but because the PCA technology only maximizes the variance of the data after dimensionality reduction, that is, only the overall information of the data is considered, and the local structure between the data is not considered. , so the classification accuracy is limited.
- the present invention proposes a method based on adaptive neighbor graph embedding.
- An unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding characterized in that the steps are as follows:
- Step 1 Data Preprocessing
- Step 2 Build the nearest neighbor graph and initialize it
- the nearest neighbor graph G (X,S), where G represents the constructed nearest neighbor graph, X represents the node set in the nearest neighbor graph, and S represents the distance relationship between nodes.
- the elements S ij represent the distance between the ith node and the jth node; the weight matrix S is obtained by minimizing the following problem:
- Step 3 Alternately iteratively optimize the objective function
- the projection matrix d 1 is the dimension of the low-dimensional space
- the projection matrix W maps the data from the d-dimensional space to the d 1 ⁇ d-dimensional space
- the denoising matrix represents the approximate matrix for noise removal in the subspace, is the transpose of the ith row vector of F, and ⁇ is the regularization parameter;
- the data in step 1 is a face image or a hyperspectral image.
- step 2 r is taken as 1.1.
- a face recognition method based on a data dimension reduction method characterized in that the dimension reduction method is used to reduce the dimension of a face image to obtain a projection matrix and low-dimensional data, and an unsupervised clustering algorithm is used to cluster the low-dimensional data.
- the camera collects a new face image, it uses the obtained projection matrix to reduce the dimension of the new image to obtain the low-dimensional projection coordinates, and calculate the low-dimensional projection coordinates and each cluster.
- the Euclidean distance between the cluster centers and the cluster center with the smallest Euclidean distance is taken, then the category to which the cluster center belongs is the category of the new face image.
- a new method for constructing the nearest neighbor graph is proposed through the invention step 1, which avoids the sensitivity to noise when the traditional k-nearest neighbor graph is constructed.
- This method of constructing the nearest neighbor graph can not only be used in the data dimensionality reduction algorithm, but also can be extended to other algorithms that need to construct the nearest neighbor graph, such as clustering.
- step 3 the learning of the neighbor graph and the learning of the projection matrix in the data dimensionality reduction are combined into a framework, and the construction of the neighbor graph is continuously updated in the subspace, and finally a reasonable neighbor graph can be obtained.
- the construction method It can adaptively find reasonable neighbor graphs and is suitable for different types of data sets.
- the present invention proposes a face recognition method based on graph embedding dimension reduction.
- the optimal neighbor graph is constructed through the continuous update of the neighbor graph, so as to better maintain the local structure of the data, and at the same time, the overall information of the data is taken into account to obtain a low-dimensional image containing more effective features. data.
- Performing face recognition in low-dimensional space can reduce the amount of data storage, reduce the amount of data calculation, improve computing efficiency, and ultimately improve the real-time performance and recognition accuracy of face recognition technology.
- FIG. 2 Flow chart of face recognition method based on dimensionality reduction method
- the present invention is based on an unsupervised data dimensionality reduction method embedded in an adaptive nearest neighbor graph, and its basic flowchart is shown in Figure 1, and its specific steps are as follows:
- Step 1 Data preprocessing.
- the original data matrix is X' ⁇ R d' ⁇ n , where n is the number of sample points, and d' is the dimension of the sample points.
- PCA Principal Component Analysis
- the obtained data matrix is X ⁇ R d ⁇ n , where d is the dimension of the sample point after PCA processing.
- Step 2 Build the nearest neighbor graph and initialize it.
- G represents the constructed nearest neighbor graph
- X represents the node set in the nearest neighbor graph
- S represents the distance relationship between nodes.
- the elements S ij represent the distance between the ith node and the jth node.
- the weight matrix S is created by minimizing the following problem:
- the power exponent factor r is used to adjust the size of the weight
- X the ith column vector
- This formula shows that the weight matrix is measured by calculating the distance between the sample points in the high-dimensional space. The smaller the distance between the sample points, the larger the element value in the corresponding weight matrix, that is, the two sample points are neighbors The greater the possibility of , and vice versa, the smaller the element value in the weight matrix.
- Step 3 Alternately iteratively optimize the objective function.
- set the projection matrix d 1 is the dimension of the low-dimensional space
- the projection matrix W maps the data from the d-dimensional space to the d 1 ⁇ d-dimensional space.
- X ⁇ R d ⁇ n is the data matrix
- I ⁇ R n ⁇ n is the identity matrix
- 1 ⁇ R n ⁇ 1 is the matrix whose elements are all 1.
- the denoising matrix represents the approximate matrix for noise removal in the subspace, is the transpose of the ith row vector of F, and ⁇ is the regularization parameter, and its value is generally larger.
- the objective function is solved by alternate iteration method. Fix S, solve F and W, then fix F and W, solve S, and use the obtained S as the initial value S 0 to iterate again.
- the solution steps are as follows:
- Step 3.1 Fix S, solve F and W.
- M X(IP)X T , which is a positive definite real symmetric matrix, and the objective function is the following formula
- Step 3.2 Fix W, find S from F.
- 1 ⁇ Rn ⁇ 1 is a vector whose elements are all 1.
- ⁇ is the Lagrange multiplier
- ⁇ should take a positive value.
- ⁇ i ⁇ 0, ⁇ i s i 0.
- the present invention proposes a face recognition method based on dimensionality reduction data, comprising the following steps:
- Step 1 Build a face database, collect face images for recognition, and perform data preprocessing. Assuming that the number of face images is n and the size is 32 ⁇ 32, each image can be elongated into a vector with a dimension of 1024 according to the gray value of the face image, and the original data is preprocessed by PCA to retain the original The energy of the data is 95%, and the dimension is 273, then the data matrix X ⁇ R 273 ⁇ n .
- Step 2 Build the nearest neighbor graph and initialize it.
- G represents the constructed nearest neighbor graph
- X represents the node set in the nearest neighbor graph
- S represents the distance relationship between nodes.
- the elements S ij represent the distance between the ith node and the jth node.
- the initial weight matrix S is obtained by minimizing the following problem:
- the objective function is as follows:
- the denoising matrix F ⁇ R n ⁇ 30 represents the approximate matrix for denoising noise in the subspace
- f i ⁇ R 30 ⁇ 1 is the transpose of the ith row vector of F
- ⁇ is the regularization parameter, and its value is generally larger.
- the solution method of the objective function is the alternate iteration method.
- the solution steps are as follows:
- Step 3.1 Fix S, solve for F and W.
- M X(IP)X T , which is a positive definite real symmetric matrix, and the objective function is the following formula
- Step 3.2 Fix W, find S from F.
- the Lagrangian function is as follows
- ⁇ is the Lagrange multiplier
- ⁇ should take a positive value.
- ⁇ i ⁇ 0, ⁇ i s i 0.
- Step 4 After the camera collects a new face image, use the obtained projection matrix W opt to perform dimension reduction processing on the new image, obtain low-dimensional projection coordinates, and calculate the Euclidean relationship between the low-dimensional projection coordinates and each cluster center. The distance and the cluster center with the smallest Euclidean distance are taken, then the category to which the cluster center belongs is the category of the new face image.
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to the fields of image recognition and classification and pattern recognition, and relates to an unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding. The method comprises: preprocessing data; constructing a nearest neighbor graph and initializing same; and optimizing an objective function by means of alternating iteration. The present invention further provides a face recognition method based on the data dimensionality reduction method, comprising: performing dimensionality reduction on a face image to obtain a projection matrix and low-dimensional data, and clustering the low-dimensional data by using an unsupervised clustering algorithm to obtain clustering centers of categories; and taking, according to the Euclidean distances between an image to be classified and the clustering centers, a clustering center having the minimum Euclidean distance, the category of the clustering center being the category of a new face image. Face recognition is performed in a low-dimensional space, so that the amount of data storage can be reduced, the amount of data calculation can be reduced, the calculation efficiency can be improved, and finally, the real-time performance and recognition precision of a face recognition technology can be improved.
Description
本发明涉及一种基于自适应近邻图嵌入的无监督数据降维方法,属于图像识别与分类和模式识别领域。The invention relates to an unsupervised data dimension reduction method based on self-adaptive nearest neighbor graph embedding, and belongs to the fields of image recognition, classification and pattern recognition.
数据降维技术是图像分类和模式识别领域的一个重要的研究课题。在大数据背景下,在实际应用场景中直接获取的原始数据量是巨大的,这些数据的高维度和高冗余对数据存储和数据处理造成了极大的困难,并且提高了对数据存储和处理的硬件平台的要求。数据降维是对原始高维数据进行特征提取和维度约减处理,在保证降维后的数据仍然保留原始数据包含的大部分信息的同时,尽可能降低数据的维度,以提高数据存储和处理效率,降低对硬件和后续数据处理算法的要求。由于数据降维能减少数据维度和需要的存储空间,节约模型训练计算时间,提高后面应用算法的准确度,数据降维技术已经被广泛应用于模式识别、计算机视觉、高光谱图像处理等领域。数据经过降维处理后,数据量大幅减少,可以提高后续数据分类的速度和精度。Data dimensionality reduction technology is an important research topic in the field of image classification and pattern recognition. Under the background of big data, the amount of raw data directly obtained in practical application scenarios is huge. Handle the requirements of the hardware platform. Data dimensionality reduction is to perform feature extraction and dimension reduction processing on the original high-dimensional data. While ensuring that the dimensionality-reduced data still retains most of the information contained in the original data, the dimensionality of the data is reduced as much as possible to improve data storage and processing. efficiency, reducing the requirements for hardware and subsequent data processing algorithms. Because data dimensionality reduction can reduce the data dimension and required storage space, save model training and calculation time, and improve the accuracy of subsequent applied algorithms, data dimensionality reduction technology has been widely used in pattern recognition, computer vision, hyperspectral image processing and other fields. After the data is processed by dimensionality reduction, the amount of data is greatly reduced, which can improve the speed and accuracy of subsequent data classification.
近来,基于图嵌入的无监督降维方法受到关注,基于图嵌入的降维方法将样本点视为图结点,节点之间的权重值代表节点之间的距离,近邻图构建之后再对样本进行降维处理。但传统的基于图嵌入降维方法近邻图需要提前构建,近邻图构建好坏与降维后效果有直接关系,近邻图构建与降维算法分开处理导致降维算法效果不显著。Recently, the unsupervised dimensionality reduction method based on graph embedding has attracted attention. The dimensionality reduction method based on graph embedding regards sample points as graph nodes, and the weight value between nodes represents the distance between nodes. Perform dimensionality reduction. However, the traditional dimensionality reduction method based on graph embedding needs to construct the nearest neighbor graph in advance. The quality of the nearest neighbor graph construction is directly related to the effect after dimensionality reduction.
占善华等人(《自适应图嵌入的鲁棒稀疏局部保持投影》,计算机工程与设计,2020,41(08):2296-2301.)提出了一种自适应嵌入的鲁棒性稀疏局部保持投影的降维方法,其将图学习和降维学习融入到一个联合学习框架中。虽然提出的模型考虑到稀疏性、鲁棒性等综合性因素,但模型中参数过多,模型冗余,无法权衡参数和性能之间的关系,而参数的选取对模型性能有重要影响,导致模型在实际应用中较为困难。Shanhua Zhan et al. ("Robust Sparse Local Preserving Projection for Adaptive Graph Embedding", Computer Engineering and Design, 2020, 41(08): 2296-2301.) proposed a robust sparse localization for adaptive embedding A projection-preserving dimensionality reduction method that integrates graph learning and dimensionality reduction learning into a joint learning framework. Although the proposed model takes into account comprehensive factors such as sparsity and robustness, there are too many parameters in the model and the model is redundant, which cannot balance the relationship between parameters and performance. The selection of parameters has an important impact on the performance of the model, resulting in The model is more difficult in practical application.
目前,在图像识别领域中,由于数据的维度较高,给识别或分类过程造成了较大困难,导致识别或分类速度较慢。降维能够将多维指标转化为少量的综合指标的技术,一般被作为预处理步骤。目前针对人脸识别***,常用的降维技术为主成分分析技术,即PCA,通过PCA技术,能进行有用信息的提取和特征降维,得到低维的特征空间。在低维特征空间中,分类和识别的速度和精度都被有效提升,但由于PCA技术仅令降维后数据方差最大,即仅考虑了数据的整体信息,而不考虑数据之间的局部结构,故分类精度受限。At present, in the field of image recognition, due to the high dimensionality of the data, it is difficult to recognize or classify the process, resulting in a slow recognition or classification. Dimensionality reduction is a technique that can convert multi-dimensional indicators into a small number of comprehensive indicators, and is generally used as a preprocessing step. At present, for face recognition systems, the commonly used dimensionality reduction technology is principal component analysis technology, namely PCA. Through PCA technology, useful information can be extracted and feature dimensionality reduction can be performed to obtain a low-dimensional feature space. In the low-dimensional feature space, the speed and accuracy of classification and recognition are effectively improved, but because the PCA technology only maximizes the variance of the data after dimensionality reduction, that is, only the overall information of the data is considered, and the local structure between the data is not considered. , so the classification accuracy is limited.
发明内容SUMMARY OF THE INVENTION
要解决的技术问题technical problem to be solved
针对目前近邻图构建方法与数据降维算法步骤分离导致降维效果不显著的缺陷,导致后面的人脸识别效率低和精度不高的问题,本发明提出一种基于自适应近邻图嵌入的无监督数据降维方法和基于数据降维后的人脸识别方法。Aiming at the defect that the current neighbor graph construction method and the data dimension reduction algorithm are separated from each other, resulting in insignificant dimensionality reduction effect, resulting in low efficiency and low accuracy of face recognition later, the present invention proposes a method based on adaptive neighbor graph embedding. Supervised data dimensionality reduction method and face recognition method based on data dimensionality reduction.
技术方案Technical solutions
一种基于自适应近邻图嵌入的无监督数据降维方法,其特征在于步骤如下:An unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding, characterized in that the steps are as follows:
步骤1:数据预处理Step 1: Data Preprocessing
采用主成分分析PCA对原始数据
进行预处理得到数据矩阵X∈R
d×n,其中,n为样本点的数量,d'为样本点的维度,d为经PCA处理后样本点的维度;
Principal component analysis PCA on raw data Perform preprocessing to obtain a data matrix X∈R d×n , where n is the number of sample points, d' is the dimension of the sample point, and d is the dimension of the sample point after PCA processing;
步骤2:构建近邻图并初始化Step 2: Build the nearest neighbor graph and initialize it
根据数据矩阵X∈R
d×n,构建近邻图G=(X,S),其中G表示构建的近邻图,X为近邻图中的节点集合,S表示节点之间相连的远近关系,其每个元素S
ij表示第i个节点与第j个节点之间的距离;权重矩阵S是通过最小化以下问题:
According to the data matrix X∈R d×n , construct the nearest neighbor graph G=(X,S), where G represents the constructed nearest neighbor graph, X represents the node set in the nearest neighbor graph, and S represents the distance relationship between nodes. The elements S ij represent the distance between the ith node and the jth node; the weight matrix S is obtained by minimizing the following problem:
其中,幂指数因子r用来调节权重的大小,x
i∈R
d×1,i=1,2,...,n为矩阵X的第i个列向量,即为第i个样本点的坐标;
Among them, the power exponent factor r is used to adjust the size of the weight, x i ∈R d×1 , i=1,2,...,n is the i-th column vector of the matrix X, that is, the i-th sample point coordinate;
步骤3:交替迭代优化目标函数Step 3: Alternately iteratively optimize the objective function
设投影矩阵
d
1为低维空间的维度,投影矩阵W将数据从d维空间映射到d
1<<d维空间;为了保证投影后数据在统计学意义上不相关,添加约束W
TS
tW=I,
为全局散度矩阵,
为数据矩阵,
为单位矩阵,
为元素均为1的矩阵;目标函数如下所示:
set the projection matrix d 1 is the dimension of the low-dimensional space, and the projection matrix W maps the data from the d-dimensional space to the d 1 <<d-dimensional space; in order to ensure that the data after projection is statistically irrelevant, add the constraint W T S t W=I , is the global divergence matrix, is the data matrix, is the identity matrix, is a matrix whose elements are all 1; the objective function is as follows:
其中,去噪矩阵
表示子空间内去除噪声的近似矩阵,
为F的第i行向量的转置,λ为正则化参数;
Among them, the denoising matrix represents the approximate matrix for noise removal in the subspace, is the transpose of the ith row vector of F, and λ is the regularization parameter;
目标函数中的第一项可进行以下化简:The first term in the objective function can be simplified as follows:
其中
为拉普拉斯矩阵,维度为n×n;
为度矩阵,是一个对角阵,
维度为n×n;S
r为相似度矩阵,其每一个元素为权值矩阵S中每一个元素的r次方,维度为n×n;则目标函数可化简为下式:
in is a Laplacian matrix with dimension n×n; is the degree matrix, which is a diagonal matrix, The dimension is n×n; S r is the similarity matrix, each element of which is the power of r of each element in the weight matrix S, and the dimension is n×n; then the objective function can be simplified to the following formula:
采用交替迭代法求解上述目标函数,得到投影矩阵W,则数据降维后的数据矩阵为Y=W
TX。
The above objective function is solved by the alternate iteration method, and the projection matrix W is obtained, and the data matrix after data dimension reduction is Y=W T X.
优选地:步骤1中的数据为人脸图像或高光谱图像。Preferably: the data in step 1 is a face image or a hyperspectral image.
优选地:步骤2中r取1.1。Preferably: in step 2, r is taken as 1.1.
一种基于数据降维方法的人脸识别方法,其特征在于采用所述的降维方法对人脸图像进行降维得到投影矩阵和低维数据,对低维数据采用无监督聚类算法进行聚类,得到各类别的聚类中心;当摄像头采集到新的人脸图像后,利用得到的投影矩阵对新图片进行降维处理,得到低维的投影坐标,计算计算低维投影坐标与各个聚类中心之间的欧式距离并取欧式距离最小的聚类中心,则该聚类中心所属的类别就是新人脸图像的类别。A face recognition method based on a data dimension reduction method, characterized in that the dimension reduction method is used to reduce the dimension of a face image to obtain a projection matrix and low-dimensional data, and an unsupervised clustering algorithm is used to cluster the low-dimensional data. After the camera collects a new face image, it uses the obtained projection matrix to reduce the dimension of the new image to obtain the low-dimensional projection coordinates, and calculate the low-dimensional projection coordinates and each cluster. The Euclidean distance between the cluster centers and the cluster center with the smallest Euclidean distance is taken, then the category to which the cluster center belongs is the category of the new face image.
本发明提出的一种自适应近邻图嵌入的无监督数据降维方法,有益效果如下:An unsupervised data dimensionality reduction method for self-adaptive nearest neighbor graph embedding proposed by the present invention has the following beneficial effects:
(1)通过发明步骤1提出了一种新的近邻图的构建方法,避免了传统k近邻图构建时对噪声敏感。该种近邻图的构建方法不仅能用于数据降维算法中,同时可以扩展到聚类等其他需要构建近邻图的算法中。(1) A new method for constructing the nearest neighbor graph is proposed through the invention step 1, which avoids the sensitivity to noise when the traditional k-nearest neighbor graph is constructed. This method of constructing the nearest neighbor graph can not only be used in the data dimensionality reduction algorithm, but also can be extended to other algorithms that need to construct the nearest neighbor graph, such as clustering.
(2)通过发明步骤3将近邻图的学习与数据降维中投影矩阵的学习融合到一个框架中,在子空间中不断更新近邻图的构建,最终可得到一个合理的近邻图,该构建方法能自适应地寻找合理的近邻图,适用于不同类型的数据集。(2) Through the invention step 3, the learning of the neighbor graph and the learning of the projection matrix in the data dimensionality reduction are combined into a framework, and the construction of the neighbor graph is continuously updated in the subspace, and finally a reasonable neighbor graph can be obtained. The construction method It can adaptively find reasonable neighbor graphs and is suitable for different types of data sets.
(3)本发明提出一种基于图嵌入降维的人脸识别方法。在降维处理步骤中,通过近邻图的不断更新构建最优的近邻图,以此来更好地保持数据的局部结构,同时把数据的整体信息考虑在内,得到包含更加有效特征的低维数据。在低维空间内进行人脸识别,可以降低数据的存储量,减小数据的计算量,提高计算效率,最终提高人脸识别技术的实时性和识别精度。(3) The present invention proposes a face recognition method based on graph embedding dimension reduction. In the dimensionality reduction processing step, the optimal neighbor graph is constructed through the continuous update of the neighbor graph, so as to better maintain the local structure of the data, and at the same time, the overall information of the data is taken into account to obtain a low-dimensional image containing more effective features. data. Performing face recognition in low-dimensional space can reduce the amount of data storage, reduce the amount of data calculation, improve computing efficiency, and ultimately improve the real-time performance and recognition accuracy of face recognition technology.
图1降维方法流程图Figure 1 Flowchart of dimensionality reduction method
图2基于降维方法的人脸识别方法流程图Fig. 2 Flow chart of face recognition method based on dimensionality reduction method
现结合实施例、附图对本发明作进一步描述:The present invention will now be further described in conjunction with the embodiments and accompanying drawings:
本发明基于一种自适应近邻图嵌入的无监督数据降维方法,其基本流程图如图1所示,其具体步骤如下:The present invention is based on an unsupervised data dimensionality reduction method embedded in an adaptive nearest neighbor graph, and its basic flowchart is shown in Figure 1, and its specific steps are as follows:
步骤一:数据预处理。原始数据矩阵为X'∈R
d'×n,其中,n为样本点的数量,d'为样本点的维度,由于原始空间中不可避免地存在零空间,故首先利用主成分分析(Principal Component Analysis,PCA)对原始数据进行预处理。主成分分析是对数据的协方差矩阵进行特征值分解,特征值越大,则选取对应的特征向量作为投影矩阵时包含的有用信息越多。若选取前d个最大的特征值对应的特征向量,满足
的值在95%-99%,即保持95%-99%原始数据的能量,使得后续算法速度更快。得到的数据矩阵为X∈R
d×n,d为经PCA处理后样本点的维度。
Step 1: Data preprocessing. The original data matrix is X'∈R d'×n , where n is the number of sample points, and d' is the dimension of the sample points. Since there is inevitably a null space in the original space, the principal component analysis (Principal Component Analysis, PCA) to preprocess the raw data. Principal component analysis is to decompose the eigenvalues of the covariance matrix of the data. The larger the eigenvalue, the more useful information it contains when the corresponding eigenvector is selected as the projection matrix. If the eigenvectors corresponding to the first d largest eigenvalues are selected, it satisfies The value of 95%-99%, that is, the energy of maintaining 95%-99% of the original data, making the subsequent algorithm faster. The obtained data matrix is X∈R d×n , where d is the dimension of the sample point after PCA processing.
步骤二:构建近邻图并初始化。根据数据矩阵X∈R
d×n,构建近邻图G=(X,S),其中G表示构建的近邻图,X为近邻图中的节点集合,S表示节点之间相连的远近关系,其每个元素S
ij表示第i个节点与第j个节点之间的距离。权重矩阵S是通过最小化以下问题:
Step 2: Build the nearest neighbor graph and initialize it. According to the data matrix X∈R d×n , construct the nearest neighbor graph G=(X,S), where G represents the constructed nearest neighbor graph, X represents the node set in the nearest neighbor graph, and S represents the distance relationship between nodes. The elements S ij represent the distance between the ith node and the jth node. The weight matrix S is created by minimizing the following problem:
其中,幂指数因子r用来调节权重的大小,经验值为1.1,x
i∈R
d×1,i=1,2,...,n为矩阵X的第i个列向量,即为第i个样本点的坐标。该式表明,权重矩阵是通过计算高维空间内样本点之间的距离来衡量的,样本点之间的距离越小,则对应权重矩阵中的元素值越大,即两个样本点为近邻的可能性越大,反之,则权重矩阵中的元素值越小。
Among them, the power exponent factor r is used to adjust the size of the weight, the empirical value is 1.1, x i ∈ R d×1 , i=1,2,...,n is the ith column vector of the matrix X, that is, the ith column vector The coordinates of the i sample points. This formula shows that the weight matrix is measured by calculating the distance between the sample points in the high-dimensional space. The smaller the distance between the sample points, the larger the element value in the corresponding weight matrix, that is, the two sample points are neighbors The greater the possibility of , and vice versa, the smaller the element value in the weight matrix.
步骤三:交替迭代优化目标函数。设投影矩阵
d
1为低维空间的维度,投影矩阵W将数据从d维空间映射到d
1<<d维空间。为了保证投影后数据在统计学意义上不相关,添加约束W
TS
tW=I,S
t∈R
n×n为全局散度矩阵,
X∈R
d×n为数据矩阵,I∈R
n×n为单位矩阵,1∈R
n×1为元素均为1的矩阵。目标函数如下所示:
Step 3: Alternately iteratively optimize the objective function. set the projection matrix d 1 is the dimension of the low-dimensional space, and the projection matrix W maps the data from the d-dimensional space to the d 1 <<d-dimensional space. In order to ensure that the data after projection are not correlated in a statistical sense, add constraints W T S t W=I, S t ∈ R n×n is the global divergence matrix, X∈R d×n is the data matrix, I∈R n×n is the identity matrix, and 1∈R n×1 is the matrix whose elements are all 1. The objective function is as follows:
其中,去噪矩阵
表示子空间内去除噪声的近似矩阵,
为F的第i行向量的转置,λ为正则化参数,其取值一般较大。
Among them, the denoising matrix represents the approximate matrix for noise removal in the subspace, is the transpose of the ith row vector of F, and λ is the regularization parameter, and its value is generally larger.
目标函数中的第一项可进行以下化简,The first term in the objective function can be simplified as follows,
其中
为拉普拉斯矩阵,维度为n×n;
为度矩阵,是一个对角阵,
维度为n×n;S
r为相似度矩阵,其每一个元素为权值矩阵S中每一个元素的r次方,维度为n×n。则目标函数可化简为下式:
in is a Laplacian matrix with dimension n×n; is the degree matrix, which is a diagonal matrix, The dimension is n×n; S r is the similarity matrix, each element of which is the power of r of each element in the weight matrix S, and the dimension is n×n. Then the objective function can be simplified to the following formula:
该目标函数的求解方法为交替迭代法。固定S,求解F和W,其次固定F和W,求解S,把求得的S作为初始值S
0再次进行迭代。求解步骤如下所示:
The objective function is solved by alternate iteration method. Fix S, solve F and W, then fix F and W, solve S, and use the obtained S as the initial value S 0 to iterate again. The solution steps are as follows:
步骤3.1:固定S,求解F和W。Step 3.1: Fix S, solve F and W.
固定S,因目标函数对F无约束,故目标函数对F的偏导数为0,此时的优化函数为Fixed S, because the objective function has no constraints on F, so the partial derivative of the objective function with respect to F is 0, and the optimization function at this time is
将上式对F求偏导,并令等式为0可得下式。Taking the partial derivative of the above formula with respect to F, and setting the equation to 0, the following formula can be obtained.
可得Available
F=PX
TW (7)
F=PX T W (7)
其中
为正定实对称矩阵。在这里可以看出,当λ较大时,
接近0,则F=X
TW为子空间内去除噪声的数据矩阵。将F代入到目标函数中,可得下式,采用下式求解W。
in is a positive definite real symmetric matrix. It can be seen here that when λ is larger, Close to 0, then F=X T W is the data matrix in the subspace to remove noise. Substituting F into the objective function, the following formula can be obtained, and the following formula is used to solve W.
其中,M=X(I-P)X
T,为正定实对称矩阵,此时目标函数为下式
Among them, M=X(IP)X T , which is a positive definite real symmetric matrix, and the objective function is the following formula
采用拉格朗日乘子法求解上式,可得W为(S
t)
-1M的前d'个最小的特征值对应的特征向量组成的矩阵,当W的最优值求解出来后,代入(7)式,可得F的最优值。
Using the Lagrange multiplier method to solve the above formula, we can obtain W as a matrix composed of the eigenvectors corresponding to the first d' smallest eigenvalues of (S t ) -1 M. After the optimal value of W is solved, Substitute into (7) to obtain the optimal value of F.
步骤3.2:固定W,F求S。Step 3.2: Fix W, find S from F.
通过步骤3.1后,得到最优W和F,固定这两个参数后,此时的目标函数为After passing step 3.1, the optimal W and F are obtained. After fixing these two parameters, the objective function at this time is
观察上式,发现其对任意一个样本i,i=1,2,...,n独立。对每一个i,目标函数可写成下式。Observe the above formula and find that it is independent for any sample i, i=1,2,...,n. For each i, the objective function can be written as follows.
其中,1∈R
n×1为元素均为1的向量。
Among them, 1∈Rn ×1 is a vector whose elements are all 1.
采用拉格朗日法求解上式,Using the Lagrangian method to solve the above equation,
其中,
η为拉格朗日乘子,可得s
ij(i≠j)的最优解为
in, η is the Lagrange multiplier, and the optimal solution of s ij (i≠j) can be obtained as
从上式可看出,η应该取正值。根据KKT条件,β
i≥0,β
is
i=0。在权值矩阵S中,定义s
ii=0。故当i≠j时,β
ij=0,此时s
ij可通过(13)计算;当i=j时,s
ij=0。
It can be seen from the above formula that η should take a positive value. According to the KKT condition, β i ≥ 0, β i s i =0. In the weight matrix S, s ii =0 is defined. Therefore, when i≠j, β ij =0, and s ij can be calculated by (13) at this time; when i = j, s ij =0.
另外有s
i1=1,则有下式
In addition, if s i 1=1, then there is the following formula
可取desirable
当η确定后,观察式(15),可以得到当两个样本点之间的距离较小时,s
ij的取值较大,反之,则越小,与上述的基本假设相一致。
After η is determined, observing equation (15), it can be obtained that when the distance between the two sample points is small, the value of s ij is larger, otherwise, the smaller it is, which is consistent with the above basic assumption.
至此,S更新完毕,重新进行下一次迭代运算,直到算法收敛。求解结束后,可得到投影矩阵W,则数据降维后的数据矩阵为Y=W
TX。
So far, S is updated, and the next iterative operation is performed again until the algorithm converges. After the solution is completed, the projection matrix W can be obtained, and the data matrix after data dimension reduction is Y=W T X.
下面结合图3对实际人脸识别方法实例说明本发明的具体实施方式,但本发明的技术内容不限于所述的范围。The specific embodiments of the present invention are described below with reference to FIG. 3 for an example of an actual face recognition method, but the technical content of the present invention is not limited to the described scope.
本发明提出一种基于降维数据的人脸识别方法,包括以下步骤:The present invention proposes a face recognition method based on dimensionality reduction data, comprising the following steps:
步骤一:构建人脸数据库,进行识别人脸图像的采集,并进行数据预处理。假设人脸图像张数为
n,大小为32×32,则可根据人脸图像的灰度值将每张图片可以拉长为维度为1024的向量,对原始数据采用PCA进行预处理,保留原始数据95%的能量,得到维度为273,则数据矩阵X∈R
273×n。
Step 1: Build a face database, collect face images for recognition, and perform data preprocessing. Assuming that the number of face images is n and the size is 32×32, each image can be elongated into a vector with a dimension of 1024 according to the gray value of the face image, and the original data is preprocessed by PCA to retain the original The energy of the data is 95%, and the dimension is 273, then the data matrix X∈R 273×n .
步骤二:构建近邻图并初始化。根据数据矩阵X∈R
d×n,构建近邻图G=(X,S),其中G表示构建的近邻图,X为近邻图中的节点集合,S表示节点之间相连的远近关系,其每个元素S
ij表示第i个节点与第j个节点之间的距离。初始权重矩阵S是通过最小化以下问题:
Step 2: Build the nearest neighbor graph and initialize it. According to the data matrix X∈R d×n , construct the nearest neighbor graph G=(X,S), where G represents the constructed nearest neighbor graph, X represents the node set in the nearest neighbor graph, and S represents the distance relationship between nodes. The elements S ij represent the distance between the ith node and the jth node. The initial weight matrix S is obtained by minimizing the following problem:
步骤三:交替迭代优化目标函数。设低维空间的维度为30,则投影矩阵W∈R
273×30。为了保证投影后数据在统计学意义上不相关,添加约束W
TS
tW=I,S
t∈R
n×n为全局散度矩阵,表达式为
其包含了数据的整体信息。X∈R
273×n为数据矩阵,I∈R
n×n为单位矩阵,1∈R
n×1为元素均为1的矩阵。目标函数如下所示:
Step 3: Alternately iteratively optimize the objective function. Assuming that the dimension of the low-dimensional space is 30, the projection matrix W∈R 273×30 . In order to ensure that the data after projection are uncorrelated in a statistical sense, add constraints W T S t W=I, S t ∈ R n×n is the global divergence matrix, the expression is It contains the overall information of the data. X∈R 273×n is the data matrix, I∈R n×n is the identity matrix, and 1∈R n×1 is the matrix whose elements are all 1. The objective function is as follows:
其中,去噪矩阵F∈R
n×30,表示子空间内去除噪声的近似矩阵,f
i∈R
30×1为F的第i行向量的转置,λ为正则化参数,其取值一般较大。
Among them, the denoising matrix F∈R n×30 represents the approximate matrix for denoising noise in the subspace, f i ∈ R 30×1 is the transpose of the ith row vector of F, λ is the regularization parameter, and its value is generally larger.
目标函数中的第一项可进行以下化简,The first term in the objective function can be simplified as follows,
其中
为拉普拉斯矩阵,维度为n×n;
为度矩阵,是一个对角阵,
维度为n×n;S
r为相似度矩阵,其每一个元素为权值矩阵S中每一个元素的r次方,维度为n×n。则目标函数可化简为下式:
in is a Laplacian matrix with dimension n×n; is the degree matrix, which is a diagonal matrix, The dimension is n×n; S r is the similarity matrix, each element of which is the power of r of each element in the weight matrix S, and the dimension is n×n. Then the objective function can be simplified to the following formula:
该目标函数的求解方法为交替迭代法,在步骤(2)中已得到S
0,则固定S=S
0, 求解F和W,其次固定F和W,求解S,把求得的S作为初始值S
0再次进行迭代。求解步骤如下所示:
The solution method of the objective function is the alternate iteration method. In step (2), S 0 has been obtained, then fix S = S 0 , solve F and W, then fix F and W, solve S, and take the obtained S as the initial The value S0 is iterated again. The solution steps are as follows:
步骤3.1:固定S,求解F和W。Step 3.1: Fix S, solve for F and W.
固定S,因目标函数对F无约束,目标函数对F的偏导数为0,此时的优化函数为Fixed S, because the objective function has no constraint on F, the partial derivative of the objective function with respect to F is 0, and the optimization function at this time is
将上式对F求偏导,并令等式为0可得下式。Taking the partial derivative of the above formula with respect to F, and setting the equation to 0, the following formula can be obtained.
可用W把F表示出来,得We can use W to express F, we get
F=PX
TW (7)
F=PX T W (7)
其中
为正定实对称矩阵。在这里可以看出,当λ较大时,
接近0,则F≈X
TW为数据降维后去除了噪声的数据矩阵。将F代入到目标函数中,可得下式,采用下式求解W。
in is a positive definite real symmetric matrix. It can be seen here that when λ is larger, Close to 0, then F≈X T W is the data matrix with noise removed after data dimension reduction. Substituting F into the objective function, the following formula can be obtained, and the following formula is used to solve W.
其中,M=X(I-P)X
T,为正定实对称矩阵,此时目标函数为下式
Among them, M=X(IP)X T , which is a positive definite real symmetric matrix, and the objective function is the following formula
采用拉格朗日乘子法求解上式,可得W为(S
t)
-1M的前30个最小的特征值对应的特征向量组成的矩阵,当W的最优值求解出来后,代入(7)式,可得F的最优值。
Using the Lagrange multiplier method to solve the above formula, we can obtain W as a matrix composed of the eigenvectors corresponding to the first 30 smallest eigenvalues of (S t ) -1 M. When the optimal value of W is solved, substitute it into (7), the optimal value of F can be obtained.
步骤3.2:固定W,F求S。Step 3.2: Fix W, find S from F.
固定W,F后,此时的目标函数为After fixing W and F, the objective function at this time is
观察上式,发现其对任意一个样本i,i=1,2,...,n独立。对每一个i,目标函数可写成下式。Observe the above formula and find that it is independent for any sample i, i=1,2,...,n. For each i, the objective function can be written as follows.
采用拉格朗日法求解上式,拉格朗日函数如下Using the Lagrangian method to solve the above equation, the Lagrangian function is as follows
其中,
η为拉格朗日乘子,可得s
ij(i≠j)的最优解为
in, η is the Lagrange multiplier, and the optimal solution of s ij (i≠j) can be obtained as
从上式可看出,η应该取正值。根据KKT条件,β
i≥0,β
is
i=0。在权值矩阵S中,定义s
ii=0。故当i≠j时,β
ij=0,此时s
ij可通过(13)计算;当i=j时,s
ij=0。
It can be seen from the above formula that η should take a positive value. According to the KKT condition, β i ≥ 0, β i s i =0. In the weight matrix S, s ii =0 is defined. Therefore, when i≠j, β ij =0, and s ij can be calculated by (13) at this time; when i = j, s ij =0.
另外有s
i1=1,则有下式
In addition, if s i 1=1, then there is the following formula
可取desirable
当η确定后,观察式(15),可以得到当两个样本点之间的距离较小时,s
ij的取值较大,反之,则越小,与上面的基本假设相一致,S更新完毕,此时计算目标函数的值。若两次迭代产生的目标函数值之差的绝对值满足一定精度(如10
-6),则停止迭代,得 到最后的W
opt和S
opt。求解结束后,可得到投影矩阵W
opt∈R
273×30,则数据降维后的数据矩阵为Y=W
opt
TX∈R
30×n。对低维数据采用无监督聚类算法进行聚类,得到各类别的聚类中心。
After η is determined, observe equation (15), it can be obtained that when the distance between the two sample points is small, the value of s ij is larger, otherwise, the smaller the value is, which is consistent with the above basic assumption, and the update of S is completed. , the value of the objective function is calculated at this time. If the absolute value of the difference between the objective function values generated by the two iterations satisfies a certain precision (for example, 10 -6 ), the iteration is stopped, and the final W opt and S opt are obtained. After the solution is completed, the projection matrix W opt ∈ R 273×30 can be obtained, and the data matrix after data dimension reduction is Y=W opt T X∈R 30×n . The low-dimensional data is clustered by unsupervised clustering algorithm, and the cluster centers of each category are obtained.
步骤四:当摄像头采集到新的人脸图像后,利用得到的投影矩阵W
opt对新图片进行降维处理,得到低维的投影坐标,计算低维投影坐标与各个聚类中心之间的欧式距离并取欧式距离最小的聚类中心,则该聚类中心所属的类别就是新人脸图像的类别。
Step 4: After the camera collects a new face image, use the obtained projection matrix W opt to perform dimension reduction processing on the new image, obtain low-dimensional projection coordinates, and calculate the Euclidean relationship between the low-dimensional projection coordinates and each cluster center. The distance and the cluster center with the smallest Euclidean distance are taken, then the category to which the cluster center belongs is the category of the new face image.
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明公开的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed by the present invention. Modifications or substitutions should be included within the protection scope of the present invention.
Claims (4)
- 一种基于自适应近邻图嵌入的无监督数据降维方法,其特征在于步骤如下:An unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding, characterized in that the steps are as follows:步骤1:数据预处理Step 1: Data Preprocessing采用主成分分析PCA对原始数据 进行预处理得到数据矩阵X∈R d×n,其中,n为样本点的数量,d'为样本点的维度,d为经PCA处理后样本点的维度; Principal component analysis PCA on raw data Perform preprocessing to obtain a data matrix X∈R d×n , where n is the number of sample points, d' is the dimension of the sample point, and d is the dimension of the sample point after PCA processing;步骤2:构建近邻图并初始化Step 2: Build the nearest neighbor graph and initialize it根据数据矩阵X∈R d×n,构建近邻图G=(X,S),其中G表示构建的近邻图,X为近邻图中的节点集合,S表示节点之间相连的远近关系,其每个元素S ij表示第i个节点与第j个节点之间的距离;权重矩阵S是通过最小化以下问题: According to the data matrix X∈R d×n , construct the nearest neighbor graph G=(X,S), where G represents the constructed nearest neighbor graph, X represents the node set in the nearest neighbor graph, and S represents the distance relationship between nodes. The elements S ij represent the distance between the ith node and the jth node; the weight matrix S is obtained by minimizing the following problem:其中,幂指数因子r用来调节权重的大小,x i∈R d×1,i=1,2,...,n为矩阵X的第i个列向量,即为第i个样本点的坐标; Among them, the power exponent factor r is used to adjust the size of the weight, x i ∈R d×1 , i=1,2,...,n is the i-th column vector of the matrix X, that is, the i-th sample point coordinate;步骤3:交替迭代优化目标函数Step 3: Alternately iteratively optimize the objective function设投影矩阵 d 1为低维空间的维度,投影矩阵W将数据从d维空间映射到d 1<<d维空间;为了保证投影后数据在统计学意义上不相关,添加约束W TS tW=I, 为全局散度矩阵, 为数据矩阵, 为单位矩阵, 为元素均为1的矩阵;目标函数如下所示: set the projection matrix d 1 is the dimension of the low-dimensional space, and the projection matrix W maps the data from the d-dimensional space to the d 1 <<d-dimensional space; in order to ensure that the data after projection is statistically irrelevant, add the constraint W T S t W=I , is the global divergence matrix, is the data matrix, is the identity matrix, is a matrix whose elements are all 1; the objective function is as follows:其中,去噪矩阵 表示子空间内去除噪声的近似矩阵, 为F的第i行向量的转置,λ为正则化参数; Among them, the denoising matrix represents the approximate matrix for noise removal in the subspace, is the transpose of the ith row vector of F, and λ is the regularization parameter;目标函数中的第一项可进行以下化简:The first term in the objective function can be simplified as follows:其中 为拉普拉斯矩阵,维度为n×n; 为度矩阵,是一个对角阵, 维度为n×n;S r为相似度矩阵,其每一个元素为权值矩阵S中每一个元素的r次方,维度为n×n;则目标函数可化简为下式: in is a Laplacian matrix with dimension n×n; is the degree matrix, which is a diagonal matrix, The dimension is n×n; S r is the similarity matrix, each element of which is the r power of each element in the weight matrix S, and the dimension is n×n; then the objective function can be simplified to the following formula:采用交替迭代法求解上述目标函数,得到投影矩阵W,则数据降维后的数据矩阵为Y=W TX。 The above objective function is solved by the alternate iteration method, and the projection matrix W is obtained, and the data matrix after data dimension reduction is Y=W T X.
- 根据权利要求1所述的基于自适应近邻图嵌入的无监督数据降维方法,其特征在于步骤1中的数据为人脸图像或高光谱图像。The method for unsupervised data dimensionality reduction based on adaptive nearest neighbor graph embedding according to claim 1, wherein the data in step 1 is a face image or a hyperspectral image.
- 根据权利要求1所述的基于自适应近邻图嵌入的无监督数据降维方法,其特征在于步骤2中r取1.1。The unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding according to claim 1 is characterized in that in step 2, r is taken as 1.1.
- 一种基于权利要求1所述的数据降维方法的人脸识别方法,其特征在于采用权利要求1所述的降维方法对人脸图像进行降维得到投影矩阵和低维数据,对低维数据采用无监督聚类算法进行聚类,得到各类别的聚类中心;当摄像头采集到新的人脸图像后,利用得到的投影矩阵对新图片进行降维处理,得到低维的投影坐标,计算计算低维投影坐标与各个聚类中心之间的欧式距离并取欧式距离最小的聚类中心,则该聚类中心所属的类别就是新人脸图像的类别。A face recognition method based on the data dimensionality reduction method according to claim 1, characterized in that the dimensionality reduction method according to claim 1 is used to reduce the dimension of the face image to obtain a projection matrix and low-dimensional data, and the low-dimensional The data is clustered by an unsupervised clustering algorithm to obtain the clustering centers of various categories; when the camera collects a new face image, the new image is dimensionally reduced using the obtained projection matrix to obtain the low-dimensional projection coordinates. Calculate the Euclidean distance between the low-dimensional projection coordinates and each cluster center, and take the cluster center with the smallest Euclidean distance, then the category to which the cluster center belongs is the category of the new face image.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110216073.4A CN112836672B (en) | 2021-02-26 | 2021-02-26 | Unsupervised data dimension reduction method based on self-adaptive neighbor graph embedding |
CN202110216073.4 | 2021-02-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022178977A1 true WO2022178977A1 (en) | 2022-09-01 |
Family
ID=75933743
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/090827 WO2022178977A1 (en) | 2021-02-26 | 2021-04-29 | Unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112836672B (en) |
WO (1) | WO2022178977A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115358354A (en) * | 2022-10-24 | 2022-11-18 | 中国水利水电科学研究院 | Rainfall space data restoration and reconstruction method |
CN115545108A (en) * | 2022-10-09 | 2022-12-30 | 贵州电网有限责任公司 | Cloud computing-based big data information analysis system and method |
CN115754599A (en) * | 2022-11-10 | 2023-03-07 | 海南电网有限责任公司乐东供电局 | Cable fault positioning method and device based on transfer learning |
CN115861683A (en) * | 2022-11-16 | 2023-03-28 | 西安科技大学 | Rapid dimensionality reduction method for hyperspectral image |
CN116126931A (en) * | 2022-12-08 | 2023-05-16 | 湖北华中电力科技开发有限责任公司 | Power consumption data mining method, device and system for power distribution area and storage medium |
CN116246779A (en) * | 2023-05-10 | 2023-06-09 | 潍坊护理职业学院 | Dental diagnosis and treatment scheme generation method and system based on user image data |
CN116295539A (en) * | 2023-05-18 | 2023-06-23 | 山东省地质矿产勘查开发局八〇一水文地质工程地质大队(山东省地矿工程勘察院) | Underground space monitoring method based on urban underground space exploration data |
CN116416559A (en) * | 2023-04-14 | 2023-07-11 | 江南大学 | Event-based spectral clustering moving object detection method and system |
CN116738866A (en) * | 2023-08-11 | 2023-09-12 | 中国石油大学(华东) | Instant learning soft measurement modeling method based on time sequence feature extraction |
CN117576493A (en) * | 2024-01-16 | 2024-02-20 | 武汉明炀大数据科技有限公司 | Cloud storage compression method and system for large sample data |
CN118052480A (en) * | 2024-04-16 | 2024-05-17 | 匠达(苏州)科技有限公司 | Method for quickly acquiring portrait data of home engineer based on deep learning |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113298009B (en) * | 2021-06-05 | 2024-05-31 | 西北工业大学 | Entropy regularization-based self-adaptive adjacent face image clustering method |
CN113792767B (en) * | 2021-08-27 | 2023-06-27 | 国网福建省电力有限公司 | Load electricity utilization characteristic monitoring and analyzing method based on graph signal processing |
CN114419382A (en) * | 2021-11-30 | 2022-04-29 | 西安交通大学 | Method and system for embedding picture of unsupervised multi-view image |
CN115131854B (en) * | 2022-06-13 | 2024-02-23 | 西北工业大学 | Global subspace face image clustering method based on fuzzy clustering |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526997A (en) * | 2009-04-22 | 2009-09-09 | 无锡名鹰科技发展有限公司 | Embedded infrared face image identifying method and identifying device |
CN105138993A (en) * | 2015-08-31 | 2015-12-09 | 小米科技有限责任公司 | Method and device for building face recognition model |
CN105809125A (en) * | 2016-03-06 | 2016-07-27 | 北京工业大学 | Multi-core ARM platform based human face recognition system |
US20170236000A1 (en) * | 2016-02-16 | 2017-08-17 | Samsung Electronics Co., Ltd. | Method of extracting feature of image to recognize object |
CN107832715A (en) * | 2017-11-15 | 2018-03-23 | 天津大学 | A kind of face recognition algorithms of adaptive neighbour |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104239859B (en) * | 2014-09-05 | 2017-09-26 | 西安电子科技大学 | Face identification method based on structuring factorial analysis |
CN104616000B (en) * | 2015-02-27 | 2018-08-07 | 苏州大学 | A kind of face identification method and device |
-
2021
- 2021-02-26 CN CN202110216073.4A patent/CN112836672B/en active Active
- 2021-04-29 WO PCT/CN2021/090827 patent/WO2022178977A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101526997A (en) * | 2009-04-22 | 2009-09-09 | 无锡名鹰科技发展有限公司 | Embedded infrared face image identifying method and identifying device |
CN105138993A (en) * | 2015-08-31 | 2015-12-09 | 小米科技有限责任公司 | Method and device for building face recognition model |
US20170236000A1 (en) * | 2016-02-16 | 2017-08-17 | Samsung Electronics Co., Ltd. | Method of extracting feature of image to recognize object |
CN105809125A (en) * | 2016-03-06 | 2016-07-27 | 北京工业大学 | Multi-core ARM platform based human face recognition system |
CN107832715A (en) * | 2017-11-15 | 2018-03-23 | 天津大学 | A kind of face recognition algorithms of adaptive neighbour |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545108B (en) * | 2022-10-09 | 2023-08-04 | 贵州电网有限责任公司 | Big data information analysis system and method based on cloud computing |
CN115545108A (en) * | 2022-10-09 | 2022-12-30 | 贵州电网有限责任公司 | Cloud computing-based big data information analysis system and method |
CN115358354A (en) * | 2022-10-24 | 2022-11-18 | 中国水利水电科学研究院 | Rainfall space data restoration and reconstruction method |
CN115754599A (en) * | 2022-11-10 | 2023-03-07 | 海南电网有限责任公司乐东供电局 | Cable fault positioning method and device based on transfer learning |
CN115861683A (en) * | 2022-11-16 | 2023-03-28 | 西安科技大学 | Rapid dimensionality reduction method for hyperspectral image |
CN115861683B (en) * | 2022-11-16 | 2024-01-16 | 西安科技大学 | Rapid dimension reduction method for hyperspectral image |
CN116126931A (en) * | 2022-12-08 | 2023-05-16 | 湖北华中电力科技开发有限责任公司 | Power consumption data mining method, device and system for power distribution area and storage medium |
CN116126931B (en) * | 2022-12-08 | 2024-02-13 | 湖北华中电力科技开发有限责任公司 | Power consumption data mining method, device and system for power distribution area and storage medium |
CN116416559A (en) * | 2023-04-14 | 2023-07-11 | 江南大学 | Event-based spectral clustering moving object detection method and system |
CN116246779B (en) * | 2023-05-10 | 2023-08-01 | 潍坊护理职业学院 | Dental diagnosis and treatment scheme generation method and system based on user image data |
CN116246779A (en) * | 2023-05-10 | 2023-06-09 | 潍坊护理职业学院 | Dental diagnosis and treatment scheme generation method and system based on user image data |
CN116295539B (en) * | 2023-05-18 | 2023-08-11 | 山东省地质矿产勘查开发局八〇一水文地质工程地质大队(山东省地矿工程勘察院) | Underground space monitoring method based on urban underground space exploration data |
CN116295539A (en) * | 2023-05-18 | 2023-06-23 | 山东省地质矿产勘查开发局八〇一水文地质工程地质大队(山东省地矿工程勘察院) | Underground space monitoring method based on urban underground space exploration data |
CN116738866A (en) * | 2023-08-11 | 2023-09-12 | 中国石油大学(华东) | Instant learning soft measurement modeling method based on time sequence feature extraction |
CN116738866B (en) * | 2023-08-11 | 2023-10-27 | 中国石油大学(华东) | Instant learning soft measurement modeling method based on time sequence feature extraction |
CN117576493A (en) * | 2024-01-16 | 2024-02-20 | 武汉明炀大数据科技有限公司 | Cloud storage compression method and system for large sample data |
CN117576493B (en) * | 2024-01-16 | 2024-04-02 | 武汉明炀大数据科技有限公司 | Cloud storage compression method and system for large sample data |
CN118052480A (en) * | 2024-04-16 | 2024-05-17 | 匠达(苏州)科技有限公司 | Method for quickly acquiring portrait data of home engineer based on deep learning |
Also Published As
Publication number | Publication date |
---|---|
CN112836672A (en) | 2021-05-25 |
CN112836672B (en) | 2023-09-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022178977A1 (en) | Unsupervised data dimensionality reduction method based on adaptive nearest neighbor graph embedding | |
Huixian | The analysis of plants image recognition based on deep learning and artificial neural network | |
Shen et al. | Non-negative matrix factorization clustering on multiple manifolds | |
Qiang et al. | Fast multi-view discrete clustering with anchor graphs | |
Jing et al. | Learning robust affinity graph representation for multi-view clustering | |
Liu et al. | Group collaborative representation for image set classification | |
CN110717519B (en) | Training, feature extraction and classification method, device and storage medium | |
JP2011008631A (en) | Image conversion method and device, and pattern identification method and device | |
CN109871880A (en) | Feature extracting method based on low-rank sparse matrix decomposition, local geometry holding and classification information maximum statistical correlation | |
CN112330158A (en) | Method for identifying traffic index time sequence based on autoregressive differential moving average-convolution neural network | |
WO2022134420A1 (en) | Unsupervised data dimensionality reduction method based on noise suppression | |
Zhang et al. | Robust unsupervised flexible auto-weighted local-coordinate concept factorization for image clustering | |
CN114863151A (en) | Image dimensionality reduction clustering method based on fuzzy theory | |
CN110991326A (en) | Gait recognition method and system based on Gabor filter and improved extreme learning machine | |
CN108388918B (en) | Data feature selection method with structure retention characteristics | |
Yang et al. | Label propagation algorithm based on non-negative sparse representation | |
Cheung et al. | Unsupervised feature selection with feature clustering | |
CN111310807B (en) | Feature subspace and affinity matrix joint learning method based on heterogeneous feature joint self-expression | |
Yao | A compressed deep convolutional neural networks for face recognition | |
CN115131854B (en) | Global subspace face image clustering method based on fuzzy clustering | |
Guan et al. | Multi-pose face recognition using cascade alignment network and incremental clustering | |
Min et al. | Unsupervised feature selection via multi-step markov probability relationship | |
CN114386494A (en) | Product full life cycle quality tracing method and device based on extensible ontology | |
CN110288606B (en) | Three-dimensional grid model segmentation method of extreme learning machine based on ant lion optimization | |
CN109978066B (en) | Rapid spectral clustering method based on multi-scale data structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21927410 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21927410 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.02.2024) |