CN113255490A - Unsupervised pedestrian re-identification method based on k-means clustering and merging - Google Patents

Unsupervised pedestrian re-identification method based on k-means clustering and merging Download PDF

Info

Publication number
CN113255490A
CN113255490A CN202110530514.8A CN202110530514A CN113255490A CN 113255490 A CN113255490 A CN 113255490A CN 202110530514 A CN202110530514 A CN 202110530514A CN 113255490 A CN113255490 A CN 113255490A
Authority
CN
China
Prior art keywords
pedestrian
identification
features
network
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110530514.8A
Other languages
Chinese (zh)
Inventor
何建军
蔡华鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Univeristy of Technology
Original Assignee
Chengdu Univeristy of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Univeristy of Technology filed Critical Chengdu Univeristy of Technology
Priority to CN202110530514.8A priority Critical patent/CN113255490A/en
Publication of CN113255490A publication Critical patent/CN113255490A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the field of pedestrian re-identification in deep learning, in particular to an unsupervised pedestrian re-identification method based on K-means clustering and merging, which comprises the steps that an unsupervised clustering and merging algorithm based on K-means is used for image processing, and a network model based on SE-ResNet50 is used for pedestrian re-identification. The method comprises the following steps: selecting a pedestrian re-identification data set; constructing a pedestrian re-identification model comprising a feature extraction network and a connection prediction network, and then clustering and merging the extracted feature images by using a K-means algorithm; and finally, training the clustered image fusion network to obtain a result of pedestrian re-identification. The method obtains good results on two pedestrian re-identification data sets Market-1501 and Duke MTMC-reiD, and has certain help effect on pedestrian re-identification in the field of unsupervised learning.

Description

Unsupervised pedestrian re-identification method based on k-means clustering and merging
Technical Field
The invention relates to pedestrian re-identification in deep learning, which utilizes SE-ResNet50 to extract image features and then carries out training after k-means clustering, and belongs to the field of computer vision.
Background
Pedestrian re-identification is a popular research direction in the field of computer vision in recent years, and is a technology for searching and identifying a specific pedestrian in a cross-camera and cross-scene by using a computer vision technology. The pedestrian re-identification has good application in the fields of criminal investigation, medical imaging, security and intelligent life.
Due to the improvement of computer hardware and the rapid increase of calculated amount, deep learning also enters the golden period of development and obtains a plurality of excellent achievements, and similarly, the deep learning also enters a hundred-flower full-scale state in the pedestrian re-identification direction, and the deep neural network forms abstract deep features by extracting basic features of a bottom layer, so that the internal rules of data are found. The mainstream deep learning technology is used for pedestrian re-recognition of networks such as GoogleNet and ResNet, and pedestrian image features are extracted first and then the pedestrian re-recognition is carried out.
However, although the conventional pedestrian re-identification method achieves good effects, it is a supervised learning method based on classification data [, which is still in a slow stage in cross-domain unsupervised or semi-supervised pedestrian re-identification, and mainly has the following problems: 1) the model trained under the supervision condition is directly transferred to a target domain for testing, so that the performance is greatly reduced; 2) it is very difficult to perform domain adaptation without id tagged target domain data.
Therefore, in order to solve the difficulties, an improved unsupervised K-means clustering and merging pedestrian re-identification method is provided, and the method is different in that on the basis of the original bottom-up clustering, an SE module is added into a convolutional neural network to extract pedestrian features, and then K-means clustering and merging are carried out on the features extracted by the network, so that pedestrian pictures with similar features are classified into one class, and then a model is retrained until the model converges, namely training is stopped.
Disclosure of Invention
The invention utilizes SE-ResNet-50 to extract image characteristics, then carries out K-means clustering combination, and finally carries out pedestrian re-identification, wherein the overall network diagram is shown in figure 2.
The first step is as follows: SE-ResNet-50 extracts image abstract features.
We specify the input feature map size as H × C × W, where H, C, W denotes the height, width, and number of channels, respectively, of the input feature map. cu is the value of the corresponding point on the feature map. The input feature map is first convolved by 3x3 and maximally pooled.
The last two layers of the convolutional neural network are combined with the SE module to form a pedestrian re-identification network structure based on the SE module, so that the characteristics of the SE module are fully utilized, the convolutional neural network is helped to improve the identification capability of information, the importance of each channel is further easily identified, the important characteristics are improved, and the secondary characteristics are inhibited.
Under the full-connection receptive field of a time-space domain, the channel weight coefficient is changed into 1 multiplied by C, wherein the purpose of multiplication of the channel weight coefficient and the related channel number is to calibrate the final characteristic, and a small amount of parameters are added properly to enable the network to better identify the spatial characteristic.
The second step is that: and clustering the extracted characteristic images by using a K-means algorithm.
(1) Randomly selecting K elements from the extracted original characteristic data as respective central points of K groups,
(2) the distances from the K elements to the K clusters are calculated respectively, and the elements are divided into the clusters closest to the elements, and the formula is as follows.
Figure BDA0003067562450000021
E is each feature point and randomly selected feature point muiThe distance of (c).
(3) And recalculating the central point of each clique by using the averaging principle. The formula is as follows.
Figure BDA0003067562450000022
Ci is a different clustering group, and the mean value of K is minimized to optimize the K-means result.
(4) The last two steps are repeated until the results of the iteration converge.
And thirdly, training the unsupervised clustered pedestrian images by fusing a network.
Because the clustering-based pedestrian re-recognition task is more complex, a deeper network is required for training, and theoretically, the convolutional neural network can improve the generalization capability of the model along with the increase of the number of layers [18 ]. In fact, the convolutional neural network has a degradation problem in a model while the number of layers is increased, the root cause is that the convolutional neural network is influenced by factors such as gradient reduction in transmission between layers, and in order to better improve the recognition effect of the model on an input picture, a SENET network is introduced into the convolutional neural network.
And fourthly, evaluating the performance of the model on the test set.
The method used in the present invention was evaluated effectively using the Cumulative Matching Curve (CMC), mean of average precision (maps). Cumulative Matching Curves (CMCs) are common evaluation criteria in pedestrian re-identification. Firstly, setting all searched pedestrian pictures as N, then bringing the pictures into a candidate set, then comparing the brought pictures in a set with similar distances, then sequencing the pedestrian images in the set according to the distances, wherein sequencing results are expressed by (1,2 Nr.. rNr), and finally counting to obtain a CMC curve, wherein the closer the sequence of the pedestrians to be searched is, the better the algorithm effect is.
Drawings
FIG. 1 is a network flow diagram
FIG. 2 is a diagram of a network architecture based on SE modules
FIG. 3 basic framework diagram for unsupervised learning
FIG. 4 Cluster merge graph
FIG. 5 is a graph showing the results of the experiment
FIG. 6 is a graph showing the results of the experiment
Detailed Description
A SE-ResNet50 network is built by utilizing a Pythroch deep learning framework, under a Centoss 6.0 environment, NVIDIA GTX1080Ti is adopted as hardware configuration, a Market-1501 data set and a DukeMTMC-reiD pedestrian re-identification data set are adopted as an experimental data set, firstly, in the first stage, training is carried out to enable epochs to be 20, batch _ size to be 16, drop to be 0.5, mp to be 0.05 and lambda to be 0.005. SGD optimization was then performed to achieve momentum of 0.9, 15epochs before Ir of 0.1, and 5epochs after Ir of 0.01. The final graphics card was NVIDIAGTX1080Ti, we tested on the Market-1501 dataset and the DukeMTMC dataset, taking 39 hours to complete training.
1. Feature extraction stage
The SE-ResNet50 is used for helping the convolutional neural network to improve the information recognition capability, each characteristic channel is compressed into a real number, the real number has a global receptive field, a compressed real number set corresponding to each characteristic channel is obtained after all the characteristic channels are compressed, each real number represents a corresponding global characteristic channel, and the number of the real numbers in the real number set is the same as the total number of the characteristic channels.
And then, introducing an excitation function, performing global average pooling on C H W to obtain a feature map of 1X 1C, wherein the feature map has a global receptive field at this stage, then performing excitation operation, performing nonlinear transformation on the result obtained in the first step through a fully-connected neural network, and multiplying the obtained result to the input feature as weight.
2. Image feature clustering stage
As shown in fig. 4, k feature points are randomly selected as initial features, then the distance between each feature and the selected feature is sequentially calculated, and the feature points are clustered to the feature point closest to the feature point, that is, the feature point most similar to the feature point, and finally k cluster families are obtained. The pedestrian features are classified in a spherical mode, so that the diversity of the pedestrian categories is maximized. In (b), the merging of clusters is performed so that the features embedded in the same cluster of spheres are closer and closer. In (c), the top half of the sphere shows cluster merge results without diversity regularization: (point 1, point 3) and (point 4, point 8) have the shortest distance and then merge into one cluster. The sphere in the lower half shows the cluster merge results for diversity regularization: although the distance between the yellow and green clusters is shortest, the two clusters are too large to merge, but instead merge point 6 and point 7.
3. Network training phase
And (3) putting the images subjected to clustering and merging into a pedestrian re-recognition network for training, wherein the network selects SE-ResNet50, and the training is ended until convergence.
4. Comparison of Experimental results
Compared with the existing unsupervised method, the improved method has the advantage that the final realization effect is obviously improved. The Market-1501 data set and the DukeMTMC-reiD pedestrian re-identification data set are selected in the experiment, and an accumulated matching curve (CMC) and an average precision mean value (mAP) are used when the method is effectively evaluated. The results of the experiment are shown in tables 1 and 2.
The method presented in table 1 compares the two image-based datasets with the current advanced method
Figure BDA0003067562450000041
Table 2 comparison with the exemplary method on two data sets
Figure BDA0003067562450000042

Claims (2)

1. An unsupervised clustering merging algorithm based on K-means is used for image processing and is characterized by comprising the following steps of:
1) selecting a pedestrian re-identification data set;
2) designing an unsupervised learning basic framework, wherein a network for extracting image features comprises a convolutional layer, a maximum pooling layer and an activation layer;
3) in order to obtain a group of images with the most similar features, clustering is carried out by using a K-means algorithm, and the distance between each feature is calculated according to the following formula:
Figure FDA0003067562440000011
2. a network model based on SE-ResNet50 is used for pedestrian re-identification, and is characterized by comprising the following steps:
1) the feature network model adds the SE module into the convolutional neural network to avoid errors in the clustering process. In the SE network, an SE module is the most core part and can compress a large number of identified features, so that only important feature information is processed, and the purpose of effectively extracting the features is finally achieved;
2) performing 'compression' operation on the features one by one in the spatial dimension, and replacing the input feature channel number with a real number to ensure that the real number has the features of the global receptive field and the input and the output are in more one-to-one correspondence;
3) performing excitation operation on the features, and calibrating a weight information for all feature channels so as to embody the importance of the feature channels;
4) according to the characteristic channel weighting information calibrated by the excitation operation, weighting each channel by adopting a multiplication principle based on the initial characteristic, thereby re-labeling the original characteristic on the channel.
CN202110530514.8A 2021-05-15 2021-05-15 Unsupervised pedestrian re-identification method based on k-means clustering and merging Pending CN113255490A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110530514.8A CN113255490A (en) 2021-05-15 2021-05-15 Unsupervised pedestrian re-identification method based on k-means clustering and merging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110530514.8A CN113255490A (en) 2021-05-15 2021-05-15 Unsupervised pedestrian re-identification method based on k-means clustering and merging

Publications (1)

Publication Number Publication Date
CN113255490A true CN113255490A (en) 2021-08-13

Family

ID=77182031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110530514.8A Pending CN113255490A (en) 2021-05-15 2021-05-15 Unsupervised pedestrian re-identification method based on k-means clustering and merging

Country Status (1)

Country Link
CN (1) CN113255490A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110400349A (en) * 2019-07-03 2019-11-01 成都理工大学 Robot navigation tracks restoration methods under small scene based on random forest
CN111860678A (en) * 2020-07-29 2020-10-30 中国矿业大学 Unsupervised cross-domain pedestrian re-identification method based on clustering
CN112560604A (en) * 2020-12-04 2021-03-26 中南大学 Pedestrian re-identification method based on local feature relationship fusion
CN112766237A (en) * 2021-03-12 2021-05-07 东北林业大学 Unsupervised pedestrian re-identification method based on cluster feature point clustering

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110400349A (en) * 2019-07-03 2019-11-01 成都理工大学 Robot navigation tracks restoration methods under small scene based on random forest
CN111860678A (en) * 2020-07-29 2020-10-30 中国矿业大学 Unsupervised cross-domain pedestrian re-identification method based on clustering
CN112560604A (en) * 2020-12-04 2021-03-26 中南大学 Pedestrian re-identification method based on local feature relationship fusion
CN112766237A (en) * 2021-03-12 2021-05-07 东北林业大学 Unsupervised pedestrian re-identification method based on cluster feature point clustering

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邬可 等: ""基于压缩激励残差网络与特征融合的行人重识别"", 《激光与光电子学进展》 *

Similar Documents

Publication Publication Date Title
CN108830296B (en) Improved high-resolution remote sensing image classification method based on deep learning
CN110738146B (en) Target re-recognition neural network and construction method and application thereof
CN109165566B (en) Face recognition convolutional neural network training method based on novel loss function
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN113516012B (en) Pedestrian re-identification method and system based on multi-level feature fusion
CN110348399B (en) Hyperspectral intelligent classification method based on prototype learning mechanism and multidimensional residual error network
CN108256482B (en) Face age estimation method for distributed learning based on convolutional neural network
CN109828251A (en) Radar target identification method based on feature pyramid light weight convolutional neural networks
CN112580590A (en) Finger vein identification method based on multi-semantic feature fusion network
CN109063649B (en) Pedestrian re-identification method based on twin pedestrian alignment residual error network
CN113688894B (en) Fine granularity image classification method integrating multiple granularity features
CN112200123B (en) Hyperspectral open set classification method combining dense connection network and sample distribution
CN114299542A (en) Video pedestrian re-identification method based on multi-scale feature fusion
CN114463812B (en) Low-resolution face recognition method based on double-channel multi-branch fusion feature distillation
CN112329536A (en) Single-sample face recognition method based on alternative pair anti-migration learning
CN111598004A (en) Progressive-enhancement self-learning unsupervised cross-domain pedestrian re-identification method
CN115909052A (en) Hyperspectral remote sensing image classification method based on hybrid convolutional neural network
CN112070010B (en) Pedestrian re-recognition method for enhancing local feature learning by combining multiple-loss dynamic training strategies
CN112364809A (en) High-accuracy face recognition improved algorithm
CN115965864A (en) Lightweight attention mechanism network for crop disease identification
CN116052218A (en) Pedestrian re-identification method
CN115116139A (en) Multi-granularity human body action classification method based on graph convolution network
CN109145950B (en) Hyperspectral image active learning method based on image signal sampling
Wang et al. Bit-plane and correlation spatial attention modules for plant disease classification
CN113033345B (en) V2V video face recognition method based on public feature subspace

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210813