CN111881757B - Pedestrian re-identification method, device, equipment and medium - Google Patents

Pedestrian re-identification method, device, equipment and medium Download PDF

Info

Publication number
CN111881757B
CN111881757B CN202010605966.3A CN202010605966A CN111881757B CN 111881757 B CN111881757 B CN 111881757B CN 202010605966 A CN202010605966 A CN 202010605966A CN 111881757 B CN111881757 B CN 111881757B
Authority
CN
China
Prior art keywords
pedestrian
sample
training set
recognition
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010605966.3A
Other languages
Chinese (zh)
Other versions
CN111881757A (en
Inventor
金良
尹云峰
范宝余
张润泽
郭振华
姜金哲
梁玲燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN202010605966.3A priority Critical patent/CN111881757B/en
Publication of CN111881757A publication Critical patent/CN111881757A/en
Priority to PCT/CN2021/076976 priority patent/WO2022001137A1/en
Application granted granted Critical
Publication of CN111881757B publication Critical patent/CN111881757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a pedestrian re-identification method, a device, equipment and a medium, comprising the following steps: extracting features of an original training set by using a first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information; clustering according to the characteristic space distribution characteristics of the original data set; screening out difficult samples according to the clustering result; adding the difficult sample to the original training set to obtain a target training set; training the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model; and when the pedestrian image to be identified is acquired, outputting a corresponding identification result by using the second pedestrian re-identification model. Therefore, the difficult samples are mined, and the spatial distribution of the samples in the original data set is changed through the mined difficult samples, so that the attention to the difficult samples can be increased, and the accuracy of pedestrian re-identification is improved.

Description

Pedestrian re-identification method, device, equipment and medium
Technical Field
The present application relates to the field of computer vision technologies, and in particular, to a pedestrian re-recognition method, device, apparatus, and medium.
Background
Pedestrian re-recognition is a technique for judging whether a specific pedestrian exists in an image or a video sequence by using a computer vision technology, and can be understood as a sub-problem of image retrieval. If video sequences shot by a plurality of cameras in an area are acquired, by designating a pedestrian interested under a certain camera, all relevant images of the pedestrian under other cameras are quickly searched. How to improve the performance of a pedestrian re-recognition algorithm is a hot problem of concern in the related technical field.
At present, for a pedestrian re-recognition algorithm, the loss tends to be in a stable state due to the gradual reduction of the number of difficult samples in the training process, and the algorithm becomes a bottleneck for improving the performance of the algorithm.
Disclosure of Invention
Accordingly, the present application is directed to a pedestrian re-recognition method, device, apparatus and medium, which can increase the attention to difficult samples, thereby improving the accuracy of pedestrian re-recognition. The specific scheme is as follows:
in a first aspect, the application discloses a pedestrian re-recognition method, which comprises the following steps:
extracting features of an original training set by using a first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information;
clustering according to the characteristic space distribution characteristics of the original data set;
screening out difficult samples according to the clustering result;
adding the difficult sample to the original training set to obtain a target training set;
training the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model;
and when the pedestrian image to be identified is acquired, outputting a corresponding identification result by using the second pedestrian re-identification model.
Optionally, before extracting the features of the original training set by using the first pedestrian re-recognition model, the method further includes:
respectively training different initial models by using the same preset training set to obtain a plurality of trained models; different initial models are based on different pedestrian re-recognition algorithms;
and evaluating all the trained models based on preset evaluation indexes, determining the trained model with the highest pedestrian re-recognition accuracy, and obtaining the first pedestrian re-recognition model.
Optionally, the clustering according to the characteristic space distribution characteristic of the original dataset includes:
and clustering by using a Kmeans algorithm according to the characteristic space distribution characteristic of the original data set.
Optionally, the screening the difficult sample according to the clustering result includes:
and screening samples with clustering results different from the label information to obtain Hard negotives samples.
Optionally, the screening the difficult sample according to the clustering result includes:
semi-Hard negotives samples were screened based on BvSB criteria.
Optionally, the screening the semi-Hard negotives sample based on the BvSB criterion includes:
calculating a first distance between any sample and each cluster center;
converting the first distance into a corresponding probability value;
determining a maximum probability value and a next-largest probability value from all the probability values of any sample;
judging whether the difference value between the maximum probability value and the next-largest probability value corresponding to the current sample is smaller than a preset threshold value;
and if the difference value between the maximum probability value and the next-highest probability value corresponding to the current sample is smaller than the preset threshold value, judging that the current sample is a semi-Hard adjacent sample.
Optionally, the screening the difficult sample according to the clustering result includes:
calculating a second distance between each sample in the same cluster and the current cluster center;
screening out a maximum distance and a minimum distance from the second distances;
determining a distance threshold by using the maximum distance and the minimum distance;
judging whether any of the second distances is larger than the distance threshold;
and if the second distance is greater than the distance threshold, determining the corresponding sample as a semi-Hard adjacent sample.
In a second aspect, the present application discloses a pedestrian re-recognition apparatus, comprising:
the feature extraction module is used for extracting features of the original training set by using the first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information;
the clustering module is used for clustering according to the characteristic space distribution characteristics of the original data set;
the difficult sample screening module is used for screening out difficult samples according to the clustering result;
the target training set acquisition module is used for adding the difficult sample to the original training set to obtain a target training set;
the model training module is used for training the first pedestrian re-recognition model by utilizing the target training set to obtain a second pedestrian re-recognition model;
and the image recognition module is used for outputting a corresponding recognition result by using the second pedestrian re-recognition model when the pedestrian image to be recognized is acquired.
In a third aspect, the application discloses a pedestrian re-recognition device comprising a processor and a memory; wherein,,
the memory is used for storing a computer program;
the processor is used for executing the computer program to realize the pedestrian re-recognition method.
In a fourth aspect, the present application discloses a computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the pedestrian re-recognition method described above.
Therefore, the method utilizes the first pedestrian re-recognition model to extract the characteristics of the original training set; the original training set comprises pedestrian sample images and corresponding label information, clustering is carried out according to characteristic space distribution characteristics of the original data set, then difficult samples are screened out according to clustering results, the difficult samples are added to the original training set to obtain a target training set, the target training set is utilized to train the first pedestrian re-recognition model to obtain a second pedestrian re-recognition model, and when the pedestrian images to be recognized are obtained, the second pedestrian re-recognition model is utilized to output corresponding recognition results. Therefore, the difficult samples are mined, and the spatial distribution of the samples in the original data set is changed through the mined difficult samples, so that the attention to the difficult samples can be increased, and the accuracy of pedestrian re-identification is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a pedestrian re-recognition application scenario provided by the application;
FIG. 2 is a schematic diagram of a triple loss generation according to the present application;
FIG. 3 is a schematic diagram of a negative distribution provided by the present application;
FIG. 4 is a flow chart of a pedestrian re-recognition method disclosed by the application;
FIG. 5 is a diagram of a NAIC ReID dataset provided by the present application;
FIG. 6 is a diagram showing a clustering result of NAIC ReID data according to the present application;
FIG. 7a is a schematic diagram of sample probability based on BvSB criteria according to the present disclosure;
FIG. 7b is a schematic diagram of sample probability based on BvSB criteria according to the present disclosure;
FIG. 8 is a flowchart of a specific pedestrian re-recognition method disclosed in the present application;
FIG. 9 is a flowchart of pedestrian re-recognition based on an alignedeID++ algorithm provided by the application;
FIG. 10 is a flow chart of pedestrian re-recognition based on the ReID-strong-base algorithm provided by the application;
FIG. 11 is a flow chart of pedestrian re-recognition based on ReID-MGN algorithm provided by the application;
FIG. 12 is a flow chart of pedestrian re-identification based on OSNet algorithm provided by the application;
FIG. 13 is a sub-flowchart of pedestrian re-recognition in accordance with the present disclosure;
FIG. 14 is a schematic view of a pedestrian re-recognition device according to the present disclosure;
fig. 15 is a block diagram of a pedestrian re-recognition apparatus according to the present disclosure.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Pedestrian re-recognition is a technique for judging whether a specific pedestrian exists in an image or a video sequence by using a computer vision technology, and can be understood as a sub-problem of image retrieval. Referring to fig. 1, fig. 1 is a schematic diagram of an application scenario for pedestrian re-recognition provided by the application. And acquiring video sequences shot by a plurality of cameras in an area, and rapidly searching all relevant images of the pedestrian under other cameras by designating the interested pedestrian under a certain camera.
Pedestrian re-recognition belongs to metric learning, and aims to learn the similarity of two images through an algorithm, namely, the similarity of different images of the same pedestrian is larger, and the similarity between different pedestrians is smaller. Thus, a dataset is typically divided into three parts: train, query, gallery. the method comprises the steps of utilizing the train to train a model, utilizing the query and the gamma to test, wherein data of the query and the gamma are completely different from the train, utilizing the trained model of the train to extract characteristics of the query and the gamma during testing, and calculating similarity of a current image in the query and all images of the gamma in a characteristic space. Therefore, the data types are completely different in training and testing, two unknown pedestrians need to be compared in testing, and whether the two pedestrians are the same is judged according to the similarity, so that the cross entropy softmax loss function with a fixed number of types is completely inapplicable, the triple loss function is generated, and the loss function mainly solves the problems that: in the feature space, samples with the same label are close to each other, while different label samples are far from each other. I.e.
L triplet(a,p,n) =max(d(a,p)-d(a,n)+margin,0);
Wherein a is an anchor, p is positive with the same category as the anchor, n is negative with different categories as the anchor, margin is a non-zero constant, d (a, p) = ||f (a) -f (p) | 2 . f maps the image into feature space and d represents a distance, such as a euclidean distance. Referring to fig. 2, fig. 2 is a schematic diagram illustrating a triple loss generation according to the present application. Minimizing L triplet(a,p,n) As can be seen by the equation, this loss function forces d (a, p) to go to 0, d (a, n) being greater than d (a, p) +margin. Based on this loss function definition, triplets can be classified into three categories, as shown in FIG. 3, FIG. 3 is a schematic diagram of a negative distribution provided by the present application, (1) Easy triplets. When d (a, p) +margin < d (a, n), the loss value is 0, where n is easy negotives; (2) Hard triplets. When d (a, n) < d (a, p), the loss value is not 0, and n is Hard negotives; (3) Semi-hard triplets. When d (a, p) < d (a, n) < d (a, p) +margin, the loss value is not 0, where n is Semi-hard alternatives. The triplets of the three categories are based on alternatives, and the easy triplets, the hard triplets and the semi-hard triplets can be expanded into the easy alternatives, the hard alternatives and the semi-hard alternatives.
At present, for a pedestrian re-identification algorithm, the algorithm performance can be improved only by improving the identification capability of the algorithm on difficult samples, and a triple loss function is observed, and it can be found that the loss function enables the algorithm to pay more attention to hard negative and semi-hard negative, but in the training process, the loss tends to be in a stable state due to the gradual reduction of the number of the difficult samples, so that the algorithm performance can not be improved, and in order to further improve the algorithm performance, the attention of the algorithm on the difficult samples needs to be increased. Therefore, the pedestrian re-recognition scheme provided by the application can increase the attention to the difficult sample, so that the accuracy of pedestrian re-recognition is improved.
Referring to fig. 4, the embodiment of the application discloses a pedestrian re-identification method, which comprises the following steps:
step S11: extracting features of an original training set by using a first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information.
The original training set is a NAIC ReID (i.e., person Re-identification) data set, for example, see fig. 5, and fig. 5 is a schematic diagram of a NAIC ReID data set provided by an embodiment of the present application. The data set is used for protecting privacy and preventing cheating, and special processing is carried out on pedestrian images, so that useful information is difficult to obtain from the picture alone.
It should be noted that in this embodiment, the original training set includes, but is not limited to, NAIC ReID data sets, for example, in some embodiments, mark 1501, dukeMTMC, and the like.
Step S12: and clustering according to the characteristic space distribution characteristics of the original data set.
In a specific implementation manner, the embodiment can utilize Kmeans (i.e. k-means clustering algorithm, k-means clustering algorithm) algorithm to perform clustering according to the characteristic space distribution characteristic of the original data set.
It should be noted that clustering algorithms used in embodiments of the present application include, but are not limited to, the Kmeans algorithm.
Step S13: and screening out difficult samples according to the clustering result.
In a specific embodiment, the embodiment may screen samples with clustering results different from the tag information to obtain Hard negotives samples.
That is, comparing the clustering result with the real label of the data set can screen out the samples of the data set which are misclassified in the feature space, namely the difficult samples Hard negotives.
And, the semi-Hard negotives samples are screened out based on BvSB (i.e., best vs second-Best) criteria. Specifically, calculating a first distance between any sample and each cluster center; converting the first distance into a corresponding probability value; determining a maximum probability value and a next-largest probability value from all the probability values of any sample; judging whether the difference value between the maximum probability value and the next-largest probability value corresponding to the current sample is smaller than a preset threshold value; and if the difference value between the maximum probability value and the next-highest probability value corresponding to the current sample is smaller than the preset threshold value, judging that the current sample is a semi-Hard adjacent sample.
Further, the embodiment may further calculate a second distance between each sample in the same cluster and the current cluster center; screening out a maximum distance and a minimum distance from the second distances; determining a distance threshold by using the maximum distance and the minimum distance; judging whether any of the second distances is larger than the distance threshold; and if the second distance is greater than the distance threshold, determining the corresponding sample as a semi-Hard adjacent sample.
It should be noted that, as can be seen from fig. 2 and fig. 3, only the Hard negotives and semi-Hard negotives samples are selected to improve the algorithm performance, and correspond to the feature space, the samples belong to the misclassified and misclassified samples, as shown in fig. 6, fig. 6 is a schematic diagram of a NAIC ReID data clustering result disclosed in the embodiment of the present application, 4 classes of NAIC ReID data sets are randomly selected in fig. 6, the feature space is reduced from 2048 dimensions to 2 dimensions by PCA (i.e. principal components analysis, principal component analysis), solid arrows point to the misclassified samples Hard negotives in fig. 6, i.e. the algorithm has misclassified the class 579 samples into the class 366, the algorithm has misclassified the class 191 samples into the class 579, and the dotted arrows point to the misclassified samples semi-Hard negotives, i.e. the samples are far from the clustering center of the present class of samples. In order to select the samples in the feature space, firstly, selecting samples with clustering results different from the real labels as Hard negotives; second, semi-hard negotives are chosen based on BvSB. Namely, calculating the distance between each sample and each cluster center, converting the distance into probability, obtaining maximum probability and sub-maximum probability, and subtracting, if the result is smaller than a specified threshold, the sample is semi-hard new, as shown in fig. 7a, and fig. 7a is a schematic diagram of sample probability based on BvSB criterion, wherein the probability of the sample belonging to category 4 and category 5 is not much different and is easy to be misplaced. Referring to fig. 7b, fig. 7b is a schematic diagram of sample probability based on BvSB criteria according to an embodiment of the present application, where the probability of a sample belonging to category 3 differs greatly from the probability of a sample belonging to category 4, and is not prone to error. Then, according to the distribution attribute of the sample set in the feature space, semi-hard negotives are selected. According to the clustering result, each type of sample is obtained, the distance between each sample in each type of sample and the center of the class is calculated, the maximum distance and the minimum distance are obtained, the threshold value is calculated, and the sample larger than the threshold value is selected as semi-hard adjacent.
For example, the present embodiment may extract dataset features using the first pedestrian re-recognition model and save features. Namely, a first pedestrian re-identification model is utilized to perform a reasoning operation on the data set, and the characteristics corresponding to the data set are extracted and stored. And then clustering by using a kmeans algorithm based on the characteristic space distribution characteristic of the data set, and screening hard negotives and semi-hard negotives. The method comprises the following steps: (1) Loading feature files, feature files and annotation files corresponding to the data set; (2) Screening samples with the number of images being greater than n under the sample set ID, for example, n > =10; (3) Selecting a sample corresponding feature select_features according to the result of the step (2); (4) Clustering the select_features by using a kmeans algorithm, wherein the clustering number is the number of categories corresponding to the samples selected in the step 2; that is, in this embodiment, after the features of the original training set are extracted by using the first pedestrian re-recognition model, features corresponding to samples with the number of images of the same category greater than the preset number in the original training set may be screened, and a clustering operation may be performed on the screened features; (5) Comparing the clustering result with a sample real label, and adding different samples to hard negotives; (6) screening semi-hard negotives based on BvSB criteria; a) Calculating distances between each sample and all clustering centers; b) The distance is converted into a probability that,
i.e.
Where ε is a small number, ε=0.0001 can be given in this example; c) Selecting a maximum probability p1 and a next-maximum probability p2 at probs; d) If p1-p2 is less than or equal to thr, adding semi-hard negotives into the sample, wherein thr=0.2; (7) Screening semi-hard negotives based on the distribution attribute of the sample in the characteristic space; a) Selecting similar samples based on the clustering result; b) Calculating the distance dists between each sample and the cluster center; c) Selecting the maximum distance d at dis max And a minimum distance d min The method comprises the steps of carrying out a first treatment on the surface of the d) Calculating a threshold d thr =(d max -d min )*ratio+d min Wherein, this embodiment may make ratio=0.5; e) Will be greater than d in dis thr Is added to semi-hard negotives.
Step S14: and adding the difficult sample to the original training set to obtain a target training set.
Step S15: and training the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model.
It can be appreciated that semi-hard neighbors and hard neighbors are added to the original data set to generate a new data set and to train. The samples can effectively improve the characteristic of 'inner tightness and outer looseness' of the pedestrian re-recognition algorithm, namely the intra-class distance is smaller, and the inter-class distance is larger, so that the samples are added in the original data set, and the performance of the pedestrian re-recognition algorithm can be improved.
Step S16: and when the pedestrian image to be identified is acquired, outputting a corresponding identification result by using the second pedestrian re-identification model.
Therefore, the embodiment of the application firstly utilizes the first pedestrian re-recognition model to extract the characteristics of the original training set; the original training set comprises pedestrian sample images and corresponding label information, clustering is carried out according to characteristic space distribution characteristics of the original data set, then difficult samples are screened out according to clustering results, the difficult samples are added to the original training set to obtain a target training set, the target training set is utilized to train the first pedestrian re-recognition model to obtain a second pedestrian re-recognition model, and when the pedestrian images to be recognized are obtained, the second pedestrian re-recognition model is utilized to output corresponding recognition results. Therefore, the difficult samples are mined, and the spatial distribution of the samples in the original data set is changed through the mined difficult samples, so that the attention to the difficult samples can be increased, and the accuracy of pedestrian re-identification is improved.
Referring to fig. 8, the embodiment of the application discloses a specific pedestrian re-recognition method, which comprises the following steps:
step S21: respectively training different initial models by using the same preset training set to obtain a plurality of trained models; different ones of the initial models are based on different pedestrian re-recognition algorithms.
In a specific embodiment, the present example can train an initial model corresponding to four different pedestrian re-recognition algorithms, including alignedReID++, reID-strong-baseline, reID-MGN, and OSNet. And training the initial model based on the four pedestrian re-recognition algorithms for multiple times by using the NAIC ReID data set, and selecting the best result of each algorithm to obtain a corresponding trained model. Referring to fig. 9, fig. 9 is a flowchart of pedestrian re-recognition based on aligneded++ algorithm according to an embodiment of the present application. Referring to fig. 10, fig. 10 is a flowchart of pedestrian re-recognition based on the ReID-strong-baseline algorithm according to an embodiment of the present application. Referring to fig. 11, fig. 11 is a flowchart of pedestrian re-recognition based on the ReID-MGN algorithm according to an embodiment of the present application. Referring to fig. 12, fig. 12 is a flowchart of pedestrian re-recognition based on OSNet algorithm according to an embodiment of the present application.
Step S22: and evaluating all the trained models based on preset evaluation indexes, determining the trained model with the highest pedestrian re-recognition accuracy, and obtaining the first pedestrian re-recognition model.
In a specific embodiment, all the trained models may be evaluated based on RANK1 and map@200, and a trained model with the highest pedestrian re-recognition accuracy may be determined, so as to obtain the first pedestrian re-recognition model.
Specifically, the RANK1 evaluation formula is:
score=λ*rank1+(1-λ)*mAP;
where λ=0.5, rank1 represents the average accuracy of the first returned result for each query image, i.e
Q represents the overall query image query set,for the ith return result label in the corresponding image library gamma of the q query images, indicating the function I (l q ,l i ) Indicating whether the labels of image q and image i are identical, i.e
And, mAP@200 represents the average of the average accuracy of the first 200 returned results. I.e.
When calculating mAP, the following contents are calculated in sequence: 1) P: precision. Namely, for a certain probe picture, calculating the quantity proportion of the same ID (identity) as the query picture, namely the category, in the first k returned results. 2) AP@n: average accuracy. Average accuracy, i.e. the accuracy of only those positions where the returned results are correct, of the first n returned results, i.e. n q How many of the first 200 returned results are correctly returned for q this query picture. 3) mAP@n for all probe pictures, the AP is calculated, and the results are averaged.
That is, in this embodiment, the best model for the NAIC ReID dataset, that is, the model with the highest accuracy, may be selected from the best results corresponding to the four algorithms according to the rank1 and the map@200 evaluation indexes.
Step S23: extracting features of an original training set by using a first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information.
That is, the embodiment can obtain the model with highest accuracy of pedestrian re-identification for the original dataset, and perform an inference operation on the dataset by using the model to extract the features corresponding to the dataset.
Step S24: and clustering according to the characteristic space distribution characteristics of the original data set.
Step S25: and screening out difficult samples according to the clustering result.
Step S26: and adding the difficult sample to the original training set to obtain a target training set.
Step S27: and training the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model.
Step S28: and when the pedestrian image to be identified is acquired, outputting a corresponding identification result by using the second pedestrian re-identification model.
For example, referring to fig. 13, fig. 13 is a sub-flowchart of pedestrian re-recognition according to an embodiment of the present application. Selecting an optimal model for the data set according to the data set and a pedestrian re-identification related algorithm; extracting the characteristics of the data set by using the screened model, and storing features; clustering by utilizing a kmeans algorithm based on the characteristic space distribution characteristic of the data set, and screening hard negotives and semi-hard negotives; adding semi-hard negotives and hard negotives into the original data set, generating a new data set, and training.
Referring to fig. 14, an embodiment of the present application discloses a pedestrian re-recognition apparatus, including:
the feature extraction module 11 is used for extracting features of the original training set by using the first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information;
a clustering module 12, configured to cluster according to the feature spatial distribution characteristics of the original dataset;
a difficult sample screening module 13, configured to screen out a difficult sample according to the clustering result;
a target training set acquisition module 14, configured to add the difficult sample to the original training set, to obtain a target training set;
the model training module 15 is configured to train the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model;
and the image recognition module 16 is configured to output a corresponding recognition result by using the second pedestrian re-recognition model when the pedestrian image to be recognized is acquired.
Therefore, the method utilizes the first pedestrian re-recognition model to extract the characteristics of the original training set; the original training set comprises pedestrian sample images and corresponding label information, clustering is carried out according to characteristic space distribution characteristics of the original data set, then difficult samples are screened out according to clustering results, the difficult samples are added to the original training set to obtain a target training set, the target training set is utilized to train the first pedestrian re-recognition model to obtain a second pedestrian re-recognition model, and when the pedestrian images to be recognized are obtained, the second pedestrian re-recognition model is utilized to output corresponding recognition results. Therefore, the difficult samples are mined, and the spatial distribution of the samples in the original data set is changed through the mined difficult samples, so that the attention to the difficult samples can be increased, and the accuracy of pedestrian re-identification is improved.
The device further comprises an initial model training module, wherein the initial model training module is used for respectively training different initial models by using the same preset training set to obtain a plurality of trained models; different ones of the initial models are based on different pedestrian re-recognition algorithms.
The device further comprises a training model evaluation module, wherein the training model evaluation module is used for evaluating all the trained models based on preset evaluation indexes, determining the trained model with the highest pedestrian re-recognition accuracy, and obtaining the first pedestrian re-recognition model.
The clustering module 12 is specifically configured to perform clustering according to the characteristic spatial distribution characteristic of the original dataset by using a Kmeans algorithm;
a difficult sample screening module 13 comprising a first screening sub-module, a second screening sub-module, and a third screening sub-module; wherein,,
the first screening submodule is specifically used for screening samples with clustering results different from the tag information to obtain Hard negotives samples.
The second screening submodule is specifically used for screening out semi-Hard negative samples based on BvSB criteria. In a specific embodiment, the second screening submodule is specifically configured to calculate a first distance between any sample and each cluster center; converting the first distance into a corresponding probability value; determining a maximum probability value and a next-largest probability value from all the probability values of any sample; judging whether the difference value between the maximum probability value and the next-largest probability value corresponding to the current sample is smaller than a preset threshold value; and if the difference value between the maximum probability value and the next-highest probability value corresponding to the current sample is smaller than the preset threshold value, judging that the current sample is a semi-Hard adjacent sample.
The third screening submodule is specifically used for calculating a second distance between each sample in the same cluster and the current cluster center; screening out a maximum distance and a minimum distance from the second distances; determining a distance threshold by using the maximum distance and the minimum distance; judging whether any of the second distances is larger than the distance threshold; and if the second distance is greater than the distance threshold, determining the corresponding sample as a semi-Hard adjacent sample.
Referring to fig. 15, an embodiment of the present application discloses a pedestrian re-recognition apparatus including a processor 21 and a memory 22; wherein the memory 22 is used for storing a computer program; the processor 21 is configured to execute the computer program to implement the pedestrian re-recognition method disclosed in the foregoing embodiment.
For the specific process of the pedestrian re-recognition method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
Further, the embodiment of the application also discloses a computer readable storage medium for storing a computer program, wherein the computer program is executed by a processor to realize the pedestrian re-recognition method disclosed in the previous embodiment.
For the specific process of the pedestrian re-recognition method, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed description of the pedestrian re-recognition method, device, apparatus and medium provided by the present application applies specific examples to illustrate the principles and embodiments of the present application, and the above examples are only used to help understand the method and core ideas of the present application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (8)

1. A pedestrian re-recognition method, characterized by comprising:
extracting features of an original training set by using a first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information;
clustering according to the characteristic space distribution characteristics of the original data set;
screening out difficult samples according to the clustering result;
adding the difficult sample to the original training set to obtain a target training set;
training the first pedestrian re-recognition model by using the target training set to obtain a second pedestrian re-recognition model;
when the pedestrian image to be identified is acquired, outputting a corresponding identification result by using the second pedestrian re-identification model;
wherein, the screening the difficult sample according to the clustering result includes:
calculating a second distance between each sample in the same cluster and the current cluster center;
screening out a maximum distance and a minimum distance from the second distances;
determining a distance threshold by using the maximum distance and the minimum distance;
judging whether any of the second distances is larger than the distance threshold;
if the second distance is greater than the distance threshold, determining the corresponding sample as a semi-Hard adjacent sample;
and after the features of the original training set are extracted by using the first pedestrian re-recognition model, screening out features corresponding to samples with the same category of images with the number larger than the preset number in the original training set, performing clustering operation on the screened features, and screening out samples with clustering results different from the tag information to obtain Hard negotives samples.
2. The pedestrian re-recognition method of claim 1, wherein prior to extracting features of the original training set using the first pedestrian re-recognition model, further comprising:
respectively training different initial models by using the same preset training set to obtain a plurality of trained models; different initial models are based on different pedestrian re-recognition algorithms;
and evaluating all the trained models based on preset evaluation indexes, determining the trained model with the highest pedestrian re-recognition accuracy, and obtaining the first pedestrian re-recognition model.
3. The pedestrian re-recognition method of claim 1, wherein the clustering according to the characteristic spatial distribution characteristics of the original dataset comprises:
and clustering by using a Kmeans algorithm according to the characteristic space distribution characteristic of the original data set.
4. The pedestrian re-recognition method according to claim 1, wherein the screening out the difficult sample according to the clustering result includes:
semi-Hard negotives samples were screened based on BvSB criteria.
5. The pedestrian re-recognition method of claim 4, wherein the screening out semi-Hard negotives samples based on BvSB criteria comprises:
calculating a first distance between any sample and each cluster center;
converting the first distance into a corresponding probability value;
determining a maximum probability value and a next-largest probability value from all the probability values of any sample;
judging whether the difference value between the maximum probability value and the next-largest probability value corresponding to the current sample is smaller than a preset threshold value;
and if the difference value between the maximum probability value and the next-highest probability value corresponding to the current sample is smaller than the preset threshold value, judging that the current sample is a semi-Hard adjacent sample.
6. A pedestrian re-recognition device, characterized by comprising:
the feature extraction module is used for extracting features of the original training set by using the first pedestrian re-recognition model; the original training set comprises pedestrian sample images and corresponding label information;
the clustering module is used for clustering according to the characteristic space distribution characteristics of the original data set;
the difficult sample screening module is used for screening out difficult samples according to the clustering result;
the target training set acquisition module is used for adding the difficult sample to the original training set to obtain a target training set;
the model training module is used for training the first pedestrian re-recognition model by utilizing the target training set to obtain a second pedestrian re-recognition model;
the image recognition module is used for outputting a corresponding recognition result by using the second pedestrian re-recognition model when the pedestrian image to be recognized is acquired;
wherein the difficult sample screening module comprises a third screening sub-module; the third screening submodule is specifically used for calculating a second distance between each sample in the same cluster and the current cluster center; screening out a maximum distance and a minimum distance from the second distances; determining a distance threshold by using the maximum distance and the minimum distance; judging whether any of the second distances is larger than the distance threshold; if the second distance is greater than the distance threshold, determining the corresponding sample as a semi-Hard adjacent sample;
and the device is further used for screening out the features corresponding to the samples with the same category of images with the number larger than the preset number in the original training set after the features of the original training set are extracted by using the first pedestrian re-recognition model, carrying out clustering operation on the screened features, and screening out samples with clustering results different from the tag information to obtain Hard negotives samples.
7. A pedestrian re-recognition device comprising a processor and a memory; wherein,,
the memory is used for storing a computer program;
the processor for executing the computer program to implement the pedestrian re-recognition method as claimed in any one of claims 1 to 5.
8. A computer readable storage medium for storing a computer program, wherein the computer program when executed by a processor implements the pedestrian re-recognition method of any one of claims 1 to 5.
CN202010605966.3A 2020-06-29 2020-06-29 Pedestrian re-identification method, device, equipment and medium Active CN111881757B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010605966.3A CN111881757B (en) 2020-06-29 2020-06-29 Pedestrian re-identification method, device, equipment and medium
PCT/CN2021/076976 WO2022001137A1 (en) 2020-06-29 2021-02-20 Pedestrian re-identification method, apparatus and device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010605966.3A CN111881757B (en) 2020-06-29 2020-06-29 Pedestrian re-identification method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111881757A CN111881757A (en) 2020-11-03
CN111881757B true CN111881757B (en) 2023-09-01

Family

ID=73157324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010605966.3A Active CN111881757B (en) 2020-06-29 2020-06-29 Pedestrian re-identification method, device, equipment and medium

Country Status (2)

Country Link
CN (1) CN111881757B (en)
WO (1) WO2022001137A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111881757B (en) * 2020-06-29 2023-09-01 浪潮电子信息产业股份有限公司 Pedestrian re-identification method, device, equipment and medium
CN112508130A (en) * 2020-12-25 2021-03-16 商汤集团有限公司 Clustering method and device, electronic equipment and storage medium
CN112884040B (en) * 2021-02-19 2024-04-30 北京小米松果电子有限公司 Training sample data optimization method, system, storage medium and electronic equipment
CN113095174B (en) * 2021-03-29 2024-07-23 深圳力维智联技术有限公司 Re-identification model training method, device, equipment and readable storage medium
CN114724090B (en) * 2022-05-23 2022-08-30 北京百度网讯科技有限公司 Training method of pedestrian re-identification model, and pedestrian re-identification method and device
CN116188919B (en) * 2023-04-25 2023-07-14 之江实验室 Test method and device, readable storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194994A (en) * 2017-06-16 2017-09-22 广东工业大学 A kind of method and device without the demarcation surface points cloud data reconstruction face of cylinder
CN109145766A (en) * 2018-07-27 2019-01-04 北京旷视科技有限公司 Model training method, device, recognition methods, electronic equipment and storage medium
CN109445844A (en) * 2018-11-05 2019-03-08 浙江网新恒天软件有限公司 Code Clones detection method based on cryptographic Hash, electronic equipment, storage medium
CN109871461A (en) * 2019-02-13 2019-06-11 华南理工大学 The large-scale image sub-block search method to be reordered based on depth Hash network and sub-block
CN109993179A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 The method and apparatus that a kind of pair of data are clustered
CN110956255A (en) * 2019-11-26 2020-04-03 中国医学科学院肿瘤医院 Difficult sample mining method and device, electronic equipment and computer readable storage medium
WO2020098121A1 (en) * 2018-11-13 2020-05-22 平安科技(深圳)有限公司 Method and device for training fast model, computer apparatus, and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11062179B2 (en) * 2017-11-02 2021-07-13 Royal Bank Of Canada Method and device for generative adversarial network training
CN109829441B (en) * 2019-02-19 2020-08-21 山东大学 Facial expression recognition method and device based on course learning
CN111027442A (en) * 2019-12-03 2020-04-17 腾讯科技(深圳)有限公司 Model training method, recognition method, device and medium for pedestrian re-recognition
CN111881757B (en) * 2020-06-29 2023-09-01 浪潮电子信息产业股份有限公司 Pedestrian re-identification method, device, equipment and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107194994A (en) * 2017-06-16 2017-09-22 广东工业大学 A kind of method and device without the demarcation surface points cloud data reconstruction face of cylinder
CN109993179A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 The method and apparatus that a kind of pair of data are clustered
CN109145766A (en) * 2018-07-27 2019-01-04 北京旷视科技有限公司 Model training method, device, recognition methods, electronic equipment and storage medium
CN109445844A (en) * 2018-11-05 2019-03-08 浙江网新恒天软件有限公司 Code Clones detection method based on cryptographic Hash, electronic equipment, storage medium
WO2020098121A1 (en) * 2018-11-13 2020-05-22 平安科技(深圳)有限公司 Method and device for training fast model, computer apparatus, and storage medium
CN109871461A (en) * 2019-02-13 2019-06-11 华南理工大学 The large-scale image sub-block search method to be reordered based on depth Hash network and sub-block
CN110956255A (en) * 2019-11-26 2020-04-03 中国医学科学院肿瘤医院 Difficult sample mining method and device, electronic equipment and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IMPROVING PERSON RE-IDENTIFICATION BY ADAPTIVE HARD SAMPLE MINING;Kezhou Chen等;《ICIP 2018》;全文 *

Also Published As

Publication number Publication date
WO2022001137A1 (en) 2022-01-06
CN111881757A (en) 2020-11-03

Similar Documents

Publication Publication Date Title
CN111881757B (en) Pedestrian re-identification method, device, equipment and medium
US10650237B2 (en) Recognition process of an object in a query image
EP2701098A2 (en) Region refocusing for data-driven object localization
Bi et al. Person re-identification using multiple experts with random subspaces
US9165184B2 (en) Identifying matching images
JP2013541119A (en) System and method for improving feature generation in object recognition
CN103440508B (en) The Remote Sensing Target recognition methods of view-based access control model word bag model
Sumbul et al. Informative and representative triplet selection for multilabel remote sensing image retrieval
CN111914921A (en) Similarity image retrieval method and system based on multi-feature fusion
US8488873B2 (en) Method of computing global-to-local metrics for recognition
US11977607B2 (en) CAM-based weakly supervised learning object localization device and method
US20090279792A1 (en) Image search method and device
Farhangi et al. Improvement the bag of words image representation using spatial information
US20140270541A1 (en) Apparatus and method for processing image based on feature point
CN110704667A (en) Semantic information-based rapid similarity graph detection algorithm
Yamamoto et al. A proposal for the global and collaborative PBL learning environment where all global members on different campuses are
Sanin et al. K-tangent spaces on Riemannian manifolds for improved pedestrian detection
Viitaniemi et al. Keyword-detection approach to automatic image annotation
Kanjanawattana et al. Extraction and identification of bar graph components by automatic epsilon estimation
CN113657180A (en) Vehicle identification method, server and computer readable storage medium
Schlegel et al. Adding cues to binary feature descriptors for visual place recognition
Tan et al. Dense invariant feature based support vector ranking for person re-identification
Bhowmik et al. OCR performance prediction using a bag of allographs and support vector regression
CN118038282B (en) Tunnel defect detection method and equipment
CN117422890B (en) Optimized deployment method, system and medium for visual deep learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant