CN112434599B - Pedestrian re-identification method based on random occlusion recovery of noise channel - Google Patents

Pedestrian re-identification method based on random occlusion recovery of noise channel Download PDF

Info

Publication number
CN112434599B
CN112434599B CN202011321451.7A CN202011321451A CN112434599B CN 112434599 B CN112434599 B CN 112434599B CN 202011321451 A CN202011321451 A CN 202011321451A CN 112434599 B CN112434599 B CN 112434599B
Authority
CN
China
Prior art keywords
network
pedestrian
noise
data
noise channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011321451.7A
Other languages
Chinese (zh)
Other versions
CN112434599A (en
Inventor
黄德双
张焜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202011321451.7A priority Critical patent/CN112434599B/en
Publication of CN112434599A publication Critical patent/CN112434599A/en
Priority to JP2021087114A priority patent/JP7136500B2/en
Application granted granted Critical
Publication of CN112434599B publication Critical patent/CN112434599B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a pedestrian re-identification method based on random shielding recovery of a noise channel, which comprises the following steps: step 1: after data division and pretreatment are carried out on a reference data set, a CAN network structure is constructed, data expansion is carried out on a training set obtained after data division and pretreatment are carried out on the reference data set by using the CAN network structure, and a basic network main body feature extraction structure is trained by using the training set after data expansion to obtain a trained basic network main body feature extraction structure; step 2: constructing a noise channel structure of the label error; and step 3: comprehensively establishing a pedestrian re-identification network for random shielding recovery based on a noise channel based on a trained basic network main body feature extraction structure, a noise channel structure and a CAN network structure; and 4, step 4: and identifying the actual original image to be detected by utilizing the pedestrian re-identification network. Compared with the prior art, the method has the advantages of good network robustness, high accuracy, low error and the like.

Description

Pedestrian re-identification method based on random occlusion recovery of noise channel
Technical Field
The invention relates to the technical field of computer vision, in particular to a pedestrian re-identification method based on random occlusion recovery of a noise channel.
Background
The basic task of a distributed multi-camera surveillance system is to associate people with camera views at different locations and at different times. This is called the pedestrian re-recognition problem, and more specifically, the pedestrian re-recognition is mainly to solve the problem of "where a target pedestrian has appeared before" or "where a target pedestrian has gone after being captured in a monitoring network". It supports many critical applications such as long-time multi-camera tracking and forensic search. In fact, each camera can shoot under different illumination conditions, shading degrees and different static and dynamic backgrounds from different angles and distances. This presents a number of significant challenges to the pedestrian re-identification task. Meanwhile, pedestrian re-identification techniques relying on traditional biometrics such as face recognition are neither feasible nor reliable, since pedestrians observed at cameras of unknown distances may have conditional limitations such as crowded backgrounds, low resolution, etc.
The traditional pedestrian re-identification technology mainly comprises two aspects: feature expression and similarity measure. The common features mainly include color features, texture features, shape features, higher-level attribute features, behavior semantic features and the like. For the similarity measurement, the euclidean distance is used first, and then some supervised similarity discrimination methods are also proposed.
With the development of deep learning, the method based on the deep learning model already occupies the field of pedestrian re-identification, and the deep models for pedestrian re-identification are mainly divided into three types at the present stage: identification model, verification model, and triple model. The Identification model is the same as the classification model on other tasks, gives a picture and then outputs a label of the picture, and the Identification model can fully utilize the labeling information of a single image. The Verification model takes two pictures as input and then inputs whether they are the same pedestrian. The Verification model uses weak tags (relationship between two lines of people) and does not use annotation information for a single picture. Similarly, the triplet model takes three pictures as input, zooms in, zooms out, and zooms out, but the label information for a single picture is not used.
In the aspect of feature extraction, the depth model abandons the traditional mode of manually designing features, and a network model and a structural module are designed based on a convolutional neural network to automatically learn the features. More classical network structures include GoogleNet, resNet, and densnet, among others. Common feature extraction structures include an abstraction structure, a feature pyramid, an attention structure, and the like.
Under the background, the invention designs a network model for random occlusion recovery based on a noise channel, and discriminant force characteristics (including global and local) and reinforced spatial relationship learning can be extracted through multi-scale characterization learning. While the random batch mask strategy employs a random masking and attention mechanism, mitigating the situation where local detail features are suppressed.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a pedestrian re-identification method based on random occlusion recovery of a noise channel.
The purpose of the invention can be realized by the following technical scheme:
a pedestrian re-identification method based on random occlusion recovery of a noise channel comprises the following steps:
step 1: after data division and pretreatment are carried out on a reference data set, a CAN network structure for shielding recovery is constructed, data expansion is carried out on a training set obtained after data division and pretreatment are carried out on the reference data set by using the CAN network structure, and a basic network main body feature extraction structure is trained by using the training set after data expansion to obtain a trained basic network main body feature extraction structure;
and 2, step: constructing a noise channel structure for reducing tag errors caused by data expansion;
and 3, step 3: comprehensively establishing a pedestrian re-identification network based on random shielding recovery of a noise channel based on a trained basic network main body feature extraction structure, a noise channel structure and a CAN network structure for shielding recovery;
and 4, step 4: and identifying the actual original image to be detected by utilizing a pedestrian re-identification network based on random occlusion recovery of a noise channel.
Further, the step 1 comprises the following sub-steps:
step 101: dividing a reference data set into a training set and a test set, then randomly extracting picture data from the training set and carrying out preprocessing operation;
step 102: constructing a CAN network structure for shielding recovery and further performing data expansion on the training set by using the CAN network structure;
step 103: setting parameters and corresponding formulas required by a training network model;
step 104: and after the setting is finished, inputting the picture data subjected to the preprocessing operation and the data expansion into the basic network main body feature extraction structure for training to obtain the trained basic network main body feature extraction structure.
Further, the reference data set in step 101 is a Market1501 data set; the preprocessing operation in the step 101 comprises horizontal turning, noise adding or random erasing; the basic network main body feature extraction structure in the step 104 is a ResNet50 network structure.
Further, in the step 104, in the process of inputting the picture data after the preprocessing operation and the data expansion into the basic network main body feature extraction structure for training, parameters are automatically adjusted by using an Adam optimization method, the Dropout strategy is used to avoid the occurrence of an over-fitting condition, and the Batch Normalization is used to accelerate the convergence speed of the network.
Further, the step 103 specifically includes: setting a total training period epoch to be 150, setting a weight attenuation parameter weight decay to be 0.0005, setting a batch size to be 180, and setting a learning rate updating mode, wherein the corresponding description formula is as follows:
Figure BDA0002793005500000031
in the formula, lr is a learning rate.
Further, the CAN network structure for occlusion recovery in step 1 is composed of a generator network for learning the original data set and generating an image and a discriminator network for determining whether the input image is real, i.e. whether the input image belongs to the original data set or is generated by the generator network, and the corresponding mathematical description formula is as follows:
Figure BDA0002793005500000032
where x is the occlusion image, y is the target image, and D and G represent the discriminator network and the generator network, respectively.
Further, the process of using the noise channel structure in step 2 to reduce the tag error caused by data expansion specifically includes:
step 201: giving distribution to the transition probability between the original label corresponding to the generated image data and the noise label observed by utilizing the noise channel structure;
step 202: and solving the distribution by using an EM algorithm to obtain implicit parameters, and reducing the tag error caused by data expansion by using the implicit parameters.
Further, the distribution in step 201 describes the formula:
Figure BDA0002793005500000041
in the formula, z is a noise label, N is a noise label set, theta and w are implicit parameters, C is a clean label set, k is the number of categories, and p is the predicted label probability.
Further, the process of obtaining the hidden parameters by using the EM algorithm for the distribution in step 202 includes fixing the hidden parameters θ and w and estimating a transition probability in step E, and updating the parameter θ in step M, where the estimated transition probability corresponds to a description formula:
Figure BDA0002793005500000042
in the formula, c ti To estimate the transition probability from tag t to tag j, y t For the true tag information of the t-th sample, x t For the t-th sample of the input, z t A noise label for the t sample of the input;
the corresponding description formula of the update parameter θ is:
Figure BDA0002793005500000043
in the equation, θ (i, j) is the true transition probability from tag i to tag j.
Further, the objective function adopted in the EM algorithm has a corresponding description formula as follows:
Figure BDA0002793005500000044
in the formula, S (w) represents an objective function employed in the EM algorithm.
Compared with the prior art, the invention has the following advantages:
(1) The method uses a deep learning technology, firstly carries out preprocessing operations such as turning and cutting on a training set picture, then carries out feature extraction through a basic network model (ResNet 50), carries out random batch mask training strategy and multi-scale representation learning on high-dimensional features extracted through the ResNet50 network, thereby obtaining feature information which has more discriminative power and more detail and contains the spatial relevance of pedestrians, and then uses a multi-loss function to carry out fusion joint training network.
(2) The invention uses the restored occlusion image to expand the data set, and introduces a tag noise channel, thereby relieving the error brought by the expanded data and improving the robustness of the network.
Drawings
Fig. 1 is a network overall block diagram of a pedestrian re-identification technology based on random occlusion recovery of a noise channel according to an embodiment of the present invention.
Fig. 2 is a network training flowchart of a pedestrian re-identification technology based on random occlusion recovery of a noise channel according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating evaluation of a result of a pedestrian re-identification technique based on random occlusion recovery of a noise channel according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
The invention relates to a pedestrian re-identification technology based on random shielding recovery of a noise channel, which realizes a more accurate and efficient pedestrian re-identification task on a plurality of reference data sets. The task of pedestrian re-identification is a processing process for establishing a relationship between pedestrian images or video samples collected by different cameras without repeated vision fields, namely identifying whether pedestrians shot by the cameras at different positions at different moments are the same person. The traditional pedestrian re-identification mainly comprises two steps of pedestrian feature expression and pedestrian similarity judgment.
Compared with a pedestrian re-identification algorithm based on deep learning, the invention provides a personnel re-identification method based on random occlusion recovery with a noise channel. Adding a random occlusion block to the original image, repairing by using a GAN model, and then expanding the original training set by using the repaired image. The baseline model is trained using the enhanced data set and label errors of the augmented image are mitigated by the noise path.
Practical embodiments
1. Basic technical scheme
The invention relates to a pedestrian re-identification technology based on random shielding recovery of a noise channel, as shown in figure 1, the main realization structure of the technology depends on the following parts:
1) Dividing a training set and a test set for an original data set;
2) Extracting a basic network main body characteristic structure;
3) A noise channel structure;
4) A GAN network structure for occlusion recovery;
5) The method comprises the steps of network hyper-parameter adjustment, including an iteration step length adjustment method, an iteration step length initial value, learning function selection and the like;
6) Selecting a loss function, wherein different loss functions are used for different structures;
7) The whole technical method is written based on PyTorch, python and some auxiliary libraries.
Step 1) in the above 7 steps specifically includes: and dividing the reference data set into a training set and a test set. Take the data set Market1501 as an example, 751 pedestrian IDs in total 12936 pictures are taken as training sets, and 750 other pedestrian IDs and some background pictures in total 19732 are taken as test sets.
On the basis, data set processing is further carried out, and a part of the training set is further divided to be used as a verification set so as to control the training process and effectively obtain the optimal state. The test set is divided into two parts, namely query and galery.
And (4) extracting the features of the pictures in the query set and the candidate set by using the trained network, and calculating Euclidean distances of the proposed features in pairs respectively to perform distance sorting. And obtaining pictures with the similar target distance in the candidate set and the query set.
Step 2) of the above 7 steps specifically includes: and selecting a mature network with better performance to perform experiments and carrying out exploration and comparison on results. With the ResNet50 network structure, resNet learns the residual errors through short-circuit connection, and the degradation problem caused when the network depth is deepened is solved.
Step 3) in the above 7 steps, specifically including: for the generated image, the original label cannot be directly considered as a genuine label. For the observed noise label, the transition probabilities before the noise label and the real label need to be learned, and for all training images, the label of the original data is considered to be clean, while the label of the generated data is noisy. For the observation tag, given the distribution, solving for hidden parameters using the EM algorithm;
step 4) in the above 7 steps, specifically including: the generation countermeasure network (GAN) adopts the idea of two-person zero-sum game, and is composed of two parts, namely a generation network and a discrimination network. The GAN is used to learn the original data set and generate an image, while the discriminator network is used to determine whether the input image is authentic (original data set) or counterfeit (generated by the generator network). Both networks are trained simultaneously. The purpose is to make the discrimination model unable to distinguish the authenticity of the generated image. In the technical scheme of the invention, using the condition GAN [15], the mathematical expression of the optimization target is as follows:
Figure BDA0002793005500000061
where x is the occlusion image, y is the target image, and D and G represent the discriminator network and the generator network, respectively.
In the technical scheme of the invention, aiming at a ResNet50 network structure, in order to solve the difficulty of SGD parameter selection, an Adam optimization method is used for automatically adjusting parameters. The strategy of Dropout is used to avoid the occurrence of the over-fitting condition, and Batch Normalization is used to accelerate the convergence speed of the network.
The method is characterized in that a total training period (epoch) is set to be 150, a weight decay parameter (weight decay) is 0.0005, a batch size (batch size) is 180, and a learning rate updating mode is as follows:
Figure BDA0002793005500000071
in the formula, lr is a learning rate.
Step 7) of the above 7 steps, specifically comprising: the PyTorch adopts the form of a dynamic graph, and is easy to realize the idea of network construction of the PyTorch.
2. Practical implementation of
The embodiment of the invention is realized in such a way that a pedestrian re-identification technology based on random shielding recovery of a noise channel comprises the following steps:
the reference data set needs data preprocessing for data expansion, and the following data processing modes are used
1) Randomly extracting a plurality of pictures in a data set and adding Gaussian noise processing
2) Randomly extracting a plurality of pictures in the data set, randomly adding a rectangular blocking block on the pictures, and randomly selecting the length and the width of the area from 2cm to 5 cm. In order to make the rectangle occlude the Person image as much as possible, the image is divided into three columns from left to right, and the center of the matrix is randomly chosen in the middle column. The pixel values of the R, G and B channels of the blocker are 0255, and the average values in the data set. On the Market-1501 dataset, pixel averages 89.3, 102.5 and 98.7, occlusion images were restored by Cycle GAN.
And randomly extracting a plurality of pictures from the training data to perform horizontal turning, noise adding, random erasing and other treatments. Meanwhile, aiming at 6 cameras in the Market1501 dataset, images among different cameras are subjected to camera style migration by using Cycle GAN, so that the data integration is multiplied.
After the data sets have been organized and processed as described above, the images are input into a convolutional neural network (ResNet 50) for feature extraction using ResNet50 as a reference network model for parameter and time considerations, and since mark 1501 belongs to a pedestrian data set with a relatively large data volume, a network model pre-trained on ImageNet is used for extraction.
For the whole network training, the combined training is carried out by using a mode of fusing identification loss and weighted list loss, and the whole model comprises a feature learning structure of three branches. And extracting the characteristic graph of the picture through each branch characteristic, and then performing network training and weight updating through combined loss.
For a tag noise path, the original tag cannot be directly considered as a genuine tag for the generated image. For the observed noise label, learning transition probabilities before the noise label and the real label is needed;
the tag of the original data is considered clean, while the tag of the generated data is noisy. For the observation tags, the following distributions are defined:
Figure BDA0002793005500000081
in the formula, z is a noise label, N is a noise label set, theta and w are implicit parameters, C is a clean label set, k is the number of categories, and p is the predicted label probability.
Given distribution, calculating implicit parameters through an EM algorithm, and in the step E, fixing the parameters and estimating transition probability:
Figure BDA0002793005500000082
in the formula, c ti To estimate the transition probability from tag t to tag j, y t For the true tag information of the t-th sample, x t For the t-th sample of the input, z t A noise label for the t sample of the input;
in step M, updating parameters:
Figure BDA0002793005500000083
in the equation, θ (i, j) is the true transition probability from tag i to tag j.
Finally, the objective function can be expressed as:
Figure BDA0002793005500000084
in the formula, S (w) represents an objective function employed in the EM algorithm.
The invention achieves the best recognition result in the current stage on the Market-1501 data set, and the result on the Market-1501 data set is shown in the table 1.
TABLE 1 comparison of experiments on Market-1501 data set
Figure BDA0002793005500000091
As shown in FIG. 3, through evaluation calculation, the pedestrian re-identification technology based on random occlusion recovery of noise channel proposed by the invention has mAP of 70.1, rank1 of 86.6 and rank5 of 94.6 on Market1501 data set (without re-ranking). Meanwhile, good experimental effect is achieved on other data sets.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A pedestrian re-identification method based on random occlusion recovery of a noise channel is characterized by comprising the following steps:
step 1: after data division and pretreatment are carried out on a reference data set, a CAN network structure for shielding recovery is constructed, data expansion is carried out on a training set obtained after data division and pretreatment are carried out on the reference data set by using the CAN network structure, training is carried out on a basic network main body feature extraction structure by using the training set after data expansion, and a trained basic network main body feature extraction structure is obtained;
and 2, step: constructing a noise channel structure for reducing tag errors caused by data expansion;
and 3, step 3: comprehensively establishing a pedestrian re-identification network based on random shielding recovery of a noise channel based on a trained basic network main body feature extraction structure, a noise channel structure and a CAN network structure for shielding recovery;
and 4, step 4: identifying the actual original image to be detected by utilizing a pedestrian re-identification network based on random shielding recovery of a noise channel;
the process of using the noise path structure in step 2 to reduce the tag error caused by data expansion specifically includes:
step 201: giving distribution to the transition probability between the original label corresponding to the generated image data and the noise label observed by utilizing the noise channel structure;
step 202: obtaining hidden parameters aiming at the distribution by using an EM algorithm, and reducing tag errors caused by data expansion by using the hidden parameters;
the distribution in step 201 is described by the formula:
Figure FDA0003798727240000011
in the formula, z is a noise label, N is a noise label set, theta and w are implicit parameters, C is a clean label set, k is the number of categories, and p is the probability of a predicted label;
in the step 202, the process of obtaining the hidden parameters by using the EM algorithm for the distribution includes fixing the hidden parameters θ and w and estimating the transition probability in step E, and updating the parameter θ in step M, where the estimated transition probability corresponds to a description formula:
Figure FDA0003798727240000021
in the formula, c ti To estimate the transition probability from tag t to tag j, y t For the true tag information of the input t sample, x t For the t-th sample of the input, z t A noise label for the t-th sample of the input;
the corresponding description formula of the update parameter θ is:
Figure FDA0003798727240000022
where θ (i, j) is the true transition probability from tag i to tag j;
the objective function adopted in the EM algorithm has a corresponding description formula as follows:
Figure FDA0003798727240000023
in the formula, S (w) represents an objective function employed in the EM algorithm.
2. The pedestrian re-identification method based on random occlusion recovery of the noise channel as claimed in claim 1, wherein the step 1 comprises the sub-steps of:
step 101: dividing a reference data set into a training set and a test set, then randomly extracting picture data from the training set and carrying out preprocessing operation;
step 102: constructing a CAN network structure for shielding recovery and further performing data expansion on the training set by using the CAN network structure;
step 103: setting parameters and corresponding formulas required by a training network model;
step 104: and after the setting is finished, inputting the picture data subjected to the preprocessing operation and the data expansion into the basic network main body feature extraction structure for training to obtain the trained basic network main body feature extraction structure.
3. The pedestrian re-identification method based on the random occlusion recovery of the noise channel as claimed in claim 2, wherein the reference data set in the step 101 is a Market1501 data set; the preprocessing operation in the step 101 comprises horizontal turning, noise adding or random erasing; the basic network main body feature extraction structure in the step 104 is a ResNet50 network structure.
4. The pedestrian re-identification method based on random occlusion recovery of the noise channel as claimed in claim 2, wherein in the step 104, the picture data after the preprocessing operation and the data expansion is input to the basic network main body feature extraction structure for training, parameters are automatically adjusted by using an Adam optimization method, the Dropout strategy is used to avoid the occurrence of the over-fitting condition, and the Batch Normalization is used to accelerate the convergence speed of the network.
5. The method according to claim 2, wherein the step 103 specifically comprises: setting the total training period epoch to 150, the weight decay parameter weight default to 0.0005, the batch size to 180, and the learning rate update mode, where the corresponding description formula is:
Figure FDA0003798727240000031
in the formula, lr is a learning rate.
6. The method according to claim 1, wherein the CAN network structure for occlusion recovery in step 1 comprises a generator network for learning an original data set and generating an image, and a discriminator network for determining whether an input image is real, that is, whether the input image belongs to the original data set or is generated by the generator network, and the corresponding mathematical description formula is as follows:
Figure FDA0003798727240000032
where x is the occlusion image, y is the target image, and D and G represent the discriminator network and the generator network, respectively.
CN202011321451.7A 2020-11-23 2020-11-23 Pedestrian re-identification method based on random occlusion recovery of noise channel Active CN112434599B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011321451.7A CN112434599B (en) 2020-11-23 2020-11-23 Pedestrian re-identification method based on random occlusion recovery of noise channel
JP2021087114A JP7136500B2 (en) 2020-11-23 2021-05-24 Pedestrian Re-identification Method for Random Occlusion Recovery Based on Noise Channel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011321451.7A CN112434599B (en) 2020-11-23 2020-11-23 Pedestrian re-identification method based on random occlusion recovery of noise channel

Publications (2)

Publication Number Publication Date
CN112434599A CN112434599A (en) 2021-03-02
CN112434599B true CN112434599B (en) 2022-11-18

Family

ID=74693648

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011321451.7A Active CN112434599B (en) 2020-11-23 2020-11-23 Pedestrian re-identification method based on random occlusion recovery of noise channel

Country Status (2)

Country Link
JP (1) JP7136500B2 (en)
CN (1) CN112434599B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239782B (en) * 2021-05-11 2023-04-28 广西科学院 Pedestrian re-recognition system and method integrating multi-scale GAN and tag learning
TWI779760B (en) * 2021-08-04 2022-10-01 瑞昱半導體股份有限公司 Method of data augmentation and non-transitory computer-readable medium
CN113742775B (en) * 2021-09-08 2023-07-28 哈尔滨工业大学(深圳) Image data security detection method, system and storage medium
CN115909464B (en) * 2022-12-26 2024-03-26 淮阴工学院 Self-adaptive weak supervision tag marking method for pedestrian re-identification

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693723A (en) * 2012-04-01 2012-09-26 北京安慧音通科技有限责任公司 Method and device for recognizing speaker-independent isolated word based on subspace

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108334848B (en) * 2018-02-06 2020-12-25 哈尔滨工业大学 Tiny face recognition method based on generation countermeasure network
KR101941994B1 (en) * 2018-08-24 2019-01-24 전북대학교산학협력단 System for pedestrian detection and attribute extraction based on a joint deep network
CN110008842A (en) * 2019-03-09 2019-07-12 同济大学 A kind of pedestrian's recognition methods again for more losing Fusion Model based on depth
CN109977841A (en) * 2019-03-20 2019-07-05 中南大学 A kind of face identification method based on confrontation deep learning network
CN110135366B (en) * 2019-05-20 2021-04-13 厦门大学 Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
CN110443203B (en) * 2019-08-07 2021-10-15 中新国际联合研究院 Confrontation sample generation method of face spoofing detection system based on confrontation generation network
CN111126360B (en) * 2019-11-15 2023-03-24 西安电子科技大学 Cross-domain pedestrian re-identification method based on unsupervised combined multi-loss model
CN110929679B (en) * 2019-12-05 2023-06-16 杭州电子科技大学 GAN-based unsupervised self-adaptive pedestrian re-identification method
CN111666800A (en) * 2019-12-23 2020-09-15 珠海大横琴科技发展有限公司 Pedestrian re-recognition model training method and pedestrian re-recognition method
CN111259850B (en) * 2020-01-23 2022-12-16 同济大学 Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN111310728B (en) * 2020-03-16 2022-07-15 中国科学技术大学 Pedestrian re-identification system based on monitoring camera and wireless positioning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102693723A (en) * 2012-04-01 2012-09-26 北京安慧音通科技有限责任公司 Method and device for recognizing speaker-independent isolated word based on subspace

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
半监督学习方法;刘建伟;《计算机学报》;20150831;第1832-1842页 *
基于生成式对抗网络的结构化数据表生成模型;宋珂慧 等;《计算机研究与发展》;20191231;第1593-1618页 *

Also Published As

Publication number Publication date
JP2022082493A (en) 2022-06-02
CN112434599A (en) 2021-03-02
JP7136500B2 (en) 2022-09-13

Similar Documents

Publication Publication Date Title
US11195051B2 (en) Method for person re-identification based on deep model with multi-loss fusion training strategy
CN112434599B (en) Pedestrian re-identification method based on random occlusion recovery of noise channel
CN111709311B (en) Pedestrian re-identification method based on multi-scale convolution feature fusion
CN109598268A (en) A kind of RGB-D well-marked target detection method based on single flow depth degree network
CN111611905A (en) Visible light and infrared fused target identification method
CN113673510B (en) Target detection method combining feature point and anchor frame joint prediction and regression
CN112668483B (en) Single-target person tracking method integrating pedestrian re-identification and face detection
CN110909741A (en) Vehicle re-identification method based on background segmentation
Zhang et al. License plate localization in unconstrained scenes using a two-stage CNN-RNN
CN113011357A (en) Depth fake face video positioning method based on space-time fusion
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN112801019B (en) Method and system for eliminating re-identification deviation of unsupervised vehicle based on synthetic data
CN112926522B (en) Behavior recognition method based on skeleton gesture and space-time diagram convolution network
CN113963399A (en) Personnel trajectory retrieval method and device based on multi-algorithm fusion application
Han et al. A method based on multi-convolution layers joint and generative adversarial networks for vehicle detection
CN111462173B (en) Visual tracking method based on twin network discrimination feature learning
CN114782997A (en) Pedestrian re-identification method and system based on multi-loss attention adaptive network
Anwer et al. Accident vehicle types classification: a comparative study between different deep learning models
CN116824641B (en) Gesture classification method, device, equipment and computer storage medium
Zhang [Retracted] Sports Action Recognition Based on Particle Swarm Optimization Neural Networks
KR20010050988A (en) Scale and Rotation Invariant Intelligent Face Detection
CN115393788B (en) Multi-scale monitoring pedestrian re-identification method based on global information attention enhancement
Lai et al. Robust text line detection in equipment nameplate images
CN113627380A (en) Cross-vision-field pedestrian re-identification method and system for intelligent security and early warning
CN113052875A (en) Target tracking algorithm based on state perception template updating

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant