CN114495269A - Pedestrian re-identification method - Google Patents

Pedestrian re-identification method Download PDF

Info

Publication number
CN114495269A
CN114495269A CN202210034867.3A CN202210034867A CN114495269A CN 114495269 A CN114495269 A CN 114495269A CN 202210034867 A CN202210034867 A CN 202210034867A CN 114495269 A CN114495269 A CN 114495269A
Authority
CN
China
Prior art keywords
pedestrian
features
local
global
identification method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210034867.3A
Other languages
Chinese (zh)
Inventor
张索非
吴晓富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Posts and Telecommunications filed Critical Nanjing University of Posts and Telecommunications
Priority to CN202210034867.3A priority Critical patent/CN114495269A/en
Publication of CN114495269A publication Critical patent/CN114495269A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a pedestrian re-identification method, which comprises the following steps: inputting the image of the pedestrian to be identified into a pre-trained pedestrian re-identification model, and extracting the characteristics of the pedestrian; matching the extracted pedestrian features with the features corresponding to the images in the gallery, and outputting a recognition result; the pedestrian re-identification model is constructed based on an asymmetric branch network, and the asymmetric branch network comprises 1 trunk network, 1 global branch network and 1 asymmetric local branch network. The pedestrian re-identification model is constructed based on the asymmetric branch network, so that the diversity of extracted features is improved, and the identification precision is improved.

Description

Pedestrian re-identification method
Technical Field
The invention relates to a pedestrian re-identification method, and belongs to the technical field of computer vision.
Background
With the wide application of deep learning in the field of image processing and computer vision, a network model based on pre-training as a feature extraction module is a common means for solving the problem, and in recent years, work in the aspect of re-recognition of many pedestrians shows that a multi-branch network is an effective strategy for feature extraction, features extracted by different branches can be mutually supplemented, and the re-recognition performance of pedestrians is greatly improved.
However, most of the existing using methods for the multi-branch network adopt a symmetrical network structure, and explicit constraints are applied among the branch networks to ensure the diversity of extracted features, so that the training computation amount of the pedestrian re-recognition model is large, and the model construction efficiency is low.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a pedestrian re-identification method, which comprises the following steps:
inputting the image of the pedestrian to be recognized into a pre-trained pedestrian re-recognition model, and extracting the characteristics of the pedestrian;
matching the extracted pedestrian features with the features corresponding to the images in the gallery, and outputting a recognition result;
the pedestrian re-identification model is constructed based on an asymmetric branch network, and the asymmetric branch network comprises 1 trunk network, 1 global branch network and 1 asymmetric local branch network.
Further, the backbone network is Resnet 50.
Further, the global branch network includes a convolution layer, a down-sampling layer, a BN layer, a residual structure and a residual module, and the step size of the convolution kernel of the down-sampling layer is 1.
Furthermore, the local branch network comprises a convolution layer, a down-sampling layer, a BN layer and a residual structure, wherein the step length of a convolution kernel of the down-sampling layer is 1, and the network weight of the local branch network is not shared.
Further, the extraction of the pedestrian features comprises the following steps:
obtaining 1 global feature from the output feature map of the global branch network through 1 global average operation;
the global features pass through a batch normalization layer to obtain normalized global features;
the output characteristic diagram of the local branch network obtains a plurality of local characteristics through 1 group of local average operations;
and the normalized global features and the plurality of local features are connected in series end to serve as the extracted pedestrian features.
Further, the normalized global features are trained by adopting a cross entropy loss function.
Further, the multiple local features are subjected to dimensionality reduction to obtain multiple shorter local features, and the multiple shorter local features are trained by adopting a cross entropy loss function.
Further, the extracted pedestrian features are trained by adopting a triple loss function.
Further, the pedestrian re-identification model comprises a lightweight attention module, and the lightweight attention module comprises a space attention sub-module and a channel attention sub-module.
Further, the spatial attention submodule employs 1 one-dimensional convolution to reduce the number of parameters.
Compared with the prior art, the invention has the beneficial effects that: the pedestrian re-identification model is constructed based on the asymmetric branch network, so that the diversity of extracted features is improved, and the identification precision is improved; meanwhile, the step length of the down-sampling layer convolution kernel of the global branch network and the local branch network is adjusted from 2 to 1, the resolution of the output characteristic graph is doubled, and the identification precision is improved; meanwhile, the main network adopts a lightweight attention module, and the relationship between the features and the peripheral features is modeled by comparing the features of different positions and different channels, so that the representativeness of the extracted features is improved, and the identification precision is further improved; meanwhile, the used lightweight attention module means that a 1-dimensional convolution is used in the attention module to reduce the calculation amount, and only a very small amount of calculation complexity is increased to replace the remarkable performance improvement.
Drawings
FIG. 1 is a flowchart of a pedestrian re-identification method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an asymmetric branch network architecture according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a lightweight spatial attention submodule according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout, and the embodiments described below with reference to the drawings are exemplary only and are not to be construed as limiting the invention.
In the description of the present invention, several meanings are more than one, several meanings are more than two, more than, less than, exceeding, etc. are understood as not including the number, and more than, less than, etc. are understood as including the number, if there are descriptions of the first and second for the purpose of distinguishing technical features, it cannot be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features or implicitly indicating the precedence of the indicated technical features.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
In the description of the present invention, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention, and in this specification, an illustrative representation of the terms does not necessarily refer to the same embodiment or example, and the particular feature, structure, material, or characteristic described may be combined in any suitable manner in any one or more embodiments or examples.
The present invention provides a pedestrian re-identification method, which is further described with reference to the accompanying drawings and embodiments.
As shown in fig. 1, the present embodiment provides a pedestrian re-identification method, including: and inputting the image of the pedestrian to be recognized into a pedestrian re-recognition model, extracting the pedestrian features, matching the extracted pedestrian features with the corresponding features of each image in the image library, and outputting a recognition result.
In this embodiment, the construction of the pedestrian recognition model is realized by a convolutional neural network, which includes: obtaining a historical pedestrian image, constructing a pedestrian re-identification training data set, and obtaining a pedestrian re-identification model based on a convolutional neural network according to the pedestrian re-identification training data set.
As shown in fig. 2, in the present embodiment, the convolutional neural network is an asymmetric branch network, and includes 1 trunk network, 1 global branch network, and 1 asymmetric local branch network.
The main network adopts ResNet50 to extract features, ResNet50 comprises layers 0 to 3, and the layer 3 output feature graph of ResNet50 is input into the global branch network and the asymmetric local branch network simultaneously.
The global branch network comprises a convolution layer, a down-sampling layer, a BN layer, a residual structure and a residual module, wherein the step length of a convolution kernel of the down-sampling layer of the global branch network is set to be 1, the residual module of the global branch network is composed of the down-sampling layer and the convolution layer, 1 global feature is obtained through 1 global average operation of an output feature diagram of the global branch network, the obtained global feature passes through a batch normalization layer to obtain a normalized global feature, and the normalized global feature is trained by adopting a cross entropy loss function.
The local branch network comprises a convolution layer, a down-sampling layer, a BN layer and a residual structure, the step length of a convolution kernel of the down-sampling layer is set to be 1, the network weight of the local branch network is not shared, the output characteristic diagram of the local branch network obtains a plurality of local characteristics through 1 group of local average operations, and the plurality of local characteristics comprise an upper local characteristic, a middle local characteristic and a lower local characteristic.
The normalized global features and the plurality of local features are connected in series end to end and serve as extracted pedestrian features, and the extracted pedestrian features are trained by adopting a triple loss function.
And obtaining a plurality of shorter local features by the plurality of local features through a dimensionality reduction layer, and then training the plurality of shorter local features by adopting a cross entropy loss function.
The main network is also provided with a lightweight attention module, the lightweight attention module comprises a space attention submodule (namely a position attention module) and a channel attention submodule (namely a channel attention module), and the space attention submodule and the channel attention submodule are connected in series to form a complete attention module; as shown in fig. 3, the spatial attention submodule reduces the number of parameters by using a specially designed one-dimensional convolution, and the specific steps are as follows: the space attention submodule firstly converts an input feature X from a dimension of C multiplied by H multiplied by 2 into a dimension of C multiplied by S, wherein C is the number of feature map channels, H is the height of the feature map, W is the width of the feature map, and S is H multiplied by W and is the size of the straightened feature; inputting the straightened characteristic X into another single-channel one-dimensional convolution filter with the kernel width of 5 and the step length of 1 to obtain a characteristic Q after convolution; inputting the straightened feature X into a single-channel one-dimensional convolution filter with the kernel width of 5 and the step length of 1 to obtain a convolved feature K, transposing the feature K, multiplying the feature K by a feature Q, and obtaining an attention graph A by a softmax function, wherein the attention graph A is represented as:
A=soft max(KTQ) (1)
where T denotes a matrix transpose.
Inputting the characteristic X into a two-dimensional convolution filter with the kernel width of 1 and the step length of 1, wherein the number of input and output channels of the filter is C, obtaining a characteristic V after convolution, multiplying the characteristic V with a characteristic graph A, obtaining a characteristic graph after attention reweighting, and adding and outputting the graph and the original characteristic through a residual error structure.
The loss function used by the asymmetric branching network during model training is represented as follows:
Figure BDA0003467923540000051
wherein L isceRepresenting the cross entropy loss function, LtRepresenting a triple loss function; wbAnd
Figure BDA0003467923540000052
representing full connectivity layer corresponding parameters for a cross entropy loss function; n represents a block index of a local branch, and generally takes a value from 1 to 3; y is the identity tag of the pedestrian sample, i andj represents the sequence number of the different samples; λ is a weight constant, which can be directly set to 1; an indicator indicates a concatenation operation of feature vectors; for global branch, in the training phase, the global feature in the triple loss function is derived from the global feature f before normalizationgThe global feature in the cross entropy loss function is derived from the normalized global feature fb(ii) a In the testing stage, the output characteristics of the model are normalized by fbIn place of fgTo be connected in series, i.e., the bnnack structure in fig. 1; for local branches, fpRepresenting local features obtained after average pooling, fsRepresenting the shortened local features obtained after the dimensionality reduction layer.
The final output pedestrian characteristic of the recognition model is fb⊙fpAs shown in table 1, the asymmetric network structure provided in the present invention can effectively improve the diversity of extracted features between different branches, thereby improving the overall recognition accuracy of the system.
TABLE 1 Performance comparison of the latest approach on four common pedestrian re-identification datasets
Figure BDA0003467923540000061
The comparison of different pedestrian re-identification methods in table 1 covers four most common standard data sets, and it can be seen that, from the two standard evaluation indexes of mAP and Rank-1 of the pedestrian re-identification task, all other data sets show obvious advantages except that the Rank-1 index on the Market1501 data set has slight lag, which exceeds other methods of ranking second.
As will be appreciated by one of skill in the art, the present application may be provided as a method, system, or computer program product, and as such, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects and may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like.
While the present application has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application, it will be understood that each flowchart illustration and/or block diagram block and combination of flowchart illustrations and/or block diagram blocks can be implemented by computer program instructions, which may be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart illustration, flow or blocks, and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A pedestrian re-identification method, characterized in that the method comprises the steps of:
inputting the image of the pedestrian to be recognized into a pre-trained pedestrian re-recognition model, and extracting the characteristics of the pedestrian;
matching the extracted pedestrian features with features corresponding to the images in the image library, and outputting an identification result;
the pedestrian re-identification model is constructed based on an asymmetric branch network, and the asymmetric branch network comprises 1 trunk network, 1 global branch network and 1 asymmetric local branch network.
2. The pedestrian re-identification method according to claim 1, wherein the backbone network is Resnet 50.
3. The pedestrian re-identification method of claim 1, wherein the global branching network comprises a convolution layer, a downsampling layer, a BN layer, a residual structure, and a residual module, wherein the downsampling layer convolution kernel step size is 1.
4. The pedestrian re-identification method according to claim 1, wherein the local branch network includes a convolution layer, a down-sampling layer, a BN layer and a residual structure, the step size of the convolution kernel of the down-sampling layer is 1, and the network weights of the local branch network are not shared.
5. The pedestrian re-identification method according to claim 1, wherein the extraction of the pedestrian feature includes the steps of:
obtaining 1 global feature from the output feature map of the global branch network through 1 global average operation;
the global features pass through a batch normalization layer to obtain normalized global features;
the output characteristic diagram of the local branch network obtains a plurality of local characteristics through 1 group of local average operations;
and the normalized global features and the plurality of local features are connected in series end to serve as the extracted pedestrian features.
6. The pedestrian re-identification method of claim 5, wherein the normalized global features are trained using a cross entropy loss function.
7. The pedestrian re-identification method according to claim 5, wherein the plurality of local features are passed through a dimensionality reduction layer to obtain a plurality of shorter local features, and the plurality of shorter local features are trained using a cross entropy loss function.
8. The pedestrian re-identification method of claim 5, wherein the extracted pedestrian features are trained using a triple loss function.
9. The pedestrian re-identification method according to claim 1, wherein the pedestrian re-identification model comprises a lightweight attention module including a spatial attention sub-module and a channel attention sub-module.
10. The pedestrian re-identification method of claim 9, wherein the spatial attention sub-module employs 1 one-dimensional convolution to reduce the number of parameters.
CN202210034867.3A 2022-01-13 2022-01-13 Pedestrian re-identification method Pending CN114495269A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210034867.3A CN114495269A (en) 2022-01-13 2022-01-13 Pedestrian re-identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210034867.3A CN114495269A (en) 2022-01-13 2022-01-13 Pedestrian re-identification method

Publications (1)

Publication Number Publication Date
CN114495269A true CN114495269A (en) 2022-05-13

Family

ID=81511359

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210034867.3A Pending CN114495269A (en) 2022-01-13 2022-01-13 Pedestrian re-identification method

Country Status (1)

Country Link
CN (1) CN114495269A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114792315A (en) * 2022-06-22 2022-07-26 浙江太美医疗科技股份有限公司 Medical image visual model training method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114792315A (en) * 2022-06-22 2022-07-26 浙江太美医疗科技股份有限公司 Medical image visual model training method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112257794B (en) YOLO-based lightweight target detection method
CN111882040B (en) Convolutional neural network compression method based on channel number search
CN110046550B (en) Pedestrian attribute identification system and method based on multilayer feature learning
CN112561027A (en) Neural network architecture searching method, image processing method, device and storage medium
CN110738146A (en) target re-recognition neural network and construction method and application thereof
CN111310598B (en) Hyperspectral remote sensing image classification method based on 3-dimensional and 2-dimensional mixed convolution
CN111860683B (en) Target detection method based on feature fusion
Li et al. Data-driven neuron allocation for scale aggregation networks
CN112926641A (en) Three-stage feature fusion rotating machine fault diagnosis method based on multi-modal data
CN113066065B (en) No-reference image quality detection method, system, terminal and medium
CN107679539B (en) Single convolution neural network local information and global information integration method based on local perception field
CN112580480B (en) Hyperspectral remote sensing image classification method and device
CN113569672A (en) Lightweight target detection and fault identification method, device and system
CN108363962B (en) Face detection method and system based on multi-level feature deep learning
CN109919084A (en) A kind of pedestrian's recognition methods again more indexing Hash based on depth
CN111160378A (en) Depth estimation system based on single image multitask enhancement
CN116958687A (en) Unmanned aerial vehicle-oriented small target detection method and device based on improved DETR
CN115937693A (en) Road identification method and system based on remote sensing image
CN108537235A (en) A kind of method of low complex degree scale pyramid extraction characteristics of image
CN115713755A (en) Efficient and accurate image identification method for Spodoptera frugiperda
CN111898614B (en) Neural network system and image signal and data processing method
CN114495269A (en) Pedestrian re-identification method
CN116977747B (en) Small sample hyperspectral classification method based on multipath multi-scale feature twin network
CN113705394A (en) Behavior identification method combining long and short time domain features
CN116777842A (en) Light texture surface defect detection method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination