CN114792430A

CN114792430A - Pedestrian re-identification method, system and related equipment based on polarization self-attention

Info

Publication number: CN114792430A
Application number: CN202210462489.9A
Authority: CN
Inventors: 闫潇宁; 陈晓艳; 杨坤志; 张东洋
Original assignee: Shenzhen Anruan Huishi Technology Co ltd
Current assignee: Shenzhen Anruan Huishi Technology Co ltd
Priority date: 2022-04-24
Filing date: 2022-04-24
Publication date: 2022-07-26

Abstract

The invention is suitable for the field of computer vision, and provides a pedestrian re-identification method, a system and related equipment based on polarized self-attention, wherein the method comprises the following steps: acquiring a shooting data set with a pedestrian picture, and preprocessing the shooting data set to obtain a data set to be divided; dividing a data set to be divided into a training set and a testing set, wherein each pedestrian picture has a real label; constructing a pedestrian re-identification model comprising a double-branch structure and a polarization self-attention mechanism structure; training a pedestrian weight recognition model by using an Adam optimization algorithm by taking a training set as input and a real label corresponding to the training set as reference for parameter debugging to obtain a training parameter weight; and taking the training parameter weight as a test weight, and taking the test set as the input of the pedestrian re-recognition model to obtain a pedestrian re-recognition result of the test set. The invention reduces the information loss in the characteristic extraction process, and improves the accuracy of pedestrian re-identification by combining global and local characteristic extraction.

Description

Pedestrian re-identification method and system based on polarization self-attention and related equipment

Technical Field

The invention belongs to the field of computer vision, and particularly relates to a pedestrian re-identification method and system based on polarization self-attention and related equipment.

Background

In recent years, with increasing demands in the fields of intelligent security and video monitoring, the research on pedestrian re-identification has received more and more extensive attention and research. In video monitoring, because the camera resolution ratio is low, light intensity is not enough, make a video recording angle is not good and the object shelters from the like factor, be difficult to catch the clear people's face information of pedestrian, therefore be difficult to discern the pedestrian identity through people's face information, and pedestrian's heavy identification can be regarded as a picture retrieval task, utilize computer vision technique to judge whether there is specific pedestrian in given picture or the video sequence, can avoid because the camera resolution ratio is low under this kind of technique, light intensity is not enough, make a video recording angle is not good and the object shelters from the like factor is difficult to discern the condition of pedestrian's identity through people's face information.

Pedestrian re-identification is generally realized based on a deep learning method, namely, after the characteristics of pedestrians are identified by a convolutional neural network, the pedestrians with the same characteristics in different pictures are obtained. The existing pedestrian re-identification method mainly considers the coarse-grained characteristics of the whole pedestrian picture, namely the overall characteristics of the pedestrian, but lacks the attention to the fine-grained characteristics of the pedestrian, such as hairstyle, clothes color, whether to pack or not and the like, so that the existing pedestrian re-identification method is insufficient in pedestrian characteristic extraction, and the accuracy rate is low.

Disclosure of Invention

The embodiment of the invention provides a pedestrian re-identification method, a system and related equipment based on polarization self-attention, and aims to solve the problem that the traditional pedestrian re-identification method is insufficient in extracting different coarse-grained characteristics of pedestrians, so that the accuracy is low.

In a first aspect, an embodiment of the present invention provides a method for re-identifying a pedestrian based on polarized self-attention, where the method includes:

acquiring a shooting data set with a pedestrian picture, and preprocessing the shooting data set to obtain a data set to be divided;

dividing the data set to be divided into a training set and a testing set, wherein each pedestrian picture in the training set and the data set has a real label;

constructing a pedestrian re-identification model comprising a double-branch structure and a polarization self-attention mechanism structure;

taking the training set as the input of the pedestrian re-recognition model, taking the real label corresponding to the training set as the reference of parameter debugging, and training the pedestrian re-recognition model by using an Adam optimization algorithm to obtain the training parameter weight;

and taking the training parameter weight as the testing weight of the pedestrian re-recognition model, and taking the testing set as the input of the pedestrian re-recognition model to obtain the pedestrian re-recognition result of the testing set.

Further, in the step of obtaining a shot data set with a pedestrian picture, and preprocessing the shot data set to obtain a data set to be divided, the preprocessing specifically includes:

normalizing the size of each picture in the shooting data set, and turning, randomly cutting and erasing data enhancement are carried out on each picture in the shooting data set.

Furthermore, the pedestrian re-identification model takes a convolutional neural network as a feature extraction network, the convolutional neural network comprises an input layer, a convolutional layer, a feature extraction layer and an output layer, wherein the convolutional layer comprises a plurality of layers, the double-branch structure comprises a global branch and a local branch, the double-branch structure is positioned between the convolutional layer and the output layer, the polarization self-attention mechanism structure comprises a channel self-attention branch and a space self-attention branch, and the polarization self-attention mechanism structure is positioned after each convolutional layer.

Further, for the polarized self-attention mechanism structure, the feature matrix of the pedestrian picture output by the convolutional layer is defined as X, and the weight of the channel self-attention branch is defined as A ^sp (X), then the channel is weighted from the attention branch by A ^sp (X) satisfies the relation (1):

A ^sp (X)＝F _SG [σ ₃ (F _SM (σ ₁ (F _GP (W _q (X))))×σ ₂ (W _v (X)))] (1)

defining the spatial self-attention branch as weight A ^ch (X), then the weight of the spatial self-attention branch A ^ch (X) satisfies the relation (2):

in the above relational expressions (1) and (2), W _q 、W _v 、

Convolution operations, σ, all 1 x1 ₁ 、σ ₂ 、σ ₃ Are all tensor warp operations, F _SM For softmax operation, F _GP For global average pooling operation, F _SG Is a Sigmoid function, and the x operator is matrix point multiplication;

defining a weight A for the channel self-attention branch ^sp (X) and the weight of the spatial self-attention branch A ^ch (X) parallel fusion results as PSA _p (X), then said parallel fusion results PSA _p (X) satisfies the relation (3):

PSA _p (X)＝Z ^ch +Z ^sp ＝A ^ch (X)⊙ ^ch X+A ^sp (X)⊙ ^sp X (3)

the above relation (3)) Middle, PSA _p (X) as an output of said polarized self-attentive force mechanism structure ^ch For a multiply by channel operation, <' > ^sp For operation by spatial multiplication, Z ^sp For the output of said spatial self-attention branch, Z ^ch Is the output of the channel from the attention branch.

Furthermore, for the dual-branch structure, the global branch has the same structure as the feature extraction layer of the convolutional neural network, and is configured to take the parallel fusion result output by the polarized self-attention mechanism structure as input, and extract a global feature corresponding to the pedestrian picture, where the local branch and the global branch are in parallel hierarchy, and are configured to take the parallel fusion result output by the polarized self-attention mechanism structure as input, and extract a local feature corresponding to the pedestrian picture.

Further, the output layer of the pedestrian re-identification model takes the fusion result of the global feature and the local feature as an output result.

Still further, the loss function used by the pedestrian re-identification model in the training process comprises at least one of a triple loss, a classification loss and a center loss.

In a second aspect, an embodiment of the present invention further provides a pedestrian re-identification system based on polarized self-attention, including:

the device comprises a preprocessing module, a data partitioning module and a data partitioning module, wherein the preprocessing module is used for acquiring a shooting data set with a pedestrian picture and preprocessing the shooting data set to obtain a data set to be partitioned;

the data dividing module is used for dividing the data set to be divided into a training set and a testing set, wherein each pedestrian picture in the training set and the data set is provided with a real label;

the model building module is used for building a pedestrian re-identification model comprising a double-branch structure and a polarization self-attention mechanism structure;

the model training module is used for taking the training set as the input of the pedestrian re-recognition model, taking the real label corresponding to the training set as the reference of parameter debugging, and training the pedestrian re-recognition model by using an Adam optimization algorithm to obtain the training parameter weight;

and the data identification module is used for taking the training parameter weight as the testing weight of the pedestrian re-identification model and taking the test set as the input of the pedestrian re-identification model to obtain a pedestrian re-identification result of the test set.

In a third aspect, an embodiment of the present invention further provides a computer device, including: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method for pedestrian re-identification based on polarized self-attention as described in any one of the above embodiments when executing the computer program.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps in the method for pedestrian re-identification based on polarized self-attention as described in any one of the above embodiments.

The method has the advantages that due to the adoption of the two-dimensional polarization self-attention mechanism structure, information of different feature spaces can be learned in a self-adaptive mode, information loss caused by dimension reduction is reduced, meanwhile, the global features and the local features are respectively extracted by utilizing the two-branch structure network, and the recognition accuracy of the pedestrian re-recognition model is further improved.

Drawings

FIG. 1 is a block flow diagram illustrating steps of a method for re-identifying a pedestrian based on polarized self-attention according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a pedestrian re-identification model provided by an embodiment of the invention;

FIG. 3 is a schematic diagram of a polarization self-attention mechanism configuration provided by an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a system 200 for recognizing a pedestrian based on polarized self-attention according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a computer device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.

Referring to fig. 1, fig. 1 is a flow chart of steps of a pedestrian re-identification method based on polarized self-attention according to an embodiment of the present invention, which specifically includes the following steps:

s101, acquiring a shooting data set with a pedestrian picture, and preprocessing the shooting data set to obtain a data set to be divided.

In the step of obtaining a shot data set with a pedestrian picture and preprocessing the shot data set to obtain a data set to be divided, the preprocessing method specifically comprises the following steps:

For example, the shooting data set selected in the embodiment of the present invention may be a public Market1501 data set, the Market1501 data set includes 1501 pedestrian targets captured through camera shooting, each pedestrian target further includes a plurality of pedestrian images in the data set, and in step S101, the pedestrian images in the Market1501 data set are preprocessed in manners of normalization, flipping, random cropping, random erasing, and the like, so as to obtain the data set to be divided.

S102, dividing the data set to be divided into a training set and a testing set, wherein each pedestrian picture in the training set and the data set has a real label.

Specifically, the data to be divided is divided into the training set and the test set, and the marker 1501 data set used in the embodiment of the present invention is divided into the training set and the test set including different pedestrian targets according to the existing content thereof, wherein the training set includes 751 pedestrian targets including 12936 pedestrian images, the test set includes 750 pedestrian targets including 19732 pedestrian images, each pedestrian image has the real tag, and the real tag is used for tagging a certain feature expressed in the pedestrian image.

Preferably, the training set is divided into N batches, each batch includes P different pedestrian targets, and each pedestrian target corresponds to K pedestrian pictures, that is, each batch includes B ═ P × K pedestrian pictures as training samples.

S103, constructing a pedestrian re-identification model comprising a double-branch structure and a polarized self-attention mechanism structure.

Specifically, referring to fig. 2, fig. 2 is a schematic structural diagram of a pedestrian re-identification model provided in an embodiment of the present invention, where the pedestrian re-identification model uses a Convolutional Neural Network (CNN) as a feature extraction network, the Convolutional Neural network includes an input layer, a Convolutional layer, a feature extraction layer, and an output layer, where the Convolutional layer includes a plurality of Convolutional layers, and on the basis of the Convolutional Neural network, the dual-branch structure is disposed at a position of the feature extraction layer and includes a global branch and a local branch, and the dual-branch structure is located between the Convolutional layer and the output layer of the overall Neural network structure.

In the embodiment of the invention, the structure of the global branch is the same as that of the original feature extraction layer of the convolutional neural network, namely, the original overall feature extraction of the pedestrian picture is realized, and after two groups of unified convolution operations, output data with the same size as the original input data is obtained by utilizing global maximum pooling; the local branch may be regarded as a structure additionally added between the convolutional layer and the output layer, and different from the global branch, the local branch does not include a structure included in the global branch for performing a global maximum pooling operation, but after two sets of convolution operations, divides image data obtained by convolution into smaller local images, and then extracts features of the local images, so as to extract fine-grained features which are not easily extracted by hierarchical features, preferably, the local branch may include a plurality of structures for dividing into local pictures and extracting local features, and when the size of the pedestrian picture is large enough or the number of pedestrian targets in the pictures is large and the environment is complicated, the plurality of local features obtained by extracting features through the plurality of local branches are spliced, output data with the same size as original input data is obtained, loss is calculated by using the same loss function as the output result of the global branch, and the finally obtained feature extraction effect is better. The dual-branch structure of the embodiment of the present invention is used only for illustration, and it should be understood that the multi-branch neural network structure based on the embodiment of the present invention also falls within the protection scope of the present invention.

In an embodiment of the present invention, the polarized self-attention mechanism structure comprises a channel self-attention branch and a spatial self-attention branch, and is located after each convolution layer of the convolutional neural network.

Specifically, referring to fig. 3, fig. 3 is a schematic view of a polarization self-attention mechanism structure provided in an embodiment of the present invention, wherein C, H, W in fig. 3 respectively corresponds to the number, height, and width of channels of the pedestrian picture, and for the polarization self-attention mechanism structure, a feature matrix of the pedestrian picture output by the convolutional layer is defined as X, and a weight of the channel self-attention branch is defined as a ^sp (X), then the channel is weighted from the attention branch by A ^sp (X) satisfies the relation (1):

defining the weight of the spatial self-attention branch as A ^ch (X), then the weight of the spatial self-attention branch A ^ch (X) satisfies the relation (2):

in the above relational expressions (1) and (2), W _q 、W _v 、

Convolution operations, σ, all 1 x1 ₁ 、σ ₂ 、σ ₃ Are all tensor deformation operations, F _SM For softmax operation, F _GP For a global average pooling operation, F _SG Is a Sigmoid function;

defining a weight A for the channel self-attention branch ^sp (X) and the weight of the spatial self-attention branch A ^ch The parallel fusion result of (X) is PSA _p (X), then said parallel fusion results PSA _p (X) satisfies the relation (3):

PSA _p (X)＝Z ^ch +Z ^sp ＝A ^ch (X)⊙ ^ch X+A ^sp (X)⊙ ^sp X (3)

in the above relation (3), PSA _p (X) as an output of the polarized self-attentive force mechanism structure, <' > ^ch For a multiply by channel operation, <' > ^sp For operation by spatial multiplication, Z ^sp For the output of said spatial self-attention branch, Z ^ch Is the output of the channel from the attention branch.

For the double-branch structure, the global branch has the same structure as the feature extraction layer of the convolutional neural network, and is configured to take the parallel fusion result output by the polarized self-attention mechanism structure as input and extract a global feature corresponding to the pedestrian picture, and the local branch and the global branch are in parallel hierarchy, and are configured to take the parallel fusion result output by the polarized self-attention mechanism structure as input and extract a local feature corresponding to the pedestrian picture.

Specifically, in the pedestrian re-identification model constructed in the embodiment of the present invention, corresponding to the original structure of the convolutional neural network, the polarization self-attention mechanism structure acts on the convolutional layer of the original convolutional neural network, and the dual-branch structure acts on the feature extraction layer of the original convolutional neural network, the output of the convolutional layer after the polarization self-attention mechanism structure acts on the dual-branch structure as an input of the dual-branch structure as a whole, and in the dual-branch structure, no structure for directly transmitting information is provided between the global branch and the local branch, and in the data input level, the global branch and the local branch both use the same input content, and respectively use different feature extraction methods to perform feature extraction on the input content.

And the output layer of the pedestrian re-identification model takes the fusion result of the global features and the local features as an output result.

It should be noted that, the convolutional neural Network refers to a neural Network including a convolutional structure, and in the case of some commonly used feature extraction networks, for example, ResNet (Deep residual Network), OSNet (Omni-Scale Network, multiscale Network), etc. as the feature extraction Network, the polarized self-attention mechanism structure described in the embodiment of the present invention may be used in its convolutional structure, the dual-branch structure described in the embodiment of the present invention may be used at the end of its convolutional structure, and both the polarized self-attention mechanism structure and the dual-branch structure may achieve the same technical effect as the embodiment of the present invention by participating in convolution calculation and implementing feature fusion of different branches through an output layer, therefore, the embodiment of the present invention does not limit the type of the basic convolutional neural Network, and it should be considered that, in any Network based on feature extraction, some embodiments of the present invention may use a new type of the feature extraction Network, and all the features of the different types of the convolutional neural networks may be combined into the same type The neural network model constructed by the targets of tasks such as target classification and scene segmentation can use the structure in the embodiment of the invention and is within the protection scope of the invention.

And S104, taking the training set as the input of the pedestrian re-recognition model, taking the real label corresponding to the training set as the reference of parameter debugging, and training the pedestrian re-recognition model by using an Adam optimization algorithm to obtain the training parameter weight.

Specifically, the reference for parameter debugging by using the real label corresponding to the training set refers to an evaluation parameter specified by a pedestrian re-recognition effect for evaluating a current model in a training process of the pedestrian re-recognition model, the real label refers to a feature expressed by the pedestrian target in the pedestrian picture in the embodiment of the present invention, and the re-recognition degree of the real label can reflect the re-recognition degree of the overall feature of the pedestrian target.

Preferably, the loss function used by the pedestrian re-identification model in the training process includes at least one of triple loss, classification loss, and center loss, in an embodiment of the present invention, the loss function used by the pedestrian re-identification model includes all three types described above, and the object of performing the loss calculation using the loss function is data output after the global branch and the local branch feature are fused. The training parameter weight refers to the weight A of the channel self-attention branch in the above embodiment ^sp (X) and the weight of the spatial self-attention branch A ^ch (X), and the parallel fusion resultant PSA _p (X)。

The number of times of the pedestrian re-identification training is related to the number of the batches divided by the training set, in the embodiment of the invention, the training set is divided into N batches, and the pedestrian re-identification model is subjected to iterative training for at least N times so as to perform complete training by using the parts of the training set corresponding to all the batches obtained by division. The pedestrian re-identification model is optimized in parameters by using an Adam optimizer in the training process.

And S105, taking the training parameter weight as a test weight of the pedestrian re-identification model, and taking the test set as the input of the pedestrian re-identification model to obtain a pedestrian re-identification result of the test set.

In the embodiment of the present invention, the training parameter weight refers to each weight parameter in the polarized self-attention machine structure, in the training process, due to continuous iterative optimization, each weight parameter in the polarized self-attention machine structure gradually approaches to a value with an optimal re-recognition effect, the test weight takes a parameter value obtained when the training is completed as a practical value, the test set is used as an input of the pedestrian re-recognition model after the training in step S104, and a pedestrian re-recognition result corresponding to the pedestrian target in the test set is obtained.

For example, please refer to table 1, where table 1 is a comparison result between the pedestrian re-identification method provided by the embodiment of the present invention and other disclosed algorithms.

TABLE 1 comparison of pedestrian re-identification method with other public algorithms

mAP and Rank-1 are common evaluation indexes in the field of pedestrian re-identification, wherein mAP represents identification accuracy, Rank-1 represents first-place hit rate, and data in table 1 show that compared with the existing pedestrian re-identification algorithm, the identification accuracy and the first-place hit rate which can be achieved by the pedestrian re-identification model based on the polarization self-attention mechanism structure and the double-branch structure provided by the embodiment of the invention are greatly improved.

The invention has the advantages that because the double-dimensional polarization self-attention mechanism structure is adopted, the information of different feature spaces can be learned in a self-adaptive manner, the information loss caused by dimension reduction is reduced, and meanwhile, the global features and the local features are respectively extracted by utilizing the double-branch structure network, so that the identification accuracy of the pedestrian re-identification model is further improved.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a pedestrian re-identification system 200 based on polarized self-attention according to an embodiment of the present invention, where the pedestrian re-identification system 200 includes:

the system comprises a preprocessing module 201, a data processing module and a data dividing module, wherein the preprocessing module 201 is used for acquiring a shooting data set with a pedestrian picture and preprocessing the shooting data set to obtain a data set to be divided;

a data dividing module 202, configured to divide the data set to be divided into a training set and a test set, where each pedestrian picture in the training set and the data set has a real label;

the model construction module 203 is used for constructing a pedestrian re-identification model comprising a double-branch structure and a polarized self-attention mechanism structure;

the model training module 204 is used for taking the training set as the input of the pedestrian re-recognition model, taking the real label corresponding to the training set as the reference of parameter debugging, and training the pedestrian re-recognition model by using an Adam optimization algorithm to obtain the training parameter weight;

the data identification module 205 is configured to use the training parameter weight as a test weight of the pedestrian re-identification model, and use the test set as an input of the pedestrian re-identification model, so as to obtain a result of the pedestrian re-identification for the test set.

The system 200 for re-identifying a pedestrian based on polarization self-attention can implement the steps in the method for re-identifying a pedestrian based on polarization self-attention in the foregoing embodiments, and can implement the same technical effects, which are not described herein again with reference to the description in the foregoing embodiments.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a computer device provided in an embodiment of the present invention, where the computer device 300 includes: a memory 302, a processor 301, and a computer program stored on the memory 302 and executable on the processor 301.

The processor 301 calls the computer program stored in the memory 302 to execute the steps of the pedestrian re-identification method provided by the embodiment of the present invention, and with reference to fig. 1, the method specifically includes:

Furthermore, in the step of obtaining a shooting data set with a pedestrian picture and preprocessing the shooting data set to obtain a data set to be divided, the preprocessing specifically includes:

Further, for the polarized self-attention mechanism structure, the feature matrix of the pedestrian picture output by the convolution layer is defined as X, and the weight of the channel self-attention branch is defined as A ^sp (X), then the channel is branched from attention by weight A ^sp (X) satisfies the relation (1):

defining the spatial self-attention branch as weight A ^ch (X), then the weight A of the spatial self-attention branch ^ch (X) satisfies the relation (2):

in the above relational expressions (1) and (2), W _q 、W _v 、

Convolution operations, σ, all 1 x1 ₁ 、σ ₂ 、σ ₃ Are all tensor warp operations, F _SM For softmax operation, F _GP For a global average pooling operation, F _SG Is a Sigmoid function;

defining a weight A for the channel self-attention branch ^sp (X) and weight A of the spatial self-attention branch ^ch (X) parallel fusion results as PSA _p (X), then said parallel fusion results PSA _p (X) satisfies the relation (3):

PSA _p (X)＝Z ^ch +Z ^sp ＝A ^ch (X)⊙ ^ch X+A ^sp (X)⊙ ^sp X (3)

in the above relation (3), PSA _p (X) as an output of said polarized self-attentive force mechanism structure ^ch As a multiply-by-channel operation, " ^sp For multiplication operations in space, Z ^sp For the output of said spatial self-attention branch, Z ^ch Is the output of the channel from the attention branch.

Illustratively, when the computer device 300 used in the embodiment of the present invention executes the method for re-identifying a pedestrian in park management provided in the embodiment of the present invention and obtains the data in table 1 in the above embodiment, the used computer hardware environment is based on central processing units Intel Xeon E5-2630V 4 of Intel corporation, the used memory size is 128GB, and the used graphics processing unit is NVIDIA GeForce CTX1080 Ti; the software environment of the used computer program is based on Ubuntu 20.04, the programming language is Python 3.6, the deep learning framework is Pythrch 1.8, and the CUDA version is 11.4. In addition, the computer device 300 provided by the embodiment of the present invention may also be implemented on hardware based on platforms such as Jetson Nano, haisi 3559, and the like.

The computer device 300 provided in the embodiment of the present invention can implement the steps in the method for re-identifying a pedestrian based on polarized self-attention in the foregoing embodiment, and can implement the same technical effects, and the details are not repeated herein with reference to the description in the foregoing embodiment.

The embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process and step in the pedestrian re-identification method based on polarization self-attention provided in the embodiment of the present invention, and can implement the same technical effects, and in order to avoid repetition, the detailed description is omitted here.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element identified by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

While the present invention has been described in connection with the preferred embodiments of the present invention, as illustrated and described in the accompanying drawings, it is to be understood that the invention is not limited to the disclosed embodiments, but is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A method for re-identifying a pedestrian based on polarized self-attention, the method comprising:

2. The pedestrian re-identification method based on polarized self-attention as claimed in claim 1, wherein in the step of obtaining a shot data set with a pedestrian picture and preprocessing the shot data set to obtain data sets to be divided, the preprocessing specifically comprises:

and normalizing the size of each picture in the shooting data set, and turning, randomly cutting and erasing data enhancement are performed on each picture in the shooting data set.

3. The pedestrian re-identification method based on the polarized self-attention as claimed in claim 1, wherein the pedestrian re-identification model takes a convolutional neural network as a feature extraction network, the convolutional neural network comprises an input layer, a convolutional layer, a feature extraction layer and an output layer, and the convolutional layer comprises a plurality of layers; the dual-branch structure comprises a global branch and a local branch, and is positioned between the convolutional layer and the output layer; the polarized self-attention mechanism structure includes a channel self-attention branch and a spatial self-attention branch, the polarized self-attention mechanism structure being located after each of the convolutional layers.

4. The pedestrian re-identification method based on polarized self-attention of claim 3, wherein for the polarized self-attention mechanism structure, a feature matrix defining the pedestrian picture output by the convolution layer is X, and the weight of the channel self-attention branch is A ^sp (X), then the channel is weighted from the attention branch by A ^sp (X) satisfies the relation (1):

in the above relational expressions (1) and (2), W _q 、W _v 、

PSA _p (X)＝Z ^ch +Z ^sp ＝A ^ch (X)⊙ ^ch X+A ^sp (X)⊙ ^sp X (3)

in the above relation (3), PSA _p (X) as an output of said polarized self-attentive force mechanism structure ^ch As a multiply-by-channel operation, " ^sp For operation by spatial multiplication, Z ^sp For the output of said spatial self-attention branch, Z ^ch Is the output of the channel from the attention branch.

5. The method of polarized self-attention based pedestrian re-identification as claimed in claim 4 wherein for the dual-branch structure: the global branch has the same structure as the feature extraction layer of the convolutional neural network and is used for taking the parallel fusion result output by the polarization self-attention mechanism structure as input and extracting the global feature corresponding to the pedestrian picture; the local branch and the global branch are in a parallel level and used for inputting and extracting the local features corresponding to the pedestrian picture by taking the parallel fusion result output by the polarized self-attention mechanism structure.

6. The polarization self-attention based pedestrian re-recognition method according to claim 5, wherein an output layer of the pedestrian re-recognition model takes a fusion result of the global feature and the local feature as an output result.

7. The pedestrian re-identification method based on polarization self-attention as claimed in claim 1, wherein the loss function used by the pedestrian re-identification model in the training process comprises at least one of a triplet loss, a classification loss and a center loss.

8. A polarized self-attention based pedestrian re-identification system, comprising:

and the data identification module is used for taking the training parameter weight as the testing weight of the pedestrian re-identification model and taking the testing set as the input of the pedestrian re-identification model to obtain the pedestrian re-identification result of the testing set.

9. A computer device, comprising: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps in the method for pedestrian re-identification based on polarized self-attention as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a processor, carries out the steps of the method for pedestrian re-identification based on polarized self-attention as claimed in any one of claims 1 to 7.