CN113095229B

CN113095229B - Self-adaptive pedestrian re-identification system and method for unsupervised domain

Info

Publication number: CN113095229B
Application number: CN202110399589.7A
Authority: CN
Inventors: 程德强; 寇旗旗; 李佳函; 李云龙; 张皓翔; 韩成功; 徐进洋; 江曼; 刘瑞航
Original assignee: China University of Mining and Technology CUMT
Current assignee: China University of Mining and Technology CUMT
Priority date: 2021-04-14
Filing date: 2021-04-14
Publication date: 2024-04-12
Anticipated expiration: 2041-04-14
Also published as: CN113095229A

Abstract

The invention relates to an unsupervised domain self-adaptive pedestrian re-recognition system and method, belongs to the technical field of pedestrian re-recognition, and solves the problems of high difficulty and low recognition accuracy of the existing unsupervised domain self-adaptive pedestrian re-recognition. The system comprises a data acquisition module for acquiring a plurality of source domain sample subsets and a plurality of target domain sample subsets; the network model training module is used for obtaining a classification loss function and a sample invariance loss function of the pedestrian re-recognition network model, sequencing and layering the pedestrian pictures in the target domain sample subset according to the similarity of each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset to obtain a layering loss function, and further carrying out iterative optimization on the pedestrian re-recognition network model; and the re-recognition module is used for re-recognizing the pedestrians by using the optimized pedestrian re-recognition network model, and obtaining an image which is the same as or similar to the image of the pedestrians to be recognized. The system can reduce the migration loss of the network and improve the accuracy of pedestrian re-identification.

Description

Self-adaptive pedestrian re-identification system and method for unsupervised domain

Technical Field

The invention relates to the technical field of pedestrian re-recognition, in particular to an unsupervised domain self-adaptive pedestrian re-recognition system and method.

Background

In traditional unsupervised domain adaptive learning, most methods are used in closed set scenarios where the sample categories in the source domain and the target domain are consistent. However, the conventional unsupervised domain adaptive algorithm cannot be used for unsupervised domain adaptive pedestrian re-recognition because the pedestrian categories of the source domain and the target domain are almost different due to image acquisition, and a large migration loss is generated when a network trained by using the source domain image is migrated to the target domain for recognition. The large migration loss caused by network migration can make unsupervised domain adaptive pedestrian re-recognition more challenging than most unsupervised learning.

In the prior art, the purpose of distinguishing the unlabeled dataset is achieved mainly through self-clustering of the target domain by self-adaptive pedestrian re-recognition of the unsupervised domain. In the prior art, three kinds of invariance loss based on a target domain are proposed to reduce the migration loss of a network, but marked data is not fully utilized, and deep semantic information of a label-free picture is not fully mined, so that the migration loss of the network cannot be effectively reduced, and the accuracy of pedestrian re-identification is still to be improved.

At least the following defects exist in the prior art, the source domain picture information and the target domain picture information cannot be fully utilized, and the negative samples with different similarities cannot be subjected to network distinguishing learning according to the similarities between the source domain picture and the target domain picture, so that the migration loss of the pedestrian re-recognition network is large, and the accuracy of the pedestrian re-recognition result is low.

Disclosure of Invention

In view of the above analysis, the embodiment of the invention aims to provide an unsupervised domain self-adaptive pedestrian re-recognition system and method, which are used for solving the problems of large migration loss of a pedestrian re-recognition network and low accuracy of a pedestrian re-recognition result in the prior art.

In one aspect, the present invention provides an unsupervised domain adaptive pedestrian re-recognition system, comprising:

a data acquisition module for acquiring a source domain sample set comprising a plurality of source domain sample subsets and a target domain sample set comprising a plurality of target domain sample subsets;

the network model training module is used for obtaining a classification loss function and a sample invariance loss function of the pedestrian re-identification network model, sorting and layering pedestrian pictures in the target domain sample subset according to the similarity of each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset, and obtaining a layering loss function based on the characteristics of the layered pedestrian pictures and the corresponding layering weights; performing iterative optimization on the pedestrian re-recognition network model by traversing each source domain sample subset and each target domain sample subset based on the classification loss function, the sample invariance loss function and the hierarchical loss function;

and the re-recognition module is used for recognizing the pedestrian image to be recognized by utilizing the optimized pedestrian re-recognition network model, and obtaining an image identical or similar to the pedestrian image to be recognized.

Further, the pedestrian re-recognition network model comprises a residual network structure, a full connection layer and a Softmax normalization layer which are sequentially connected and correspond to the classification loss function, a target domain memory which is sequentially connected and corresponds to the sample invariance loss function, and L ₂ And the residual error network structure is respectively connected with the full-connection layer, the target domain memory, the source domain memory and the similarity measurement axis network structure.

Further, the network model training module obtains the layered loss function specifically by:

respectively inputting a source domain sample subset and a target domain sample subset into a residual network structure of a pedestrian re-identification network model to extract image features so as to obtain the features of each pedestrian picture in the source domain sample subset and the features of each pedestrian picture in the target domain sample subset, respectively storing the features of each pedestrian picture in the source domain sample subset and the features of each pedestrian picture in the target domain sample subset, and multiplying the features of each pedestrian picture in the source domain sample subset by the features of each pedestrian picture in the target domain sample subset to obtain corresponding similarity;

sorting pedestrian pictures in the target domain sample subset in a descending order based on the similarity, sequentially selecting a first preset number of pictures as a first layer of pictures according to the order, selecting a second preset number of pictures as a second layer of pictures, selecting the rest pictures as a third layer of pictures, and respectively setting layer weights of the three layers of pictures;

obtaining a layering loss function based on the characteristics of each pedestrian picture in the target domain sample subset and the layering weight corresponding to the layering to which the characteristics belong:

wherein L is _SL Represents a hierarchical loss function, n _t Representing the number, w, of pedestrian pictures in a target domain sample subset _t,r Representing layer weights, r represents the sequence number, x of pedestrian pictures in the target domain sample subset according to similarity sequence _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by p (m|x _t,i ) Representing pedestrian picture x _t,i Probability of similarity to pedestrian pictures of all categories in all source domain sample subsets stored, R ₁ [m]Features representing all categories of pedestrian pictures in all source domain sample subsets stored, R ₁ [j]Representing the jth class in all source domain sample subsets storedCharacteristics of pedestrian pictures, N _s And representing the category number of the pedestrian pictures in all the stored source domain sample subsets, wherein beta is a temperature coefficient.

Further, the set layer weights of the three layers of pictures are as follows:

wherein b represents the number of pictures of the first layer, k ₁ Representing the sum of the numbers of the first layer picture and the second layer picture, and p (l|i) represents the pedestrian picture x _t,i Probability of class I pedestrian picture in all stored target domain sample subsets, R ₁ [l]Representing the characteristics of the class I pedestrian pictures in all the stored target domain sample subsets, N _t And representing the category number of the pedestrian pictures in all the stored target domain sample subsets.

Further, the network model training module obtains the classification loss function specifically by:

inputting the source domain sample subset into a residual network structure of a pedestrian re-recognition network model to extract image features so as to obtain the features of each pedestrian picture in the source domain sample subset;

sequentially inputting the characteristics of each pedestrian picture into a full-connection layer and a softmax regression layer of the pedestrian re-recognition network model, and carrying out characteristic dimension conversion and characteristic normalization;

classifying the loss function based on the characteristics of each pedestrian picture after dimension conversion and normalization by adopting the following formula;

wherein L is _src Represents a classification loss function, n _s Representing the number of pedestrian pictures in a source domain sample subset, x _s,m Representing the mth pedestrian picture, f (x _s,m ) Representing pedestrian picture x _s,m Is characterized by y _s,m Representing pedestrian picture x in source domain sample subset _s,m Category labels of (c), p (y) _s,m |x _s,m ) Representing pedestrian picture x _s,m Belonging to category y _s,m Probability of f (x) _s,j ) Representing the features of the jth pedestrian picture in the source domain sample subset, N _s Representing the number of categories of pedestrian pictures in all the stored source domain sample subsets.

Further, the network model training module obtains the sample invariance loss function by:

inputting the target domain sample subset into a residual network structure of a pedestrian re-recognition network model to extract image features so as to obtain the features of each pedestrian picture in the target domain sample subset, and carrying out L on the features of each pedestrian picture ₂ Standardization treatment;

the sample invariance loss function is obtained according to the characteristics of each pedestrian picture after the normalization processing through the following formula:

wherein L is _T Representing a sample invariance loss function, n _t Representing the number of pedestrian pictures in a target domain sample subset, x _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by w _i,l Representing pedestrian picture x _t,i Weights of pedestrian pictures belonging to class I，p(l|x _t,i ) Representing pedestrian picture x _t,i Probability of belonging to class I pedestrian pictures, N _t And representing the category number of the pedestrian pictures in all the stored target domain sample subsets, wherein beta is a temperature coefficient.

Further, the network model training module performs iterative optimization on the pedestrian re-recognition network model specifically by the following manner:

obtaining a total loss function according to the classification loss function, the sample invariance loss function and the layering loss function:

L＝λ ₁ L _src +λ ₂ L _T +λ ₃ L _SL ，

wherein L represents the total loss function, L _src Represents a class loss function, lambda ₁ Weights representing classification loss functions, L _T Representing the sample invariance loss function, lambda ₂ Weights representing sample invariance loss functions, L _SL Represents a layered loss function, lambda ₃ Weights representing hierarchical loss functions;

and traversing each source domain sample subset and each target domain sample subset, and iteratively updating the total loss function until the variation of the total loss function value is smaller than a preset value, so as to finish the optimization of the pedestrian re-identification network model.

On the other hand, the invention provides an unsupervised domain self-adaptive pedestrian re-identification method, which comprises the following steps:

obtaining a source domain sample set comprising a plurality of source domain sample subsets and a target domain sample set comprising a plurality of target domain sample subsets;

respectively obtaining a classification loss function and a sample invariance loss function of a pedestrian re-identification network model based on the source domain sample subset and the target domain sample subset, sorting and layering pedestrian pictures in the target domain sample subset according to the similarity of each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset, and obtaining a layering loss function based on the characteristics of the layered pedestrian pictures and the corresponding layering weights; performing iterative optimization on the pedestrian re-recognition network model by traversing each source domain sample subset and each target domain sample subset based on the classification loss function, the sample invariance loss function and the hierarchical loss function;

and identifying the pedestrian image to be identified by using the optimized pedestrian re-identification network model, and obtaining the image which is the same as or similar to the pedestrian image to be identified.

Further, the layered loss function is obtained specifically by:

wherein L is _SL Represents a hierarchical loss function, n _t Representing the number, w, of pedestrian pictures in a target domain sample subset _t,r Representing layer weights, r represents the sequence number, x of pedestrian pictures in the target domain sample subset according to similarity sequence _t,i Representing target domain sample subset input pedestrian re-recognitionInputting pedestrian pictures with the sequence of i and f (x) in other network models _t,i ) Representing pedestrian picture x _t,i Is characterized by p (m|x _t,i ) Representing pedestrian picture x _t,i Probability of similarity to pedestrian pictures of all categories in all source domain sample subsets stored, R ₁ [m]Features representing all categories of pedestrian pictures in all source domain sample subsets stored, R ₁ [j]Representing the features of the jth pedestrian picture in all the stored source domain sample subsets, N _s And representing the category number of the pedestrian pictures in all the stored source domain sample subsets, wherein beta is a temperature coefficient.

Further, the set layer weights of the three layers of pictures are as follows:

Compared with the prior art, the invention has at least one of the following beneficial effects:

1. according to the self-adaptive pedestrian re-recognition system and method for the unsupervised domain, the source domain sample with the label and the target domain sample without the label are combined to effectively combine the supervised pedestrian re-recognition and the unsupervised pedestrian re-recognition, and the target domain sample is layered through the similarity between the source domain sample and the target domain sample for the first time, so that the distinguishing degree of the pedestrian re-recognition network on the target domain sample is improved, the migration loss of the network is reduced, and the accuracy of the pedestrian re-recognition network on the pedestrian re-recognition result is improved.

2. According to the unsupervised domain self-adaptive pedestrian re-recognition system and method provided by the invention, the negative samples are layered and weighted by utilizing the similarity of the characteristics of the source domain samples and the target domain samples to obtain the layered loss function, the loss function enables the network to self-adaptively reduce the learning of the characteristics of the negative samples in the training of the pedestrian re-recognition network model, so that the migration loss of the network is reduced, on the basis, the classification loss function and the sample invariance loss function are combined to perform iterative optimization on the pedestrian re-recognition network, and the precision of the pedestrian re-recognition result of the pedestrian re-recognition network can be greatly improved.

In the invention, the technical schemes can be mutually combined to realize more preferable combination schemes. Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.

Drawings

The drawings are only for purposes of illustrating particular embodiments and are not to be construed as limiting the invention, like reference numerals being used to refer to like parts throughout the several views.

FIG. 1 is a schematic diagram of an unsupervised domain adaptive pedestrian re-identification system in accordance with an embodiment of the present invention;

FIG. 2 is a schematic diagram of a pedestrian re-recognition network model according to an embodiment of the present invention;

fig. 3 is a flowchart of an unsupervised domain adaptive pedestrian re-identification method according to an embodiment of the present invention.

Reference numerals:

110-a data acquisition module; 120-a network model training module; 130-re-identification module.

Detailed Description

Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.

System embodiment

An embodiment of the invention discloses an unsupervised domain self-adaptive pedestrian re-identification system. As shown in fig. 1, the system includes:

a data acquisition module 110 for acquiring a source domain sample set comprising a plurality of source domain sample subsets and a target domain sample set comprising a plurality of target domain sample subsets. Specifically, a large number of pictures of pedestrians are randomly collected from different angles, each picture contains a pedestrian, the collected pictures are divided into a training set and a testing set, the training set is used for training the pedestrian re-recognition network model, and the testing set is used for testing the trained pedestrian re-recognition network model so as to ensure the recognition accuracy of the pedestrian re-recognition network model. Selecting part of pictures randomly in a training set, adding labels for each picture, specifically adding the same label for a plurality of pictures of the same pedestrian, wherein different labels represent different pedestrians, dividing the labeled pictures into a plurality of source domain sample subsets, taking other pictures without labels in the training set as target domain sample sets, adding numbers for each picture, and randomly dividing the numbered pictures into a plurality of target domain sample subsets.

The network model training module 120 is configured to obtain a classification loss function and a sample invariance loss function of the pedestrian re-recognition network model, sort and layer the pedestrian pictures in the target domain sample subset according to the similarity between each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset, and obtain a layering loss function based on the characteristics of the layered pedestrian pictures and the corresponding layering weights. And performing iterative optimization on the pedestrian re-recognition network model based on the classification loss function, the sample invariance loss function and the hierarchical loss function traversing each source domain sample subset and each target domain sample subset.

And the re-recognition module is used for recognizing the pedestrian image to be recognized by utilizing the optimized pedestrian re-recognition network model, and obtaining an image identical or similar to the pedestrian image to be recognized. Specifically, the pedestrian picture to be identified is input into a trained pedestrian re-identification network model, the model can output the picture most similar to the pedestrian picture to be identified, the model is set to output the numbers of the first three pictures with the highest similarity to the pedestrian picture to be identified, the first three pictures are pictures in a target domain sample set, and then the pedestrian picture to be identified and the first three pictures are manually compared to determine the category of the pedestrian picture to be identified.

Preferably, as shown in fig. 2, the pedestrian re-recognition network model comprises a residual network structure, a full connection layer and a Softmax normalization layer which are sequentially connected and correspond to the classification loss function, a target domain memory which is sequentially connected and corresponds to the sample invariance loss function, and L ₂ And the residual error network structure is respectively connected with the full-connection layer, the target domain memory, the source domain memory and the similarity measurement axis network structure. Specifically, the target domain memory and the source domain memory are both key value storage structures, the key is used for storing the characteristics of the pedestrian picture, the value is used for storing the number or the label corresponding to the pedestrian picture, and the residual network structure is shown as ResNet50.

Preferably, the network model training module obtains the classification loss function, the sample invariance loss function, and the layering loss function specifically by:

step 1, respectively inputting a source domain sample subset and a target domain sample subset into a residual network structure of a pedestrian re-recognition network model to extract image features so as to obtain the features of each pedestrian picture in the source domain sample subset and the features of each pedestrian picture in the target domain sample subset.

The features of each pedestrian picture in the target domain sample subset are stored in the target domain memory and the features of each pedestrian picture in the source domain sample subset are stored in the Yu Yuanyu memory.

Step 2, specifically, obtaining a classification loss function by the following method:

and sequentially inputting the characteristics of each pedestrian picture in the source domain sample subset into a full-connection layer and a softmax regression layer of the pedestrian re-recognition network model, and carrying out characteristic dimension conversion and characteristic normalization to strengthen the nonlinearity of the pedestrian re-recognition network model in learning.

Step 3, specifically, obtaining a sample invariance loss function by the following method:

l is carried out on the characteristics of each pedestrian picture in the current target domain sample subset stored in the target domain memory ₂ And (5) standardization treatment.

wherein L is _T Representing a sample invariance loss function, n _t Representing the number of pedestrian pictures in the current target domain sample subset, x _t,i Representing a pedestrian picture with the input sequence of i when the current target domain sample subset is input into the pedestrian re-recognition network model, and f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by w _i,l Representing pedestrian picture x _t,i Weights belonging to class i pedestrian pictures, p (l|x _t,i ) Representing pedestrian picture x _t,i Probability of belonging to class I pedestrian pictures, N _t And representing the category number of the pedestrian pictures in all the stored target domain sample subsets, wherein beta is a temperature coefficient.

Step 4, obtaining a layered loss function by specifically:

multiplying the characteristics of each pedestrian picture in the current source domain sample subset with the characteristics of each pedestrian picture in the current target domain sample subset based on the similarity measurement axis network structure to obtain corresponding similarity;

and sorting pedestrian pictures in the target domain sample subset in a descending order based on the similarity, sequentially selecting a first preset number of pictures as a first layer of pictures according to the order, selecting a second preset number of pictures as a second layer of pictures, selecting the rest pictures as a third layer of pictures, and respectively setting the layer weights of the three layers of pictures. Illustratively, the number of first layer pictures is 3 and the number of second layer pictures is 147.

wherein L is _SL Represents a hierarchical loss function, n _t Representing target domain samplesConcentrate the number of pedestrian pictures, w _t,r Representing layer weights, r represents the sequence number, x of pedestrian pictures in the target domain sample subset according to similarity sequence _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by p (m|x _t,i ) Representing pedestrian picture x _t,i Probability of similarity to pedestrian pictures of all categories in all source domain sample subsets stored, R ₁ [m]Features representing all categories of pedestrian pictures in all source domain sample subsets stored, R ₁ [j]Representing the features of the jth pedestrian picture in all the stored source domain sample subsets, N _s And representing the category number of the pedestrian pictures in all the stored source domain sample subsets, wherein beta is a temperature coefficient.

Preferably, the layer weights of the three layers of pictures are set as follows:

Specifically, the steps 2 to 4 may be performed simultaneously without separating the steps from each other.

And 5, carrying out iterative optimization on the pedestrian re-recognition network model in the following manner:

obtaining a total loss function from the classification loss function, the sample invariance loss function and the hierarchical loss function:

L＝λ ₁ L _src +λ ₂ L _T +λ ₃ L _SL ，

wherein L represents the total loss function, L _src Represents a class loss function, lambda ₁ Weights representing classification loss functions, L _T Representing the sample invariance loss function, lambda ₂ Weights representing sample invariance loss functions, L _SL Represents a layered loss function, lambda ₃ Weights representing hierarchical loss functions, illustratively λ ₁ Take a value of 0.7 lambda ₂ Takes a value of 0.3 lambda ₃ The value of (2) is 0.2.

And (3) traversing each source domain sample subset and each target domain sample subset, repeating the steps 1 to 5, and iteratively updating the total loss function until the variation of the total loss function value is smaller than a preset value, so as to finish the optimization of the pedestrian re-identification network model.

Specifically, in the iterative updating process, the pedestrian picture characteristics of the target domain sample subset stored in the target domain memory are updated in real time by the following modes:

x _t,i representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Warp L ₂ Normalized features, R _i Representing target domain pedestrian pictures x stored in target domain memory _t,i Is characterized in that,is a hyper-parameter that controls the update rate of features.

Method embodiment

Another embodiment of the present invention discloses an unsupervised domain adaptive pedestrian re-recognition method, and since the method embodiment and the system embodiment are based on the same principle, the details are not repeated herein, and reference may be made to the system embodiment on the market for the repetition.

Specifically, as shown in fig. 3, the method includes:

s110, acquiring a source domain sample set comprising a plurality of source domain sample subsets and a target domain sample set comprising a plurality of target domain sample subsets.

S120, respectively obtaining a classification loss function and a sample invariance loss function of a pedestrian re-identification network model based on the source domain sample subset and the target domain sample subset, sorting and layering pedestrian pictures in the target domain sample subset according to the similarity of each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset, and obtaining a layering loss function based on the layered pedestrian picture characteristics and the corresponding layering weights; and performing iterative optimization on the pedestrian re-recognition network model by traversing each source domain sample subset and each target domain sample subset based on the classification loss function, the sample invariance loss function and the hierarchical loss function.

And S130, identifying the pedestrian image to be identified by using the optimized pedestrian re-identification network model, and obtaining the image identical or similar to the pedestrian image to be identified.

Preferably, the layered loss function is obtained in particular by:

wherein L is _SL Represents a hierarchical loss function, n _t Representing the number, w, of pedestrian pictures in a target domain sample subset _t,r Representing layer weights, r represents the sequence number, x of pedestrian pictures in the target domain sample subset according to similarity sequence _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by p (m|x _t,i ) Representing pedestrian picture x _t,i Probability of similarity to pedestrian pictures of all categories in all source domain sample subsets stored, R ₁ [m]Features representing all categories of pedestrian pictures in all source domain sample subsets stored, R ₁ [j]Representing the features of the jth pedestrian picture in all the stored source domain sample subsets, N _s And representing the category number of the pedestrian pictures in all the stored source domain sample subsets, wherein beta is a temperature coefficient.

Preferably, the set layer weights of the three layers of pictures are as follows:

wherein b represents the number of pictures of the first layer, k ₁ Representing the sum of the numbers of the first layer picture and the second layer picture, and p (l|i) represents the pedestrian picture x _t,i Class I pedestrian pictures in all target domain sample subsets belonging to storageProbability of R ₁ [l]Representing the characteristics of the class I pedestrian pictures in all the stored target domain sample subsets, N _t And representing the category number of the pedestrian pictures in all the stored target domain sample subsets.

Compared with the prior art, the unsupervised domain self-adaptive pedestrian re-recognition system and method provided by the embodiment effectively combine the supervised pedestrian re-recognition and the unsupervised pedestrian re-recognition through the combination of the labeled source domain sample and the unlabeled target domain sample, namely layering the target domain sample through the similarity between the source domain sample and the target domain sample for the first time, so that the distinguishing degree of the pedestrian re-recognition network to the target domain sample is improved, the migration loss of the network is reduced, and the accuracy of the pedestrian re-recognition result of the pedestrian re-recognition network is further improved; on the other hand, the negative samples are layered and weighted by utilizing the similarity of the characteristics of the source domain samples and the target domain samples to obtain a layered loss function, the loss function enables the network to adaptively reduce the learning of the characteristics of the negative samples in the training of the pedestrian re-recognition network model, so that the migration loss of the network is reduced, on the basis, the classification loss function and the sample invariance loss function are combined to carry out iterative optimization on the pedestrian re-recognition network, and the precision of the pedestrian re-recognition network on the pedestrian re-recognition result can be greatly improved.

Those skilled in the art will appreciate that all or part of the flow of the methods of the embodiments described above may be accomplished by way of a computer program to instruct associated hardware, where the program may be stored on a computer readable storage medium. Wherein the computer readable storage medium is a magnetic disk, an optical disk, a read-only memory or a random access memory, etc.

The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.

Claims

1. An unsupervised domain adaptive pedestrian re-recognition system, comprising:

the network model training module is used for obtaining a classification loss function and a sample invariance loss function of the pedestrian re-identification network model, sorting and layering pedestrian pictures in the target domain sample subset according to the similarity of each pedestrian picture in the source domain sample subset and each pedestrian picture in the target domain sample subset, and obtaining a layering loss function based on the characteristics of the layered pedestrian pictures and the corresponding layering weights; performing iterative optimization on the pedestrian re-recognition network model by traversing each source domain sample subset and each target domain sample subset based on the classification loss function, the sample invariance loss function and the hierarchical loss function; the pedestrian re-recognition network model comprises a residual network structure, a full connection layer and a Softmax normalization layer which are sequentially connected and correspond to a classification loss function, a target domain memory which is sequentially connected and correspond to a sample invariance loss function, and L ₂ The standard layer, and source domain memory and similarity measurement shaft network structure which are sequentially connected and correspond to the layered loss function, and the residual error network structure is respectively connected with the full-connection layer, the target domain memory, the source domain memory and the similarity measurement shaft network structure;

the network model training module obtains a layered loss function specifically by the following method:

wherein L is _SL Represents a hierarchical loss function, n _t Representing the number, w, of pedestrian pictures in a target domain sample subset _t,r Representing layer weights, r represents the sequence number, x of pedestrian pictures in the target domain sample subset according to similarity sequence _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by p (m|x _t,i ) Representing pedestrian picture x _t,i Probability of similarity to pedestrian pictures of all categories in all source domain sample subsets stored, R ₁ [m]Features representing all categories of pedestrian pictures in all source domain sample subsets stored, R ₁ [j]Representing the features of the jth pedestrian picture in all the stored source domain sample subsets, N _s Representing the category number of pedestrian pictures in all the stored source domain sample subsets, wherein beta is a temperature coefficient;

the set layer weights of the three layers of pictures are as follows:

wherein b represents the number of pictures of the first layer, k ₁ Representing the sum of the numbers of the first layer picture and the second layer picture, and p (l|i) represents the pedestrian picture x _t, i probability of belonging to class I pedestrian picture in all stored target domain sample subsets, R ₁ [l]Representing the characteristics of the class I pedestrian pictures in all the stored target domain sample subsets, N _t Representing the category number of the pedestrian pictures in all the stored target domain sample subsets;

2. The unsupervised domain adaptive pedestrian re-recognition system according to claim 1, wherein the network model training module obtains the classification loss function by:

wherein L is _src Represents a classification loss function, n _s Representing the number of pedestrian pictures in a source domain sample subset, x _s,m Representation sourceMth pedestrian picture in domain sample subset, f (x _s,m ) Representing pedestrian picture x _s,m Is characterized by y _s,m Representing pedestrian picture x in source domain sample subset _s,m Category labels of (c), p (y) _s,m |x _s,m ) Representing pedestrian picture x _s,m Belonging to category y _s,m Probability of f (x) _s,j ) Representing the features of the jth pedestrian picture in the source domain sample subset, N _s Representing the number of categories of pedestrian pictures in all the stored source domain sample subsets.

3. The unsupervised domain adaptive pedestrian re-recognition system according to claim 1, wherein the network model training module obtains the sample invariance loss function by:

wherein L is _T Representing a sample invariance loss function, n _t Representing the number of pedestrian pictures in a target domain sample subset, x _t,i Representing a pedestrian picture with input sequence i when a target domain sample subset is input into a pedestrian re-recognition network model, f (x) _t,i ) Representing pedestrian picture x _t,i Is characterized by w _i,l Representing pedestrian picture x _t,i Weights belonging to class i pedestrian pictures, p (l|x _t,i ) Representing pedestrian picture x _t,i Belonging to the first classProbability of pedestrian picture, N _t And representing the category number of the pedestrian pictures in all the stored target domain sample subsets, wherein beta is a temperature coefficient.

4. The unsupervised domain adaptive pedestrian re-recognition system according to claim 1, wherein the network model training module performs iterative optimization on the pedestrian re-recognition network model by:

L＝λ ₁ L _src +λ ₂ L _T +λ ₃ L _SL ，

5. An unsupervised domain adaptive pedestrian re-recognition method, comprising:

the pedestrian re-recognition network model comprises a residual network structure, a full connection layer and a Softmax normalization layer which are sequentially connected and correspond to a classification loss function, a target domain memory which is sequentially connected and correspond to a sample invariance loss function, and L ₂ The standard layer, and source domain memory and similarity measurement shaft network structure which are sequentially connected and correspond to the layered loss function, and the residual error network structure is respectively connected with the full-connection layer, the target domain memory, the source domain memory and the similarity measurement shaft network structure;

the layered loss function is obtained specifically by:

the set layer weights of the three layers of pictures are as follows: