Based on pedestrian's heavily recognition methods and system of visual angle self-adaptation sub-space learning algorithm
Technical field
The present invention relates to a kind of method of computer vision field, particularly, relate to a kind of pedestrian's heavily recognition methods and system based on visual angle self-adaptation sub-space learning algorithm.
Background technology
Along with the development of infotech, Intelligent treatment terminal has become very universal, and the collection of multi-medium data also becomes more and more convenient.In the face of the multi-medium data of magnanimity, how intellectual analysis carried out to them, accomplish to use by oneself, be community service, become the important subject of computer vision field.Target detection technique, target following technology and target identification technology etc. all obtain huge development, and for the detection of pedestrian, tracking and recognition technology because its important practical value obtains the concern of many researchers.In the field such as security protection and family endowment, we often pay close attention to the long-term locking tracking problem for specific pedestrian, and this relates to multiple technology such as pedestrian detection, pedestrian tracking.And when pedestrian disappears from a camera, when he appears under another camera again, we wish to identify this pedestrian and proceed to follow the tracks of, this relates to the heavy recognition technology of pedestrian.Pedestrian heavily identifies that the target that will realize connects, the target detected in two mutually not overlapping cameras to realize the relay tracking across camera.But due to the configuration of different cameras, the position of laying, scene difference, the color causing the pedestrian's image under different camera to exist in various degree changes and Geometrical change, under adding complicated monitoring scene, there is blocking in various degree between pedestrian, make the heavy identification problem of the pedestrian under different camera become more thorny.Coupling between current pedestrian heavily identifies mainly for picture, also do not utilize video information, the algorithm of main flow can be divided into two large classes: the pedestrian's macroscopic features matching algorithm extracted based on low-level image feature, and based on the Feature Correspondence Algorithm of metric learning.Pedestrian's feature that first kind algorithm is devoted to extract robust more, that have discrimination, to improve the matching accuracy rate of pedestrian's appearance.And Equations of The Second Kind algorithm is devoted to study to more rational feature space, change to reduce same a group traveling together due to posture, visual angle etc. the feature difference caused.First kind method does not need training sample, is therefore convenient to promote the use of, but it needs numerous and diverse characteristic Design, and the factor causing pedestrian's macroscopic features to change under reality is too complicated, is difficult to find the pervasive feature having discrimination.The content that patent of the present invention is studied belongs to Equations of The Second Kind algorithm, and target utilizes training data to obtain more excellent proper subspace, makes in new proper subspace more close with a group traveling together's feature, and different pedestrian's feature more away from.
By a large amount of literature searches, we find that existing metric learning algorithm is mainly to the conversion of mahalanobis distance, target is a kind of eigentransformation matrix of study, the feature after converting is made more to meet desirable feature distribution (the feature distribution namely with a group traveling together is more close, the feature of different pedestrian more away from).In the people such as AlexisMignon " PCCA:ANewApproachforDistanceLearningfromSparsePairwiseCo nstraints " literary composition in the InternaltionalConferenceonComputerVisionandPatternRecogi ntion of 2012, proposition utilizes training data to learn to obtain a low n-dimensional subspace n, the training sample reserved of getting the bid in this space distributes (feature samples of same pedestrian is adjusted the distance and is less than a threshold value, and the feature samples of different pedestrian is to being greater than this threshold value) to meeting desirable feature.The method is applicable to high-dimensional feature space, and also can obtain good effect when training sample is less.In the people such as Wei-ShiZheng " PersonRe-identificationbyProbabilisticRelativeDistanceCo mparison " literary composition in the InternaltionalConferenceonComputerVisionandPatternRecogi ntion of 2011, a kind of metric learning algorithm based on tlv triple input is proposed, target be the feature samples that makes to belong to same a group traveling together between distance be less than the feature samples that belongs to different pedestrian to the maximization of spacing.But this method has more restriction (tlv triple) to input data, and processing speed is slower under high dimensional feature input condition.
Chinese patent literature CN103500345A, open (bulletin) day 2014.01.08, disclose the heavy recognizer of a kind of pedestrian based on metric learning, this invention carries out pedestrian's re-examination card by adopting newly-designed Smooth Regularization distance metric model, has taken into full account covariance matrix offset issue in model.There is the advantage not needing complicated Optimized Iterative process.But the method does not consider that the pedestrian's image under different camera visual angle correspond to the change at different illumination, visual angle etc., what the tolerance therefore obtained neither be best.
Summary of the invention
For defect of the prior art, the object of this invention is to provide a kind of pedestrian's heavily recognition methods and system based on visual angle self-adaptation sub-space learning algorithm, fully can excavate the impact of different camera on pedestrian's feature, and pointedly for each camera learns corresponding transformation relation, drop to minimum to make the impact of camera on pedestrian's macroscopic features, just only can pay close attention to the difference of pedestrian's macroscopic features in the characteristic matching stage that pedestrian heavily identifies, thus greatly improve the accuracy rate that pedestrian heavily identifies.
According to an aspect of the present invention, a kind of pedestrian's heavily recognition methods based on visual angle self-adaptation sub-space learning algorithm is provided, described method is only to comprise the rectangular image of single pedestrian or to cut out target rectangle frame as input picture from raw video image by tracking results, extract proper vector over an input image, and data set is divided into training dataset and test data set, training dataset obtains transformation matrix according to visual angle self-adaptation sub-space learning Algorithm Learning, the transformation matrix that test data set utilizes study to obtain carries out distance calculating and pedestrian heavily identifies.
Described method comprises the steps:
Step 1): utilize feature extraction algorithm to carry out feature extraction to input picture, obtain proper vector set, proper vector set is divided into again training dataset and test data set further;
Step 2), on training dataset, obtain the adaptive sub-space transform matrix in visual angle to each camera study, the process of its learning mapping matrix realizes by optimizing loss function;
Step 3), in test data set, first the eigenmatrix of all test patterns is mapped to corresponding subspace, obtains the proper vector after mapping, and carry out pedestrian on this basis and heavily identify.
Further, step 2) in, described loss function is as shown in formula (1):
Wherein: L
a, L
bthe mapping matrix needing study, L
afor compensating the change that camera A brings to pedestrian's appearance features vector under this camera lens, L
bfor compensating the change that camera B brings to pedestrian's appearance features vector under this camera lens, all training samples are all paired appearance, and the proper vector under camera A is: { x
i, i=1,2 ..., N
train, the proper vector under camera B is: { y
i, i=1,2 ..., N
train, in two characteristic sets, the feature of correspondence position is corresponding to the same a group traveling together under different camera and x
iwith y
icorresponding to same a group traveling together; | S|, | D| represent respectively positive sample to namely with a group traveling together's feature to the number right with negative sample; λ, μ
a, μ
bfor the parameter of each significance level in regulation loss function; || ||
fthe Frobenius norm of homography;
Loss function in formula (1) can the illumination of camera, depending on Jiao Alto change be not special complicated situation under obtain good effect, but can only linear transformation operation be carried out to the proper vector under each camera, in order to adapt to complicated actual scene better, nonlinear transformation is introduced by kernel function, thus bring more dirigibility to model, can reduce the macroscopic features of pedestrian itself better; The method of described introducing kernel function is as follows:
Proper vector is calculated by following formula in the distance of nuclear space:
Wherein: φ (x
i), φ (y
j) be the proper vector of nuclear space,
it is the mapping matrix of corresponding nuclear space;
On the basis of formula (2), the loss function in formula (1) is generalized to nuclear space, because the dimension of nuclear space is very high, can not be directly right
learn, therefore introduce transformation matrix Q
a, Q
brepresent
physical relationship is as follows:
Wherein:
For the matrix that the proper vector of nuclear space forms;
Thus, as follows at the loss function of nuclear space:
Wherein:
be nuclear space with
for the loss function of parameter, K
a=φ (A)
tφ (A), K
b=φ (B)
tφ (B), the mark computing of tr () representing matrix, T is matrix transpose symbol, and X represents that all the other elements are all the square formation of except diagonal entry is zero; Can prove that formula (4) is about Q
a, Q
bconvex function, therefore adopt simple gradient descent method to converge to optimum solution; It is as follows that described gradient descent method is optimized to formula (4) method solved:
First respectively to Q
a, Q
bcarry out differentiate, obtain following result:
Wherein: l is loss function, K
a, K
b, Q
a, Q
b, corresponding with formula (4) of X;
On this basis to Q
a, Q
bcarry out iteration renewal, update rule is:
Wherein: l is loss function, η
a, η
bfor the step-length that iteration upgrades, obtained by cross validation; T is iterations.
Further, step 3) in, described pedestrian heavily identifies to refer to and test data to be concentrated any image characteristic of correspondence vector under camera A to carry out distance with all image characteristic of correspondence vector sets under camera B to calculate, and sort from small to large according to distance, the same a group traveling together of mating under namely the image coming foremost is considered to different camera;
Concrete, described pedestrian heavily identifies, comprises the steps:
3.1) concentrate in test data, the feature of all pedestrians under the characteristics of image of the first man under camera A and camera B is carried out distance and calculates, obtain the first row data M of distance matrix M
1;
3.2) step 3.1 is repeated), compare until all pedestrians under camera A have carried out characteristic distance with pedestrian under camera B, and obtain distance matrix M
2, M
3..., M
i,j, wherein M
i,jrepresent the characteristic distance of i-th pedestrian in A and the jth pedestrian in B;
Every a line of M sorted from small to large, come the image in B corresponding to the distance of i-th, the image namely mated with this row corresponding image i-th in A, what wherein come first row is the image mated most.
More preferably, described distance calculating method is as shown in formula (7):
Wherein:
φ (A
test), φ (A
train) correspond respectively to the set that in camera A, test set and training set form in the proper vector of nuclear space; Correspondingly, φ (B
test), φ (B
train) correspond respectively to the set that in camera B, test set and training set form in the proper vector of nuclear space; Q
a, Q
bstep 2) mapping matrix that obtains of learning; e
i, e
jrepresent that the i-th, j element is one respectively, all the other be all zero column vector.
According to another aspect of the present invention, provide a kind of pedestrian based on visual angle self-adaptation sub-space learning algorithm heavy recognition system, described system comprises the characteristic extracting module, subspace mapping matrix study module and the heavy identification module that connect successively; Wherein:
Described system comprises: the heavy identification module of characteristic extracting module, the adaptive sub-space learning module in visual angle and pedestrian, wherein:
Described characteristic extracting module, its input is original pedestrian's image, and this module carries out feature extraction to pedestrian's image that each inputs, and obtains d dimensional feature vector; In all pedestrians, randomly draw pedestrian's image of some as training data set, and using the input of their characteristics of correspondence as subspace mapping matrix study module; All the other pedestrian's images are as in test data set;
Described subspace mapping matrix study module, its input is the training data set that characteristic extracting module exports, for adaptively for the study of each camera obtains best mapping matrix, make the proper vector after converting meet desired characteristics distribution as far as possible, that is: with the proper vector of a group traveling together apart from little, different pedestrian's proper vector distance is large; This module exports the transformation matrix Q obtained
a, Q
b;
Described heavy identification module, this module, at the enterprising row relax of test data set, utilizes the transformation matrix Q that subspace mapping matrix study module obtains
a, Q
bcarry out distance to test data set image to calculate, and the pedestrian under the B camera the most similar to certain a group traveling together under A camera is exported as the heavy recognition result of pedestrian.
Compared with prior art, the present invention has following beneficial effect:
Traditional metric learning often does identical conversion to the picture feature under different camera, make the feature space after converting meet desirable feature distribution (it is comparatively near that ideal distribution refers to the characteristic distance belonging to same a group traveling together, and the characteristic distance belonging to different pedestrian is far away) as far as possible.But consider illumination corresponding to pedestrian's image under different camera visual angles, visual angle, adopt identical transformation matrix can not excavate different camera characteristic separately to different cameras, what the transformation space therefore learning to obtain neither be optimum.Based on this, the present invention proposes the recognition methods adopting the adaptive sub-space learning algorithm in visual angle, it further considers that on the basis of traditional measure learning algorithm different cameras has different characteristics, and adopts different conversion (linear or non-linear) to make up the conversion characteristics of different camera.By this thought, the present invention can learn the optimal mapping relation obtaining often pair of camera more neatly, and the feature making to convert under rear different camera distributes close to desirable feature more.Experimental result in the heavy identification mission of pedestrian confirms the validity of the method that the present invention proposes.
Accompanying drawing explanation
By reading the detailed description done non-limiting example with reference to the following drawings, other features, objects and advantages of the present invention will become more obvious:
Fig. 1 is the process flow diagram of one embodiment of the invention;
Fig. 2 is one embodiment of the invention visual angle self-adaptation sub-space learning algorithm flow chart;
Fig. 3 is the schematic diagram that one embodiment of the invention is illustrated visual angle adaptive sub-space learning algorithm and is better than traditional metric learning algorithm;
Fig. 4 is that one embodiment of the invention personage heavily identifies several groups that randomly draw of conventional data centralization pedestrian's images to be matched;
Fig. 5 is the visual recognition effect figure of one embodiment of the invention method, first is classified as image to be matched, and other are classified as the feature utilizing the present invention to extract, after carrying out characteristic matching, before the rank drawn ten matching image, second is classified as the most matching image obtained according to method of the present invention;
Fig. 6 is sub-space learning algorithm proposed by the invention, is applied to when personage heavily identifies and the accuracy rate comparison diagram of additive method.
Fig. 7 is system architecture diagram in one embodiment of the invention.
Embodiment
Below in conjunction with specific embodiment, the present invention is described in detail.Following examples will contribute to those skilled in the art and understand the present invention further, but not limit the present invention in any form.It should be pointed out that to those skilled in the art, without departing from the inventive concept of the premise, some distortion and improvement can also be made.These all belong to protection scope of the present invention.
As shown in Figure 1, a kind of pedestrian's heavily recognition methods based on visual angle self-adaptation sub-space learning algorithm, described method is only to comprise the rectangular image of single pedestrian or to cut out target rectangle frame as input picture from raw video image by tracking results, extract proper vector over an input image, and data set is divided into training dataset and test data set, training dataset obtains transformation matrix according to visual angle self-adaptation sub-space learning Algorithm Learning, and the transformation matrix that test data set utilizes study to obtain carries out distance calculating and pedestrian heavily identifies.Concrete, comprise the steps:
Step 1): carry out feature extraction to the image of often opening of data centralization, obtain d dimensional feature vector, all proper vectors are randomly drawed out a part further as training dataset, remaining as test data set;
The method that this step can adopt prior art to record realizes, such as utilize document " LargeScaleMetricLearningfromEquivalenceConstraints (carrying out extensive metric learning from equity is about intrafascicular) " (Koestinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., & Bischof, H.<ComputerVisionandPatternRecognitionGreat T.GreaT.GT, 2012) method in carries out feature extraction;
In the present embodiment, following explanation is done to said method specific implementation:
A given input picture, first be 128 × 48 by adjusted size, then use the window of 16 × 8 to slide on image, stepping is of a size of 8 × 8, the complete image of 128 × 48 can be divided into the zonule (interregional have overlap) of 90 16 × 8 like this;
In each zonule, extract Lab respectively, HSV histogram and LBP textural characteristics, wherein Lab, HSV extract the histogram of 24 dimensions to each passage, and LBP is the even LBP histogram of 59 dimensions.Each like this fritter can obtain the proper vector of one 203 dimension;
The proper vector that each fritter extracts once be stitched together in order, obtain the complete proper vector of image, the dimension of final proper vector is 18270.
In the present embodiment, in order to reduce the redundancy of information, improve arithmetic speed, adopt PCA algorithm that proper vector is carried out dimensionality reduction further, the dimension after dimensionality reduction is 34 simultaneously.
Step 2): on training dataset, study obtains the adaptive sub-space transform matrix in visual angle;
Described visual angle self-adaptation refers to and learns a specific transformation matrix to each camera, the characteristics of image after converting is kept consistency between different camera, thus improves heavy recognition effect more neatly.Due to the complicacy of situation in actual scene, in order to overcome the impact of the transfer pair pedestrian macroscopic features that camera is introduced better, introduce kernel function and simulate nonlinear transformation, illustrate as shown in Figure 3 and different transformation matrixs is adopted to different cameras, and introduce the advantage that kernel function simulates nonlinear transformation.
As shown in Figure 2, be the process flow diagram of sub-space learning algorithm proposed in one embodiment of the invention, concrete learning process following (parameter related to below is not particularly illustrated, and please refer to summary of the invention):
2.1) for some data set (as: VIPER, illustrate this data set part samples pictures as shown in Figure 4), data are divided into two groups, often group comprises a pictures of all pedestrians, VIPER has 612 couples of pedestrians, so first group of wherein piece image comprising 612 couples of pedestrians, and second group comprises another image, and same pedestrian putting in order in two groups is identical; The data centralization be divided into group is selected a part of pedestrian's data as training dataset (as: selecting all pictures of 316 couples of pedestrians in VIPER at random as training sample), remaining as test data set (only utilizing the feature of training sample in step 2);
2.2) because the dimension of nuclear space may be infinitely great, cannot be directly right
be optimized, therefore, Wo Menling
thus change into Q
a, Q
bbe optimized.Initialization Q
a, Q
bfor unit matrix, loss threshold epsilon=10 that convergence judges are set
-5;
2.3) according to formula (4) counting loss function;
2.4) Q is calculated according to formula (5)
a, Q
bgradient;
2.5) Q is upgraded according to formula (6)
a, Q
b;
2.6) Q after upgrading is utilized
a, Q
b, according to formula (4) counting loss function l, if Δ l > is ε, then forward step 2.4 to, otherwise be judged to be convergence, export corresponding Q
a, Q
b.
Step 3): in step 2) learn the Q that obtains
a, Q
bbasis on, in test data set, enterprising every trade people heavily identifies; Concrete methods of realizing is as follows:
3.1) concentrate in test data, the feature of all pedestrians under the characteristics of image of the first man under camera A and camera B is carried out distance according to formula (7) and calculates, obtain the first row data M of distance matrix M
1.For VIPER data set, because test set has 316 pedestrians, so M
1comprise 316 range data.
3.2) step 3.1 is repeated) until all pedestrians under camera A have carried out characteristic distance with pedestrian under camera B and have compared, and obtain distance matrix M
2, M
3..., M
316, finally obtain the matrix of 316 × 316 sizes, wherein M
i,jrepresent the characteristic distance of i-th pedestrian in A and the jth pedestrian in B;
Every a line of M sorted from small to large, come the image in B corresponding to the distance of i-th, be exactly the image mated with this row corresponding image i-th in A that this method provides, what wherein come first row is the image mated most.
As shown in Figure 7, based on above-mentioned method, the present invention also provides a kind of pedestrian based on the adaptive sub-space learning algorithm in visual angle heavy recognition system, and described system comprises: characteristic extracting module, self-adaptation subspace mapping matrix module and heavy identification module, wherein:
Described system comprises: the heavy identification module of characteristic extracting module, the adaptive sub-space learning module in visual angle and pedestrian, wherein:
Described characteristic extracting module, its input is original pedestrian's image, and this module carries out feature extraction to pedestrian's image that each inputs, and obtains d dimensional feature vector; In all pedestrians, randomly draw pedestrian's image of some as training data set, and using the input of their characteristics of correspondence as subspace mapping matrix study module; All the other pedestrian's images are as in test data set;
Described subspace mapping matrix study module, its input is the training data set that characteristic extracting module exports, and this module obtains the mapping matrix of the best for the study of each camera adaptively, and with the Q obtained
a, Q
beigentransformation is carried out to the feature that training data is concentrated, makes the proper vector after converting meet desired characteristics distribution (the proper vector distance with a group traveling together is less, and different pedestrian's proper vector distance is larger) as far as possible;
Described characteristic extracting module, is expressed as a d dimensional feature vector by each pedestrian's image inputted;
Described heavy identification module, this module, at the enterprising row relax of test data set, utilizes the transformation matrix Q learning to obtain
a, Q
bfeature Mapping is carried out to test data set, according to formula (7), distance is carried out to the feature after mapping and calculate, and the pedestrian under the camera B the most similar to certain a group traveling together under camera A is exported as the heavy recognition result of pedestrian.
In the present embodiment, to the some pedestrians in camera A, according to distance order from small to large, the pedestrian in camera B is sorted, come pedestrian in the B of foremost and, as the matching result with this pedestrian in camera A, export recognition result.
The technology that above-mentioned modules specifically adopts is corresponding with each several part of said method, again repeats no more.
As shown in Figure 5, it is the matching image of before the rank that draws of an embodiment ten, first is classified as image to be matched, each row are followed successively by the matching image of ten couplings that rank the first that the present embodiment provides below, what wherein dotted line circle went out is actual matching image, can find out that method that the present embodiment proposes can be good at carrying out identification and the coupling of same a group traveling together.
As shown in Figure 6, it is the heavy recognition accuracy comparison diagram (ILIDS data set) of the sub-space learning of an embodiment and non-self-adapting, wherein: SDALF is the extraction carrying out color, Texture eigenvalue based on symmetry, and all kinds of Fusion Features is carried out personage heavily know method for distinguishing; Metric learning then combines with the distance compare threshold of local auto-adaptive by SVMML, overcomes the shortcoming that single threshold value causes discrimination lower; KISSME proposes one metric learning method fast from the angle of statistical reasoning, does not need iteration optimization; KLFDA is then based on minimizing covariance in class, maximizing the method for the improvement classification results that the principle of covariance between class proposes; PCCA proposes to utilize training data to learn to obtain a low n-dimensional subspace n, and the training sample reserved of getting the bid in this space distributes to meeting desirable feature; PRDC learns more excellent tolerance, the feature samples making to belong to same a group traveling together between distance be less than the feature samples that belongs to different pedestrian to the maximization of spacing.OurLinearKernel and OurRBFKernel is the present embodiment accuracy rate result (simultaneously testing the effect of linear processes RBF core).Can find out that the present embodiment is similar to additive method on recognition accuracy, and the accuracy rate of the inventive method converge to 1 speed faster.
Above specific embodiments of the invention are described.It is to be appreciated that the present invention is not limited to above-mentioned particular implementation, those skilled in the art can make various distortion or amendment within the scope of the claims, and this does not affect flesh and blood of the present invention.