CN107798686A

CN107798686A - A kind of real-time modeling method method that study is differentiated based on multiple features

Info

Publication number: CN107798686A
Application number: CN201710788553.1A
Authority: CN
Inventors: 青春美; 邓佳丽; 徐向民; 邢晓芬
Original assignee: South China University of Technology SCUT
Current assignee: South China University of Technology SCUT
Priority date: 2017-09-04
Filing date: 2017-09-04
Publication date: 2018-03-13

Abstract

The invention discloses a kind of real-time modeling method method that study is differentiated based on multiple features, including step：1) the greyscale video frame in video is obtained, the brightness attribute of tracking target is described using Cross bin distribution fields feature；2) the texture diversity for tracking target is modeled using enhancing histogram of gradients feature EHOG；3) by the color video frame of video, color characteristic CN is extracted to keep colour consistency；4) by step 1), 2), 3) in obtained various dimensions feature pass through Hilbert space mapping projections to high-dimensional feature space, obtain inner product mapping；5) obtained confidence map is put into CSK frameworks and be tracked, find tracking position of object, then more new template carries out target tracking.The present invention can effectively solve illumination variation present in target following, ambient interferences, block the problem such as low with real-time.

Description

A kind of real-time modeling method method that study is differentiated based on multiple features

Technical field

The present invention relates to the technical field of target following, refers in particular to a kind of real-time target that study is differentiated based on multiple features Tracking, available for intelligent video monitoring, automatic Pilot and man-machine interaction etc..

Background technology

One of the problem of target following is most challenging in computer vision field.It play in different applications to Important effect is closed, particularly has important application in military, medical treatment, monitoring and man-machine interaction.Recent years have Many algorithms are all used for solving the problems, such as target following, but because the deformation of target, the change of illumination, and target are blocked The problems such as influence, receive very big influence for the performance of target tracking algorism.

Current tracking framework generally can be divided into two modules：Expression model and trace model.Table in majority tracking framework Monochrome information is depended on up to model extraction target signature, until Lucas and Kanade are proposed based on the complete of unprocessed information more Office's template, certain methods are using brightness histogram come to the apparent modeling of target, but such a method lost spatial information.For solution Certainly spatial information problem, it is proposed that describe son comprising the multinuclear including spatial information.

When illumination variation, the apparent change of target is more violent.Single illumination intensity information is very easy to by illumination shadow Ring, can not robust tracking target, therefore propose diversified illumination invariant feature, it is such as a variety of not by illumination variation Texture information, as covariance region description, local binary patterns.In the recent period, online increment subspace model also be used to locate Manage target following under illumination variation.

Because color of object contains more rich information, a variety of tracking based on color apparent model are proposed in recent years Algorithm, the average drifting Mean Shift algorithms such as based on color histogram, color space are proposed using multiple color attribute Self-adaptive reduced-dimensions mechanism tracks target.However, it is simple use color characteristic cause track algorithm to background similar in color with The tracking target of low saturation is unstable.

Most experiments show, take a kind of single characteristic information to tackle the change under the conditions of several scenes, do not have The algorithm for having any single feature can handle the tracking problem under all scenes.Therefore, multiple features are merged and carry out target Tracking becomes main flow.Most researchers merge multiple clues, such as dynamic bayesian network, Meng Teka using probabilistic method Sieve algorithm, particle filter etc..The selection of feature and syncretizing mechanism are the challenges of target following in multi thread track algorithm.

The content of the invention

It is an object of the invention to overcome the deficiencies in the prior art, it is proposed that a kind of achievable stabilization, robust based on Multiple features differentiate study real-time modeling method method, can effectively solve illumination variation present in target following, ambient interferences, Block the problem such as low with real-time.

To achieve the above object, technical scheme provided by the present invention is：It is a kind of to differentiate the real-time of study based on multiple features Method for tracking target, comprise the following steps：

1) the greyscale video frame in video is obtained, the brightness category of tracking target is described using Cross-bin distribution fields feature Property；

2) the texture diversity for tracking target is modeled using enhancing histogram of gradients feature EHOG；

3) by the color video frame of video, color characteristic CN is extracted to keep colour consistency；

4) by step 1), 2), 3) in obtained various dimensions feature pass through Hilbert space mapping projections to high dimensional feature Space, inner product mapping is obtained, is comprised the following steps：

4.1) original CSK is extended to multidimensional characteristic by the data of acquisition by multichannel correlation filter, that is, will Multidimensional characteristic in Fourier using the Gauss weights read group total that variance is σ into final feature；

4.2) using differentiate study method learning procedure 1), 2), 3) in various dimensions characteristic response figure adaptively melted Close；

4.3) feature of different clues is accordingly subjected to multi thread weighting using the weights that step 4.2) learns, obtained most The weighting confidence map of all features eventually；

5) confidence map for obtaining step 4.3) is put into CSK frameworks and is tracked, and finds tracking position of object, then More new template carries out target tracking.

In step 1), a video sequence to be tested is chosen from normal video storehouse, the total quantity for acquisition is T Video sequence, since the 1st frame, determine the target data x of t frames, t is 1 to the arbitrary frame between T, and it is labeled as x_m,n, m and n represent the ranks value of gray level image；

In step 2), for tape label training data (x_1,1,y_1,1),...,(x_m,n,y_m,n), wherein x_m,nTo train sample Notebook data, y_m,nIt is expected as a result, it is desirable to calculate various dimensions feature F_f(m, n, k), k represent the bin values of gray feature histogram, Various dimensions feature includes brightness CDF, textural characteristics EHOG and color characteristic CN, is specifically calculated as follows：

1. calculate brightness F_cdf(m, n, k), using local correlations approximation and cross-bin distribution field measures To describe brightness；Wherein, distribution field is the probability distribution matrix of each pixel of image, represents that this pixel belongs to a spy The probable value of sign, a distribution field are the probability distribution matrixes of a 3-dimensional, including the width of image, height and signature grey scale value Scope, in gray feature space, w × h sizes picture can generate three-dimensional w × h × b Characteristic Field, and w is the width of image Degree, h is the height of image, and b is the quantity of the bin values of gray feature histogram, is used One original-gray image I (m, n) is split to the different distributions field of different characteristic layer, D_kIt is a set, its scope is D_k =[255k/b, 255 (k+1)/b], k ∈ { 0,2 ..., b-1 } represent the bin values of gray feature histogram, and b is that gray feature is straight The quantity of the bin values of square figure；Then cross-bin metric forms are used, i.e. the mode of histogram edge filter calculates brightness Feature, specific formula areWherein,It is the Gaussian kernel of Spatial Dimension, standard Difference is σ_s,It is the one-dimensional Gaussian kernel of feature space, standard deviation σ_k, * expression convolution operations；

2. calculate textural characteristics F_ehog(m, n, k), texture consistency is described using enhancing histograms of oriented gradients EHOG, Gradient is calculated using Finite-difference Filtering device [- 1,0 ,+1] and its transposition；Wherein, according to gradient to orientation-sensitive whether, Gradient is divided into contrast sensitivity B₁With the insensitive B of contrast₂, definition

θ (m, n) represents the direction of pixel in image, and p, q represent the direction that contrast is insensitive and contrast is sensitive respectively The number of gradient；Then B is represented with B₁Or B₂, k ∈ { 0,1 ..., p+q-1 }, k represent the bin values of histogram, then can The characteristic pattern F of grid where calculating (m, n)_hog(m, n, k) is：α (m, n) table The range value of pixel in diagram picture, furthermore, it is necessary to which the consistency for obtaining gradient is normalized, use four kinds of different normalizings Change factor N_1,-1(m,n,k)、N_1,1(m,n,k)、N_-1,-1(m,n,k)、N_-1,1(m, n, k), obtain the characteristic pattern of gradient consistency N_δx,δ_Y(m, n), wherein δ_x∈ { -1,1 }, δ_y∈ { -1,1 }, specific formula for calculation is：

After four kinds of different normalization modes, the characteristic pattern F after being normalized_ehog(m, n, k), specific formula for calculation are：

3. calculate color characteristic F_cn(m, n, k), the RGB color used in computer need to be mapped to by mapping matrix In the probability color matrix of 11 dimensions, mapping matrix is by the study of Google picture searching to color probability is expressed as into F_cn(m, N, k)=Map (R (m, n), G (m, n), B (m, n), k), wherein R (m, n), G (m, n), B (m, n) are represented in RGB pictures respectively Three kinds of Color Channels, Map expressions are mapped to the mapping matrix of 11 dimension colors from RGB, and k represents the bin values of histogram, F_cn(m,n, K) value each selected in represents that this yuan of vegetarian refreshments belongs to the probability of each color in 11 kinds of colors；

In step 3), multi thread feature is merged, by the various dimensions feature obtained in step 2) according to formulaMerged, obtain the fusion feature of t framesWherein,Represent t frames Multi thread feature,Represent to carry out Fourier transformation, ∑_kRepresent to carry out the accumulating operation on 0 to k, and according to formulaHilbert space mapping projections are carried out to high-dimensional feature space, obtain the interior of t frames Product mappingWherein, κ represents to carry out gaussian kernel function computing,The frequency-domain model of t frames is represented, its computational methods are

In step 4.1), multi thread feature is put into CSK frameworks and is tracked, by original CSK be extended to various dimensions and The tracker of multi thread, regard tracking as classification problem, separate foreground and background；Each pixel value is represented according to multiple clues Fusion standard judges that this position belongs to the probability of prospect or background；Every clue i.e. every kind of feature have a confidence feature Figure；The confidence characteristic pattern for merging all features just obtains final confidence map, this confidence map is put into CSK frameworks carry out with Track；

Multidimensional characteristic is extended by multichannel correlation filter, that is, multidimensional characteristic is used in Fourier Variance be σ Gauss weights read group total into final feature,

Wherein, x_(m,n)Represent the target data of t frames, x '_(m,n)The target data of t+1 frames is represented, κ represents to carry out Gaussian kernel Functional operation, k represent the bin values of histogram, and exp represents the exponential function using natural constant e the bottom of as,Represent that Fourier becomes Change,Represent inverse Fourier transform；

In step 4.2), for training data x, closed using the kernel function of standardization least squares method RLS loss variances It is A to close solution_f=(H_f+λI)^-1Y, wherein I are unit matrix, and λ is regularization coefficient, matrix H_fElement be h_m,n(x, x'), matrix Y={ y_m,nIt is label result；The training sample auto-correlation under higher dimensional space is calculated by kernel function；By above-mentioned closed solution It is transformed into the parameter that CSK correlation pass filters are calculated in FourierSo as to which spy be calculated Sign response⊙ represents point multiplication operation；It is corresponding to differentiate minimum using multiple featuresMethod study to step 2.1), 2.2), 2.3) in each feature weights ω_f, wherein r_f Represent characteristic response r_m,n, argmin () refers to so that function obtains the set of all independents variable of its minimum value；

In step 4.3), the characteristic response of different clues is subjected to multi thread using the weights that step 4.2) learns and added Power,Wherein F_fIt is appearance features, M_fIt is frequency-domain model, ω_fIt is The weighted value of this feature, r_m.nThe multi thread confidence map of all features of fusion as finally obtained；

In step 5), confidence map that step 4.3) is obtained, which is put into CSK frameworks, to be tracked, according toFind tracking position of objectIt is calculated by target location renewal frequency-domain model pre- The frequency-domain model M ' of survey_f,So as to which the closed solution A ' of prediction further be calculated_f, A '_f=Y/ (H'_f+ λ)=Y/ (κ (M '_f,M′_f)+λ), H'_fThe inner product mapping of prediction is represented, is then used Renewal formwork calculation obtains the object module of t+1 framesWherein,Represent t The frequency-domain model of frame,The closed solution of t frames is represented, the target data of t+1 frames is found, realizes target tracking, wherein, λ is just Then term coefficient, γ ∈ [0,1] are learning rate.

The present invention compared with prior art, has the following advantages that and beneficial effect：

1st, the present invention combines the robust apparent model of various dimensions feature first and extension CSK (is tied based on kernel function circulation The Vision Tracking of structure) trace model, it is successfully realized target following.

2nd, the present invention finds target by merging three kinds of brightness, texture, colouring information features using the method for differentiating study With the decision boundary of background, the optimal result of study to multiple features fusion, improved model is in complicated and diversified environment Obtain leading tracking result.

3rd, three kinds of features that the present invention uses cope with different scenarios, under different conditions different features Occupy main positions.

Brief description of the drawings

Fig. 1 is the inventive method flow chart.

Embodiment

With reference to specific embodiment, the invention will be further described.

The real-time modeling method method shown in Figure 1, that the present embodiment is provided that study is differentiated based on multiple features, bag Include following steps：

1) the greyscale video frame in video is obtained, is specifically：A video sequence to be tested is chosen from normal video storehouse Row, for the video sequence for including T frames of acquisition, since t=1 frames, determine the target data x of t frames, it are labeled as x_m,n, m and n represent the ranks value of gray level image.

2) tracking target is modeled using various dimensions feature, is specifically：For tape label training data (x_1,1, y_1,1),.......,(x_m,n,y_m,n), wherein x is training sample data, and y is expectation as a result, it is desirable to calculate various dimensions feature F_f, Various dimensions feature includes brightness CDF, textural characteristics EHOG, color characteristic CN, is specifically calculated as follows:

θ (m, n) represents the direction of pixel in image, and p, q represent the direction that contrast is insensitive and contrast is sensitive respectively The number of gradient；Then B is represented with B₁Or B₂, k ∈ { 0,1 ..., p+q-1 }, k represent the bin values of histogram, then can The characteristic pattern F of grid where calculating (m, n)_hog(m, n, k) is：α (m, n) table The range value of pixel in diagram picture, furthermore, it is necessary to which the consistency for obtaining gradient is normalized, use four kinds of different normalizings Change factor N_1,-1(m,n,k)、N_1,1(m,n,k)、N_-1,-1(m,n,k)、N_-1,1(m, n, k), obtain the characteristic pattern of gradient consistencyWherein δ_x∈ { -1,1 }, δ_y∈ { -1,1 }, specific formula for calculation is：

After four kinds of different normalization modes, the characteristic pattern F after being normalized_ehog(m, n, k), specific formula for calculation For：

3) multi thread feature is merged, by the various dimensions feature obtained in step 2) according to formula Merged, obtain the fusion feature of t framesWherein,The multi thread feature of t frames is represented,Represent into Row Fourier transformation, ∑_kRepresent to carry out the accumulating operation on 0 to k, and according to formulaCarry out Hilbert space mapping projections obtain the inner product mapping of t frames to high-dimensional feature spaceWherein, κ represents to carry out Gauss Kernel function computing,The frequency-domain model of t frames is represented, its computational methods are

4.1) multi thread feature is put into CSK frameworks to be tracked, by original CSK be extended to various dimensions and multi thread with Track device, regard tracking as classification problem, separate foreground and background；Each pixel value represent according to multiple clues merge standard come Judge that this position belongs to the probability of prospect or background；Every clue i.e. every kind of feature have a confidence characteristic pattern；Fusion institute The confidence characteristic pattern for having feature just obtains final confidence map, and this confidence map is put into CSK frameworks and is tracked；

Multidimensional characteristic is extended by multichannel correlation filter, that is, by multidimensional characteristic in Fourier Using the Gauss weights read group total that variance is σ into final feature,

4.2) training data x is directed to, the kernel function closed solution using standardization least squares method RLS loss variances is A_f= (H_f+λI)^-1Y, wherein I are unit matrix, and λ is regularization coefficient, matrix H_fElement be h_m,n(x, x'), matrix Y= {y_m,nIt is label result；The training sample auto-correlation under higher dimensional space is calculated by kernel function；Above-mentioned closed solution is turned Change to Fourier and the parameter of CSK correlation pass filters is calculatedSo as to which spy be calculated Sign response⊙ represents point multiplication operation；It is corresponding to differentiate minimum using multiple featuresMethod study to step 2.1), 2.2), 2.3) in each feature weights ω_f, wherein r_fTable Show characteristic response r_m,n, argmin () refers to so that function obtains the set of all independents variable of its minimum value；

4.3) characteristic response of different clues is subjected to multi thread weighting using the weights that step 4.2) learns,Wherein F_fIt is appearance features, M_fIt is frequency-domain model, ω_fIt is this The weighted value of feature, r_m.nThe multi thread confidence map of all features of fusion as finally obtained；

5) confidence map for obtaining step 4.3) is put into CSK frameworks and is tracked, according to Find tracking position of objectThe frequency-domain model M ' of prediction is calculated by target location renewal frequency-domain model_f,So as to which the closed solution A ' of prediction further be calculated_f, A '_f=Y/ (H'_f+ λ)=Y/ (κ (M '_f, M′_f)+λ), H'_fThe inner product mapping of prediction is represented, is then used Renewal formwork calculation obtains the object module of t+1 framesWherein,The frequency-domain model of t frames is represented,Represent The closed solution of t frames, the target data of t+1 frames is found, realizes target tracking, wherein, λ is regularization coefficient, and γ ∈ [0,1] are Learning rate.

Embodiment described above is only the preferred embodiments of the invention, and the implementation model of the present invention is not limited with this Enclose, therefore the change that all shape, principles according to the present invention are made, it all should cover within the scope of the present invention.

Claims

A kind of 1. real-time modeling method method that study is differentiated based on multiple features, it is characterised in that comprise the following steps：

1) the greyscale video frame in video is obtained, the brightness attribute of tracking target is described using Cross-bin distribution fields feature；

2) the texture diversity for tracking target is modeled using enhancing histogram of gradients feature EHOG；

3) by the color video frame of video, color characteristic CN is extracted to keep colour consistency；

4) by step 1), 2), 3) in obtained various dimensions feature by Hilbert space mapping projections to high-dimensional feature space, Inner product mapping is obtained, is comprised the following steps：

4.1) original CSK is extended to multidimensional characteristic by the data of acquisition by multichannel correlation filter, that is, by multidimensional Feature in Fourier using the Gauss weights read group total that variance is σ into final feature；

4.2) using differentiate study method learning procedure 1), 2), 3) in various dimensions characteristic response figure adaptively merged；

4.3) feature of different clues is accordingly subjected to multi thread weighting using the weights that step 4.2) learns, obtains final institute There is the weighting confidence map of feature；

5) confidence map for obtaining step 4.3) is put into CSK frameworks and is tracked, and finds tracking position of object, then updates Template carries out target tracking.
A kind of 2. real-time modeling method method that study is differentiated based on multiple features according to claim 1, it is characterised in that：

In step 1), a video sequence to be tested is chosen from normal video storehouse, the total quantity for acquisition is regarding for T Frequency sequence, since the 1st frame, determine that it, to the arbitrary frame between T, is labeled as x by the target data x of t frames, t for 1_m,n, m and N represents the ranks value of gray level image；

In step 2), for tape label training data (x_1,1,y_1,1),...,(x_m,n,y_m,n), wherein x_m,nFor number of training According to y_m,nIt is expected as a result, it is desirable to calculate various dimensions feature F_f(m, n, k), k represent the bin values of gray feature histogram, multidimensional Degree feature includes brightness CDF, textural characteristics EHOG and color characteristic CN, is specifically calculated as follows：

1. calculate brightness F_cdf(m, n, k), retouched using local correlations approximation with cross-bin distribution fields measure State brightness；Wherein, distribution field is the probability distribution matrix of each pixel of image, represents that this pixel belongs to feature Probable value, a distribution field are the probability distribution matrixes of a 3-dimensional, include the model of the width of image, height and signature grey scale value Enclose, in gray feature space, w × h sizes picture can generate three-dimensional w × h × b Characteristic Field, and w is the width of image, h It is the height of image, b is the quantity of the bin values of gray feature histogram, is usedWill One original-gray image I (m, n) is split to the different distributions field of different characteristic layer, D_kIt is a set, its scope is D_k= [255k/b, 255 (k+1)/b], k ∈ { 0,2 ..., b-1 } represent the bin values of gray feature histogram, and b is gray feature Nogata The quantity of the bin values of figure；Then cross-bin metric forms are used, i.e. the mode of histogram edge filter calculates brightness spy Sign, specific formula areWherein,It is the Gaussian kernel of Spatial Dimension, standard deviation For σ_s,It is the one-dimensional Gaussian kernel of feature space, standard deviation σ_k, * expression convolution operations；

2. calculate textural characteristics F_ehog(m, n, k), texture consistency is described using enhancing histograms of oriented gradients EHOG, used Finite-difference Filtering device [- 1,0 ,+1] and its transposition calculate gradient；Wherein, according to gradient to orientation-sensitive whether, by ladder Degree is divided into contrast sensitivity B₁With the insensitive B of contrast₂, definition

θ (m, n) represents the direction of pixel in image, and p, q represent the direction that contrast is insensitive and contrast is sensitive respectively The number of gradient；Then B is represented with B₁Or B₂, k ∈ { 0,1 ..., p+q-1 }, k represent the bin values of histogram, then can The characteristic pattern F of grid where calculating (m, n)_ho_g(m, n, k) is：α (m, n) table The range value of pixel in diagram picture, furthermore, it is necessary to which the consistency for obtaining gradient is normalized, use four kinds of different normalizings Change factor N_1,-1(m,n,k)、N_1,1(m,n,k)、N_-1,-1(m,n,k)、N_-1,1(m, n, k), obtain the characteristic pattern of gradient consistencyWherein δ_x∈ { -1,1 }, δ_y∈ { -1,1 }, specific formula for calculation is：

After four kinds of different normalization modes, the characteristic pattern F after being normalized_ehog(m, n, k), specific formula for calculation are：

<mrow> <msub> <mi>F</mi> <mrow> <mi>e</mi> <mi>h</mi> <mi>o</mi> <mi>g</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>F</mi> <mrow> <mi>h</mi> <mi>o</mi> <mi>g</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>/</mo> <msub> <mi>N</mi> <mrow> <mn>1</mn> <mo>,</mo> <mn>1</mn> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>F</mi> <mrow> <mi>h</mi> <mi>o</mi> <mi>g</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>/</mo> <msub> <mi>N</mi> <mrow> <mn>1</mn> <mo>,</mo> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>F</mi> <mrow> <mi>h</mi> <mi>o</mi> <mi>g</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>/</mo> <msub> <mi>N</mi> <mrow> <mo>-</mo> <mn>1</mn> <mo>,</mo> <mn>1</mn> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>F</mi> <mrow> <mi>h</mi> <mi>o</mi> <mi>g</mi> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> <mo>/</mo> <msub> <mi>N</mi> <mrow> <mo>-</mo> <mn>1</mn> <mo>,</mo> <mo>-</mo> <mn>1</mn> </mrow> </msub> <mrow> <mo>(</mo> <mi>m</mi> <mo>,</mo> <mi>n</mi> <mo>,</mo> <mi>k</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

3. calculate color characteristic F_cn(m, n, k), the RGB color used in computer need to be mapped to 11 dimensions by mapping matrix In the probability color matrix of degree, mapping matrix is by the study of Google picture searching to color probability is expressed as into F_cn(m,n,k) =Map (R (m, n), G (m, n), B (m, n), k), wherein R (m, n), G (m, n), B (m, n) represent three kinds in RGB pictures respectively Color Channel, Map expressions are mapped to the mapping matrix of 11 dimension colors from RGB, and k represents the bin values of histogram, F_cnIn (m, n, k) The value each selected represents that this yuan of vegetarian refreshments belongs to the probability of each color in 11 kinds of colors；

In step 3), multi thread feature is merged, by the various dimensions feature obtained in step 2) according to formula Merged, obtain the fusion feature of t framesWherein,The multi thread feature of t frames is represented,Represent to carry out Fourier transformation, ∑_kRepresent to carry out the accumulating operation on 0 to k, and according to formulaWished Your Bert space reflection projects to high-dimensional feature space, obtains the inner product mapping of t framesWherein, κ represents to carry out Gaussian kernel Functional operation,The frequency-domain model of t frames is represented, its computational methods are

In step 4.1), multi thread feature is put into CSK frameworks and is tracked, original CSK is extended to various dimensions and multi-thread The tracker of rope, regard tracking as classification problem, separate foreground and background；Each pixel value represents to be merged according to multiple clues Standard judges that this position belongs to the probability of prospect or background；Every clue i.e. every kind of feature have a confidence characteristic pattern； The confidence characteristic pattern for merging all features just obtains final confidence map, and this confidence map is put into CSK frameworks and is tracked；

Multidimensional characteristic is extended by multichannel correlation filter, that is, it is σ's that multidimensional characteristic is used into variance in Fourier Gauss weights read group total into final feature,

Wherein, x_(m,n)Represent the target data of t frames, x '_(m,n)The target data of t+1 frames is represented, κ represents to carry out Gaussian kernel letter Number computing, k represent the bin values of histogram, and exp represents the exponential function using natural constant e the bottom of as,Represent Fourier transformation,Represent inverse Fourier transform；

In step 4.2), for training data x, the kernel function closed solution of standardization least squares method RLS loss variances is used For A_f=(H_f+λI)^-1Y, wherein I are unit matrix, and λ is regularization coefficient, matrix H_fElement be h_m,n(x, x'), matrix Y= {y_m,nIt is label result；The training sample auto-correlation under higher dimensional space is calculated by kernel function；Above-mentioned closed solution is converted The parameter of CSK correlation pass filters is calculated to FourierRung so as to which feature be calculated Should⊙ represents point multiplication operation；It is corresponding to differentiate minimum using multiple featuresMethod study to step 2.1), 2.2), 2.3) in each feature weights ω_f, wherein r_f Represent characteristic response r_m,n, argmin () refers to so that function obtains the set of all independents variable of its minimum value；

In step 4.3), the characteristic response of different clues is subjected to multi thread weighting using the weights that step 4.2) learns,Wherein F_fIt is appearance features, M_fIt is frequency-domain model, ω_fIt is this The weighted value of feature, r_m.nThe multi thread confidence map of all features of fusion as finally obtained；

In step 5), confidence map that step 4.3) is obtained, which is put into CSK frameworks, to be tracked, according to Find tracking position of objectThe frequency-domain model M ' of prediction is calculated by target location renewal frequency-domain model_f,So as to which the closed solution A ' of prediction further be calculated_f, A '_f=Y/ (H'_f+ λ)=Y/ (κ (M '_f, M′_f)+λ), H'_fThe inner product mapping of prediction is represented, is then used Renewal formwork calculation obtains the object module of t+1 framesWherein,The frequency-domain model of t frames is represented,Represent the The closed solution of t frames, the target data of t+1 frames is found, realizes target tracking, wherein, λ is regularization coefficient, and γ ∈ [0,1] are Habit rate.