CN101127120A

CN101127120A - Target tracking algorism for effectively suppressing template drift

Info

Publication number: CN101127120A
Application number: CNA200710045939XA
Authority: CN
Inventors: 潘吉彦; 胡波; 张建秋
Original assignee: Fudan University
Current assignee: Shanghai Longdong Optoelectronic Co., Ltd.
Priority date: 2007-09-13
Filing date: 2007-09-13
Publication date: 2008-02-20
Anticipated expiration: 2027-09-13
Also published as: CN101127120B

Abstract

The utility model relate to a target tracking algorithm which can effectively inhibit the template drift occurred frequently in the process of target tracking, belonging to the technical field of computer vision and mode analysis. The utility model provides a quantitative expression called ''drift noise power'' aiming at the cause of the template drift and brings the ''drift noise power'' into the framework of Kalman-mode updated filter. The Kalman-mode updated filter with the ''drift noise power'' can achieve the optimal balance between updating target appearance in time and avoiding template drift self-adaptively in time and space. The utility model is proved to be effective by the experimental results of a plurality of different types of realistic video stream and synthetic video stream.

Description

A kind of target tracking algorism of effective inhibition template drift

Technical field

The invention belongs to computer vision and pattern analysis technical field, be specifically related to a kind of target tracking algorism of effective inhibition template drift.

Background technology

Target following has a wide range of applications in man-machine interaction, supervision automatically, video frequency searching, Traffic monitoring and automobile navigation.The task of target following is to determine how much states of target in each frame of video flowing, comprises position, size and orientation etc.Owing to do not limit the outward appearance of tracked target, and the outward appearance of target can change in tracing process, adds the interference of complex background, and target tracking algorism is faced with lot of challenges, is one of research focus of computer vision field.

Target tracking algorism is divided into two big classes, and a class is trace point target (point tracking) ^[1,2], another kind of then is (the kernel tracking) of tracing surface target ^[3～6]When target with respect to whole visual field very hour, for example radar image can adopt the point target track algorithm.For the captured image of common camera, then adopt appearance mark track algorithm more.Appearance mark track algorithm can be divided into again To Template is mated (template matching) ^[3,4]And only objective contour is followed the tracks of (contour tracking) ^[5,6]Because the template matches track algorithm has been integrated the overall appearance information and the geological information of target simultaneously, therefore use quite extensive.

The template matches track algorithm uses a rectangle or oval-shaped template to characterize target usually.The motion of target is described by the coordinate transform (translation, convergent-divergent, rotation etc.) of template usually.Different coordinate conversion parameters provides different image-regions, wherein provides the geological information that has reflected current goal with the coordinate conversion parameter of the highest image-region of template matches degree ^[7]

Because the outward appearance of target constantly changes in tracing process, so template also must be brought in constant renewal in.The simplest template renewal strategy be every frame or every some frames the image-region that matches as new template ^[8,9]Yet a frequent phenomenon that takes place is, in tracing process, target shifts out template gradually, and background object moves into template gradually, finally causes losing of target.This phenomenon is called template drift phenomenon (template drift) ^[3,7]

Many documents have all been done preliminary to the reason of template drift phenomenon and have been explained qualitatively ^[3,10,11]The basic reason of template drift is that template matches exists geometric error, and these errors are introduced in the template when each template renewal, change into the outward appearance error of template.Although these errors are very little, can build up, cause the continuous drift of the relative template of target.Template renewal must be frequent more, and the template drift phenomenon is just serious more.So just caused a pair of contradiction: can reflect the outward appearance of current goal as far as possible timely and accurately in order to make template, must the frequent updating template; On the other hand, the frequent updating template can cause the template outward appearance to be destroyed by drift phenomenon again.

In order to solve this contradiction, various template renewal algorithms have been carried.Certain methods is come the result of calibration template coupling with original template ^[1,7,10]These methods can be avoided template drift in target appearance preferably when whole tracking phase variation is very little, but in most of the cases, target appearance all can take place to change comparatively significantly after after a while, at this moment, original template is proofreaied and correct just no longer valid for coupling, and can cause the tracking effect variation on the contrary.

Document [11] concludes that the strategy that template is carried out filtering with Kalman's template renewal wave filter has the strongest robustness for drift and noise after having compared the various template renewal strategies that do not rely on original template.Document [12] has further been studied Kalman's coefficient of how choosing the template renewal wave filter.But, in document [11] and [12], Kalman's coefficient just remains unchanged in whole tracing process once selected, and this makes the intensity that the template renewal wave filter can't be adjusted template renewal adaptively according to the severe degree and the possible template drift degree of target appearance variation.

In document [4] and [13], one of two plant noises by estimating Kalman's template renewal wave filter, Kalman's coefficient of wave filter can be adjusted automatically according to the situation that target appearance changes.But document [4] is with [13] or supposition state transitions noise power is constant, or supposition observation noise power invariability, and this is all comparatively rare in actual conditions.Therefore, the method in document [4] and [13] still can not reduce template drift well in the outdoor scene video flowing.In order effectively to estimate the noise power of Kalman's template renewal wave filter, the calculation template drift is to the influence of noise power quantitatively.

List of references

[1]C.Rasmussen，G.Hager.Probabilistic data association methods for tracking complexvisual objects[J].IEEE Trans.on Pattern Analysis and Machine Intelligence，2001，23(6)：560-576.

[2]C.Hue，J.L.Cadre，P.Prez.Sequential Monte Carlo methods for multiple target trackingand data fusion[J].IEEE Trans.on Signal Processing，2002，50(2)：309-325.

[3]I.Matthews，T.Ishikawa，S.Baker.The template update problem[J].IEEE Trans.onPattern Analysis and Machine Intelligence，2004，26(6)：810-815.

[4]H.T.Nguyen，A.W.M.Smeulders.Fast occluded object tracking by a robust appearancefilter[J].IEEE Trans.on Pattern Analysis and Machine Intelligence，2004，26(8)：1099-1104.

[5]A.Yilmaz，X.Li，M.Shan.Contour based object tracking with occlusion handling invideo acquired using mobile cameras[J].IEEE Trans.on Pattern Analysis and MachineIntelligence，2004，26(11)：1531-1536.

[6]Y.Chen，Y.Rui，T.Huang.Jpdaf based HMM for real-time contour tracking[A].Proc.IEEE Comp.Society.Conf.on Computer Vision and Pattern Recognition[C].1：543-550，2001.

[7]Z.Jia，A.Balasuriya，S.Challa.Target Tracking with Bayesian Fusion Based TemplateMatching[A].Proc.IEEE Int.Conf.On Image Processing[C].2：II-826-829，2005.

[8]M.J.Black，Y.Yacoob.Recognizing facial expressions in image sequences using localparameterized models of image motion[J].Int.J.Computer Vision，1997，25(1)：23-48.

[9]H.Sidenbladh，M.J.Black，D.J.Fleet.Stochastic tracking of 3D human figures using 2Dimage motion[A].Proc.European Conf.on Computer Vision[C].2：702-718，2000.

[10]T.Kaneko，Osamu Hori.Template update criterion for template matching of imagesequences[A].Proc.IEEE Int.Conf.on Pattern Recognition[C].2：1-5，2002.

[11]A.M.Peacock，S.Matsunaga，D.Renshaw，J.Hannah，A.Murray.Reference block updatingwhen tracking with block matching algorithm[J].Electronic Letters，2000，36：309-310.

[12]C.Haworth，A.M.Peacock，D.Renshaw.Performance of reference block updatingtechniques when tracking with the block matching algorithm[A].Proc.IEEE Int.Conf.onImage Processing[C].1：365-368，2001.

[13]H.T.Nguyen，M.Worring，R.van den Boomgaard.Occlusion robust adaptive templatetracking[A].Proc.IEEE Int.Conf.on Computer Vision[C].1：678-683，2001.

[14]L.K.Liu，E.Feig.A block-based gradient descent search algorithm for block motionestimation in video coding[J].IEEE Trans.on Circuits and Systems for Video Technology，1996，6：419-422.

[15]S.Baker，I.Matthews.Lucas-Kanade 20years on：a unifying framework[J].Int.J.Computer Vision，2004，53(3)：221-255.

[16]J.Pan，B.Hu，J.Q.Zhang.An efficient object tracking algorithm with adaptive predictionof initial searching point[A].Lecture Notes in Computer Science，4319/2006：1113-1122，2006.

[17]R.G.Brown，P.Y.C.Hwang.Introduction to Random Signals and Applied Kalman Filtering[M]，John Wiley，1992.

Summary of the invention

The objective of the invention is to propose a kind of target tracking algorism that can effectively suppress template drift.

How key of the present invention is that the origin cause of formation quantitative modeling with template drift is a drift noise and with its part as template renewal wave filter observation noise.

Target tracking algorism of the present invention comprises:

Utilize more new template of Kalman filter, and by including the generation that effectively suppresses template drift in the Kalman filter in the influence of template drift is explicit.

With template drift The noise quantitative expression is an error power, i.e. drift noise power σ _MD ²

With drift noise power σ _MD ²As observation noise power σ in Kalman's template renewal wave filter _M ²Ingredient.

With camera noise power σ _MC ²As observation noise power σ in Kalman's template renewal wave filter _M ²Another ingredient.

Observation noise power σ in Kalman's template renewal wave filter _M ²By σ _MD ²With σ _MC ²Weighted sum obtain: σ _M ²=λ σ _MD ²+ σ _MC ², wherein λ is a constant.

Come quantitative Analysis drift noise power σ by the quantization error of coordinate transform coefficient and the probability distribution of pixel actual value _MD ²

Under the situation of unknown camera noise power parameter, estimate the noise power σ of camera by the pixel variance yields that calculates the test pattern that produces by the grey uniform background _MC ²

When target travel can only be described with translation and convergent-divergent, adopt a kind of fast algorithm to calculate drift noise power σ _MD ², this fast algorithm will produce same conversion recoil pairing each sum term of target transformation parameter vector and be merged into one, m be tieed up the summation problem be converted to two-dimentional summation problem, thereby reduce operand greatly.

Be further described in detail below

1, based on the target following of template matches

In the target following based on template matches, target is represented that by the subimage of representing its outward appearance this number of sub images is called template.Original template is the outward appearance of target in first frame normally.In the present invention, template is with T (x) expression, x=[x wherein, y] ^TIt is pixel coordinate.Because the existence of observation noise, real target appearance can't obtain, so the actual utilization of track algorithm is the estimated value of template

In each frame, By coordinate transform φ (x; A) be mapped in the picture frame, wherein a is the transformation parameter vector.φ (x; A) motion and the deformation of target have been described.For the coordinate transform that comprises translation, convergent-divergent and rotation, a is a four-dimensional vector, and a=(a1, a2, a3, a4), φ (x; A) can be expressed as:

φ (x; a) = a_{1} [\begin{matrix} \cos a_{2} & - \sin a_{2} \\ \sin a_{2} & \cos a_{2} \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] + [\begin{matrix} a_{3} \\ a_{4} \end{matrix}] - - (1)

Here, α ₁Be amount of zoom, α ₂Be the anglec of rotation, (α ₃, α ₄) be the translational movement in (x, y) coordinate system.α in theory, φ (x; A) can have the target travel that many arbitrarily parameters are described any complexity, but can describe the most applications that runs in the reality by the motion model of (1) formula representative.

Transformation parameter vector a has reflected the geological information of target in present frame.The optimal estimation of this information obtains by the image-region of seeking the present frame that mates most with template, promptly

\hat{a} = \underset{a}{\arg \min} \frac{1}{N} \underset{x &Element; Ω_{T}}{Σ} | I_{n} [φ (x; a)] - \hat{T} (x) | - - - (2)

In following formula,  is the optimal estimation value of the transformation parameter vector of n frame; I _n(x) be the pixel value that the n two field picture is positioned at coordinate x place; Ω _TIt is the set of template pixel coordinate; N is the number of pixels that template comprises.(2) realization of formula has a series of fast search algorithms ^[14,15], the initial point of search can be obtained by the algorithm in the document [16].In addition, because therefore the not necessarily rounded coordinate value that coordinate transform φ is produced need calculate I with interpolation algorithm _n[φ (x; A)].

In the ideal case, the transformation parameter vector  that is obtained by (2) formula has reflected the true geometric state of target, but because the final Search Results of (2) formula must be taken from discrete vector space, quantization error has wherein caused the optimal estimation value and the actual value a of transformation parameter vector ₀Between have error inevitably.Therefore, the observed reading I of target appearance all can take place in each frame _n[φ (x; )] depart from its actual value I _n[φ (x; a ₀)].We call drift noise depart from that part of observation noise that causes owing to this outward appearance.The accumulation of drift noise is the basic reason that causes the template drift phenomenon to take place.Drift noise and camera noise have constituted observation noise jointly.

2, Kalman's template renewal wave filter

In order to obtain the optimal estimation for the true outward appearance of target, Kalman filter has been applied in the template renewal.In order to analyze the situation of change of each template pixel, we carry out Kalman filtering respectively to each template pixel.To template pixel x, its state equation is

T(x，n)＝T(x，n-1)+ε _S(x，n-1)(3)

Wherein, (x n) is the gray-scale value of template pixel x at the n frame to T; ε _S(x is the state transitions noise n), and it has reflected the variation from the n frame to n+1 frame target appearance itself.We can think reasonably that the state transitions noise is the zero-mean white Gaussian noise, have power spectrum density σ _S ²(x, n).

The observation equation of template pixel x is

I _n[φ(x；）]＝T(x，n)+ε _M(x，n) (4)

Wherein, ε _M(x n) is observation noise.Equally, observation noise also is the zero-mean white Gaussian noise, has power spectrum density σ _M ²(x, n).

Note e _P(x n) is T (x, predicated error n), e _E(x, n) be T (then we can obtain following two formulas for x, evaluated error n):

T (x, n - 1) = {\hat{T}}_{E} (x, n - 1) + e_{E} (x, n - 1) - - - (5)

T (x, n) = {\hat{T}}_{P} (x, n) + e_{P} (x, n) - - - (6)

In the formula,

Be after having obtained preceding n-1 frame observed reading, to T (x, predicted value n);

Be after having obtained preceding n frame observed reading, to T (x, estimated value n).Because the state transitions coefficient in (3) formula is 1, so

{\hat{T}}_{P} (x, n) = {\hat{T}}_{E} (x, n - 1) - - - (7)

From (3), (5)-(7) formula, we can obtain T, and (x, predicated error n) and the pass between the evaluated error are

e _P(x，n)＝e _E(x，n-1)+ε _S(x，n-1)(8)

Because the state transitions noise is uncorrelated with evaluated error, so we obtain

σ_{P}^{2} (x, n) = σ_{E}^{2} (x, n - 1) + σ_{S}^{2} (x, n - 1) - - - (9)

Wherein, σ _P ²And σ _E ²It is respectively the power spectrum density of predicated error and evaluated error.Under unlikely situation about obscuring, for brevity, below power spectrum density is abbreviated as power.

According to kalman filtering theory ^[17], optimum Kalman's coefficient of template pixel x should be taken as

G (x, n) = \frac{1}{1 + σ_{M}^{2} (x, n) / σ_{P}^{2} (x, n)} - - - (10)

Template is then upgraded according to following equation:

{\hat{T}}_{E} (x, n) = {\hat{T}}_{P} (x, n) + G (x, n) {I_{n} [φ (x; \hat{a}) - {\hat{T}}_{P} (x, n)]} = {\hat{T}}_{P} (x, n) + G (x, n) α (x, n) - - - (11)

Wherein,

α (x, n) = I_{n} [φ (x; \hat{a})] - {\hat{T}}_{P} (x, n)

It is the new breath of n frame.

After the template renewal, (x, evaluated error power n) becomes T

σ_{E}^{2} (x, n) = [1 - G (x, n)] σ_{P}^{2} (x, n) - - - (12)

Equation (7), (9)-(12) formula has constituted a complete iteration of Kalman's template renewal.

Here, unique needs are initialized is the evaluated error power σ of template _E ²Because original template directly intercepts from the target area of first frame, the evaluated error of this moment is only caused by the camera noise.Therefore, the initial estimation error power equals the camera noise power, promptly

σ_{E}^{2} (x, 0) = σ_{MC}^{2} - - - (13)

σ wherein _MC ²It is the camera noise power.

3, the contact of two plant noise power

In the Kalman filtering problem of standard, state transitions noise power σ _S ²With observation noise power σ _M ²It is known all to be considered to priori.But in target following, these two plant noise power need On-line Estimation.But, if a noise power obtains, then another noise power just is not difficult to have estimated ^[13]Consider (3)-(5) and (7) formula simultaneously, we can obtain getting in touch the equation of new breath, evaluated error and two plant noises at once:

α(x，n)＝e _E(x，n-1)+ε _S(x，n-1)+ε _M(x，n)(14)

Because every equal two pairwise uncorrelateds in following formula the right are so can obtain

σ_{α}^{2} (x, n) = σ_{E}^{2} (x, n - 1) + σ_{S}^{2} (x, n - 1) σ_{M}^{2} (x, n) - - - (15)

Wherein, σ _α ²(x is newly to cease α (x, power n) n).Newly breath power can be estimated by the Mean Square Error on time one space, promptly

σ_{α}^{2} (x, n) \approx \frac{1}{N_{L}} Σ_{k = n - L + 1}^{n} \underset{z &Element; Ω_{L} (x)}{Σ} {[α (z, k)]}^{2} - - - (16)

In the following formula, L is the length of time running mean window, and general desirable L gets 15-25; Ω _L(x) be the spatial image piece that is centered close to x; N _LBe to participate in average sum of all pixels.In example of the present invention, we get L=20, and the size of spatial image piece is 11 * 11 pixels.

Because evaluated error power σ _E ²In the process of Kalman filtering, produce automatically, therefore according to (15) formula, if σ _M ²(σ _S ²) known, σ then _S ²(σ _M ²) can obtain immediately.Present problem is one that how at first to estimate in two noise powers.Document [4] has been made different hypothesis with [13] for the value of two noise powers.As previously mentioned, these hypothesis are two extreme cases may running in the reality, and we need make more efficiently estimation to the value of noise power.Because state transitions noise reflection is the variation of target appearance itself, and this variation can be fully arbitrarily, so the power of direct estimation state transitions noise and being not easy.Yet we it will be appreciated that in the back, the power of observation noise can be by estimating drift noise power online obtaining.In case we have estimated observation noise power σ _M ², state transitions noise power σ then _S ²Can be expressed as follows immediately:

σ_{S}^{2} (x, n - 1) = σ_{α}^{2} (x, n) - σ_{E}^{2} (x, n - 1) - σ_{M}^{2} (x, n) - - - (17)

In some cases, (17) formula can produce negative value, and this shows that at the x place target appearance does not almost change from the n-1 frame to the n frame.So, σ _S ²Should be taken as zero, simultaneously, σ _M ²Also should correspondingly be adjusted into

σ_{M}^{2} (x, n) = σ_{α}^{2} (x, n) - σ_{E}^{2} (x, n - 1) - - - (18)

Next, we just specifically discuss the power of On-line Estimation drift noise how and observation noise.

4, the estimation of observation noise power

As previously mentioned, the accumulation of the drift noise composition in the observation noise is the basic reason that causes the template drift phenomenon to take place.Therefore, in order to make Kalman's template renewal wave filter can suppress template drift, must estimate the power of drift noise quantitatively.

Fig. 1 has represented because the optimal estimation value  and the actual value a of transformation parameter vector ₀Between inconsistent having caused at I _n[φ (x; Produced drift error )].Template pixel x is at the actual position φ of n frame (x among Fig. 1; a ₀) be positioned at φ (x; Neighborhood Ω ) _uIn certain a bit, so the actual value of template pixel x also should be taken from Ω _uIn certain a bit.The drift noise of template pixel x is exactly in fact I _n[φ (x; )] with x at Ω _uIn the error of expectation actual value.When the search precision of (2) formula reduces, Ω _uCan become big, generally can cause drift noise to increase.

For simplicity, we replace a with a ₀The actual value of representing the transformation parameter vector.According to the argumentation of front, the drift noise power of template pixel x can be expressed as

σ_{MD}^{2} (x, n) = \underset{a}{&Integral;} {I_{n} [φ (x; a)] - I_{n} [φ (x; \hat{a})]}^{2} p_{a} (a | \hat{a}) da - - - (19)

Wherein, σ _MD ²(x n) is the drift noise power of template pixel x at the n frame; p _aIt is the associating posterior probability Density Distribution of each component of a after the known .When  was near a, the posterior probability Density Distribution of each component of a can regard separate as because this moment some components value result can not influence the value of other component.Under the situation that target is not lost, this condition always satisfies.Therefore, (19) formula can be written as again

σ_{MD}^{2} (x, n) = \underset{a_{1} a_{2} . . . a_{m}}{&Integral; &Integral; . . . &Integral;} {I_{n} [φ (x; a)] - I_{n} [φ (x; \hat{a})]}^{2} Π_{i = 1}^{m} p_{i} (a_{i} | {\hat{a}}_{i}) d a_{i} - - - (20)

Wherein, p _iBe i the component a of a _iThe posterior probability Density Distribution; M is the number of parameters that coordinate transform φ is comprised.

Ensuing problem is how to calculate p _iAs seen from Figure 2,  _iCan only quantize, and  _iConditional probability be

P_{i} ({\hat{a}}_{i} | a_{i}) = \{\begin{matrix} 1, & | {\hat{a}}_{i} - a_{i} | \leq Δ_{i} / 2 \\ 0, & else \end{matrix} - - - (21)

In the following formula, P _iBe given a _iBack  _iConditional probability; Δ _iBe (2) formula search  _iThe time final step-length.According to bayes rule, a _iPosteriority distribute and to be

p_{i} (a_{i} | {\hat{a}}_{i}) = \frac{P_{i} ({\hat{a}}_{i} | a_{i}) p (a_{i})}{&Integral; P_{i} ({\hat{a}}_{i} | a_{i}) p (a_{i}) d a_{i}} - - - (22)

With (21) formula substitution (22) Shi Kede

p_{i} (a_{i} | {\hat{a}}_{i}) = \{\begin{matrix} \frac{p (a_{i})}{{&Integral;}_{{\hat{a}}_{i} {- Δ}_{i} / 2}^{{\hat{a}}_{i} + Δ_{i} / 2} p (a_{i}) d a_{i}} & | a_{i} - {\hat{a}}_{i} | \leq Δ_{i} / 2 \\ 0, & else \end{matrix} - - - (23)

Although obtain p (a _i) explicit value and be not easy, but we can reasonably think p (a _i) be approximate constant in the integrating range of (23) formula.This is because Δ i is less, and integrating range is relatively near p (a _i) comparatively smooth maximum point.Approximate based on this, (23) formula can be reduced to

p_{i} (a_{i} | {\hat{a}}_{i}) = \{\begin{matrix} 1 / (Δ_{i}), & | a_{i} - {\hat{a}}_{i} | \leq Δ_{i} / 2 \\ 0, & else \end{matrix} - - - (24)

With (24) formula substitution (20) formula, can try to achieve the drift noise power of template pixel x at the n frame.Yet,, be very difficult so will expect the analysis result of drift noise power owing in computation process, comprise interpolation operation.But, we can be by trying to achieve approximate numerical result with the integration discretize, that is:

σ_{MD}^{2} (x, n) \approx \underset{k_{1}}{Σ} \underset{k_{2}}{Σ} . . . \underset{k_{m}}{Σ} {I_{n} [φ (x; a_{k})] - I_{n} [φ (x; \hat{a})]}^{2} Π_{i = 1}^{m} p_{i} (k_{i} Δ a_{i} | {\hat{a}}_{i}) Δ a_{i} - - - (25)

= (Π_{i = 1}^{m} \frac{Δ a_{i}}{Δ_{i}}) \underset{k_{1}}{Σ} \underset{k_{2}}{Σ} . . . \underset{k_{m}}{Σ} {I_{n} [φ (x; a_{k})] - I_{n} [φ (x; \hat{a})]}^{2}

Wherein, Δ a ₁Δ a _mFormed the sum unit in the residing higher dimensional space of a, and a _k=[k ₁Δ a ₁K _mΔ a _m] ^TThe size of sum unit has determined the precision of (25) formulas.Integer k _iSpan satisfy

|k _iΔa _i- _i|≤Δi/2，i＝1，2，…，m (26)

After having estimated the drift noise power of template pixel, need add that camera power just can obtain final observation noise power.Different with drift noise power, the camera noise power is in the space and all think constant on the time.The camera noise power-value can be found from the technical manual of sensor, also can obtain by the pixel variance yields that calculates the test pattern that is produced by the grey uniform background, that is:

σ_{MC}^{2} = \frac{1}{M} \underset{x &Element; Ω_{G}}{Σ} {[I_{G} (x) - \frac{1}{M} \underset{x &Element; Ω_{G}}{Σ} I_{G} (x)]}^{2} - - - (27)

Wherein, I _GIt is test pattern; Ω _GIt is the coordinate set of test zone; M is the number of pixels that test zone comprises.

At last, observation noise power is provided by following formula:

σ_{M}^{2} (x, n) = σ_{MC}^{2} + λ σ_{MD}^{2} (x, n) - - - (28)

Wherein λ is the constant greater than 1, and its value depends on the precision of interpolation method.General λ gets 1＜λ≤3, and the precision of difference approach is high more, and then λ is more near 1.In the bilinear interpolation for the present invention's employing, λ is taken as 2.

From above derivation and discuss as can be known, observation noise power and target appearance itself have very big relation, and the observation noise of the place generation that texture or edge are intensive more is also big more.This is understandable, because same grid deviation can cause bigger outward appearance error at the more complicated place of target appearance, thereby the template at this place is caused bigger destruction.So the details that target appearance comprises is many more, the template drift phenomenon is just serious more.In our algorithm, the target appearance details in somewhere at a time is many more, and observation noise power that this moment should the place is just big more, and this makes corresponding Kalman's coefficient can not become excessive, thereby has effectively protected template, has significantly suppressed drift.

5, fast algorithm

If we can be from p _aWith the coordinate u=φ (x that obtains among the φ after the conversion; A)=[v, w] ^TAssociating posterior probability Density Distribution p _u, then the calculated amount of (25) formula just can significantly reduce.With (19) formula with conversion after coordinate u represent, can get

σ_{MD}^{2} (x, n) = \underset{u &Element; Ω_{u}}{&Integral;} {[I_{n} (u) - I_{n} (\hat{u})]}^{2} p_{u} (u | \hat{u}) du - - - (29)

(25) formula then becomes

σ_{MD}^{2} (x, n) \approx ΔvΔw \underset{l_{1}}{Σ} \underset{l_{2}}{Σ} {[I_{n} (l_{1} Δv, l_{2} Δw) - I_{n} (\hat{v}, \hat{w})]}^{2} p_{u} (l_{1} Δv, l_{2} Δw | \hat{v}, \hat{w}) - - - (30)

In above two formulas,

Ω _uBe the zone shown in Fig. 1, Δ v Δ w is the rectangle sum unit; Integer l ₁, l ₂Value satisfy

[l ₁Δv，l ₂Δw] ^T∈Ω _u (31)

By (30) formula is replaced (25) formula, we can be merged into one with producing same conversion recoil pairing each sum term of target transformation parameter vector, m is tieed up the summation problem be converted to two-dimentional summation problem, reduce operand greatly.

When coordinate transform φ was too complicated, the associating posterior probability Density Distribution of finding the solution coordinate u after the conversion is difficulty very.But, if φ only comprises translation and convergent-divergent (the modal target travel of two classes), promptly

u = φ (x; a) = [\begin{matrix} v \\ w \end{matrix}] = a_{1} [\begin{matrix} x \\ y \end{matrix}] + [\begin{matrix} a_{2} \\ a_{3} \end{matrix}] - - - (32)

Then can obtain p _uAnalytical expression.

The distribution function of u can be expressed as

F_{u} (u | \hat{u}) = P {V \leq v, W \leq w | \hat{v}, \hat{w}} - - - (33)

Here we represent stochastic variable with capitalization.Consider (32) formula, (33) formula can be rewritten as

F_{u} (u | \hat{u}) = P {A_{2} \leq v - A_{1} x, A_{3} \leq w - A_{1} y | {\hat{a}}_{1}, {\hat{a}}_{2}, {\hat{a}}_{3}}

= {&Integral;}_{- \infty}^{\infty} [{&Integral;}_{- \infty}^{v - a_{1} x} p_{2} (a_{2} | {\hat{a}}_{2}, a_{1}) d a_{2} \cdot {&Integral;}_{- \infty}^{w - a_{1} y} p_{3} (a_{3} | {\hat{a}}_{3}, a_{1}) d a_{3}] p_{1} (a_{1} | {\hat{a}}_{1}) d a_{1} - - - (34)

= {&Integral;}_{- \infty}^{\infty} [{&Integral;}_{- \infty}^{v - a_{1} x} p_{2} (a_{2} | {\hat{a}}_{2}) d a_{2} \cdot {&Integral;}_{- \infty}^{w - a_{1} y} p_{3} (a_{3} | {\hat{a}}_{3}) d a_{3}] p_{1} (a_{1} | {\hat{a}}_{1}) d a_{1}

Wherein, last equation establishment is because the independence between coordinate conversion parameter.

The associating posterior probability Density Distribution of u can be by asking partial derivative to obtain to (34), that is:

p_{u} (u | \hat{u}) = \frac{{&PartialD;}^{2} F_{u} (v, w)}{&PartialD; v &PartialD; w} = {&Integral;}_{- \infty}^{\infty} p_{2} (v - a_{1} x | {\hat{a}}_{2}) p_{3} (w - a_{1} y | {\hat{a}}_{3}) p_{1} (a_{1} | {\hat{a}}_{1}) d a_{1} - - - (35)

(24) formula is updated in each distribution of following formula, we obtain

p_{u} (u | \hat{u}) = \frac{1}{Δ_{1} Δ_{2} Δ_{3}} {&Integral;}_{B_{L}}^{B_{H}} d a_{1} \{\begin{matrix} (B_{H} - B_{L}) / (Δ_{1} Δ_{2} Δ_{3}), & B_{H} &GreaterEqual; B_{L} \\ 0, & B_{H} < B_{L} \end{matrix} - - - (36)

Wherein, can prove B _LWith B _HValue as follows:

B_{L} = \max {{\hat{a}}_{1} - \frac{Δ_{1}}{2}, \frac{v - {\hat{a}}_{2}}{x} - \frac{Δ_{2}}{2 | x |}, \frac{w - {\hat{a}}_{3}}{y} - \frac{Δ_{3}}{2 | y |}}, x &NotEqual; 0, y &NotEqual; 0 - - - (37)

B_{H} = \min {{\hat{a}}_{1} + \frac{Δ_{1}}{2}, \frac{v - {\hat{a}}_{2}}{x} + \frac{Δ_{2}}{2 | x |}, \frac{w - {\hat{a}}_{3}}{y} + \frac{Δ_{3}}{2 | y |}}, x &NotEqual; 0, y &NotEqual; 0 - - - (38)

If x=0 then works as | v- ₂|≤Δ ₂Second disappearance more than/2 o'clock in the two formula braces, otherwise the associating posterior probability Density Distribution of u equals zero; If y=0 then works as | w- ₃|≤Δ ₃The 3rd disappearance more than/2 o'clock in the two formula braces, otherwise the associating posterior probability Density Distribution of u equals zero.For simplicity, omitted B here _LWith B _HDerivation.When target travel can only be described with translation and convergent-divergent, can use (30), (36)-(38) formula significantly reduces the calculated amount of estimating drift noise power.

According to foregoing, the concrete steps of the target tracking algorism of effective inhibition template drift of the present invention are as follows:

1. initialization evaluated error power σ _E ², promptly move (13) formula.σ wherein _MC ²Be the camera noise power, the technical manual that can look into the camera sensor obtains, if there are not the data of this respect, then uses (27) formula estimation camera noise power σ _MC ²

2. selected target zone in first frame.

3.

Following initialization: by initial coordinate conversion φ (x; a _s) sampling Initial R OI, promptly

{\hat{T}}_{E} (x, 0) = I_{n} [φ (x; a_{s})],

A wherein _sInitial coordinate transformation parameter for target.

4. read in next frame.

5. the prediction module of present frame Be the estimation template of previous frame

Promptly move (7) formula.

With prediction module by coordinate transform φ (x; A) be mapped to present frame.Obtain reflecting that by the image-region of seeking the present frame that mates most with prediction module the transformation parameter vector  of the geological information of target in present frame promptly moves (2) formula.

By calculating because the mathematical expectation of pixel observation value that the quantization error of transformation parameter vector in (2) formula causes and the square-error between the actual value obtains the drift noise power σ of each template pixel _MD ²Owing to can't obtain analytical expression, the integration of (19) formula is converted into the summation of (25) formula with the mode of discretize transformation parameter.Specifically, with (25), (26) formula estimation drift noise power σ _MD ²When target travel can only be described with translation and convergent-divergent, can be merged into one with producing same conversion recoil pairing each sum term of target transformation parameter vector, m is tieed up the summation problem be converted to two-dimentional summation problem, reduce operand greatly.Specifically, this fast algorithm is by (30), and (36)-(38) formula significantly reduces the calculated amount of estimation drift noise power.

8. the observation noise power σ of each template pixel _M ²Be drift noise power and camera noise power σ _MC ²Weighted sum.Specifically, with (28) formula calculating observation noise power σ _M ²Wherein λ is the constant greater than 1, and its value depends on the precision of interpolation method.The precision of difference approach is high more, and then λ is more near 1.For the bilinear interpolation that adopts in the example of the present invention, λ is taken as 2.

9. estimate the new breath power σ of each template pixel with (16) formula _α ²

10. calculate the state transitions noise power σ of each template pixel with (17) formula _S ²If (17) formula produces negative value, then σ _S ²Be taken as zero, simultaneously σ _M ²Be adjusted accordingly according to (18) formula.

11. operation (9) formula obtains each template prediction errors power σ _P ²

12. according to (10) formula determine each template pixel optimum Kalman's coefficient G (x, n).

13. obtain the estimation template of present frame according to (11) formula

14. obtain the evaluated error power σ of each template pixel of present frame according to (12) formula _E ²

15., then forwarded for the 5th step to, otherwise finish if video flowing has been untreated.

Description of drawings

Fig. 1: the actual position φ (x of template pixel x in picture frame; a ₀) be positioned at Search Results φ (x; In neighborhood ), caused drift noise thus.

Fig. 2: the quantizing process during search optimal transformation parameter vector.

Fig. 3: the performance that different track algorithms suppress template drift compares.Four lines has shown the tracking results of four different video streams respectively.In each row, leftmost piece image is the common initial frame of each algorithm, and next three width of cloth images have from left to right shown the final tracking results of the algorithm that document [3], document [4] and the present invention propose respectively.Current template for displaying is in the lower right corner of each width of cloth image.In the 4th video flowing, the algorithm of document [3] has been lost target (seeing d2), so tracing process finishes in advance.

Fig. 4: the frame in " Lake " original image and the synthetic video stream.Sailing boat among the right figure in the red circle is tracked target.

Fig. 5: adopt the operand of fast algorithm front and back template renewal wave filter to compare.

Embodiment

1, outdoor scene video flowing experiment

We have at first compared the performance that different track algorithms suppress template drift on a large amount of outdoor scene video flowings.Tracked target has appearance change intensity in various degree in these video flowings.Because in the experiment of each class, we have obtained similar experimental result on all video flowings, so we respectively get a typical video flowing and are placed among the present invention, as shown in Figure 3 in each class experiment.Each row among Fig. 3 has been represented a video flowing, and to fourth line, the intensity of variation of target appearance is from the no change to great changes from first row.We have compared the performance that following three kinds of algorithms suppress template drift: the algorithm that the algorithm in the algorithm in the document [3], the document [4] and the present invention propose.In each row of Fig. 3, leftmost piece image is the common initial frames of all algorithms, and next three width of cloth images have from left to right shown three kinds of tracking results that algorithm is last respectively.In the 4th video flowing,, finish in advance so follow the tracks of because the algorithm in the document [3] has been lost target in tracing process.What show in the lower right corner of every width of cloth image is to work as front template.

In all experiments, we observe, and for the algorithm of document [3], because it adopts the first frame template correction target position, therefore when the target appearance variation is very little, almost do not have the template drift phenomenon (to see Fig. 3 a ₂); Yet when the target appearance intensity of variation increased, the performance of this algorithm declined to a great extent and (sees Fig. 3 b ₂, 3c ₂), even lose objects (is seen Fig. 3 d ₂), this is because the first frame template is no longer valid for the correction target position.

For the algorithm of document [4], when target appearance changes hour, too fast template renewal makes this algorithm produce very serious template drift (to see Fig. 3 a ₃); And when the target appearance intensity of variation increases,, still fairly obviously (see Fig. 3 b although template drift reduces to some extent ₃, 3c ₃, 3d ₃).This is because this algorithm is not considered drift noise power, thereby its template renewal wave filter can't be obtained gratifying effect for suppressing template drift.

For the algorithm that the present invention proposes, regardless of the target appearance intensity of variation, template drift always (is seen Fig. 3 a by effective must having restrained ₄, 3b ₄, 3c ₄, 3d ₄).This be since the effective modeling of template renewal wave filter of algorithm of the present invention drift noise, thereby can obtain the template renewal strategy of optimum at various tracking scene adaptive ground.

2, synthetic video stream experiment

The degree of the template drift that produces for the quantitative comparison algorithms of different, we also test each algorithm on the synthetic video flowing except having done on the outdoor scene video flowing a large amount of experiments.Why to use artificial synthetic video stream, be that the actual value of the geometric parameter of target in each frame all is known because in this case, and, we can effectively control various experiment condition, to obtain the more deep understanding for algorithm performance.

In our experiment, we use 512 * 512 standard testing image " lake " to generate test video stream, generating mode is as follows: (scope is the 367th～447 row with the sailing boat among the former figure for we, the 296th～350 row) extract as target, cover on the former figure through convergent-divergent and after changing outward appearance, and on former figure, move along certain track.The track that target moves is the constant speed helix of following form:

\{\begin{matrix} x (n) = (10 + r \cdot n) \cos (π \cdot r \cdot n / 20) + 256 \\ y (n) = (10 + r \cdot n) \sin (π \cdot r \cdot n / 20) + 256 \end{matrix}, n = 0,1, . . . - - - (39)

Wherein, r constantly changes, and is 2 pixel/frame all the time with the translational speed that guarantees target.The yardstick of target between 0.5～1.5 with the rate variation of 0.03/ frame.The outward appearance of target changes according to following formula:

I^{'} (x, n) = \{\begin{matrix} [I (x, n) - 128] \cdot k + 128, & x &Element; Ω_{A} \\ I (x, n), & else \end{matrix}, n = 0,1 . . . - - - (40)

Wherein, and I (x, n) (x n) represents target pixel value before and after the outward appearance change respectively with I '; K is a time-varying parameter, changes between-1～1 with fixed rate; Ω _ABe the zone that target appearance is modified, get successively target first, second, a left side half and right half part, and in each part, k finishes once circulation.By changing the rate of change of k, the speed that we can the controlled target appearance change.Two field picture in former figure and the synthetic video stream is shown among Fig. 4.

At first, we allow the k perseverance be zero, produce a video flowing that contains 300 frames.In this video flowing, the outward appearance of target remains unchanged.The tracking error of each algorithm is shown in Table 1.Here, tracking error refers to the actual value of the coordinate conversion parameter vector of target in each frame and the average Euclidean distance between the track algorithm estimated value.In the table 1 the 2nd to the 4th row are respectively the Kalman coefficient of template renewal wave filter on all pixels to be fixed as 0,0.5 and 1 tracking error; The the 5th to the 8th row are tracking errors of document [3], document [4], document [13] and algorithm of the present invention; Δ _LWith Δ _SIt is respectively search precision corresponding to the coordinate conversion parameter of position and yardstick.

We can see in table 1, Kalman's coefficient is fixed as zero has minimum tracking error.This meets expection, because when target appearance is constant, best template renewal strategy is exactly a new template more not fully.The tracking error and the least error of document [3] algorithm are very approaching, because first frame template correction target position effectively always in this case.Though it is constant to it should be noted that in this experiment target appearance remains, we observe when the yardstick of target hour (thereby the details that comprises in the unit area more for a long time), much bigger new breath when can produce than other.Obviously, these new breaths do not cause owing to target appearance itself changes, but come from the drift noise that matching error is brought.Algorithm of the present invention has correctly been analyzed the source of new breath, automatically improved observation noise power (but not state transitions noise power), Kalman's coefficient of each pixel of template renewal wave filter all is controlled at a very low level, thereby its tracking error is all more approaching with minimum value under all search precisions, and naked eyes are not almost seen.The algorithm of document [4], document [13] and Kalman's coefficient method of 0.5 of being fixed as all had bigger tracking error is not because their resulting template renewal strategies are optimum.Maximum tracking error appears at Kalman's coefficient is fixed as in 1 the method, every frame all without reservation more new template caused serious template drift.

In order to observe the tracking error of each algorithm when target appearance changes, we allow k with constant rate of change fluctuation, generate a test video stream that comprises 300 frames.Experimental result is shown in Table 2.In this experiment, keep the tracking error of constant strategy of template and document [3] algorithm all significantly to increase; 1 method that Kalman's coefficient is fixed as has still produced very big template drift; In remaining algorithm, algorithm of the present invention has all been obtained the tracking error much smaller than other algorithm under all search precisions.

From above experiment as can be seen, no matter whether target appearance changes, algorithm of the present invention always can be according to the height of target appearance situation of change and search precision, in appropriate time and appropriate position with the mode of optimum new template pixel more, so that template can either in time obtain upgrading, can excessively do not upgraded again, thereby reduced tracking error effectively, suppressed template drift.

3, fast algorithm effect

Carry out a needed multiply-add operation of template renewal (MAC) number of times before and after having shown the employing fast algorithm among Fig. 5.Wherein, the x axle is the mean value of evaluated error of the observation noise power of all pixels of template.The exact value of observation noise power obtains by get very little sum unit in (25) formula.We obtain the evaluated error and the corresponding computational complexity thereof of a series of employing fast algorithms front and back observation noise power by the sum unit size that changes (25) formula and (30) formula.As seen from Figure 5, if do not use fast algorithm, then when the evaluated error of observation noise power reduced, operand increased sharply; And if adopt fast algorithm, then the increase of operand is very slow.

Table 1 in the tracking error of target appearance algorithms of different fixedly the time relatively

k＝1	G＝0	G＝0.5	G＝1	[3]	[4]	[13]	The present invention
k＝1	G＝0	G＝0.5	G＝1	[3]	[4]	[13]	The present invention	Δ _L＝1，Δ _S＝0.05	0.0125	0.0557	0.2666	0.0125	0.0443	0.3222	0.0137
Δ _L＝1，Δ _S＝0.08	0.0207	0.1116	0.1057	0.0207	0.0574	0.1057	0.0238	Δ _L＝1，Δ _S＝0.05	0.0125	0.0557	0.2666	0.0125	0.0443	0.3222	0.0137
Δ _L＝1，Δ _S＝0.08	0.0207	0.1116	0.1057	0.0207	0.0574	0.1057	0.0238	Δ _L＝2，Δ _S＝005	0.7481	3.8261	49.878	0.7973	1.0834	1.5804	1.0356
Δ _L＝2，Δ _S＝0.08	0.6982	2.0403	60.996	0.7830	1.7115	2.5966	1.3418	Δ _L＝2，Δ _S＝005	0.7481	3.8261	49.878	0.7973	1.0834	1.5804	1.0356

Data in the table are the actual value of the coordinate conversion parameter vector of target in each frame and the average Euclidean distance between the track algorithm estimated value

The tracking error of algorithms of different relatively when table 2 changed in target appearance

\|Δk/Δn\|＝0.01	G＝0	G＝0.5	G＝1	[3]	[4]	[13]	The present invention
\|Δk/Δn\|＝0.01	G＝0	G＝0.5	G＝1	[3]	[4]	[13]	The present invention	Δ _L＝1，Δ _S＝0.05	7.3143	0.0607	0.2666	0.3357	0.0599	0.1510	0.0146
Δ _L＝1，Δ _S＝0.08	7.3981	0.1350	0.7376	0.2164	0.0571	0.8215	0.0238	Δ _L＝1，Δ _S＝0.05	7.3143	0.0607	0.2666	0.3357	0.0599	0.1510	0.0146
Δ _L＝1，Δ _S＝0.08	7.3981	0.1350	0.7376	0.2164	0.0571	0.8215	0.0238	Δ _L＝2，Δ _S＝0.05	16.324	2.0436	57.152	17.192	1.7046	1.3896	1.2039
Δ _L＝2，Δ _S＝0.08	19.243	2.6870	12.120	13.020	1.7864	3.5415	1.2619	Δ _L＝2，Δ _S＝0.05	16.324	2.0436	57.152	17.192	1.7046	1.3896	1.2039

Data in the table are the actual value of the coordinate conversion parameter vector of target in each frame and the average Euclidean distance between the track algorithm estimated value.

Claims

1. a target tracking algorism that effectively suppresses template drift is characterized in that utilizing more new template of Kalman filter, and by including the generation that effectively suppresses template drift in the Kalman filter in the influence of template drift is explicit.

2. the target tracking algorism of effective inhibition template drift according to claim 1 is characterized in that with template drift The noise quantitative expression be an error power, i.e. drift noise power σ _MD ²

3. the target tracking algorism of effective inhibition template drift according to claim 1 and 2 is characterized in that drift noise power σ _MD ²As observation noise power σ in Kalman's template renewal wave filter _M ²Ingredient.

4. the target tracking algorism of effective inhibition template drift according to claim 1 is characterized in that camera noise power σ _MC ²As observation noise power σ in Kalman's template renewal wave filter _M ²Another ingredient.

5. according to the target tracking algorism of claim 1,2,3, one of 4 described effective inhibition template drifts, it is characterized in that observation noise power σ in Kalman's template renewal wave filter _M ²By σ _MD ²With σ _MC ²Weighted sum obtain: σ _M ²=λ σ _MD ²+ σ _MC ², wherein λ is a constant.

6. according to the target tracking algorism of claim 2 or 3 described effective inhibition template drifts, it is characterized in that coming quantitative Analysis drift noise power σ by the quantization error of coordinate transform coefficient and the probability distribution of pixel actual value _MD ²

7. the target tracking algorism of effective inhibition template drift according to claim 4, it is characterized in that under the situation of unknown camera noise power parameter, estimate the noise power σ of camera by the pixel variance yields that calculates the test pattern that produces by the grey uniform background _MC ²

8. the target tracking algorism of effective inhibition template drift according to claim 6 is characterized in that when target travel can only be described with translation and convergent-divergent, adopts a kind of fast algorithm to calculate drift noise power σ _MD ², this fast algorithm will produce same conversion recoil pairing each sum term of target transformation parameter vector and be merged into one, m be tieed up the summation problem be converted to two-dimentional summation problem, thereby reduce operand greatly.