CN101026759A

CN101026759A - Visual tracking method and system based on particle filtering

Info

Publication number: CN101026759A
Application number: CN 200710090883
Authority: CN
Inventors: 杨杰; 程建; 凌建国; 张翼
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2007-04-09
Filing date: 2007-04-09
Publication date: 2007-08-29
Anticipated expiration: 2027-04-09
Also published as: CN100571392C

Abstract

The method includes steps: based on calculation of synthesized weight value set of current multiple vision characters to obtain filtering estimation of interfusion state of multiple vision characters in current sub frame; the said filtering estimation includes information of target position; and based on information of target position to determine target position of current sub frame of vision track. The invention also discloses vision track system based on particle filtering. An instance in the invention embeds democracy synthesis mechanism to state estimation procedure of particle filtering in order to implement vision track. The invention raises reliability and robustness of vision track.

Description

Visual tracking method and system based on particle filter

Technical field

The present invention relates to the vision tracking technique, be specifically related to visual tracking method and system based on particle filter.

Background technology

The essence that vision is followed the tracks of is that recursion estimates interested position with target of specific image feature in video sequence, and characteristics of image comprises color, shape, texture and motion etc.The vision tracking problem can be considered the state estimation problem of recursion.That is to say to obtain the sensation target state estimation, just can determine the position of tracking target.In the prior art, there are following several visual tracking methods:

In the prior art, merge angle from many visual signatures self adaptation, self-organizing and proposed democracy synthetic method (Democratic Integration), and apply to face tracking.The democracy synthetic method refers to a kind of many visual signatures self-organizing blending algorithm, can organize together correlated characteristic well, brings into play bigger effect.The visual tracking method that the democracy synthetic method is applied to people's face is specially: at each pixel in the image, carry out many Feature Extraction, obtain the similarity degree of each pixel representative target and real goal by the democracy synthetic method, find the center of the highest pixel of similarity degree then, to realize tracing process as target.This is actually an exhaustive process this method, lacks effective theoretical frame of realizing that vision is followed the tracks of, causes vision track algorithm efficient lower.And, find also that by experiment this method is to the discontinuous fluid sensitivity of sensation target; To the interference sensitivity of false target, in case generation because of the tracking error that false target produces, will be difficult to recover.Like this, the visual tracking method of prior art has caused the reliability and the robustness of vision tracking lower.

This shows that there is the shortcoming of the low and robustness difference of reliability in the visual tracking method of prior art.

Summary of the invention

The embodiment of the invention provides a kind of visual tracking method based on particle filter, and this method can improve reliability and the robustness that vision is followed the tracks of.

The embodiment of the invention provides a kind of vision tracking system based on particle filter, and this system can improve reliability and the robustness that vision is followed the tracks of.

A kind of visual tracking method based on particle filter, this method comprises:

Many visual signatures that synthetic weights set calculates current subframe according to current many visual signatures merge state filtering to be estimated, described many visual signatures merge state filtering estimates to comprise target position information; And

Determine the target location of current sub frame of vision track according to described target position information.

A kind of vision tracking system based on particle filter, this system comprise that synthetic weights acquisition module, many visual signatures merge state filtering estimation module and target location determination module;

Synthetic weights acquisition module, the set of obtaining the synthetic weights of current many visual signatures sends many visual signatures to and merges the state filtering estimation module;

Many visual signatures merge the state filtering estimation module, the set that the current many visual signatures that transmit according to synthetic weights acquisition module synthesize weights, and the many visual signatures that calculate current subframe merge state filterings to be estimated;

The target location determination module merges state filtering according to described many visual signatures and estimates, determines the target location of current subframe.

From such scheme as can be seen, the embodiment of the invention obtains the set of the synthetic weights of many visual signatures, set according to the synthetic weights of many visual signatures calculates the estimation of many visual signatures fusion state filterings, merges the position that state filterings estimate to determine current subframe target by many visual signatures.Like this, follow the tracks of to realize vision, improved reliability and robustness that vision is followed the tracks of by the state estimation procedure that the democracy synthesis mechanism is embedded into particle filter.

Description of drawings

Fig. 1 is the structural representation of the embodiment of the invention based on the vision tracking system of particle filter;

Fig. 2 is the principle flow chart of the embodiment of the invention based on the visual tracking method of particle filter;

Fig. 3 carries out the method flow diagram of a specific embodiment of vision tracking for utilization principle shown in Figure 2;

Fig. 4 is the tracing process schematic diagram of the embodiment of the invention based on the visual tracking method of particle filter.

Embodiment

Please referring to Fig. 1, be the structural representation of the embodiment of the invention based on the vision tracking system of particle filter, this system comprises that synthetic weights acquisition module 110, many visual signatures merge state filtering estimation module 120 and target location determination module 130.

Synthetic weights acquisition module 110, the set of obtaining the synthetic weights of current many visual signatures sends many visual signatures to and merges state filtering estimation module 120.

Many visual signatures merge state filtering estimation module 120, the set that the current many visual signatures that transmit according to synthetic weights acquisition module 110 synthesize weights, and the many visual signatures that calculate current subframe merge state filterings to be estimated.

Target location determination module 130 merges many visual signatures fusion state filterings estimation that state filtering estimation module 120 calculates according to many visual signatures, determines the target location of current subframe.

Synthetic weights acquisition module 110 can comprise single vision significant condition filtering estimator module 111 and state estimation deviation calculation submodule 112, quality metrics calculating sub module 113 and synthetic weights calculating sub module 114.

Single vision significant condition filtering estimator module 111 is obtained the single vision significant condition filtering of current subframe and is estimated.

The single vision significant condition filtering of obtaining current subframe is estimated as: according to the weights of sampling particle, calculate the filtering of single vision significant condition and estimate that its concrete computational process is referring to the description at step 201 place.

State estimation deviation calculation submodule 112, estimate according to many visual signatures fusion state filterings that the filtering of single vision significant condition is estimated and many visual signatures fusion state filtering estimation module 120 transmits that single vision significant condition filtering estimator module 111 transmits, calculate the state estimation deviation.Its concrete computational process is referring to the description at step 204 place.

Quality metrics calculating sub module 113, the state estimation deviation according to state estimation deviation calculation submodule meter 112 obtains calculates the visual signature quality metrics.Its concrete computational process is referring to the description at step 204 place.

Synthetic weights calculating sub module 114 according to the vision quality tolerance that quality metrics calculating sub module 113 obtains, calculates the set of the synthetic weights of current many visual signatures.Its concrete computational process is referring to the description at step 205 place.

The vision tracking system of the embodiment of the invention can also comprise that characteristic disappear judging module 140 and feature produce judging module 150.

Characteristic disappear judging module 140, whether the interior synthetic weights of each many visual signatures of set of judging the synthetic weights of many visual signatures that synthetic weights acquisition module 110 transmits are less than or equal to judgment value, if be less than or equal to judgment value, then disappear, the synthetic weights of these many visual signatures are deleted from the set of the synthetic weights of many visual signatures with the synthetic weights characteristic of correspondence of these many visual signatures.This judgment value is a constant that sets in advance, and can be 0, also can be other numerical value.The concrete judging process of characteristic disappear is referring to the description at step 3031 place.

Feature generation module 150, the set characteristic of correspondence number of judging the synthetic weights of many visual signatures that synthetic weights acquisition module 110 transmits less than the visual signature threshold value after, produce feature indication, for described feature indication is provided with the synthetic weights of corresponding many visual signatures, the synthetic weights of many visual signatures of this setting are added the set of the synthetic weights of current many visual signatures.The synthetic weights of specifically generation feature indication, and many visual signatures that will be provided with add the process of the set of the synthetic weights of current many visual signatures, referring to the description at step 3032 and 3033 places.

Please referring to Fig. 2, be the principle flow chart of the embodiment of the invention based on the visual tracking method of particle filter, this method may further comprise the steps:

Step 201 obtains the filtering of single vision significant condition and estimates.Single vision significant condition filtering estimated statement is shown

Be the k sensation target state filtering estimation of m visual signature constantly, be used for step 204 and calculate the state estimation deviation.Its procurement process is specially:

At first, for m visual signature, calculate sampling particle X _k ⁽ⁱ⁾Weight w _k ^{(i), m}, calculation expression is:

w_{k}^{(i), m} &Proportional;p (Z_{k}^{(i), m} | X_{k}^{(i)}) - - - (1)

Wherein, m=1,2......M, the sequence number of m representation feature, i represents to adopt the sequence number of particle, and k represents constantly, if k=0, the expression initial time, then current subframe is initial subframe; w _k ^{(i), m}The weights of representing m feature, Z _k ^{(i), m}The observation situation of representing m visual signature; P (Z _k ^{(i), m}| X _k ⁽ⁱ⁾) expression m visual signature observation probability, ∝ represents to be proportional to.

Then, according to the sampling particle weight w _k ^{(i), m}, the sensation target state filtering that obtains m visual signature estimates that single vision significant condition filtering is just estimated

Calculation expression is:

{\hat{X}}_{k}^{m} = Σ_{i = 1}^{N} w_{k}^{(i), m} X_{k}^{(i)} - - - (2)

Step 202, the set of synthesizing weights by many visual signatures calculates the estimation of many visual signatures fusion state filterings

To obtain the set of the synthetic weights of many visual signatures after the vision of last subframe is followed the tracks of and finished, it is retrieved as the process of step 204 and 205.If current subframe is first subframe, then the set of the synthetic weights of pre-configured many visual signatures is synthesized the set of weights as described many visual signatures.The process that obtains the estimation of many visual signatures fusion state filterings is specially:

At first, definition sampling particle X _k ⁽ⁱ⁾Observation probability p (Z _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾) calculation expression, for:

p (Z_{k}^{(i)} | X_{k}^{(i)}) = \underset{m = 1, . . ., M}{Σ} π_{k}^{m} p (Z_{k}^{(i), m} | X_{k}^{(i)}) - - - (3)

Wherein, m=1,2......M, π _k ^mThe synthetic weights of many visual signatures of representing m visual signature of last subframe are called the set that many visions are synthesized weights with the synthetic weights of all many visual signatures of M visual signature correspondence of last subframe, and

\underset{m = 1, . . ., M}{Σ} π_{k}^{m} = 1;

P (Z _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾) expression sampling particle X _k ⁽ⁱ⁾Observation probability.Here, (3) formula is called the weighting synthetic model.

Then, with the synthetic weights π of many visual signatures of last subframe _k ^m, substitution (3) formula, particle X obtains sampling _k ⁽ⁱ⁾Observation probability p (Z _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾), with the observation probability p (Z that calculates _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾) substitution (4) formula, by the proportional relation of (4), obtain many visual signatures and merge down-sampling particle X _k ⁽ⁱ⁾Weight w _k ⁽ⁱ⁾

w_{k}^{(i)} &Proportional; p (Z_{k}^{(i)} | X_{k}^{(i)}) - - - (4)

At last, according to the sampling particle X that calculates by (4) formula _k ⁽ⁱ⁾Weight w _k ⁽ⁱ⁾, calculate many visual signatures and merge sensation target state filtering estimation down

Calculation expression is:

{\hat{X}}_{k} = Σ_{i = 1}^{N} w_{k}^{(i)} X_{k}^{(i)} - - - (5)

Many visual signatures merge sensation target state filtering estimation down

Be used for step 203 and determine the target location of current subframe, and be used for step 204 calculating vision quality tolerance.

Step 203 merges state estimation according to the many visual signatures that obtain

Determine the target location of current subframe.

Many visual signatures of the current subframe that obtains merge state estimation The information that has comprised the target location of current subframe, according to Just can determine the target location of current subframe.

For many visual signatures, the state filtering of sensation target estimates to exist two levels, i.e. single vision significant condition filtering is estimated and many visual signatures merge the state filtering estimation.Step 201 is for calculating the process that the filtering of single vision significant condition is estimated, step 202 merges the process that state filterings are estimated for calculating many visual signatures, and step 201 and step 202 can be carried out simultaneously, also can first execution in

step

202 and 203, and execution in step 201 again.

Step 204 obtains visual signature quality metrics q _k ^m, its concrete procurement process comprises:

At first, the single vision significant condition filtering that calculates according to step 201 is estimated

The many visual signatures that calculate with step 202 merge the state filtering estimation

Obtain the state estimation deviation of two levels, the state estimation deviation chart is shown d _k ^m, calculation expression is:

d_{k}^{m} = {| | {\hat{X}}_{k} - {\hat{X}}_{k}^{m} | |}_{2} - - - (6)

Wherein, ‖ ‖ ₂Be the Euclidean norm.

Then, according to the state estimation deviation d that tries to achieve _k ^m, obtain visual signature quality metrics q _k ^m, calculation expression is:

q_{k}^{m} = A (d_{k}^{m}) - - - (7)

Wherein, A () arrives the interior mapping of (0,1) scope for set of real numbers,

0 \leq q_{k}^{m} \leq 1 .

q _k ^mReflected the contribution rate of visual signature m to the vision tracking results.With (d _k ^m) to multiply by A be in order to make the visual signature quality metrics that obtains Paint Gloss.A is an empirical value.Mapping A () may be defined as negative exponential function, or other functions such as 0-1 function.Here, suppose to shine upon A () and be defined as the 0-1 function, be:

q_{k}^{m} = A (d_{k}^{m}) = \{\begin{matrix} 1 & if d_{k}^{m} \leq δ \\ 0 & if d_{k}^{m} &GreaterEqual; δ \end{matrix} - - - (8)

Wherein, δ is the state estimation deviation threshold, and typical state estimation deviation threshold can be taken as 4 pixels, i.e. δ=4 pixels.

q_{k}^{m} = 1

Expression k moment visual signature m follows the tracks of vision contribution preferably, and

q_{k}^{m} = 0

Expression k visual signature m constantly follows the tracks of the nothing contribution to vision.Under this 0-1 mapping, visual signature quality metrics q _k ^mInitialization value, q ₀ ^m, determine according to visual signature characteristic priori in the scene.

Step 205 is according to visual signature quality metrics q _k ^mObtain the set of the synthetic weights of many visual signatures.The set of the synthetic weights of the many visual signatures that calculate is used to calculate many visual signatures fusion state filterings estimation of next subframe.The process that obtains the set of the synthetic weights of many visual signatures specifically comprises:

At first, under the particle filter theoretical frame, will be based on the synthetic weights π of many visual signatures of visual signature quality metrics _k ^mThe self-organizing correction chart be shown

It is defined as:

τ {\overset{\cdot}{π}}_{k}^{m} = q_{k}^{m} - π_{k}^{m} - - - (9)

In actual vision tracing process, adopt the difference form of (9) formula to represent through the synthetic weights π of the revised many visual signatures of self-organizing _k ^m, expression formula is:

π_{k}^{m} = π_{k - 1}^{m} + β (q_{k - 1}^{m} - π_{k - 1}^{m}) - - - (10)

Wherein,

q_{k - 1}^{m} - π_{k - 1}^{m} = \overset{\cdot}{τ} π_{k - 1}^{m},

β is synthetic weights adjusted rate, and typically synthetic weights adjusted rate value is 0.05, i.e. β=0.05.

Then, to the synthetic weights π of the many visual signatures that obtain _k ^mCarry out the normalization computing.

Because (8) formula is calculated q _k ^mProcess in, carry out [0,1] mapping computing make q _k ^mNo longer have the normalization feature, thereby make the π that calculates by (10) formula _k ^mAlso no longer have the normalization feature, so, need be to the synthetic weights π of many visual signatures _k ^mCarry out normalization, method for normalizing is: to the m value is 1, the M of 2......M the synthetic weights π of many visual signatures _k ^mSummation, the synthetic weights of many visual signatures that use m feature correspondence are divided by this summing value, just obtain the synthetic weights of many visual signatures after the normalization, the m value is 1, and the synthetic weights of the many visual signatures after the M of the 2......M normalization are formed the set of the synthetic weights of many visual signatures after the normalization.Normalized computing expression is:

π_{k}^{m} / Σ_{m = 1}^{M} π_{k}^{m} .

Normalization computing described here is the normalization of broad sense, can be the computing that is classified as " 1 ", can be the computing that is classified as " 2 " also, or is classified as the computing of other numerical value.

Step 204 and 205 is the π after the normalization that calculates _k ^mThe set vision that will apply to next subframe follow the tracks of in the computing.Just, merging state filtering at the many visual signatures that calculate next subframe estimates The time, with the π after the normalization _k ^mIn substitution (3) formula, continue execution in step 201～205.

Such scheme is the basic principle of visual tracking method of the present invention.Among Fig. 2, step 201～203 estimate for single vision significant condition filtering of the present invention and many visual signatures merge state filtering estimation procedure, just particle filter process; Step 204 and 205 is for adopting the democracy composition algorithm to obtain the process of the synthetic weights of many visual signatures.Below in conjunction with specific embodiment shown in Figure 3, the described method of Fig. 2 is elaborated.

Please referring to Fig. 3, carry out the method flow diagram of a specific embodiment of vision tracking for using principle shown in Figure 2, this method may further comprise the steps:

Step 301 is carried out particle state and is shifted.

Carrying out the particle state transfer is specially: the particle that is obtained current time by the sampling calculating particles of specific state transition model and last subframe.Here, the sampling particle with last subframe is expressed as X _K-1 ⁽ⁱ⁾, the sampling particle that will work as last subframe is expressed as The sampling particle set representations of last subframe is

{{\tilde{X}}_{k}^{(i)}, w_{k}^{(i)}}_{i = 1}^{N} .

The method that is obtained the particle of current time by the sampling calculating particles of specific state transition model and last subframe is a process well known to those skilled in the art, repeats no more here.Described state transition model can be the state equation of motion.

If current subframe is the initial subframe that vision is followed the tracks of, the process of then carrying out the particle state transfer is the initialization procedure that vision is followed the tracks of, and this process specifically comprises:

Initial subframe is promptly during k=0.According to prior distribution p (X ₀) determine initial sampling particle collection, be expressed as

{X_{0}^{(i)}, \frac{1}{N}}_{i = 1}^{N};

For sampling particle collection, the many visual signatures of initialization synthesize weights, and hypothesis is shown the synthetic weight table of initialized many visual signatures here

π_{0}^{m} = \frac{1}{M},

Wherein, m=1 ..., M, is called the summation of M the synthetic weights of many visual signatures the set of the synthetic weights of many visual signatures here; According to the visual signature characteristic of tracked sensation target in the scene, initialization visual signature quality metrics is expressed as q ₀ ^mDetermine state estimation deviation threshold value and and synthetic weights adjusted rate, hypothesis is respectively 4 and 0.05 here, i.e. δ=4 and β=0.05.The process of synthetic weights of the many visual signatures of initialization and initialization visual signature quality metrics, the process of synthetic weights of just pre-configured many visions and visual signature quality metrics.

Step 302 is calculated the filtering of single vision significant condition and is estimated

The value of m is: m=1 ..., M.Calculate the filtering of single vision significant condition according to step 201 and estimate that detailed process comprises:

At first, to m feature, calculate the sampling particle

Weight w _k ^{(i), m}, the calculating expression is:

w_{k}^{(i), m} &Proportional; p ({\tilde{Z}}_{k}^{(i), m} | X_{k}^{(i)}) - - - (11)

Then, the weights that calculate are carried out normalization, method for normalizing is, is 1 to the i value, the N of a 2......N weight w _k ^{(i), m}Summation, the weights that use i particle correspondence are divided by this summing value, just obtain the weights after the normalization, and the computing expression is:

w_{k}^{(i), m} / Σ_{i = 1}^{N} w_{k}^{(i), m} - - - (12)

At last, the weight w that will calculate by (12) formula _k ^{(i), m}In substitution (13) formula, the sensation target state filtering that obtains m visual signature estimates that single vision significant condition filtering is just estimated

Calculation expression is:

{\hat{X}}_{k}^{m} = Σ_{i = 1}^{N} w_{k}^{(i), m} {\tilde{X}}_{k}^{(i)} - - - (13)

Step 303 calculates many visual signatures and merges the weight w of particle down _k ⁽ⁱ⁾, concrete computational process is:

Step 3031 is carried out visual signature disappearance judgement.

Supposing to can be used for the visual signature number that vision follows the tracks of is M, and each visual signature has a feature indication, is Ω with the feature indication set representations of M feature indication formation.Suppose that it is M that k is used for the characteristic that vision follows the tracks of constantly _k, with M _kThe feature indication set representations that individual feature indication constitutes is Ω _kTo be used for visual signature that vision the follows the tracks of threshold value table that rolls off the production line and be shown M ₀Here, establish M ₀Be representative value 2.

Visual signature disappears to adjudicate and is specially: whether the interior synthetic weights of each many visual signatures of set of judging the synthetic weights of current many visual signatures are less than or equal to judgment value, if be less than or equal to judgment value, then disappear, the synthetic weights of these many visual signatures are deleted from the set of the synthetic weights of described many visual signatures with the synthetic weights characteristic of correspondence of these many visual signatures.This judgment value is a constant that sets in advance, and can be 0, also can be other numerical value.For example, to the synthetic weights π of many visual signatures of a certain feature m _k ^mJudge, if judge π _k ^mBe less than or equal to 0, promptly

π_{k}^{m} \leq 0,

Representation feature m disappears, and then the synthetic weights of many visual signatures of this feature correspondence are deleted from the set of the synthetic weights of many visual signatures, with this feature indication from Ω _kMiddle deletion, and order

M_{k} = {\overset{=}{Ω}}_{k};

Otherwise this feature m exists.Wherein, Ω _kRepresentation feature indicates collection Ω _kGesture.

Above-mentioned many visual signatures synthesize weights π _k ^m, for last subframe is carried out the value that vision calculates in following the tracks of, π _k ^mMay be one and approach very much zero value, also may be negative value.

Step 3032 is carried out visual signature and is produced judgement, and its detailed process comprises:

(1) to characteristic M _kJudge, if characteristic M _kLess than the visual signature lower threshold M that is provided with ₀, i.e. M _k＜M ₀, then carry out (2); Otherwise, execution in step 3033.

(2) produce random number, be expressed as u.This random number can be that (0,1) goes up equally distributed random number, also can be other interval arbitrarily random number that distributes that goes up.

(3) u be multiply by the long-pending of M and round, rounding the result is feature indication, is expressed as label.The process of rounding can be expressed as: label=round (uM).

The corresponding feature of feature indication label.

(4) whether judging characteristic sign label belongs to feature indication collection Ω _k, if do not belong to, i.e. label  Ω _k, then this feature indication label characteristic of correspondence is effective, and feature indication label is added feature indication collection Ω _kIn, and order

M_{k} = {\overset{=}{Ω}}_{k},

Execution in step (5); Otherwise this feature indication label characteristic of correspondence is invalid, execution in step (2).

(5) according to the characteristic of label characteristic of correspondence in the scene, add feature indication collection Ω in the initialization step (4) _kFeature vision quality tolerance and synthesize weights, the synthetic weights of initialized many visual signatures are added in the set of the synthetic weights of many visual signatures execution in step (1).In initial method and the step 301 to initial subframe initialization identical, repeat no more here.

Certain feature is chosen for concentrating from feature indication at random in step (2)～(5), and to the vision quality tolerance of this feature with synthesize weights and carry out initialized process.

Generate judgement by visual signature disappearance judgement and visual signature, reduced the computation complexity that many visual signatures merge tracking, improved the real-time of track algorithm.Step 303 can not comprise step 3031 and step 3032.

Step 3033, the synthetic weights π of each the many visual signatures in the set of the synthetic weights of many visual signatures that step (5) is obtained _k ^mSubstitution weighting synthetic model, promptly formula (3) calculates sampling particle X _k ⁽ⁱ⁾Observation probability p (Z _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾).With the observation probability p (Z that calculates _k ⁽ⁱ⁾| X _k ⁽ⁱ⁾) substitution (4) formula, obtain many visual signatures and merge down-sampling particle X _k ⁽ⁱ⁾Weight w _k ⁽ⁱ⁾To weight w _k ⁽ⁱ⁾Carry out normalization, method for normalizing is, is 1 to the i value, the N of a 2......N weight w _k ⁽ⁱ⁾, m summation, the weights that use i particle correspondence are divided by this summing value, just obtain the weights after the normalization, and the computing expression is:

w_{k}^{(i), m} / Σ_{i = 1}^{N} w_{k}^{(i), m} .

Step 302 and step 303 can be carried out simultaneously, also can first execution in step 303 execution in step 302 again.

Step 304 merges the many visual signatures after the normalization that calculates in the step 303 with the weight w of particle down _k ⁽ⁱ⁾Substitution formula (5), the many visual signatures that obtain current subframe merge state filtering to be estimated

Calculation expression is:

{\hat{X}}_{k}^{m} Σ_{i = 1}^{N} w_{k}^{(i), m} X_{k}^{(i)} .

Step 305 is estimated according to many visual signatures fusion state filterings of the current subframe that calculates in the step 304

Determine the target location of current subframe.

Many visual signatures of the current subframe that obtains merge state estimation

The information that has comprised the target location of current subframe, according to

Just can determine the target location of current subframe.

Step 306 calculates the visual signature quality metrics q of current subframe _k ^m, concrete computational process may further comprise the steps:

At first, the single vision significant condition filtering that calculates according to step 302 is estimated

The normalized many visual signatures that calculate with step 304 merge the state filtering estimation

Calculate the state estimation deviation d of two levels _k ^m, calculation expression is:

d_{k}^{m} = {| | {\hat{X}}_{k} - {\hat{X}}_{k}^{m} | |}_{2}

Wherein, ‖ ‖ ₂Be the Euclidean norm.

q_{k}^{m} = A (d_{k}^{m})

0 \leq q_{k}^{m} \leq 1 .

q_{k}^{m} = A (d_{k}^{m}) = \{\begin{matrix} 1 & if d_{k}^{m} \leq δ \\ 0 & if d_{k}^{m} &GreaterEqual; δ \end{matrix}

q_{k}^{m} = 1

q_{k}^{m} = 0

Expression k visual signature m constantly follows the tracks of the nothing contribution to vision.Under this 0-1 mapping, visual signature quality metrics q _k ^mInitial value, q ₀ ^m, determine according to visual signature characteristic priori in the scene.

Step 307 is according to visual signature quality metrics q _k ^mObtain the synthetic weights π of many visual signatures of current subframe _k ^mThe many visual signatures that calculate synthesize weights π _k ^mBe used to calculate many visual signatures fusion state filterings estimation of next subframe.Obtain the synthetic weights π of many visual signatures of current subframe _k ^mProcess specifically comprise:

At first, under the particle filter theoretical frame, will be based on the synthetic weights π of many visual signatures of visual signature quality metrics _k ^mThe self-organizing correction be defined as:

τ {\overset{\cdot}{π}}_{k}^{m} = q_{k}^{m} - π_{k}^{m}

π_{k}^{m} = π_{k - 1}^{m} + β (q_{k - 1}^{m} - π_{k - 1}^{m})

Wherein,

q_{k - 1}^{m} - π_{k - 1}^{m} = \overset{\cdot}{τ} π_{k - 1}^{m},

Because (8) formula is calculated q _k ^mProcess in, carry out [0,1] mapping computing make q _k ^mNo longer have the normalization feature, thereby make the π that calculates by (10) formula _k ^mAlso no longer have the normalization feature, so, need be to π _k ^mCarry out the normalization computing, the m value is 1, and the synthetic weights of the many visual signatures after the M of the 2......M normalization are formed the set of the synthetic weights of many visual signatures after the normalization.Operation expression is:

π_{k}^{m} / Σ_{m = 1}^{M} π_{k}^{m} .

Step 306 and 307 is in order to calculate the set of the synthetic weights of many visual signatures after the normalization.The set of the synthetic weights of the many visual signatures after the normalization will be used with the vision of next subframe and be followed the tracks of in the computing.

Step 308 is carried out particle and is resampled.

Merge the weight w of particle down according to the many visual signatures after the normalization that calculates in the step 3033 _k ⁽ⁱ⁾, from sampling particle collection

{{\tilde{X}}_{k}^{(i)}, w_{k}^{(i)}}_{i = 1}^{N}

In extract N particle again, form new sampling particle collection, be expressed as

{X_{k}^{(i)}, \frac{1}{N}}_{i = 1}^{N} .

Behind k=k+1, execution in step 301 is carried out the vision of next subframe and is followed the tracks of computing.

Please, be the tracing process schematic diagram of the embodiment of the invention based on the visual tracking method of particle filter referring to Fig. 4.The target that Fig. 4 follows the tracks of is the head portrait of Ms in the image, and as can be seen, embodiment of the invention tracing process is locked as Ms's head portrait always.As seen, the visual tracking method of the utilization embodiment of the invention has higher reliability and robustness than prior art.

The technical scheme of the embodiment of the invention can be applicable to video monitoring, video compression coding, robot navigation and location, intelligent human-machine interaction, virtual reality, and field such as Imaging Guidance.

The technical scheme of the embodiment of the invention obtains the set of the synthetic weights of many visual signatures by the democracy composition algorithm; Calculate after many visual signatures merge state filterings and estimate according to the set of the synthetic weights of many visual signatures again, determine the target location that the vision of current subframe is followed the tracks of.Just, the embodiment of the invention is embedded into the democracy synthesis mechanism in the particle filter process, has realized the vision tracking.Like this, vision is followed the tracks of have higher reliability and robustness, realized the vision tracking under the complex scene, improved the efficient of vision track algorithm, satisfied the real-time requirement that actual vision is followed the tracks of application system.

Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; and be not intended to limit the scope of the invention; within the spirit and principles in the present invention all, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1, a kind of visual tracking method based on particle filter is characterized in that the method comprising the steps of:

2, the method for claim 1 is characterized in that, if described current subframe is first subframe, the synthetic weights set of pre-configured many visual signatures is gathered as the synthetic weights of described many visual signatures.

3, the method for claim 1 is characterized in that, described many visual signatures that synthetic weights set calculates current subframe according to current many visual signatures merge state filtering to be estimated, comprising:

For the synthetic weights of each many visual signature in the synthetic weights set of many visual signatures, judge whether the synthetic weights of these many visual signatures are less than or equal to judgment value, if be less than or equal to described judgment value, then disappear with the synthetic weights characteristic of correspondence of these many visual signatures, should the deletion from the synthetic weights set of described many visual signatures of the synthetic weights of many visual signatures.

As claim 1 or 3 described methods, it is characterized in that 4, described many visual signatures that synthetic weights set calculates current subframe according to current many visual signatures merge state filtering to be estimated, further comprises step:

Judge that whether the synthetic weights characteristic of correspondence number of all many visual signatures is less than the visual signature lower threshold that sets in advance in the synthetic weights set of current many visual signatures, if, carry out the step that produces random number, otherwise, continue to carry out the described step of determining the target location of current sub frame of vision track according to described target position information;

Produce random number;

Described random number be multiply by described characteristic, the product that obtains is rounded, obtain feature indication;

Judge whether resulting feature indication belongs to the feature indication collection,, resulting feature indication is added the feature indication collection, be implemented as the step that the feature indication that adds the feature indication collection is provided with the synthetic weights of corresponding many visual signatures if do not belong to; Otherwise this feature indication characteristic of correspondence is invalid, carries out the step of described generation random number, and wherein, the feature indication collection is the set that the feature indication of the synthetic weights characteristic of correspondence of all many visual signatures in the synthetic weights set of current many visual signatures is formed;

For the feature indication that adds the feature indication collection is provided with the synthetic weights of corresponding many visual signatures, the synthetic weights of many visual signatures of this setting are added in the synthetic weights set of many visual signatures, carry out and describedly judge that the synthetic weights characteristic of correspondence number of all many visual signatures in the synthetic weights set of current many visual signatures is whether less than the step of the visual signature lower threshold that sets in advance.

5, the method for claim 1 is characterized in that, the step that the described many visual signatures that calculate current subframe merge the state filtering estimation specifically comprises:

Determine current sampling particle collection;

According to synthetic weights set of current many visual signatures and weighting synthetic model, obtain the observation probability of current sampling particle set sampling particle;

The many visual signatures that obtain after the normalization according to described observation probability merge the weights of described sampling particles down;

Merge the weights of described sampling particle down according to the many visual signatures after the normalization, the many visual signatures that obtain current subframe merge state filtering to be estimated.

6, method as claimed in claim 5, it is characterized in that, describedly determine that the step of current sampling particle collection comprises: merge the weights of down-sampling particles according to many visual signatures in the last subframe, extract particle again, form current sampling particle collection from the employing particle set of previous frame.

7, the method for claim 1 is characterized in that, obtaining of the synthetic weights set of described current many visual signatures comprises:

The single vision significant condition filtering that obtains last subframe is estimated;

Merge the state filterings estimation according to the single vision significant condition filtering estimation of last subframe and many visual signatures of last subframe, obtain the state estimation deviation;

Obtain the visual signature quality metrics according to described state estimation deviation;

Obtain the synthetic weights set of many visual signatures according to described vision quality tolerance.

8, method as claimed in claim 7 is characterized in that, the single vision significant condition filtering of described last subframe is estimated to obtain according to the weights of sampling particle in the last subframe.

9, a kind of vision tracking system based on particle filter is characterized in that, this system comprises that synthetic weights acquisition module, many visual signatures merge state filtering estimation module and target location determination module;

Synthetic weights acquisition module obtains the synthetic weights set of current many visual signatures;

Many visual signatures merge the state filtering estimation module, and according to the synthetic weights set of current many visual signatures, the many visual signatures that calculate current subframe merge state filtering to be estimated;

10, system as claimed in claim 9 is characterized in that, described synthetic weights acquisition module comprises single vision significant condition filtering estimator module, state estimation deviation calculation submodule, quality metrics calculating sub module and synthetic weights calculating sub module;

Single vision significant condition filtering estimator module, the single vision significant condition filtering that obtains current subframe is estimated;

State estimation deviation calculation submodule merges state filterings according to many visual signatures that described single vision significant condition filtering is estimated and many visual signatures fusion state filtering estimation module transmits and estimates, calculates the state estimation deviation;

The quality metrics calculating sub module obtains the visual signature quality metrics according to described state estimation deviation calculation;

Synthetic weights calculating sub module according to the vision quality tolerance that the quality metrics computing module obtains, calculates the synthetic weights set of current many visual signatures.

11, as claim 9 or 10 described systems, it is characterized in that this system further comprises the characteristic disappear judging module;

The characteristic disappear judging module, judge whether the synthetic weights of each many visual signatures in the synthetic weights set of many visual signatures are less than or equal to judgment value, if be less than or equal to described judgment value, then disappear with the synthetic weights characteristic of correspondence of these many visual signatures, should the deletion from the synthetic weights set of many visual signatures of the synthetic weights of many visual signatures.

12, as claim 9 or 10 described systems, it is characterized in that, this system further comprises the feature generation module, judge the synthetic weights set of many visual signatures characteristic of correspondence number less than the visual signature threshold value after, produce feature indication, for described feature indication is provided with the synthetic weights of corresponding many visual signatures, the synthetic weights of many visual signatures of this setting are added the synthetic weights set of current many visual signatures.