CN1908963A

CN1908963A - Method for detection and tracking of deformable objects

Info

Publication number: CN1908963A
Application number: CNA2006101592420A
Authority: CN
Inventors: D·克雷默斯
Original assignee: Siemens Medical Solutions USA Inc
Current assignee: Siemens Medical Solutions USA Inc; Draeger Medical Systems Inc
Priority date: 2005-08-03
Filing date: 2006-08-03
Publication date: 2007-02-07

Abstract

A method for detecting and tracking a deformable object having a sequentially changing behavior, comprising: developing a temporal statistical shape model of the oscillatory behavior of the embedding function representing the object from prior motion; and then applying the model against future, sequential motion of the object in the presence of unwanted phenomena by maximizing the probability that the developed statistical shape model matches the sequential motion of the object in the presence of unwanted phenomena.

Description

The method of detection and tracking deformable objects

Technical field

The present invention relates generally to object detection, and relate more particularly to the detection and tracking of deformable objects.

Background technology

The application requires in the U.S. Provisional Application No.60/705 of submission on August 3rd, 2005, and 061 right of priority is incorporated herein this application as a reference.

As known in the art, usually wish to detect and cutting object from the background of other object and/or from the background of noise.For example, a kind of application is in MRI, wishes that wherein the anatomical features (such as patient's vertebra) to the patient is cut apart.In other cases, may wish to move, deformable anatomical features (such as heart) cuts apart.

In 1988, Osher and Sethian are being entitled as " Fronts propagation with curvaturedependent speed:Algorithms based on Hamilton-Jacobi formulations " (J.of Comp.Phys., 79:12-49,1988) paper in introduced level set (the level set) method, it should be noted that, the predecessor of Level Set Method is being entitled as " A finite elementmethod for the simulation of Raleigh-Taylor instability " (Springer Lect.Notes inMath. by Dervieux and Thomasset, 771:145-158,1979) paper in be suggested as by developing suitable imbedding function φ: Ω * [0, T] → R comes impliedly at territory Ω  R ⁿThe middle method of propagating (propagate) hypersurface C (t), wherein:

C(t)＝{x∈Ω|φ(x，t)＝0}。(1)

Usually, imbedding function is the height function phi (x) at the real number value of each some x place definition of the plane of delineation, to such an extent as in level line C and this plane have an x corresponding, phi (x)=0 wherein:

C＝{x|\phi(x)＝0}。

This also is a mode of impliedly representing level line C.Except level line C is worked (mobile level line etc.), also Han Shuo phi is worked.Mobile the value of phi will impliedly move " embedded " level line.This also is the reason that why phi (x) is called " imbedding function ", and should " imbedding function " this level line being embedded as its zero level or value is that 0 isoline comes.

Therefore, substitute the common differential equation of propagating explicit frontier point by the part differential equation that the differentiation of the imbedding function of higher dimensionality is carried out modeling.The major advantage of this method is well-known: at first, the implicit expression border is expressed and is not depended on specific parametrization, does not need to introduce reference mark recombination mechanism (control point re-gridding mechanism) between propagation periods.Secondly, develop imbedding function and allow admirably change in topology to be carried out modeling, these change in topology are such as separating and merge embedded border.In the environment of the statistical learning of shape modeling and shape, the attribute of statistical learning allows to be configured in the dissimilar tolerance of defined shape on the imbedding function, the shape that these imbedding functions can the processing variation topology.The 3rd, the implied expression of formula (1) is summarised as the three-dimensional or the hypersurface of multidimensional more usually.In order to force the unique corresponding relation between level line and its imbedding function, φ can be constrained to signed distance function, just, |  φ |=1 all sets up basically everywhere.

Early stage in the nineties, people such as Malladi are being entitled as " A finite element method for thesimulation of Raleigh-Taylor instability " (Springer Lect.Notes in Math., 771:145-158,1979) paper, people such as Caselles are being entitled as " Geodesic active contour " (Proc.IEEE Intl.Conf.on Comp.Vis., the 694-699 page or leaf, Boston, USA, nineteen ninety-five) paper, people such as Kichenassamy are being entitled as " Gradient flows and geometric active contour models " (IEEE Intl.Conf.on Comp.Vis., the 810-815 page or leaf, nineteen ninety-five) paper, and Paragios and Deriche are being entitled as " Geodesic active regions and level set methods for supervised texturesegmentation " (Int.J.of Computer Vision, 46 (3): 223-247,2002) in the paper first Application of Level Set Method in image segmentation proposed at first.The Mumford-Shah function is (referring to being entitled as " Optimal approximations by piecewise smooth functions and associated variationalproblems " (Comm.Pure Appl.Math., 42:577-685,1989[14]) paper) level set realize independently proposing by Chan and Vese, referring to being entitled as " Active contours without edges " (IEEETrans.Image Processing, 10 (2): 266-277, calendar year 2001) paper, and people such as Tsai is being entitled as " Model-based curve evolution technique for image segmentation " (Comp.VisionPatt.Recog., the 463-468 page or leaf, Kauai, Hawaii, calendar year 2001) proposes in the paper.

In recent years, for the low-level information of undertreatment, the researchist has proposed in the dividing method of statistics shape knowledge introducing based on level set.Though these priori have been shown cutting apart of strong improvement analogical object, up to the present, it is on the static statistics shape prior (just, what is " priori ") that focus has concentrated on (in time) just.Yet, in the environment of tracking of deformable object, be apparent that, along with the past of time, some profile (such as, those profiles of the people of the heartbeat in the MRI application program, the perhaps people's of the walking in other application program profile) may become more or less that some is similar.People such as Leventon are being entitled as " Geometry and prior-based segmentation " (European Conf.On Computer Vision that T.Pajdla andV.Hlavac edits, volume 3024 of LNCS, the 50-61 page or leaf, Prague, spring in 2004) proposed in the paper to come imbedding function is carried out modeling by principal component analysis (PCA) (PCA) is carried out in training shapes set, and suitable driving item is increased in the level set evolution equation, people such as Tsai are being entitled as " Curve evolution implementation of the Mumford-Shahfunctional for image segmentation; de-noising; interpolation; and magnification " (IEEETrans.on Image Processing, 10 (8): 1169-1186, calendar year 2001) suggestion is directly carried out in the subspace of more initial natural modes and is optimized in the paper.People such as Rousson are (referring to " Shape priors for level setrepresentations " (the Proc.of the Europ.Conf.on Comp.Vis. that people such as A.Heyden edits, volume 2351 of LNCS, the 78-92 page or leaf, Copenhagen, spring in May, 2002, Berlin) and " Implicit active shape models for 3d segmentation in MRI imaging " (MICCAI, the 209-216 page or leaf, 2004)) suggestion introduces shape information on change level, and people such as Chen are (referring to " Using shape priors in geometric active contours in a variational framework " (Int.J.of Computer Vision, 50 (3): 315-328,2002)) come directly level line to be forced shape constraining by the zero level that provides imbedding function.Recently, people such as Riklin-Raviv (referring to European Conf.OnComputer Vision, volume 3024 of LNCS, 50-61 page or leaf, Prague, spring in 2004) propose to introduce projective invariance by cutting signed distance function with various angles.

In above-mentioned works, show the shape information of statistical learning, to handle owing to noise, confusion and to block and that in input picture, lose or misleading information.The research shape prior comes the object of the similar shape in the given image is cut apart.Yet, although these shape priors can be used for the tracing object of image sequence, referring to [people's such as Cremers " Nonlinear shape statistics in Mumford-Shah basedsegmentation " (the Europ.Conf.On Comp.Vis. that people such as A.Heyden edits, volume 2351of LNCS, the 93-108 page or leaf, Copenhagen, in May, 2002, spring)], [" Tracking objects with the Chan-Vese algorithm " (Technical Report 03-14 of Moelich and Chan, Computational Applied Mathematics, UCLA, Los Angeles, 2003)] and [" Kernel density estimation and intrinsic alignment for knowledge-drivensegmentation:Teaching level sets to walk " (Pattern Recognition of people such as Cremers, volume 3175 ofLNCS, the 36-44 page or leaf, spring in 2004)], these priori still not too are fit to this task, because these priori have been ignored the temporal coherence of profile, this temporal coherence characterizes a lot of deformed shape.

When following the tracks of the 3-d deformable object along with the variation of time, at given time, not every significantly shape is all similar fully.For example, the typical pattern of the image shows continuous profile of the people's of walking rule sampling.Similarly, not independent sample usually with the projection of the rigid three-dimensional object of constant speed rotation from the statistics distribution of shapes.Alternatively, can expect that final profile is gathered comprises strong time correlation.

Summary of the invention

According to the present invention, a kind of method is provided, be used for the deformable objects that detection and tracking have continuous variation characteristic, wherein, and the time statistical shape model of the continuous variation characteristic of this method research imbedding function, this imbedding function representative is from the object of priori motion; And then by the maximization probability, the continuous motion application model of the object during for phenomenon in the future, that occur not expecting, this probability be the statistical shape model coupling studied occur not expecting phenomenon the time the probability of continuous motion of object.

According to another feature of the present invention, a kind of method produces the dynamic model of the time-evolution (time evolution) of the imbedding function that the priori of boundary shape object observes, such object has observable, continually varying boundary shape; And after subsequently such model being used for about the probabilistic inference of the object of such shape.

This method has been studied the time statistical shape model of the object shapes of implied expression.Especially, relying on the shape probability of preset time is the function of the shape of the former object of observing.

In one embodiment, the dynamic shape model is integrated in the cutting procedure in Bayes (Bayesian) framework, cuts apart based on the image sequence of level set carrying out.

In one embodiment, obtain optimization by the part differential equation at level set function.Optimization comprises the differentiation at interface, and the differentiation at this interface is driven by the strength information and the priori dynamic shape of present image, and the dynamic shape of priori depends on cutting apart of being obtained on the frame formerly.

Utilize such method, compare with the existing dividing method that utilizes the statistics shape prior to carry out, the shape that final cutting apart learnt before not only being similar to, but that these are cut apart is also consistent with time correlation estimated from sample sequence.The cutting procedure that obtains thus can be handled a large amount of noises and obstruction, because this cutting procedure has utilized about the conforming priori of time shape, and because this cutting procedure along with information (rather than handling each image independently) from input picture is assembled in the variation of time.

Research is at the dynamic model of the shape of implied expression and based on Bayesian frame these models are integrated into image sequence and have employed a lot of priori work in the various fields in cutting apart.The theory of dynamic system and time series analysis have very long tradition (for example referring to [A.Papoulis " Probability; Random Variables; and Stochastic Processes " (McGraw-Hill, New York, 1984)]) in described document.Wherein, Blake, Isard and colleague are at explicit shape expression study autoregressive model [" the Active Contours " of A.Blake and M.Isard (London, spring in 1998)].In these works, based on the marginal information of from intensity image, being extracted, by the tracking results of particulate filter achieving success.Yet herein, the inventive method is different from said method in three kinds of modes:

● herein, dynamic model is the shape at implied expression.Therefore, the dynamic shape model shape of processing variation topology automatically.This model generally expands to higher dimensionality (for example, 3D shape), because this model does not need to handle the problem of determining combinatorial problem that point is corresponding and the reference mark reorganization that is associated with explicit shape expression.

● by [Zhu, Yuille 1996, Chan, Vese 1999] enlightenment ground, the method according to this invention is integrated in the strength information of input picture in the statistical formulas.This has caused the tracking scheme based on the zone, rather than based on the tracking scheme at edge.Statistical formulas means, with respect to the strength model of supposition, this method optimal input information.This method does not also rely on the precomputation of the picture edge characteristic of heuristic definition.Yet the probability strength model of supposing is very simple (being that Gauss (Gaussian) distributes).Can application and the more complicated model of intensity, color or the texture of background.

● descend rather than solve the Bayes posteriority optimization of variable in being provided with by gradient by random sampling technique.Though to only following the tracks of on the most similar hypothesis (rather than multiple hypothesis), this method promotes the expansion expressed to the higher dimensionality shape, and needn't sharply increase the intrinsic computation complexity of the method for sampling with algorithm limits used in the present invention for this.

Recently, people such as the Goldenberg [PatternRecognition of Goldenberg, Kimmel, Rivlin and Rudzsky, 38:1033-1043, in July, 2005] successfully PCA is applied to the shape sequence of aligning, so that periodic shape motion is classified.Although also concentrating on, this work characterizes mobile implied expression in shape, but this work and difference of the present invention are, these shapes are not represented (but representing by binary mask) by the level set imbedding function, this work does not utilize autoregressive model, and this work concentrates on the property sort of the shape sequence of pre-segmentation, rather than concentrates on and utilize dynamic shape priori to cut apart or follow the tracks of.

Description of drawings

The accompanying drawing of back and below explanation in set forth the detailed description of one or more embodiment of the present invention.Can clearly from instructions and accompanying drawing and accessory rights requirement, obtain other features, objects and advantages of the present invention.

The low-dimensional that Fig. 1 illustrates the set of training profile is approximate, and the profile at top is manually cut apart, and bottom profile is to be similar to by its PCA according to imbedding function of the present invention;

Fig. 2 is an autocorrelation function, and this autocorrelation function is used to checking according to autoregressive model of the present invention, with the forth day of a lunar month a shape model draw the autocorrelation function of the residue that is associated;

Fig. 3 is a shape model, original shape sequence (left side) and the time-evolution of first, second and the 6th shape natural mode is shown according to the synthetic sequence (right side) of second order markov (Markov) chain by statistical learning according to the present invention;

Fig. 4 is by according to the walking sequence that process of the present invention produced, and the second order Markov model by statistical learning in imbedding function produces the sample profile;

Fig. 5 shows has the sample from image sequence that increases amount of noise;

Fig. 6 shows cutting apart according to of the present invention of the resting shape priori of using 25% noise;

Fig. 7 shows cutting apart according to of the present invention of the dynamic shape priori of using 50% noise;

Fig. 8 show use 50 the cutting apart of dynamic shape priori of % noise according to of the present invention;

Fig. 9 shows cutting apart according to of the present invention of the dynamic shape priori of using 75% noise;

Figure 10 shows cutting apart according to of the present invention of the dynamic shape priori of using 90% noise;

Figure 11 shows according to the present invention segmentation accuracy is carried out qualitative assessment;

Figure 12 shows the tracking when having obstruction according to the present invention;

Figure 13 is the process flow diagram according to process of the present invention; And

Figure 14 is the embedding surface that produces, statistics ground and the profile that provided by the zero level line of synthetic surface, the acquisition of sampling from second-order autoregressive model according to the present invention of this surface.The profile that implicit formula allows to embed changes topological structure (image of bottom left).

Identical reference symbol among each figure is represented components identical.

Embodiment

With reference now to Figure 13,, the process flow diagram of the method that is used for the detection and tracking deformable objects is shown.In addition, the example of this method is described for example, object is the people of walking.Yet, should be appreciated that this method can be used for other deformable objects, this deformable objects comprises the anatomy object such as the human heart of beating.

Before the step in Figure 13 is discussed, the Bayesian formula of cutting apart based on the image sequence of level set will be introduced.At first, with the common-used formula in the space of discussion imbedding function, the effective formula on calculating in the low dimension subspace is described then.

2.1 common-used formula

Below, it is the two-dimensional silhouette set of the sealing of mould that shape is defined as with certain conversion group, by having the T of parameter vector θ _θRepresent the element in this set.Depend on application program, these conversion group can be rigid body conversion, similar or affine conversion or bigger conversion group.According to equation (1), impliedly express this shape by imbedding function φ.Therefore, will be by φ (T _θX) provide interested object, wherein should conversion T _θAct on grid, this causes the corresponding conversion of the profile of implied expression.Herein, owing to wanting to use different models to express and learning its corresponding time-evolution, so intentionally shape φ and conversion parameter θ are separated.

Suppose from image sequence, to provide continuous images I _t: Ω → R, wherein I _1:tDifferent image collection { the I constantly of expression ₁, I ₁..., I _t.(all expression formulas are with I to use Bayesian formula _1:t-1Be condition), by with respect to imbedding function φ _tWith conversion parameter θ _tMaximize conditional probability

P (φ_{t}, θ_{t} | I_{1 : t}) = \frac{P (I_{t} | φ_{t}, θ_{t}, I_{1 : t - 1}) P (φ_{t}, θ_{t} | I_{1 : t - 1})}{P (I_{t} | I_{1 : t - 1})} - - - (2),

Can solve present frame I _tThe problem of cutting apart.(on infinite dimension space probability distribution is carried out normally open question of modeling, this problem comprises the suitable tolerance and the problem of integrability of defining.Therefore, function phi can be considered to the approximate of limited dimension, and this is similar to by the imbedding function on the sampling regular grid and obtains).For succinct reason, will the annotation of bayes method be discussed.Herein, have the ability enough to say that Bayesian frame can be regarded as the inverse process of the image forming course of probability in being provided with.

Denominator in the formula (2) does not depend on estimated quantity, and therefore can when maximization it be ignored.In addition, can use with incision general graceful-Cole Mo Geluofu (the Chapman-Kolmogorov) [Probability of A.Papoulis, Random Variables, and Stochastic Processes (McGraw-Hill, New York, 1984)] formula rewrites second in the molecule:

P(φ _t，θ _t|I _1:t-1)＝∫P(φ _t，θ _t|φ _1:t-1，θ _1:t-1)P(φ _1:t-1，θ _1:t-1|I _1:t-1)dφ _1:t-1dθ _1:t-1 (3)

Below, for the expression formula in the formula of reduction (2), carrying out some hypothesis, this causes calculating more feasible estimation problem:

● suppose image I _1:tBe separate:

P(I _t|φ _t，θ _t，I _1:t-1)＝P(I _t|φ _t，θ _t) (4)

●

● suppose that the intensity of interested shape and the intensity of background are to have unknown average μ ₁, μ ₂And variances sigma ₁, σ ₂The independent sample of two Gaussian distribution.Therefore, top data item can be written as:

P (I_{t} | φ_{t}, θ_{t}) = \underset{\underset{φ_{t} (T_{θ_{t}} x) &GreaterEqual; 0}{x}}{Π} \frac{1}{\sqrt{2 π} σ_{1}} \exp (- \frac{{(I_{t} (x) - μ_{1})}^{2}}{{2 σ}_{1}^{2}}) \underset{\underset{φ_{t} (T_{θ_{t}} x) < 0}{x}}{Π} \frac{1}{\sqrt{2 π} σ_{2}} \exp (- \frac{{(I_{t} (x) - μ_{2})}^{2}}{{2 σ}_{2}^{2}})

&Proportional; \exp (- \underset{Ω}{&Integral;} (\frac{{(I_{t} - μ_{1})}^{2}}{{2 σ}_{1}^{2}} + {\log σ}_{1}) {Hφ}_{t} (T_{θ_{t}} x) + (\frac{{(I_{t} - μ_{2})}^{2}}{{2 σ}_{2}^{2}} + {\log σ}_{2}) (1 - {Hφ}_{t} (T_{θ_{t}} x)) dx),

● wherein introduced Heaviside (Heaviside) step function H φ ≡ H (φ) and represented the zone, wherein φ is positive (H φ=1) or negative (H φ=0).Wherein, relevant strength model has also been proposed, " Optimal approximations bypiecewise smooth functions and associated variational problems " (Comm.Pure Appl.Math. referring to D.Mumford and J.Shah, 42:577-685,1989), " the Region competition:Unifying snakes; region growing " of S.C.Zhu and A.Yuille, " MDLfor multiband image segmentation " (IEEE PAMI with Bayes, 18 (9): 884-900,1996) and " Active contours without edges " (IEEETrans.Image Processing of T.F.Chan and L.A.Vese, 10 (2): 266-277, calendar year 2001).Jointly utilize shape φ _tWith conversion θ _tCome estimation model parameter μ _iAnd σ _iBy the inside and outside intensity I of current shape _tAverage and variance provide its optimal value:

μ_{1} = \frac{1}{a_{1}} &Integral; I_{t} {Hφ}_{t} dx, σ_{1}^{2} = \frac{1}{a_{1}} &Integral; I_{t}^{2} {Hφ}_{t} dx - {μ_{1}}^{2},

A wherein ₁=∫ H φ _tDx (5)

● and similarly for μ ₂And σ ₂, utilize (1-H φ _t) alternative H φ _tFor hold mark represents that simply, these parameters are not shown as the part of dynamic variable.

● for fear of all possible intermediate shape φ that considers in the formula (3) _1:t-1With conversion θ _1:t-1Computation burden, peak value was got in the distribution of state consumingly before supposing near the maximal value of corresponding distribution:

P (φ_{1 : t - 1}, θ_{1 : t - 1} | I_{1 : t - 1}) \approx δ (φ_{1 : t - 1} - {\hat{φ}}_{1 : t - 1}) δ (θ_{1 : t - 1} - {\hat{θ}}_{1 : t - 1}), - - - (6)

● wherein

({\hat{φ}}_{i}, {\hat{θ}}_{i}) = \arg \max P (φ_{i}, θ_{i} | I_{1 : i - 1})

Be at the shape that frame obtained in past and the estimation of conversion, and δ (.) expression dirac (Dirac delta) function.This approximate interchangeable adjustment is as follows: suppose the restriction owing to storer, tracker can not be stored the image that is obtained, but this system only stores the estimation in the past of shape and conversion.Then, the deduction problem of t narrows down to respect to imbedding function φ constantly _tWith conversion parameter θ _tThe problem that the maximization condition distributes

P (φ_{t}, θ_{t} | I_{t}, {\hat{φ}}_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}) &Proportional; P (I_{t} | φ_{t}, θ_{t}) P (φ_{t}, θ_{t} | {\hat{φ}}_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}) . - - - (7)

● this equates with original deduction problem referring to formula (2) referring to the condition that is approximately of formula (6).

● the main contribution of this piece paper is with shape the preceding and is converted to condition and comes shape φ _tWith conversion θ _tOn associating priori carry out modeling.For this reason, consider that two are similar to:

In the first step: suppose that shape and conversion are separate, just, P (φ _t, θ _t| φ _1:t-1, θ _1:t-1)=P (φ _t| φ _1:t-1) P (θ _t| θ _1:t-1), and the even priori on the hypothesis conversion parameter, just, P (θ _t| θ _1:t-1)=constant.This is the complementation to people's such as Rathi nearest work, " Particle filtering forgeometric active contours and application to tracking deforming objects (IEEEInt.Conf.on Comp.Vision and Patt.Recognition; 2005); they have proposed the time model at these conversion parameters, and are not forcing any specific model in shape referring to Y.Rathi, N.Vaswani, A.Tannenbaum and A.Yezzi.

In second step, under the situation of considering the coupling between shape and the conversion, consider the joint distribution P (φ of shape and conversion parameter _t, θ _t| φ _1:t-1, θ _1:t-1) situation more commonly used.Experimental result confirms that when handling obstruction, this causes more performance.

2.2 the formula of limited dimension

When the conditional probability of from sample data, estimating in (7)

The time, need the limited dimension of reduction imbedding function approximate.As everyone knows, if the dimension of model and data is low, estimation statistics model more reliably then.Then in the subspace with the low dimension formula Bayesian inference of reruning, this subspace generates (span) by the topmost natural mode of sample shape set.Then utilize training sequence with dual mode: at first, this training sequence is used for limiting the low dimension subspace that wherein will carry out estimation.Then, in this sub spaces, process uses this training sequence to learn dynamic model at the implicit expression shape.

Make { φ ₁..., φ _NIt is the time series of training shapes.Suppose all training shapes φ _iIt is signed distance function.Yet any linear combination of natural mode does not produce signed distance function usually.Though the statistical shape model support that is proposed approaches the shape of the training shapes set of signed distance function (and therefore approach), be not that all shapes of taking a sample in the subspace of being considered are all corresponding with signed distance function.Make φ ₀Expression average shape and ψ ₁..., ψ _nRepresent n maximum natural mode, wherein n＜＜N.This process then is approximately each training shapes:

φ_{i} (x) = φ_{0} (x) + Σ_{j = 1}^{n} α_{ij} ψ_{j} (x), - - - (8)

Wherein

a _ij＝(φ _i-φ ₀，ψ _j)≡∫(φ _i-φ ₀)ψ _jdx。(9)

Successfully the expression of such level set function based on PCA is applied in the structure of statistics shape prior, referring to M.Leventon, W.Grimson, " Statistical shapeinfluence in geodesic active contours " (CVPR with O.Faugeras, volume 1, the 316-323 page or leaf, Hilton HeadIsland, SC, 2000), A.Tsai, A.Yezzi, W.Wells, C.Tempany, D.Tucker, A.Fan, " Model-based curve evolution technique for imagesegmentation " (Comp.Vision Patt.Recog. of E.Grimson and A.Willsky, the 463-468 page or leaf, kauai, Hawaii, calendar year 2001), M.Rousson, " Implicit active shape models for 3dsegmentation in MRI imaging " (MICCAI of N.Paragios and R.Deriche, the 209-216 page or leaf, 2004) and M.Rousson and D.Cremers (M.Rousson and D.Cremers, MICCAI, volume 1, the 757-764 page or leaf, 2005), " Efficient kernel density estimation of shape and intensity priors for level setsegmentation " (MICCAI, 2005).Below, be ψ=(ψ with the vector representation of an initial n natural mode ₁..., ψ _n).Therefore, by n dimension shape vector a _i=(a _I1..., a _In) with each sample shape φ _iBe similar to.Similarly, the shape vector by following form is similar to arbitrary shape φ

a _φ＝(φ-φ ₀，ψ)。(10)

Fig. 1 shows from the profile set of the people's of walking sequence and their being similar to initial 6 natural modes.Though this is approximate to be the rough approximation that lacks some details of shape certainly, still is enough to find them.More particularly, six profile sequences in the first half of Fig. 1 come from the manual tracking according to the step 100 of Figure 13, and that six profiles in the Lower Half of Fig. 1 are PCA as in the step 102 of Figure 13 is approximate.Therefore, Fig. 1 shows and trains the low-dimensional of profile set approximate.The profile of manually cutting apart according to the step 100 of Figure 13 (top of Fig. 1) is a bottom profile, and these bottom profile are come approximate by initial 6 major components (PCA) (referring to formula (8)) of its imbedding function (bottom of Fig. 1).

Be similar in front the derivation that provides in the part, the target that the image sequence in this sub spaces is cut apart is described below: provide continuous images I from image sequence _t: Ω → R, and be given in former image I _1:t-1On obtained cut apart a _1:t-1And conversion

This process is with respect to form parameter a _tWith conversion parameter θ _tMaximize following conditional probability

P (α_{t}, θ_{t} | I_{t}, α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}) = \frac{P (I_{t} | α_{t}, θ_{t}) P (α_{t}, θ_{t} | α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1})}{P (I_{t} {| α}_{1 : t - 1}, {\hat{θ}}_{1 : t - 1})} . - - - (11)

This conditional probability is modeled as:

P (α_{t}, θ_{t} | α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}), - - - (12)

This conditional probability is that condition has constituted the given shape a that is used to observe moment t with the parameter estimation at shape that is obtained on the former image and conversion _tWith particular conversion θ _tProbability.

3. dynamic statistics shape

Having worked out a lot of theories comes time series data relevant on time domain is carried out modeling.Wherein, in [the Active Contours.London of A.Blake and M.Isard, spring in 1998], proposed shape-variable is carried out the application program of the dynamic system of modeling.Herein, the dynamic model of the shape of this procedural learning implied expression.In order to simplify described discussion, at first concentrate on the dynamic model of warpage.In other words, suppose on the conversion parameter even distribution and only to condition distribution P (α _t| α _1:t-1) carry out modeling.

3.1 the dynamic model of distortion

Refer again to Figure 13, in step 100, obtain to have the image sequence of manually cutting apart of the deformable objects that for example sequentially changes vibration characteristics in any traditional mode.More particularly, as described in the superincumbent part 2.2, make { φ ₁..., φ _NIt is the time series of training shapes.At the people's of the walking among Fig. 1 example display result, the black training shapes on the white background shown in Figure 1 and the level set function of each shape.As described in, make φ in part 2.2 ₀Expression average shape, and ψ ₁..., ψ _nRepresent n maximum natural mode, wherein n＜＜N.

In step 102, the PCA of this process usage level set function) expresses and calculate major component, be referred to as the shape vector as in below formula (9) and (10).The PCA that the profile sequence has been shown in 6 training shapes (herein being 6 profiles) below Fig. 1 expresses.Notice that because approximate among the PCA, last can be observed, the right crus of diaphragm in last in 6 profiles in the set on top has sharper toe than corresponding last profile in 6 shapes in bottom last.

Therefore, according to PCA, then approximate each training shapes of this process as providing in superincumbent formula (8) and (9).

In step 104, this process is estimated dynamic (autoregression) model at the shape vector sequence.Be illustrated in this formula (13) below.Fig. 3 illustrates the shape vector (left side) of list entries and utilizes the shape vector (right side) of the sequence that this model synthesizes.More particularly, Fig. 3 is that model compares; Original-shape sequence (top) with show similar vibration characteristics and Modulation and Amplitude Modulation by the synthetic sequence (bottom) of the second-order Markov chain of statistical learning.These curves show the time-evolution of first, second and the 6th shape natural mode.

Fig. 5 shows the sample from the image sequence of the noise with increase.Fig. 5 is the image from the sequence of the noisiness with increase, be from the sample incoming frame in the sequence with 25%, 50% and 90% noise, wherein 90% noise means by 90% of alternative all pixels of the random strength of sampling from even distribution herein.

More particularly, the time statistical shape model of vibration characteristics of the imbedding function of the object people of walking (herein for) from priori motion is represented in this process study.

Still more particularly, this process is come the shape vector of level of approximation set function sequence by k rank Markov chain ([Neumaier, Schneider calendar year 2001])

a_{t} &equiv; a_{φ_{t}}

Learn the time dynamic perfromance of shape-variable, just:

a _t＝μ+A ₁a _t-1+A ₂a _t-2+...+A _ka _t-k+η， (13)

Wherein, η is the zero-mean Gaussian noise with covariance ∑.Therefore, provide shape probability by corresponding k rank autoregressive model with the condition that is shaped as observed in the former time step:

P (α_{t} | α_{1 : t - 1}) &Proportional; \exp (- \frac{1}{2} v^{T} Σ^{- 1} v), - - - (14)

Wherein

v≡α _t-μ-A ₁α _t-1-A ₂α _t-2...-A _kα _t-k (15)

Proposed various methods and come the estimation model parameter in described document, this model parameter is by average μ ∈ R ⁿAnd translation and noise matrix A ₁..., A _k, ∑ ∈ R ^{N * n}Provide.Herein, this process is applied in [" Estimation of parameters and eigenmodes ofmultivariate autoregressive models " (ACMT.Mathematical Software of A.Neumaier and T.Schneider, 27 (1): 27-57, calendar year 2001] the staged least square algorithm that proposes in.Designed the accuracy that different tests comes quantitative model to cooperate.At two standards of having set up of model accuracy is that the final predicated error of Akaike is (referring to " Autoregressive model fitting for control " (Ann.Inst.Statist.Math. of H.Akaike, 23:163-180,1971)) and Bayes's standard of Schwarz (referring to " Estimating the dimension of a model " (Ann.Statist. of G.Schwarz, 6:461-464,1978).Use is found according to Bayes's standard of Schwarz up to the dynamic model on 8 rank, can be similar to the training sequence of this process of use by second-order autoregressive model best.

From the training sequence of 151 continuous profiles, estimate the parameter of second-order autoregressive model.By drawing the autocorrelation function (referring to Fig. 2) of the residue that is associated with the natural mode of each institute's modeling, then assess this model.These curves show that these residues come down to incoherent.Therefore, utilize the autocorrelation function shown in Fig. 2, by drawing the autocorrelation function with the forth day of a lunar month residue of being associated of a shape model, can verify in the step 104 of Figure 13 the autoregressive model that is provided.Apparently, these residues are correlated with statistically.

In addition, according to formula (13), estimated model parameter allows the synthetic walking of this process sequence.In order to remove dependence, initial hundreds of samples have been discarded to starting condition.Fig. 3 shows the time-evolution of first, second and the 6th natural mode in list entries (left side) and the composition sequence (the right).Obviously, second-order model is caught some key elements in the vibration characteristics.According to step 104 (Figure 13), the original-shape sequence on the left side and the right by the sequence shows that second-order Markov chain synthesized of statistical learning similar vibration characteristics and Modulation and Amplitude Modulation.These curves show the time-evolution of first, second and the 6th shape natural mode.Similar vibration characteristics and the Modulation and Amplitude Modulation of sequence shows that original-shape sequence (left side) and second-order Markov chain (the right) by statistical learning are synthesized.These curves show the time-evolution of first, second and the 6th shape natural mode.

When the people's of the sequence capturing that synthesized walking characteristic kinematic, Fig. 4 shows that synthetic separately profile is not effective shape of imitation in all instances.Think that expectation obtains these restrictions from the model of the represented list entries of strong compression: 151 shapes that replacement is limited on 256 * 256 grids, this model only keeps average shape φ ₀, 6 natural mode ψ and the autoregressive model parameter that provides by 6 dimension averages and 36 * 6 matrixes.Corresponding to being compressed to 4.6% of original size, amount to 458851 parameters, rather than 9895936 parameters.Though worked out the synthetic of the dynamic shape model that uses autoregressive model before in [the Active Contours of A.Blake and M.Isard, London, spring in 1998], it should be noted that still shape synthetic is based on implied expression.More particularly, Fig. 4 shows the walking sequence of the synthetic generation that is produced by this process.Second order Markov model by the statistical learning on imbedding function produces the sample profile, referring to formula (13).Though Markov model is caught the people's of walking a plurality of typical vibration characteristics, not all sample that generates is corresponding with admissible shape, relatively at latter two profile on right side, bottom.Yet as described in part 5, this model enough accurately comes suitably to retrain cutting procedure.Fig. 4 shows by the synthetic walking sequence that produces of process.The second order Markov model of the statistical learning on the imbedding function produces the sample profile, referring to formula (13).Though Markov model is caught the people's of walking a plurality of typical vibration characteristics, not all sample that produces is corresponding with admissible shape, relatively latter two profile on right side, bottom.Yet as will be described in the part 5, this model can enough come suitably to retrain cutting procedure exactly.

With reference to Figure 14, the sequence of the synthetic imbedding function of statistics is shown, and the profile that is caused that the zero level line by respective surfaces provides also is shown.Especially, this implied expression allows the synthetic shape that changes topology.For example, the profile on the lower left quarter of Figure 14 comprises two profiles.The statistics ground, embedding surface that obtains by sampling from second-order autoregressive model produces sequence, and provides profile by the zero level line of synthetic surface.The profile that implicit formula allows to embed changes topology (image of lower left quarter).

3.2 the associating dynamic perfromance of distortion and conversion

In the part in front, introduced the time dynamic perfromance that autoregressive model is caught the shape of implied expression.For this reason, before the study of carrying out dynamic model, remove with such as the corresponding degree of freedom of the conversion of translation and rotation.Therefore, deformation pattern is only incorporated in study into, and has ignored all information about posture and position.For example, the synthetic shape among Fig. 4 shows the people of the walking of " at the scene " walking.

Usually, expectation is with deformation parameter a _tWith conversion parameter θ _tClosely be coupled together.The model of associating dynamic perfromance of catching shape and conversion is more powerful than the model of ignoring conversion significantly.Yet it is constant dynamic shape model that this procedural learning is changed for translation, rotation and other.For this reason, can utilize such fact: conversion has formed group, this means the conversion that increases progressively by application

{Δθ}_{t} = T_{θ_{t}} x = T_{{Δθ}_{t}} T_{θ_{t - 1}} x,

Can from before conversion θ _T-1The middle conversion θ that obtains at moment t _tReplace the absolute conversion of study θ _tModel, this process learns to upgrade conversion Δ θ simply _tThe model of (for example, the variation in translation and the rotation).By structure, this model is constant with respect to the whole posture or the position of the shape of institute's modeling.

For the associating modeling is carried out in conversion and distortion, for each training shapes in the learn sequence, this process obtains deformation parameter a simply _iWith conversion changes delta θ _i, and the autoregressive model that will provide in formula (14) and (15) is coupled to combined vectors

α_{t} = (\begin{matrix} α_{t} \\ {Δθ}_{t} \end{matrix}) .

Under the people's who walks situation, find that as under static situation, second-order autoregressive model has provided best model and cooperated.Produce the people's of walking profile according to the synthetic permission of this model, this profile is similar to the profile shown in Fig. 4, but this profile from arbitrarily (user's special use) reference position, advance in the space.

4. the dynamic shape priori in variable cutting apart

From image sequence, provide image I _t, and provide and have form parameter a _1:t-1With conversion parameter θ _1:t-1The set of the former shape of cutting apart, the purpose of tracking is with respect to shape a _tWith conversion θ _tMaximization conditional probability equation (11).This can carry out by minimizing its negative logarithm, and this negative logarithm (equaling constant) provides by the energy of following form:

E(α _t，θ _t)＝E _data(α _t，θ _t)+vE _shape(α _t，θ _t)。(17)

Introduce other weight v, to allow the relative weighting between priori and the data item.Especially, if strength information and hypothesis not consistent (gaussian intensity profile of object and background), preferably bigger weight v.Provide data item by following formula:

E_{data} (α_{t}, θ_{t}) = \underset{Ω}{&Integral;} (\frac{{(I_{t} - μ_{1})}^{2}}{{2 σ}_{1}^{2}} + {\log σ}_{1}) {Hφ}_{α_{t}, θ_{t}} + (\frac{{(I_{t} - μ_{2})}^{2}}{{2 σ}_{2}^{2}} + {\log σ}_{2}) (1 - {Hφ}_{α_{t}, θ_{t}}) dx - - - (18)

Wherein, simplify, introduce following expression for symbolic representation

φ_{α_{t}, θ_{t}} &equiv; φ_{0} (T_{θ_{t}} x) + α_{t}^{T} ψ (T_{θ_{t}} x), - - - (19)

Utilize deformation parameter a with expression _tGenerate and utilize parameter θ _tThe imbedding function of the shape of being changed.

Use autoregressive model equation (14), provide the shape energy by following formula:

E_{shape} (α_{t}, θ_{t}) = \frac{1}{2} v^{T} Σ^{- 1} v, - - - (20)

In formula (15), defined v.In order to be introduced in the distortion introduced in the part 1 and the conjunctive model of conversion, the expression formula of top v need strengthen by relative conversion Δ θ:

v &equiv; (\begin{matrix} α_{t} \\ {Δθ}_{t} \end{matrix}) - μ - A_{1} (\begin{matrix} α_{t - 1} \\ {Δ \hat{θ}}_{t - 1} \end{matrix}) - A_{2} (\begin{matrix} α_{t - 2} \\ {Δ \hat{θ}}_{t - 2} \end{matrix}) . . . - A_{k} (\begin{matrix} α_{t - k} \\ {Δ \hat{θ}}_{t - k} \end{matrix}), - - - (21)

Wherein, μ and A _iThe average of expression statistical learning and at the translation matrix of joint space of distortion and conversion, and k is a model order.In experiment, preference pattern exponent number k=2.

Be easy to show, second-order autoregressive model can be interpreted as the version at random of the harmonic oscillator of time discrete decay.Therefore, be fit to the warpage of vibration is in fact carried out modeling.Yet, find that the higher-order autoregressive model provides qualitatively similarly result.

Calculate the optimal segmentation of cycle tests, this is consistent with the dynamic model of being learnt.This can finish by the shape vector that finds the conditional probability in the maximization equation (11).By the gradient on the negative logarithm of carrying out this probability descend realize the maximization.This is shown in the equation (22).Intuitively, this process makes warpage, so that this shape had not only been mated most with present image but also with the prediction of dynamic model.The optimum shape of each test pattern has been shown in Fig. 9,10,11 and 13.

Finish by minimization of energy formula (17) and to utilize dynamic shape priori tracking image sequence I _1:tOn objects.It is tactful to use gradient to descend, and this causes following differential formulas, to estimate shape vector a _t:

\frac{{da}_{t} (τ)}{dτ} = - \frac{{&PartialD; E}_{data} (a_{t}, θ_{t})}{{&PartialD; a}_{t}} - v \frac{{&PartialD; E}_{shape} (a_{t}, θ_{t})}{{θa}_{t}} - - - (22)

Wherein, with respect to physics moment t, τ represents artificial (artificial) develops constantly.Provide data item by following formula:

\frac{{&PartialD; E}_{data}}{{&PartialD; a}_{t}} = &lang; ψ, δ (φ_{a_{t}}) (\frac{{(I_{t} - μ_{1})}^{2}}{{2 σ}_{1}^{2}} - \frac{{(I_{t} - μ_{2})}^{2}}{{2 σ}_{2}^{2}} + \log \frac{σ_{1}}{σ_{2}}) &rang;,

And provide the shape item by following formula:

\frac{{&PartialD; E}_{shape}}{{&PartialD; a}_{t}} = \frac{&PartialD; v}{{&PartialD; a}_{t}} Σ^{- 1} v = (\begin{matrix} 1_{n} & 0 \\ 0 & 0 \end{matrix}) Σ^{- 1} v, - - - (23)

Wherein, v provides in formula (21), and 1 _nFor the projection on the shape composition v being carried out the n dimension cell matrix of modeling, wherein n is the number of shape model.These two influence shape as follows and develop: draw shape with separate picture intensity according to two gaussian intensity models for first.Because shape vector a _tIn variable influence shape by natural mode ψ, so data item is the projection on these natural modes.Second causes shape vector a _tTowards the mitigation of similar shapes, as being predicted by dynamic model, this dynamic model is based at former shape vector that time frame obtained and conversion parameter.

Similarly, obtain with respect to conversion parameter θ by developing the corresponding gradient decline equation that provides by following formula _tMinimize:

\frac{{dθ}_{t} (τ)}{dτ} = - \frac{{&PartialD; E}_{data} (a_{t}, θ_{t})}{{&PartialD; θ}_{t}} - v \frac{{&PartialD; E}_{shape} (a_{t}, θ_{t})}{{&PartialD; θ}_{t}} - - - (24)

Wherein, provide data item by following formula

\frac{{&PartialD; E}_{data} (a_{t}, θ_{t})}{{&PartialD; θ}_{t}} = &lang; &dtri; ψ \frac{d (T_{θ_{t}} x)}{{dθ}_{t}}, δ (φ_{a_{t}}) [\frac{{(I_{t} - μ_{1})}^{2}}{{2 σ}_{1}^{2}} - \frac{{(I_{t} - μ_{2})}^{2}}{{2 σ}_{2}^{2}} + \log \frac{σ_{1}}{σ_{2}}] &rang;, - - - (25)

And provide driving item from priori by following formula:

\frac{{&PartialD; E}_{shape}}{{&PartialD; θ}_{t}} = \frac{&PartialD; v}{{&PartialD; θ}_{t}} Σ^{- 1} v = \frac{d ({Δθ}_{t})}{{dθ}_{t}} [\begin{matrix} 0 & 0 \\ 0 & 1_{s} \end{matrix}] Σ^{- 1} v, - - - (26)

Wherein, as mentioned above, shape prior contribution is towards the driving force of the most similar conversion of being predicted by dynamic model.Block diagonal matrix in the equation (11) carries out modeling to the projection on s the conversion composition of defined associating vector v in formula (21) simply.

5. experimental result

5.1 dynamic and static statistics shape prior

Below, will be presented in the front for based on the purpose of the tracking of level set and the dynamic statistics shape prior of introducing.

According to this process, in order to make up shape prior, and, obtain manually the cutting apart of sequence of the people of (in this example) walking as pointed in the step 100 of Figure 13, this sequence is placed in the middle and each shape carried out binarization.Then, in the step 102 in Figure 13, the definite signed distance function { φ that is associated with each shape and main 6 natural modes that calculated of this process _i} _I=1..NSet.Each training shapes is projected on these natural modes, and in step 104, this process obtains shape vector sequence { a _i∈ R ⁶} _I=1..NThis process is by calculating mean vector μ, translation matrix A ₁, A ₂∈ R ^{6 * 6}And the noise covariance ∑ ∈ R shown in the equation (14) ^{6 * 6}And second order multivariate autoregressive model is coupled on this sequence.Then, this process is not relatively by utilizing the dynamic statistics shape prior and utilizing cutting apart of the noise sequence that obtained cutting apart in 6 n-dimensional subspace ns of dynamic statistics shape prior.

Do not utilize that dynamic priori carries out cut apart with the subspace that utilizes more initial natural modes in static evenly priori consistent cutting apart of being obtained, as at A.Tsai, A.Yezzi, W.Wells, C.Tempany, D.Tucker, A.Fan, E.Grimson and A.Willsky " Model-based curve evolution techniquefor image segmentation " (Comp.Vision Patt.Recog., the 463-468 page or leaf, Kauai, Hawaii, calendar year 2001) proposed in like that.Though (for example have interchangeable model for resting shape priori, M.Leventon, W.Grimson, " Statistical shape influencein geodesic active contours " (CVPR with O.Faugeras, volume 1, the 316-323 page or leaf, Hilton HeadIsland, SC, 2000) in Gauss model or D.Cremers, S.J.Osher, " Kerneldensity estimation and intrinsic alignment for knowledge-drivensegmentation:Teaching level sets to walk " (Pattern Recognition with S.Soatto, volume 3175 ofLNCS, the 36-44 page or leaf, spring in 2004) and M.Rousson and D.Cremers[MICCAI, volumel, the 757-764 page or leaf] in the imparametrization static model), in experiment, still find, when all these models are used for image sequence and cut apart (referring to Fig. 7), all these models are showed qualitatively similarly restriction, because these models do not utilize the time shape relevant,, these models minimize so being tending towards beginning the part.

In step 108, this process then by the maximization probability continuous motion of the object during for phenomenon in the future, that occur not expecting use this model, this probability be the statistical shape model coupling studied occur not expecting phenomenon the time the probability of continuous motion of object.

Therefore, with reference to figure 5, show sample incoming frame from sequence with 25%, 50%, 75% and 90% noise.(it should be noted that noise means, 25% of all pixels are substituted by the random strength of being sampled from even distribution.What need arouse attention is, our algorithm is easy to handle even noise, although the new probability formula of this noise is based on the hypothesis of Gaussian noise.) Fig. 6 illustrates the set of cutting apart that the even resting shape priori utilized on the sequence with 25% noise obtained.Fig. 6 illustrates cutting apart that the resting shape priori of using 25% noise carries out.Level set is developed the noise that is tied to a certain quantity of low n-dimensional subspace n permission processing.Though when having medium noise, it is successful not utilizing cutting apart that dynamic priori carries out, Fig. 7 still illustrates, and when increasing noise level, does not utilize dynamic priori carries out cut apart and may collapse.Fig. 7 show use 50 the resting shape priori of % noise cutting apart of carrying out.Use static (evenly) shape prior, splitting scheme can not be handled the noise of bigger quantity.After more initial frames, this is cut apart the beginning part and minimizes.Because resting shape priori can not in time provide prediction, so these priori have the trend of the shape estimation that is obtained on the former image of beginning.More particularly, Fig. 6 shows cutting apart that the resting shape priori utilized on the walking sequence with 25% noise carries out.Level set developed is tied to the noise that allows to handle a certain quantity on the low n-dimensional subspace n, and Fig. 7 shows cutting apart that the resting shape priori utilized on the walking sequence with 50% noise carries out.Only use resting shape priori, splitting scheme can not be handled the noise of bigger quantity.

Fig. 8 illustrates cutting apart of the sequence identical with sequence among Fig. 7, and the dynamic statistics shape prior that this cuts apart utilization derives from second-order autoregressive model obtains.Fig. 8 shows cutting apart that the dynamic shape priori of using 50% noise carries out.Compare cutting apart of carrying out with the static priori of utilization shown in Fig. 7, (using second-order autoregressive model) dynamically priori forces the information of the statistical learning of the time behavioral characteristics that develops about shape, low-level information that lose with processing or misleading.

Fig. 9 and 10 illustrates the statistics shape prior good cutting apart is provided, and on average has 90% noise.Significantly, utilize the time statistics of dynamic shape to allow to make cutting procedure highly stable with misleading information for what lose.More particularly, Fig. 8 illustrates use based on cutting apart that the dynamic statistics shape prior of second-order autoregressive model carries out.Compare with cutting apart among Fig. 7, this priori is forced the information of the statistical learning of the time behavioral characteristics that develops about shape, handling misleading low-level information, and Fig. 9 illustrates the tracking that utilizes the dynamic statistics shape prior to carry out, to handle the more noise of big figure.Input picture is by 90% noise corrupted.Yet the dynamic shape model of statistical learning allows to explain low-level information.These experiment confirms, tracking scheme can be competed mutually with the ability of human viewer really.Fig. 9 illustrates the tracking that the dynamic shape priori of utilizing 75% noise is carried out.The dynamic shape model of statistical learning allows to explain low-level information.Figure 10 illustrate utilize 90 the tracking of dynamic shape priori of % noise.Illustrate with respect to the qualitative assessment of ground truth in the left side of Figure 11 and to point out that our tracking scheme can be competed mutually with the ability of human viewer really, provide reliably in the place of human viewer failure and cut apart.Owing to utilize cutting procedure that dynamic shape priori carries out, so the segmentation result on some frames is inaccurate at first along with the variation accumulative image information (and storer) of time.

5.2 the qualitative assessment of noise stability

For the accuracy of cutting apart is quantized, use manually cutting apart of original cycle tests.Then, the error metrics below the definition:

ϵ = \frac{{&Integral; (Hφ (x) - H φ_{0} (x))}^{2} dx}{&Integral; Hφ (x) dx + &Integral; H φ_{0} (x) dx}, - - - (27)

Wherein, H also is the Heaviside step function, φ ₀Be real cutting apart, and φ is estimated cutting apart.This error is corresponding with the opposed area that symmetrical difference is set (just, two unions of cutting apart deduct its common factor) of being divided by each zone of cutting apart.Though there are the various tolerance of segmentation accuracy, at this tolerance decision segmentation accuracy, because this accuracy presents the value in well-defined scope 0≤ε≤1, wherein ε=0 is corresponding to perfectly cutting apart.

The left side of Figure 11 shows the average error of cutting apart on cycle tests according to noise level.Use the dynamic shape priori (part 3.3) of distortion and conversion, utilize the estimation initialization cutting procedure of initial position simultaneously.This curve shows several things: at first, at the noise level that is lower than 60%, it is quite constant that error keeps.This is due to the fact that: the weight v of priori is arranged to fixed value (in theory, for lower noise, can use less weight) at all experiments.Therefore, about 5% remainder error comes from the difference between estimated dynamic model and the real sequence, the approximate error of being introduced of accumulation and autoregressive model approximate by major component.Secondly, as desired, for bigger noise figure, error increases.The deviation of (particularly 90% noise place) monotonicity influences statistical fluctuation probably.The initial posture that combines with the priori of translation composition about walking is estimated to have caused such fact: even under the situation of 100% noise, this error still is lower than the error of random division.Figure 11 shows the quantitative evaluation of segmentation accuracy.At the speed of travel (the right) of increased numbers purpose noise (left side) and variation, draw and cut apart error relatively.Even for 100% noise, cut apart error and still maintain below 1 well, because this process combines the good estimation of initial position and the model of translation motion.The curve on the right illustrates the speed v of learning for being lower than ₀Speed of travel v, (having 70% noise) cut apart error and kept low, and for the sequence of walking faster, accuracy slowly reduces.Yet even for 5 times the sequence that is the speed of travel learnt, being segmented in of utilizing that dynamic priori carries out is better than utilizing cutting apart that static priori carries out on the performance.

5.3 the surely property walked for frequency and frame frequency variation

What obtained on given some frames in the end cuts apart, and dynamic shape priori has been introduced about some profile priori how similarly.Suppose that this process is from fixing speed of travel v ₀Sequence in learnt the people's of walking dynamic model.Obviously, estimated model will be adjusted to this specific speed of travel.Yet, when using such model in practice, can not guarantee that the people in the cycle tests will be just in time with identical speed walking.Similarly, the frame frequency that can not guarantee (even the speed of travel is identical) camera is identical.For more useful in practice, the priori that is proposed must be stablized for the variation in walking frequency and the frame frequency.

In order to verify this stability, perhaps by abandoning some frame (in order to quicken footwork) or synthesizing the cycle tests of the different speeds of travel by duplicated frame (thereby the footwork that slows down).The right side of Figure 11 is illustrated in the defined error ε of cutting apart in the equation (27), and it is average on the cycle tests with 70% noise and following speed that this cuts apart error, and this speed can change to 5 times of raw velocity from 1/5 of training sequence speed.Sequence does not influence accuracy though slow down, in case speed increases, accuracy will reduce gradually.Yet it is very stable that cutting procedure changes for the such fierceness in the speed.The reason of this stability is dual: at first, Bayesian formula allows in such a way built-up pattern prediction and input data, and this mode is that cutting procedure is suitable for the input data that are about to into consistently.Secondly, autoregressive model only depends on some last estimated profiles, to produce the shape probability of present frame.The time consistency of long scope can not be supposed and the sequence of the speed of travel can be therefore handled with variation.Experiment shows, even for 5 times the sequence that is original walking sequence, is better than cutting apart of utilizing that dynamic model carries out utilizing cutting apart that static model carry out.This is not astonishing: compare the prediction that dynamic model provides the time shape to develop with static model.Even this prediction may be inferior good, but still allow to strengthen cutting procedure for the diverse speed of travel.

5.4 the associating dynamic perfromance of distortion and conversion

In part 3.2, introduce the differentiation of uniting that dynamic model is caught distortion and conversion parameter.On the task shown in up to now, find that the conjunctive model of pure distorted pattern and distortion and conversion provides similar segmentation result.Though conjunctive model provides the priori about conversion parameter, these parameters are that very similar, pure distorted pattern needs to estimate individually these parameters from these data in the given moment.

As last example, produce and cut apart task, wherein, owing to significantly block, conversion parameter can not be estimated from data reliably.Cycle tests shows that a people goes to the left side from the right, and blocks barrier and move to right from the left side, and the noise by 80% destroys.The top line of Figure 12 illustrates utilizes caught distortion and conversion cutting apart that dynamic shape priori obtained.Even the profile of walking is by total blockage, this model still can produce the profile on the left side of walking, in case and figure when occurring once more, model and view data are suitable.

On the other hand, the end row of Figure 12 illustrates and utilizes cutting apart of same number of frames that dynamic model carries out, and this dynamic model is only incorporated warpage into.Owing to not have supposition about the knowledge of translation, the cutting procedure image information that need place one's entire reliance upon is so that the estimation conversion parameter.Therefore, significantly obstruction has misled cutting procedure.When figure occurs after barrier once more, this process in conjunction with by the people on the left side of walking and by move to right barrier provided about the information of the contradiction of translation.In case interested figure has been lost, priori only produces the people's of " at the scene " walking " illusion " of profile, referring to the last piece image of right lower quadrant.Although the experiment of " failure " is still believed, this result has explained best how dynamic model and image information are merged in the Bayesian formula that image sequence is cut apart.Figure 12 illustrates the tracking that exists when blocking.List entries shows that the barrier that the people on the left side of walking is moved to the right has blocked.Though utilize dynamic priori to produce top line, this dynamic priori combines distortion and conversion, and exercise with the dynamic priori of only catching the distortion composition at the end.Because the latter does not provide the prediction of translation motion, so the estimation of translation is purely based on view data.In case when the people reappears behind barrier, block its misleading, and can not recover.

6. conclusion

Utilize process recited above, the dynamic statistics shape is used to the shape of implied expression.Compare with the existing shape at the implicit expression shape, these models are caught the time correlation that characterizes the deformed shape feature, and this deformed shape is such as the people's of walking continuous profile.This dynamic shape specification of a model is following true: the probability of observing specific shape in the given moment depends in observed shape of the moment before.

In order to construct statistical shape model, this process with the conceptual expansion of Markov chain and autoregressive model in the territory of the shape of implied expression.Therefore, the shape of final dynamic shape model permission processing variation topology.In addition, these models are extended to the shape (surface just) of higher dimension at an easy rate.

Estimated dynamic model allows the shape sequence of synthetic random length.For people's the situation of walking, verified the accuracy of estimated dynamic model, the dynamic shape of list entries is developed shape with the composition sequence of different shape natural mode develop and compare, and confirm that residue is incoherent statistically.Although the shape of being synthesized is with effectively shape is not corresponding in all cases, still can uses dynamic model to retrain and cut apart and tracing process in the mode of supporting similar shape to develop.

, cut apart at the image sequence based on level set for this reason, studied Bayesian formula, this allows the dynamic model of strong statistical learning to force shape prior for cutting procedure.Compare with a lot of existing trackings, autoregressive model as the statistics priori be integrated in the method for variation, this method can descend by partial gradient (rather than by random optimization method) minimize.

Experimental result confirms that when having much noise, when the man-hour of following the tracks of walking, dynamic shape priori is better than resting shape priori on performance.As provide, according to noise qualitative assessment segmentation accuracy.In addition, find that be very stable based on the cutting procedure of model to big (until the 5 times) variation in the frame frequency and the speed of travel.And then when the people who follows the tracks of walking passed through significantly to block, the dynamic priori in the joint space of distortion and conversion was indicated on the performance owing to pure priori based on distortion.

A plurality of embodiment of the present invention has been described.Yet, be readily appreciated that, under situation without departing from the spirit and scope of the present invention, can carry out various modifications.Therefore, in the scope of other embodiment claim below.

Claims

1, a kind ofly be used for the method that detection and tracking have the deformable objects of continuous variation characteristic, this method comprises:

The time statistical shape model of the continuous variation characteristic of research imbedding function, described imbedding function is represented the object from the priori motion; And

Then by the maximization probability, described model is used in the continuous motion of the object during for phenomenon in the future, that occur not expecting, described probability be the statistical shape model coupling studied occur not expecting phenomenon the time the probability of continuous motion of object.

2, a kind of method, it comprises:

Produce the dynamic model of the time-evolution of the imbedding function that the priori of boundary shape object observes, such object has observable, continually varying boundary shape; And

After subsequently such model being used for about the probabilistic inference of the object of such shape.

3, a kind ofly be used for the method that detection and tracking have the deformable objects of continuous variation characteristic, this method comprises:

The time statistical shape model of the continuous variation characteristic of research imbedding function, described imbedding function represent that this process comprises from the object of priori motion:

The figure sequence that research is manually cut apart, this figure sequence shows the level set function of the training shapes and each shape in the described shape of object;

Calculate the shape vector sequence, this shape vector sequence comprises the major component expression of level set function;

Estimate the dynamic model of shape vector sequence; And

Determine to have one of shape vector sequence with the maximum probability of mating described dynamic model.

4, method as claimed in claim 3 wherein, estimates that described dynamic model comprises at shape vector sequence use autoregressive model.

5, method as claimed in claim 3, wherein, described maximum probability determines to comprise the maximization conditional probability.

6, method as claimed in claim 3, wherein, described maximum probability is determined to be included in and is carried out gradient on the negative logarithm of conditional probability and descend.

7, method as claimed in claim 3 wherein, estimates that described dynamic model comprises according to α _t=μ+A ₁α _T-1+ A ₂α _T-2+ ... + A _kα _T-k+ η comes the shape vector of level of approximation set function sequence by k rank Markov chain

α_{t} &equiv; α_{φ_{t}},

Wherein, η is the zero-mean Gaussian noise with covariance ∑.

8, method as claimed in claim 6, wherein, given consecutive image I from image sequence _t: Ω → R, and given former image I _1:t-1On obtained cut apart α _1:t-1And conversion

Maximum probability is determined to comprise:

With respect to shape vector α _tWith conversion parameter θ _t, the maximization conditional probability

P (α_{t}, θ_{t} | I_{t}, α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}) = \frac{P (I_{t} | α_{t}, θ_{t}) P (α_{t}, θ_{t} | α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1})}{P (I_{t} | α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1})}, (11)

Wherein, described conditional probability is modeled as:

P (α_{t}, θ_{t} | α_{1 : t - 1}, {\hat{θ}}_{1 : t - 1}), . . . (12)

Described conditional probability is that condition has constituted and is used to observe the given shape α of t constantly with the parameter estimation of the shape that obtained on the former image and conversion _tWith particular conversion θ _tProbability.

9, method as claimed in claim 6 wherein, is carried out gradient decline and is comprised use gradient decline strategy on the negative logarithm of conditional probability, this causes the following differential equation, to estimate shape vector α _t:

\frac{d α_{t} (τ)}{dτ} = - \frac{{&PartialD; E}_{data} (α_{t}, θ_{t})}{&PartialD; α_{t}} - v \frac{{&PartialD; E}_{shape} (α_{t}, θ_{t})}{&PartialD; α_{t}} . . . (22)

Wherein, with respect to physics moment t, τ represents that artificial differentiation constantly.