CN103106667B

CN103106667B - A kind of towards blocking the Moving Objects method for tracing with scene change

Info

Publication number: CN103106667B
Application number: CN201310039754.3A
Authority: CN
Inventors: 房胜; 汴紫涵; 徐田帅; 王飞; 党超
Original assignee: Shandong University of Science and Technology
Current assignee: Shandong University of Science and Technology
Priority date: 2013-02-01
Filing date: 2013-02-01
Publication date: 2016-01-20
Anticipated expiration: 2033-02-01
Also published as: CN103106667A

Abstract

The invention discloses a kind of towards blocking the Moving Objects method for tracing with scene change, comprising the following steps: that the video sequence of a to input carries out foreground moving detection, extracting Moving Objects; If b has preserved the feature of tracing object, then enter steps d; If do not preserved, then the region selected according to user completes template initialization to target object and SURF feature extraction, and the initialization of Kalman filter; C adopts and carries out predicting tracing based on the method for Kalman filter to moving target, until video content terminates, enters step e; When blocking in tracing process, then enter steps d; D uses the matching process determination tracing object based on SURF feature, at the end of characteristic matching tends towards stability and judges to block, enters step c after reinitializing Kalman filter; E exports and preserves destination object characteristic information.The present invention is the Moving Objects in Video Sequences method for tracing being applicable to fixed background monocular-camera of complete set, can be made into software, is convenient to application.

Description

A kind of towards blocking the Moving Objects method for tracing with scene change

Technical field

The invention belongs to the tracer technique field of image processing techniques, Moving Objects.Specifically relating to combines based on kalman filtering and SURF method realizes blocking quick, the accurate method for tracing with Moving Objects in scene change situation.

Background technology

Current Moving Objects in Video Sequences method for tracing has several as follows:

One is the tracking based on region, and first it be partitioned into the object video of each frame, then sets up the corresponding relation between each cutting object, thus realizes the tracking to object video.This method requires very high to the segmentation of object video, once the Object Segmentation mistake of a certain frame in video segment or a few frame, then the tracking of whole object video will be failed.

Two is GraphCuts method (being also called Min-Cut/Max-Flow method), is a kind of image partition method of classics, and a lot of image partition method is all derived based on GraphCuts at present.Back due to the tracking of Moving Objects is all the extraction of sport foreground object usually, and therefore this tracking Application comparison based on region is extensive.But because this method can not split the object mutually blocked well, therefore this method effect in the scene of blocking frequent generation is poor.

Three is the trackings based on model, and the tracking at present based on model is mainly divided into two classes: namely based on the human body tracking of model and the vehicle tracking based on model.Due to the characteristic of the method, after the corresponding relation obtaining object 2D image coordinate and 3D coordinate, even if angular transformation largely occurs object, the 3D model of object also can be utilized to follow the tracks of.The method requires first to carry out modeling to tracked object, then the content in this model and video image is carried out coupling to realize following the tracks of; And this method requires there are enough prioris to tracked object, could set up out effective object module.

Summary of the invention

Task of the present invention is to provide a kind of towards blocking the Moving Objects method for tracing with scene change, and the method can realize the tracking to special exercise object in video quickly and accurately.

Its technical solution is:

Towards blocking the Moving Objects method for tracing with scene change, comprise the following steps:

The video sequence of a to input carries out foreground moving detection, extracts Moving Objects; Then step b is entered,

If b has preserved the feature of tracing object, then enter steps d; If do not preserve the feature of tracing object, then the region selected according to user completes template initialization to target object and SURF feature extraction, and the initialization of Kalman filter; Then step c is entered,

C adopts and carries out predicting tracing based on the method for Kalman filter to moving target, until video content terminates, enters step e; When blocking in tracing process, then enter steps d;

D uses the matching process determination tracing object based on SURF feature, at the end of characteristic matching tends towards stability and judges to block, enters step c after reinitializing Kalman filter;

E exports and preserves destination object characteristic information.

In above-mentioned steps a, set up two reference frame I _bg(x, y), I _up(x, y), I _bgthe background frames that (x, y) is current scene, I _up(x, y) is a reference frame constantly updated in time; By present frame I (x, y) respectively with I _bg(x, y), I _up(x, y) carries out difference binaryzation, and the result obtained is designated as: F _bg(x, y), F _up(x, y), tells legacy in scene and moving object according to the value of the two.

In above-mentioned steps c, first initialization is carried out to Kalman filter, then carry out predicting tracing according to the target object state observed; In tracing process, the profile variations situation according to destination object upgrades template image adaptively, and is preserved by representational characteristic information; In tracing process, the determination methods intersected based on profile is adopted to carry out modeling, analysis and judgement to whether blocking.

In above-mentioned steps d, automatic search video content also finds the prospect agglomerate maximum with tracked object features Point matching, for the erroneous matching that measuring error and noise cause, rear use RANSAC algorithm is carried out to exact matching and obtains the homography matrix of the conversion between image, spotting object in video obtaining SURF characteristic matching point; Judge to block whether terminate to block employing same model with above-mentioned judging whether; The method reinitializing Kalman filter employing is identical with the initial method of above-mentioned Kalman filter.

The present invention can have following Advantageous Effects:

The method that the present invention adopts Kalman filter and SURF characteristic matching to combine, on the one hand when unobstructed and scene change, Kalman filter can complete prediction fast and follow the trail of, on the other hand when blocking with scene change, the Target Tracking Problem under utilizing the characteristics such as the Scale invariant of SURF feature effectively can solve Kalman filter failure conditions; Therefore there is advantage fast and accurately, and owing to can upgrade To Template adaptively according to the change of objective contour, robustness is also fine.The present invention is integrated use legacy detection algorithm, Kalman filter, SURF feature, shadowing etc., and the Moving Objects in Video Sequences method for tracing being applicable to fixed background monocular-camera of the complete set put forward, can be made into software, is convenient to application.

Accompanying drawing explanation

Below in conjunction with accompanying drawing and embodiment, the present invention is further described:

Fig. 1 is the foreground moving testing process schematic diagram in the present invention.

The basic Kalman filter workflow schematic diagram of Fig. 2 for using in the present invention.

The metric space contrast schematic diagram that Fig. 3 builds for SURF and the SIFT algorithm used in the present invention.

The SURF algorithm square frame filtering schematic diagram in three directions of Fig. 4 for using in the present invention.

Fig. 5 is FB(flow block) of the present invention.

Embodiment

In order to understand better and realize the present invention, first the technical background that the present invention uses is described below:

One, Moving Objects detection algorithm.

1. time domain method of difference

Two two field pictures adjacent in video sequence are done difference by time domain difference exactly, extract Moving Objects by the pixel value difference obtained.This method is simple and convenient, is applicable to the extraction under dynamic background.But the objective contour obtained by this method possibility is also imperfect.Such as when moving object is moved very slow, and when itself having large area smooth region, so adjacent two two field pictures are done difference and just can not obtain overlapping part, the profile obtained there will be in " cavity ".

It is utilize three-frame difference to replace two frame differences that current one is improved one's methods, and so just can detect the profile of intermediate frame motion target preferably.Three two field pictures adjacent in setting video sequence are: I _t-1(x, y), I _t(x, y), I _t+1(x, y), calculates the pixel value difference of adjacent two frames respectively:

Then binary conversion treatment is carried out to the error image obtained, obtains binary image:

To b _{(t, t1)}(x, y) and b _{(t+1, t)}(x, y) carries out logic "and" operation, obtains bianry image B _t(x, y):

B_{t} (x, y) = \{\begin{matrix} 1 & b_{(t, t - 1)} (x, y) \cap b_{(t + 1, t)} (x, y) = 1 \\ 0 & b_{(t, t - 1)} (x, y) \cap b_{(t + 1, t)} (x, y) = 0 \end{matrix} - - - (3)

Finally the process such as burn into expansion are carried out to the bianry image obtained, with stress release treatment and " cavity ".

2. optical flow method

The ultimate principle that optical flow method detects moving object is: to each pixel definition velocity in image, if do not have moving object in image, then light stream vector is continually varying in whole image-region; If there is moving object in image, then there is relative motion between moving object and background, the velocity of the two is different, thus detects moving object.The very complicated and calculated amount of computing method due to light stream is very large, therefore general not adopt by real-time system.

3. background modeling

Background modeling and time domain difference method have similarity, are all the differences doing two two field pictures, are that present frame and reference frame (background frames) are carried out difference operation unlike background modeling method.The motion that background modeling method is widely used in static camera detects, choosing of background frames is the key of whole algorithm, background modeling is namely to the modeling of background frames, background frames ideally should not contain moving object, and can carry out according to certain strategy the dynamic change upgrading to adapt to scene, such as, leaf swing in illumination variation, background, ripple, the situation such as sleet descends slowly and lightly.Existing background modeling method is broadly divided into six classes: increment type Gauss is average, sequential medium filtering, mixed Gauss model, Density Estimator, order core density are approximate and feature background model.

The method for testing motion that the present invention adopts is exactly a kind of simple and quick method based on background modeling thought: legacy detection method, composition graphs 1.Set up two reference frame I _bg(x, y), I _up(x, y).I _bgthe background frames that (x, y) is current scene, I _up(x, y) is a reference frame constantly updated in time.By present frame I (x, y) respectively with I _bg(x, y), I _up(x, y) carries out difference binaryzation, and the result obtained is designated as: F _bg(x, y), F _up(x, y) also differentiates legacy in scene and moving object according to the two value.

Two, based on the tracing algorithm of Kalman filter, composition graphs 2.

1. discrete Kalman filter

This be the present invention use unobstructed, without Moving Objects method for tracing during scene change.Basic Kalman filtering is the method solving linear filtering and forecasting problem, minimum for criterion with square error, has simply, feature fast.

Kalman filter forecasting process:

{\hat{X}}_{k}^{-} = F_{k} {\hat{X}}_{k - 1} + B_{k - 1} u_{k - 1} - - - (1)

P_{k}^{-} = F_{k - 1} P_{k - 1} {F_{k - 1}}^{T} + Q_{k - 1} - - - (2)

Kalman filter renewal process:

K_{k} = P_{k}^{-} {H_{k}}^{T} {(H_{k} P_{k}^{-} {H_{k}}^{T} + R_{k})}^{- 1} - - - (3)

{\hat{X}}_{k} = {\hat{X}}_{k}^{-} + K_{k} (Z_{k} - H_{k} {\hat{X}}_{k}^{-}) - - - (4)

P_{k} = (I - K_{k} H_{k}) P_{k}^{-} - - - (5)

Wherein F is state-transition matrix, and H is calculation matrix, and Z is observed reading, and B is Input transformation matrix, and u is input value (do not need new input value in some system, therefore B and u can omit).Q and R represents the variance of noise vector in state migration procedure and observation process respectively. represent that the k-1 moment is to k moment state X _kbest predictor, represent the observed reading Z with the k moment _kwith the predicted value of previous moment to this moment to X _kthe state updating done.P is covariance, and its upper and lower target implication is identical with X, and K represents Kalman gain.

2. the Kalman filter of expansion

For the second best measure of Nonlinear state space model, the class methods be most widely used are Kalman filter (extendedKalmanfiltering, EKF) of expansion.The basic ideas of EKF are first by nonlinear system linearization, and then carry out the process similar with linear Kalman filter device.Its concrete grammar blocks the Taylor expansion of nonlinear function, thus by nonlinear function linearization.What carry out according to Taylor expansion is that single order or second order intercept, and EKF mainly can be divided into single order EKF(firstorderEKF) and second order EKF(secondorderEKF).

Although the Kalman filter of expansion has outstanding performance in solution nonlinear model, it also exists significantly not enough in actual applications: one is carried out by nonlinear model producing unstable filtering in linearizing process; Two is when calculating the derivative of Jacobian matrix, realizes comparatively complicated; Three is situations that pattern function in reality may exist non-differentiability, and EKF will be caused like this to lose efficacy.Therefore, when the noise in the non-linear comparatively strong of model and system is non-gaussian distribution, the estimated accuracy of EKF will reduce greatly, finally leads to the failure.

Three, SIFT, SURF characteristic matching method, the left figure in composition graphs 3, figure is the image pyramid built in classic method, and last layer image is the down-sampling to last tomographic image; Right figure is the method building metric space in SURF algorithm, and image is constant, the size of the just Filtering Template of change.

SURF algorithm is the innovatory algorithm of SIFT.SURF algorithm will be far superior to SIFT algorithm in characteristic matching speed, and therefore SURF algorithm can be applied to further in real-time images match scene and go.The present invention adopts SURF algorithm to carry out Feature Points Matching.

SURF algorithm is divided into metric space structure, feature point detection, feature descriptor to generate and Feature Points Matching four parts.

1. metric space builds

The metric space of SURF algorithm is formed

Traditional metric space is described to a pyramid, and Gaussian convolution core is the unique linear core realizing change of scale, given piece image I (x, y), then its metric space is defined as:

L(x,y,δ)=G(x,y,δ)*I(x,y)（1）

Wherein G (x, y) is changeable scale Gaussian function,

G (x, y, δ) = \frac{1}{{2 πδ}^{2}} e^{- (x^{2} + y^{2}) / {2 δ}^{2} - - - (2)}

(x, y) is volume coordinate, and δ is yardstick coordinate, and the size of δ determines the smoothness of image.Utilize formula above, finally determine the exponent number of pyramid metric space according to the size of image, and comprise the number of plies of image in the pyramid of every rank.The first pyramidal ground floor in rank is original image, and every one deck up carries out Laplace conversion (Gaussian convolution, δ value increases gradually) to front one deck afterwards.Intuitively, image is more up fuzzyyer.

In order to effectively stable key point be detected in metric space, the people such as Lowe propose Gaussian difference scale space (DOGscale-space).

D(x,y,δ)=(G(x,y,kδ)-G(x,y,δ))*I(x,y)=L(x,y,kδ)-L(x,y,δ)(3)

Every one deck of DoG gold tower metric space is subtracted each other by gaussian pyramid adjacent two layers and obtains, and therefore DoG pyramid is compared with Gauss's gold tower, and the exponent number of tower is identical, but the number of plies in the tower of every rank subtracts one.

The metric space of SIFT algorithm is formed

SIFT algorithm builds the method for metric space, and its shortcoming is that the foundation of every tomographic image all will depend on last tomographic image, and the size of image needs to reset, and therefore the operand of this method is larger.

What SURF algorithm changed when building image pyramid is not picture size, but the size of Filtering Template.SURF algorithm can parallel processing when building metric space, and does not need to carry out double sampling to image, thus improves arithmetic speed.The difference of the metric space of SURF algorithm and SIFT algorithm construction as shown in Figure 3.

2. feature point detection

Given piece image I (x, y), its integral image is

I_{Σ} (x, y) = Σ_{i = 0}^{i \leq x} Σ_{j = 0}^{j \leq y} I (i, j) - - - (4)

SURF feature point detection utilizes Hessian determinant of a matrix to judge, and on image, whether certain point is extreme point.If f (x, y) is the function that a second order can be micro-, then its Hessian matrix is

H (f (x, y)) = [\begin{matrix} \frac{{&PartialD;}^{2} f}{{&PartialD; x}^{2}} & \frac{{&PartialD;}^{2} f}{&PartialD; x &PartialD; y} \\ \frac{{&PartialD;}^{2} f}{&PartialD; x &PartialD; y} & \frac{{&PartialD;}^{2} f}{{&PartialD; y}^{2}} \end{matrix}] - - - (5)

The then determinant of matrix H

\det H = \frac{{&PartialD;}^{2} f}{{&PartialD; x}^{2}} \frac{{&PartialD;}^{2} f}{{&PartialD; x}^{2}} - {(\frac{{&PartialD;}^{2} f}{&PartialD; x &PartialD; y})}^{2} - - - (6)

Be the product of the eigenwert of H, if detH<0, then point (x, y) is not Local Extremum; If detH>0, then point (x, y) is Local Extremum.So the Hessian matrix of image I (x, y) under yardstick δ is

H (x, y, δ) = [\begin{matrix} L_{xx} (x, y, δ) & L_{xy} (x, y, δ) \\ L_{xy} (x, y,δ) & L_{yy} (x, y, δ) \end{matrix}] - - - (7)

Wherein L _xx(x, y, δ) is that image is at point (x, y) place and Gaussian function second order local derviation convolution, definable L similarly _xy(x, y, δ) and L _yy(x, y, δ).

The people such as Bay replace with square frame Filtering Template after carrying out rational discretize and cutting to convolution kernel, and use integral image to reduce calculated amount to accelerate convolution speed.After cutting, three convolution kernels are respectively D _xx, D _yyand D _xy, they are L _xx, L _yyand L _xyreduced representation.The square frame Filtering Template of 9 × 9 as shown in Figure 4, corresponding second order Gauss filter scale factor delta=1.2.

Due to the approximate evaluation that square frame filtering is second order Gauss filtering, therefore in order to make up the error caused by square frame filtering replacement second order Gauss filtering calculating Hessian matrix determinant value, then have

detH=D _xxD _yy-(0.9D _xy) ²(8)

The length of side be 9 convolution kernel be the convolution kernel of smallest dimension, along with the continuous increase of yardstick, the increase that the size of convolution kernel (Filtering Template) is also proportional.If Filtering Template size is N*N, then corresponding yardstick δ=N*1.2/9.

Obtained the extreme point of each yardstick by Hessian matrix determinant after, each extreme point compares with 8 consecutive point of its same yardstick and each 9 points of its upper and lower two yardstick, when the value of this extreme point is maximum or minimum in 26 values, just by this extreme point by alternatively unique point.

Finally carry out interpolation arithmetic by the method that M.Brown mentions, obtain characteristic point position and the place scale-value of sub-pixel precision.Remove the low unique point of contrast and unstable skirt response point (because DoG method can produce stronger skirt response), to improve the stability of noise resisting ability and enhancing coupling simultaneously.

Because unique point is the local stability point spatially chosen at graphical rule, so the requirement of characteristic matching under meeting dimensional variation situation.

Why high than the operation efficiency of SIFT algorithm SURF algorithm is, is because SURF algorithm employs integral image and Hessian matrix to accelerate the detection of unique point.

3.SURF feature descriptor generates

The generation of SURF Feature Descriptor is divided into two steps: principal direction is determined and built descriptor.

Principal direction is determined.For ensureing the rotational invariance of unique point, around unique point 6 δ (δ is unique point place yardstick) neighborhood in by step-length δ, pixel is sampled, and ask in sample point the Harr wavelet convolution core response in x and y direction that the length of side is 4 δ.For making the response contribution near unique point large, the response contribution away from unique point is little, carries out Gauss's weighting, be then expressed as the point on a two-dimensional coordinate to little wave response by σ=2 δ, finally can obtain the distribution plan of all sampled points response on two dimensional surface.Then slide by fixed step size with the sliding window that a subtended angle is 60 °, the response within the scope of 60 ° is added at every turn and forms new vector, the principal direction of the direction selecting most long vector unique point for this reason

Build descriptor.Then centered by unique point, by X-axis rotate to principal direction.Choose the square area that the length of side is 20 δ, this window is divided into the subregion of 4 × 4, in each subregion, calculate the Harr response of 25 sampled points, be designated as d respectively _xand d _y.Then the Gaussian function weighting with σ=3.3 δ on each subwindow obtains d _xand d _yaccumulated value, i.e. ∑ d _x, ∑ d _y, ∑ │ d _x│, ∑ │ d _y│.Every sub regions forms a four-dimensional new vector, and because square area comprises 16 subwindows, then each unique point forms the descriptor of 16 × 4=64 dimension.

By the method that principal direction is alignd, SURF feature has rotational invariance, can be used for the characteristic matching under rotational case.

4.SURF Feature Points Matching

The coupling of unique point utilizes the Euclidean distance of proper vector to measure as the similarity determination of key point in two width images.The formula of Euclidean distance is:

d=sqrt(∑(x _i1-x _i2) ²)(9)

X _i1to represent on piece image certain any the i-th dimension coordinate, x _i2to represent on the second width image certain any the i-th dimension coordinate.

Specific practice is: get certain key point in piece image, finds out the first two key point nearest with its Euclidean distance in the second width image.If distance nearest in these two key points is less than certain threshold value except distance near in proper order, then thinks and find a pair match point.Increase this threshold value, the number of match point will increase, but accuracy rate declines; This threshold value of contrary reduction, the number of match point will reduce, but accuracy rate can improve.

The deficiency of 5.SURF algorithm in Moving Objects is followed the trail of

Although the scaling, rotation, translation, brightness change etc. of SIFT/SURF feature to image have comparatively stable unchangeability, it is applied in during Moving Objects is followed the trail of and still there is deficiency, mainly contain the following aspects:

1) being method based on image gold tower due to what adopt when building metric space, therefore may occurring that layer obtains undertighten thus causes yardstick to mate the situation having error.When original image itself is less, build the extraction impact of metric space on unique point little.

2) the low point of contrast and unstable skirt response point can be removed in SURF unique point filter process.If there is large stretch of smooth region in picture material, the characteristic information of this part content will be filtered like this.Marginal information equally as key character also may be omitted.

3) SURF feature is the local feature of picture material, have ignored the global information of image itself.

4) the search strategy efficiency that uses when Feature Points Matching of SURF algorithm is not high, and can not make full use of the position relationship between adjacent features point, thus the coupling that may make the mistake.

5) SIFT/SURF algorithm itself only make use of the gamma characteristic of image, does not consider image intrinsic colour information.

6) compared with kalman filter-tracking technology, the operand of SURF algorithm wants large many; The tracking that simple use SURF algorithm is used for Moving Objects in Video Sequences is difficult to the requirement meeting real-time.

Four, homography matrix.

When carrying out destination object location, use the concept of homography matrix.Two width image AB in the same space, if there are mapping relations one to one from image A to image B, this mapping relations matrix representation is exactly a homography matrix.If in two width images, certain point coordinate is respectively I (x, y, 1), I'(x', y', 1), homography matrix H, then corresponding projection relation is:

k [\begin{matrix} x^{'} \\ y^{'} \\ 1 \end{matrix}] = H [\begin{matrix} x \\ y \\ 1 \end{matrix}] = [\begin{matrix} h_{1} & h_{2} & h_{3} \\ h_{4} & h_{5} & h_{6} \\ h_{7} & h_{8} & h_{9} \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}] - - - (1)

Wherein k is scale-up factor, usual H to be degree of freedom be 8 transformation matrix, work as h ₉when=1, can be obtained by (1):

h ₁x+h ₂y+h ₃-h ₇xx′-h ₈yx′-x′=0(2)

h ₄x+h ₅y+h ₆-h ₇xy-h ₈yy-y=0(3)

As long as there is the coordinate of 4 pairs of match points just can calculate homography matrix H like this.

In conjunction with foregoing and Fig. 1 and Fig. 5, further describe technical scheme of the present invention below:

The video sequence of a to input carries out foreground moving detection, extracts Moving Objects.

In application scenarios to be processed required for the present invention, it is all the situation that possible occur that Moving Objects enters the rear object movement be suddenly trapped in for a long time in same position or background.The present invention adopts remnant object detection method to extract Moving Objects.

Legacy detects and detects different from motion, and it not only needs to detect non-existent object in original scene, also will judge whether this object stops in this scenario.

Concrete grammar is: set up two reference frame I _bg(x, y), I _up(x, y).I _bgthe background frames (not comprising moving object, consistent with the background frames in general background modeling) that (x, y) is current scene, I _up(x, y) is a reference frame constantly updated in time, if current image frame is I (x, y), then update method is:

I _up(x,y)=(1-α)I _up(x,y)+αI(x,y)(1)

α is renewal speed weight, and α is larger, and renewal rate is faster, and the less renewal rate of α is slower.If external object rests in scene like this, then I will be incorporated through this object after a period of time _upin (x, y), become the part of " background ".By I (x, y) respectively with I _bg(x, y), I _up(x, y) carries out difference binaryzation, and the result obtained is designated as: F _bg(x, y), Fup (x, y).As the pixel value F at fruit dot (x, y) place _bg(x, y)=1, F _up(x, y)=0 item can judge that this point belongs to legacy, specifically can tell the legacy in scene and moving object according to the method defined in table 1.

Table 1 remnant object detection method decisional table

	F _bg(x,y)	F _up(x,y)	Judge type
				Ⅰ	1	1	Moving Objects
Ⅱ	1	0	Temporary transient stationary objects (legacy)
				Ⅲ	0	1	Random noise
Ⅳ	0	0	Stationary body in scene

If b has preserved the feature of tracing object, then directly enter steps d; If no, then the region selected according to user completes template initialization to target object and SURF feature extraction, and the initialization of Kalman filter.

If preserved the feature of tracing object under the assigned catalogue of hard disk, then mean and there occurs scene change, now first import Template Information and the characteristic information of the destination object preserved under assigned catalogue, then in frame of video, global search coupling is carried out, if the mating to count of the moving object in a certain frame and template, exceedes certain threshold value specified, then think that target object occurs in this scenario, now need initialization Kalman filter to carry out tracking process.

If do not preserve the correlated characteristic of tracing object under the assigned catalogue of hard disk, then need user to be drawn on monitored picture by mouse and select target object.Algorithm carries out template initialization and SURF feature extraction according to the target object drawing choosing.

The initialization of above-mentioned Kalman filter is as the following step:

1) due to the restriction of practical application scene of the present invention, generally can not there is large change in the movement velocity of target object in scene, and therefore the present invention adopts the equation of motion of uniform motion to carry out treatment and analysis to object:

x _t=x _t-1+v _x(2)

y _t=y _t-1+v _y(3)

Formula (2) and formula (3) represent the equation of motion of object in x-axis and y-axis direction respectively, v _x, v _yrepresent object speed in the two directions respectively.In t, the state representation of a moving object is X _t=(x _t, y _t, v _x, v _y) ^t, observed reading is Z _t=(x _t, y _t) ^t.

2) the state-transition matrix F and the calculation matrix H that contrast Kalman filter model and the known system of the equation of motion are respectively:

F = [\begin{matrix} 1 & 0 & 1 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}],

H = [\begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \end{matrix}] - - - (4)

3) according to related experiment situation and the reference of practical application scene of the present invention, in the state migration procedure of the Kalman filter adopted in the present invention and observation process, the value of variance matrix Q and R of noise vector is as follows respectively:

Q = [\begin{matrix} 0.01 & 0 & 0 & 0 \\ 0 & 0.01 & 0 & 0 \\ 0 & 0 & 0.01 & 0 \\ 0 & 0 & 0 & 0.01 \end{matrix}],

R = [\begin{matrix} 0.2845 & 0.0045 \\ 0.0045 & 0.0455 \end{matrix}] - - - (5)

4) system initial state vector covariance matrix P is defined as follows:

P_{0} = [\begin{matrix} 100 & 0 & 0 & 0 \\ 0 & 100 & 0 & 0 \\ 0 & 0 & 100 & 0 \\ 0 & 0 & 0 & 100 \end{matrix}] - - - (6)

5) when the position adjustment of camera, above matrix Q, R, P ₀definition can adjust accordingly according to actual conditions.For original state definition, herein algorithm allows user-defined selection, and namely user can draw with mouse and select the target object that a certain moment wants to follow the tracks of, and then system is according to the observation position of now this object top left corner apex coordinate and speed initialization

After carrying out initialization to Kalman filter, system just can carry out predicting tracing to the position of target object.In tracing process, the profile variations situation according to destination object upgrades template image adaptively, and is preserved by representational characteristic information.

The present invention, according to the profile variation of tracked object, judges whether to upgrade To Template and SURF Feature Descriptor in real time.Profile variation herein mainly refers to the change because of displacement position and direction of when camera focal length is constant target, the change that the pixel total area in the image caused shared by target occurs.Namely

A _m-A _n|>H(7)

Time, judge target generation profile variation, Am and An represents the elemental area shared by m and n moment target respectively, and H is a threshold value, and this threshold value can adjust according to concrete scene.

After the elemental area shared by judgment object changes, system continues to judge whether the length breadth ratio of this target object changes, if the length breadth ratio R of now target object _twith the length breadth ratio R of the To Template preserved before _mcompare generation significant change, then think that object generation angle changes, need the Template Information upgrading destination object.

R _t-R _m|>H _R，R _t=W _tH _t(8)

H _rthreshold value when changing for judging that the generation of destination object length breadth ratio is larger, W _tand H _trepresent width and the height of t destination object respectively

Judge the generation of blocking.The present invention adopts the determination methods intersected based on profile, and the method is effective fast, and it needs blocking of solution two kinds of situations: one is mutually blocking between object, and two is that destination object is blocked by background.

We are known when destination object blocks, and the elemental area shared by its profile increases sharply or reduces.Elemental area increases expression and there occurs blocking between object, and elemental area reduces the situation representing and there occurs background occlusion objects.Therefore, after being detected obtained the binary image of moving object contours by motion, recycling following formula carries out judging just can know and blocked the moment of generation.

|S _t-S _t-1|>T(9)

S _tand S _t-1represent current time and the elemental area shared by previous moment destination object respectively, T represents the threshold value set.This threshold value needs to change according to the focal length of camera in concrete scene.

D uses the matching process determination tracing object based on SURF feature, at the end of characteristic matching tends towards stability and judges to block, after reinitializing Kalman filter, enters step c;

Between the two width images obtained by SURF algorithm, the feature point pairs of some couplings is not right-on, there is the erroneous matching because measuring error and noise cause in them.Therefore, the present invention is obtaining SURF characteristic matching point to rear, uses RANSAC algorithm carry out exact matching and obtain the homography matrix of the conversion between image, thus spotting object in video.

RANSAC(randomsampleconsensus) algorithm is a kind of method estimating mathematical model parameter iteration, and its basic ideas are the parameters of being tried to achieve the mathematical model that most of sample (referring to the feature point pairs of coupling mutually) can meet by the method for stochastic sampling and checking.

The present invention uses the concrete steps of RANSAC algorithm as follows:

1) first random selecting four pairs of SURF match points, as Initial Internal Points set, calculate transformation matrix H by pair match point of four in initial sets.

2) judge whether the point outside interior some set can add this set.Calculate the distance between I ' and HI, if this value is less than the threshold value set, then this point is added interior set.

3) 1 is repeated) and 2) step N time, choose and interiorly gather the match point set of that maximum group set of mid point number as final needs.Last according to this set, use least square method to upgrade transformation matrix H.

In above process, suppose that the ratio that the match point finally obtained accounts for initial SURF match point sum is p, then to randomly draw four pairs of match points be not the probability of final correct coupling is entirely 1-p ⁴, the initial four couples point of iteration N time is not the probability of correct match point is entirely (1-p ⁴) ⁿ, the probability P so obtaining correct transformation matrix H is:

P=1-(1-p ⁴) ^N(10)

In actual applications, both in order to ensure that greater probability obtains transformation matrix H, making again the iterations N of algorithm less, generally having got N between 10 ~ 20.

After obtaining transformation matrix H, four of destination object in template summits are mapped the general location just obtaining objects in images in the past, computing method are as follows:

x _i′=h ₁*x _i+h ₂*y _i+h ₃(11)

y _i′=h ₄*x _i+h ₅*y _i+h ₆(12)

(x _i', y _i') represent the coordinate on target in video image object i-th summit, (x _i, y _i) be i-th apex coordinate of destination object in template.

Judge the end of blocking.The method used of this part and judgement block whether to use method identical, and namely whether judgment object profile intersects.At the end of the process of blocking, can there is significant change in the elemental area that the destination object place agglomerate observed in bianry image occupies, and whether through type (9) just can judge that certain moment blocks terminates.

At the end of blocking, system, by according to the result of searching for coupling before, reinitializes Kalman filter, is then constantly predicted by Kalman filter and more newly arrives and continued to follow the tracks of process.

E exports and preserves destination object characteristic information.

According to the needs of user, preserve destination object characteristic information to assigned catalogue.

Meaning of the present invention is:

The tracking process of object video is the gordian technique of intelligent video process, is the prerequisite of the senior semantic operations such as Activity recognition, event recognition, identification and process.Blocking with under the complex scene such as scene change, realizing fast and accurately is focus and the difficult point of research both at home and abroad to the tracking of object video.In recent years, people have carried out large quantifier elimination to the tracking problem under circumstance of occlusion, although current existing method can solve subproblem, also do not have a kind of method can solve all problems well.Such as, although image layered method can solve block tracking problem, complexity is high, is difficult to the requirement meeting real-time; And upgrade difficulty based on the tracking initialization difficulty of color or profile, object module, be difficult to use in real system; In Same Scene, utilize multiple-camera joint-monitoring to be the popular method solving occlusion issue at present, but at present the method is also very immature compared with the tracking under single camera, and realizes cost and complexity is all very high.Compare and block, the tracking in scene change situation is more complicated, at present in this field both at home and abroad can the documents and materials of reference also less, main solution is still the track algorithm under single game scape is expanded to many scenes to follow the tracks of.And the present invention, do not need multiple cameras, only need a video camera just can carry out fast track for the Moving Objects under fixed background environment, and with during scene change, still there is higher accuracy rate and robustness blocking.

Special instruction, work of the present invention is supported by project of national nature science fund project (61170253), the plan of Information Institute innovative research team of University Of Science and Technology Of Shandong.

The relevant technologies content do not addressed in aforesaid way is taked or uses for reference prior art to realize.

It should be noted that, under the instruction of this instructions, those skilled in the art can also make such or such easy variation pattern, such as equivalent way, or obvious mode of texturing.Above-mentioned variation pattern all should within protection scope of the present invention.

Claims

1., towards blocking the Moving Objects method for tracing with scene change, it is characterized in that comprising the following steps:

E exports and preserves destination object characteristic information;

In above-mentioned steps a, set up two reference frame I _bg(x, y), I _up(x, y), I _bgthe background frames that (x, y) is current scene, I _up(x, y) is a reference frame constantly updated in time; By present frame I (x, y) respectively with I _bg(x, y), I _up(x, y) carries out difference binaryzation, and the result obtained is designated as: F _bg(x, y), F _up(x, y), tells legacy in scene and moving object according to the value of the two;

In above-mentioned steps c, first initialization is carried out to Kalman filter, then carry out predicting tracing according to the target object state observed; In tracing process, the profile variations situation according to destination object upgrades template image adaptively, and is preserved by representational characteristic information; In tracing process, the determination methods intersected based on profile is adopted to carry out modeling, analysis and judgement to whether blocking;