CN102970528A

CN102970528A - Video object division method based on change detection and frame difference accumulation

Info

Publication number: CN102970528A
Application number: CN2012104024434A
Authority: CN
Inventors: 祝世平; 高洁
Original assignee: Beihang University
Current assignee: Shenzhen Xin Xiang Technology Co., Ltd.
Priority date: 2012-12-28
Filing date: 2012-12-28
Publication date: 2013-03-13
Anticipated expiration: 2032-12-28
Also published as: CN102970528B

Abstract

The invention discloses a video object division method based on change detection and frame difference accumulation. The video object division method includes: interframe changes of symmetrical frames with the interval to be k frames are tested through t significance testing, a detected initial motion and variation area is subjected to time domain fixed interval time domain calculating, and a memory mask is formed through further integrating; a threshold value is adjusted through image edge continuous testing by aid of a Kirsch edge testing operator based on discontinuous testing to obtain all connected edge information in the current frame; a semantics video object plane is obtained through a space-time filter; and video object division is achieved through selective application filling and morphological processing operations. The video object division method based on change detection and frame difference accumulation is novel, effectively solves the problems of video object internal lack and background exposing which are caused by irregular object movements frequently happening in the video object division method, and is greatly improved in division speed, division effect, application range and portability.

Description

Based on the video picture segmentation method that changes detection and the poor accumulation of frame

Technical field

The present invention relates to the processing method in a kind of Video Object Extraction, particularly a kind of video picture segmentation method based on changing detection and the poor accumulation of frame.Under the prerequisite that guarantees segmentation effect and speed, having changed traditional variation detection needs test of many times to obtain the shortcoming of frame difference limen value, has taken into full account the information of neighborhood of pixels, obtains more accurately object video.The poor accumulative total of frame is used the information in a period of time, solved the local motion object disappearance problem that the Moving Objects irregular movement brings, the complete image marginal information and the binaryzation that obtain being communicated with in the present frame based on the Kirsch edge detection operator of discontinuity detecting technique, three kinds of methods make the method have more practicality and generalization in conjunction with application.

Background technology

Object-based Video segmentation is the content-based coding of realization MPEG-4 and the key of interactive function, is accompanied by the popularization of the content-based coding techniques of MPEG-4 standard, and Video Object Segmentation Technology has become the focus of current research.At present, object-based Video segmentation is at video monitoring, man-machine interaction, and military affairs, there is very widely application prospect in the fields such as communication.Although people conduct extensive research video object segmentation in recent years, still do not address this problem fully.

Video picture segmentation method is divided into auto Segmentation (referring to Thomas Meier according to whether needing manually to participate in cutting procedure, King N.Ngan.Automatic segmentation of moving objects for video object plane generation[J] .IEEE Transactions on Circuits and Systems for Video Technology, 1998,8 (5): 525-538) and semi-automatic cutting apart (referring to Munchurl Kim, J.G.Jeon, J.S.Kwak, M.H.Lee, C.Ahn.Moving object segmentation in video sequences by user interaction and automatic object tracking[J] .Image and Vision Computing, 2001,19 (5): 245-260).

Mainly comprise according to the application difference: the time domain splitting method, spatial segmentation method and space-time merge dividing method.The time domain splitting method utilize the Temporal Order of video sequence, the time domain by consecutive frame changes to detect moving target.Method commonly used has frame differential method and background subtraction point-score.Frame differential method (referring to Zhan C H.An improved moving object detection algorithm based on frame difference and edge detection[A] .International Conference on Image and Graphics[C], 2007:519-523); Method realizes simple, and the programming complexity is low, and is not too responsive to scene changes such as light, can adapt to various dynamic environment, stability better still can not extract the complete area of object, can only extract the border, depend on the time interval of the frame-to-frame differences of selection.The background subtraction point-score is (referring to Olivier Barnich, Marc Van Droogenbroeck.A universal background subtraction algorithm for video sequences[J] .IEEE Transactions on Image Processing, 2011,20 (6): 1709-1723); Its principle and method simplicity of design are processed according to the actual conditions definite threshold, acquired results has directly reflected the information such as the position, size, shape of moving target, can access more accurate moving target information, but the larger discrete noise point that produces easily of the amount of calculation of context update is subjected to the impact of the change of external conditions such as light, weather larger.Light stream (referring to Jareal A, Venkatesh K S.A new color based optical flow a1gorithm for environment mapping using a mobile robot[A] .IEEE International Symposium on Intelligent Control[C] .2007:567-572); Carried the movable information of target, under the scene information unknown situation, can detect well Moving Objects, accuracy in detection is high, but the optical flow method amount of calculation greatly generally can not be applied to processing in real time scene.The spatial segmentation method is (referring to Jaime S.Cardoso, Jorge C.S.Cardoso, Luis Corte-Real.Object-based spatial segmentation of video guided by depth and motion information[A] .IEEE Workshop on Motion and Video Computing[C] .2007.); Utilize the spatial domain attributes such as color, brightness, texture and marginal information of image to extract object video.Temporal-spatial fusion dividing method (Ahmed, R., Karmakar, G.C., Dooley, L.S.Incorporation of texture information for joint spatio-temporal probabilistic video object[A] .IEEE International Conference on Image Processing[C] .2007,6,293-296); Present the most frequently used dividing method, the time-domain information of the method combining video sequences and spatial information (si).Two kinds of information mutually merge and obtain relatively accurate segmentation result.

Yet no matter be to adopt which kind of method to cut apart Moving Objects, the irregular movement of displaying background and object (Moving Objects or its certain part are static within a period of time) all can cause the decline of segmentation accuracy.In motion analysis, displaying background and the static foreground area that causes because of the irregular movement of object easily are mistakenly detected as prospect or background, thereby cause the decline of segmentation accuracy.

Summary of the invention:

The present invention proposes a kind of video picture segmentation method based on changing the accumulation of detection and frame difference image, at first utilize gaussian filtering that each frame video image is carried out Gaussian smoothing filtering, then use t significance test assay intervals to change as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is calculated by the poor accumulation of time domain fixed-interval frame, and further integrate formation memory mask; Then use the Kirsch edge detection operator based on discontinuity detecting technique to detect to regulate threshold value by the image border continuity, thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity; Then obtain the semantic video object plane by space time filter; Final optionally application is filled and morphology processing operation realizes cutting apart of object video.This is a kind of new video Object Segmentation method, the inner disappearance of object video that is caused by irregular movement (Moving Objects or its certain part are static within a period of time) that its proposition has solved effectively that video picture segmentation method often occurs and the problem of displaying background, splitting speed, segmentation effect, the scope of application and portability have all had very large improvement.

The technical problem to be solved in the present invention is:

1, when directly obtaining the region of variation of Moving Objects by inter-frame difference, need to obtain by many experiments the threshold value of frame difference method, to noise and light sensitive and obtain the problem that there is serious disappearance in complete moving region;

2, (covering/appear) problem of blocking that object video irregular movement (Moving Objects or its certain part are static within a period of time) produces.

3, the Kirsch edge detection operator obtains the object video border and has discontinuity.

The technical solution adopted for the present invention to solve the technical problems: the video picture segmentation method based on changing the accumulation of detection and frame difference image may further comprise the steps:

Step 1: utilize each two field picture of gaussian filtering smoothed video sequence, use t significance test assay intervals to change each frame initial motion region of variation of acquisition as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is carried out obtaining the entire motion region of variation through symmetrical frame apart from region of variation phase and operation, then take the poor accumulation of time domain fixed-interval frame to calculate, obtain regularly effectively template of section, and further integrate to form and remember mask, the time domain of finishing object video is cut apart;

Step 2: each frame to original video namely carries out rim detection based on the Kirsch operator of discontinuity detecting technique with improved Kirsch edge detection operator; The result of binaryzation rim detection finishes the spatial segmentation of object video;

Step 3: adopt the mode of parallel temporal-spatial fusion will be carried out extracting with operation by the segmentation memory mask that forms in the step 1 and each frame of video sequence the exact boundary profile of Moving Objects by the binaryzation edge detection results that obtains in the step 2; Carry out optionally that morphology opens and closes and the extraction of object video is finished in padding according to boundary information.

The advantage that the present invention compared with prior art has is:

1, this method utilizes t distribution significance test to detect the variation of interframe, the estimation that does not need to know the variance of noise in the video so avoided noise parameter, and do not need manually experiment to obtain threshold value when obtaining frame difference image, can search the acquisition optimal threshold according to the t distribution table.The statistical change detection technology of hypothesis testing can well suppress camera noise to the impact of segmentation result, and segmentation result significantly is better than the result that Threshold segmentation obtains.

2, this method utilization image of being spaced apart the k frame can better be processed object video and slowly moves, the concept of memory mask MT (Memory Template) has been proposed, adopt the method for the poor accumulation of time domain interval frame decided at the higher level but not officially announced to obtain memory template, effectively solve the disappearance problem on border, moving region.

Breakpoint appears in edge line easily that 3, directly utilize the Kirsch rim detection to obtain, and effect is not very desirable.This method adopts 4 * 4 direction template to calculate around the impact point difference on 6 directions, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, comes thus the discontinuity at detected image edge.Thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity.

Description of drawings:

Fig. 1 is the flow chart of the video picture segmentation method based on changing detection and the poor accumulation of frame of the present invention.

Fig. 2 is that the Akiyo video sequence changes detection and the poor accumulative effect figure of frame: wherein (a) represents the 5th frame of Akiyo video sequence; (b) the 21st frame of expression Akiyo video sequence; (c) the initial motion region of variation that obtains after detecting by t significance test of presentation graphs (a); (d) the initial motion region of variation that obtains after detecting by t significance test of presentation graphs (b); (e) presentation graphs (c) is through the complete Motion-changed Region of the poor accumulation of time domain fixed-interval frame; (f) the complete Motion-changed Region of presentation graphs (d) after the poor accumulation of time domain fixed-interval frame.

Fig. 3 is the memory mask figure of Akiyo video sequence: wherein (a) represents first and second memory mask of Akiyo video sequence.

Fig. 4 is Akiyo video sequence VOP extraction effect figure: wherein (a) represents the 5th frame of Akiyo video sequence; (b) the 21st frame of expression Akiyo video sequence; (c) VOP that extracts (a) from figure of expression; (d) VOP that extracts (b) from figure of expression;

Fig. 5 is the VOP design sketch that the Grandma video sequence adopts the inventive method to extract: wherein (a) represents the 4th frame of Grandma video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 19th frame of expression Claire video sequence; (d) VOP that extracts (c) from figure of expression;

Fig. 6 is the VOP design sketch that the Claire video sequence adopts the inventive method to extract: wherein (a) represents the 8th frame of Claire video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 16th frame of expression Claire video sequence; (d) VOP that extracts (c) from figure of expression;

Fig. 7 is the VOP design sketch that the Miss-American video sequence adopts the inventive method to extract: wherein (a) represents the 20th frame of Miss-American video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 40th frame of expression Miss-American video sequence; (d) VOP that extracts (c) from figure of expression;

Fig. 8 is the VOP design sketch that Mother and daughter and Hall monitor video sequence adopt the inventive method to extract: wherein (a) represents the 15th frame of Mother and daughter video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 70th frame of expression Hall monitor video sequence; (d) VOP that extracts (c) from figure of expression;

Fig. 9 is this paper method and reference (Zhu Shiping, Ma Li, Hou Yangshuan. based on the Video object segmentation algorithm [J] of time domain fixed-interval memory compensation. photoelectron. laser, 2010,21 (8): 1241-1246.) method is applied to the spatial accuracy comparison diagram of the front 20 frame segmentation results of Grandma video sequence and Miss-American video sequence.(a) be the spatial accuracy comparison diagram that is applied to the Grandma video sequence; (b) be the spatial accuracy comparison diagram that is applied to the Miss-American video sequence; Wherein the spatial accuracy result that this paper method obtains is used in spatial accuracy 1 representative, and spatial accuracy 2 represents the application reference method and obtains the spatial accuracy result.

Embodiment:

Be described in further detail the present invention below in conjunction with the drawings and the specific embodiments.

Video picture segmentation method based on time domain fixed-interval memory compensation of the present invention, Fig. 1 is the flow chart of the inventive method, this method may further comprise the steps:

If F _(n)The n frame of presentation video sequence, F _(n)With F _(n-k)Between error image comprise F _(n)In object video and the background area that exposes owing to the motion of object.F _(n+k)With F _(n)Between the poor mask of frame comprise F _(n)In object video and since object motion at F _(n+k)The background area of middle covering.Next, video image is changed detection, adopt the change detection techniques that distributes based on t, the value of the value of region of variation and region of variation not in the statistics frame difference image obtains initial change and detects mask.Employing has avoided many experiments to obtain segmentation threshold based on the change detection techniques that t distributes, and has taken into full account the information that detects in the neighborhood of pixels, and the judged result that obtains is more accurate.Inner because the grain details of the motion details that the mistake alarm probability that exists owing to hypothesis testing is lost and Moving Objects lacks the inner void that causes and the noise region of scattered distribution is present in the initial motion region of variation.

Carry out time domain when cutting apart, if the target componental movement is not obvious, just be difficult to find the moving region that comprises the complete video object, so at first adopt symmetrical frame apart from the poor accumulative total of moving of frame.

The n frame is F behind the setting video sequence gray processing _n(x, y), level and smooth through gaussian filtering is G later _n(x, y).

Every two field picture noise is designated as N in the video sequence _n(x, y) variance is designated as

So can be with n two field picture G in the video sequence _n(x, y) is expressed as:

G_{n} (x, y) = {\overset{&OverBar;}{G}}_{n} (x, y) + N_{n} (x, y)

Wherein

Actual value for video image.According to following formula, can get difference image:

FD (x, y) = {\overset{&OverBar;}{G}}_{n} (x, y) - {\overset{&OverBar;}{G}}_{(n - k)} (x, y) + N_{n} (x, y) - N_{(n - k)} (x, y)

If D _{(x, y)}=N _n(x, y)-N _(n-k)(x, y), wherein N _n(x, y) and N _(n-k)(x, y) is the identical and mutually independent random variables of probability density, so D _{(x, y)}Still be additive zero Gaussian noise stochastic variable, variance is

Because the noise of each pixel is mutually independently, if all non-vanishing frame differences are all caused by noise in the window, the average μ of these values should be for zero, so carry out hypothesis testing according to probability theory knowledge, if position (x, y) is background (is null hypothesis H ₀): H ₀: μ=0.In the situation of without knowledge of noise covariance, adopt t significance test to detect, construct statistical test amount t according to the pixel in the neighborhood window:

t = \frac{A_{d} (n)}{s / \sqrt{p}}

Wherein, A _d(n) and s be respectively sample average and sample variance in the neighborhood window.

A_{d} (n) = \frac{1}{p} Σ_{- n}^{n} Σ_{- n}^{n} | FD (x + l, y + m) |

s = \sqrt{\frac{1}{p - 1} Σ_{- n}^{n} Σ_{- n}^{n} {(FD (x + l, y + m) - A_{d} (n))}^{2}}

Theoretical according to significance test, threshold value is determined by the distribution that given level of significance α and t obey:

| t | &GreaterEqual; t_{\frac{α}{2}} (p - 1)

The selection of level of significance α is relevant with camera noise intensity in the concrete video sequence, and its value can be made as 10 usually ^-2, 10 ^-6Deng (choosing the α value herein is 10 ^-2, window size elects 5 * 5 as), can obtain desirable result. according to the level of significance α of setting, if

Formula is set up, and then this central pixel point belongs to m (n).

The initial motion region of variation can be expressed as:

Apart from region of variation phase and operation, in the not obvious part of object video can being moved is included in, obtain complete Motion-changed Region through symmetrical frame:

If the inner vein of moving target has the consistency of height, moving target is whole or local static or motion is slowly in certain time, only use above-mentioned change detecting method can not detect complete moving region, thereby cause when temporal-spatial fusion, can't obtain accurate moving object boundary profile, thereby in final extraction, cause the target disappearance.

Use the poor accumulation method of time domain fixed-interval frame for above-mentioned situation this paper, can effectively solve object video border disappearance problem.The method utilizes each interframe of video sequence in the correlation of time-domain and the continuity of target travel, considers within certain period the number of times that each pixel occurs.Namely in the preset time section, occurrence number frequently part appears in the whole movement slot as effective template.Take into full account the motion continuity of object video on a period of time, namely fully excavated the temporal information on the whole block.

If preset time, segment length was l, in this section time zone, comprise the L frame video image, be respectively M ₁, M ₂..., M _L-1, M _l, effective template EM that this section is corresponding (effective mask) is identical with every frame video image size:

EM (x, y) = \{\begin{matrix} 255, & T &GreaterEqual; τ \\ 0, & T < τ \end{matrix}

Wherein, T=n _i/ L, n _iFor point (x, y) is marked as the number of times of target travel point, M in the L frame video image ₁, M ₂..., M _L-1, M _lTo detect the Motion-changed Region mask image M that obtains through changing ₁(x, y), M ₂(x, y) ..., M _L-1(x, y), M _l(x, y), the proportion threshold value of τ for setting.Choose different proportion threshold value τ according to the different video sequence.Very fast for movement velocity, the video that motion amplitude is larger can be chosen larger value; Otherwise slower for movement velocity, the video that motion amplitude is less will be selected less value.

For any pixel (x, y), if EM (x, y) is 0, then do not carry out the poor cumulative calculation of frame, if EM (x, y) is 255, then carries out the poor accumulation of time domain fixed-interval frame and calculate.

The pixel value that carries out the respective point of the every frame video image in the rear corresponding time period of the poor accumulation calculating of frame is and is set as 255, namely

F ₁(x,y)=F ₂(x,y)=...F _m(x,y)=255

Fig. 2 is that Akiyo carries out the result that the t changes in distribution detected and decided the poor accumulation meter of interval frame.In the test, the symmetrical frame of choosing is apart from k=2, proportion threshold value τ=2/12 in the poor accumulation of frame.

As can be seen from Figure 2, there are many cavities through changing to have in the moving region in the frame difference image that detects, moving target is also imperfect, effect had had sizable improvement after but poor accumulation was calculated through frame, not only obtained complete boundary profile, and the moving region interior void has obtained also filling preferably.

Behind the poor accumulation meter of time domain fixed-interval frame, although detecting, the more initial t changes in distribution of effect obtained very large improvement, the still existence cavity in object video inside.So this paper has proposed the concept of memory mask MT (Memory Template).Be about to the mask that poor accumulation is calculated through frame and carry out morphology processing and padding, obtain complete object video mask and then obtain altogether N/L memory mask.

Unlatching, closure are operations important in the morphology, and they are combined with by the dilation and erosion cascade and form.

Gray scale expansion and gray scale corrosion operation can be considered the image filtering operation, utilize structural element that signal is carried out gray scale corrosion and gray scale expansion, and it is defined as follows:

(A &CirclePlus; B) (s, t) = \max {A (s - x, t - y) + B (x, y) | (s - x, t - y) &Element; D_{A}; (x, y) &Element; D_{B}}

(AΘB)(s,t)=min{A(s+x,t+y)+B(x,y)|(s+x,t+y)∈D _A;(x,y)∈D _B}

D wherein _AAnd D _BBe respectively the domain of definition of A and B, B is the square structure unit for reconstructed operation.

Open the profile that operation generally makes image and become smooth, disconnect narrow interruption and the burr on the profile.Closed operation can make outline line more smooth equally, but with open operation by contrast, it eliminates narrow tip and long thin wide gap usually, eliminates little cavity, and fills up the fracture in the outline line.

Below two formulas represent respectively morphologic open the operation and closed operation:

A \cdot B = (A &CirclePlus; B) ΘB

According to Boundary filling Moving Objects template step:

1, carries out level and fill, that is: travel through whole Moving Objects template, find first boundary point and last boundary point in every delegation, the pixel between these 2 all is labeled as the Moving Objects internal point;

2, vertically fill, that is: travel through whole Moving Objects template, find first boundary point and last boundary point in each row, the pixel between these 2 all is labeled as the Moving Objects internal point;

3, level is filled the result and get common factor with vertical filling result, finally filled the video motion object template after complete.

Because opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, can not carry out morphology and process, only need to fill and get final product.Mask profile among Fig. 2 is continuous and smooth, can directly fill, and it is filled the result and lists in Fig. 3.

When rim detection, some important edge details since disturb or the contrast deficiency thicken, faint.Breakpoint appears in edge line easily that directly utilize the Kirsch rim detection to obtain, and effect is not very desirable.This paper utilizes the image border continuity to detect to regulate threshold value, thus the image border that obtains being communicated with.Usually the discontinuous place at the edge, pixel value has larger difference, uses 4 * 4 direction template to calculate around the impact point difference on 6 directions herein, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, come thus the discontinuity at detected image edge.Suppressing noise, guaranteeing to have protected preferably low intensive edge details in the edge continuity, obtained gratifying effect by the method.Threshold value determining the precision of edge location and edge continuously.Remember that former video sequence is M through the result of rim detection and process filling and binaryzation _e

N/L time domain memory mask MT respectively with each spatial domain binaryzation edge detection results M _eMerge and extract two-value Moving Objects template:

B(x,y)=MT(x,y)∩M _e(x,y)

If corresponding B (x, y) should be 255, namely this point finally is labeled as prospect, otherwise is labeled as background.Adopt above-mentioned like this amalgamation mode, obviously can be with the constraint by the border of the occlusion area produce weeds out because object video moves in the memory motherboard.At last, in conjunction with original video sequence V _O(x, y), finish cutting apart of object video:

VO (x, y) = \{\begin{matrix} V_{O} (x, y), & B (x, y) = 255 \\ 255, & B (x, y) = 0 \end{matrix}

For the validity of this paper method is described, selected normal video cycle tests " Akiyo ", " Grandma ", " Claire ", " Miss-American ", " Mother and daughter " and " Hall monitor " as experimental subjects.Six sections test videos are the QCIF form, and size is 176 * 144 pixels.Experimental result shows that this paper method all has good segmentation effect to dissimilar video sequences.

This paper selects the C language as the implementation language of described method, CoreTM 2Duo E6300, the 1.86GHz dominant frequency, memory size is 2G, uses the programming of Visual C++6.0 development environment to realize.

For the better correctness of reflection this paper method, this paper adopts in the experiment of MPEG-4 core the evaluation of the accuracy that is proposed by Wollborn etc.The space evaluation of the accuracy has defined the spatial accuracy SA (Spatial Accuracy) of each frame cutting object mask.

Then can be by the accuracy of cutting apart of following formula method.

Ω (I_{s}, I_{g}) = 1 - \frac{| I_{e} - I_{r} |}{I_{r}}

In the formula, I _eAnd I _rThe reference segmentation and the resulting object template of actual dividing method that represent respectively the t frame; Spatial accuracy has reflected the segmentation result of each frame and the shape similarity degree between the reference segmentation template, and SA is larger, shows to cut apart more accurately, and SA is less, show cut apart more inaccurate.

Table 1 and table 2 have been listed respectively this paper method and reference (Zhu Shiping, Ma Li, Hou Yangshuan. based on the Video object segmentation algorithm [J] of time domain fixed-interval memory compensation. photoelectron. laser, 2010,21 (8): 1241-1246.) be applied to the spatial accuracy contrast of the front 20 frame segmentation results of Grandma video sequence and Miss-American video sequence.Can find out that by contrast the spatial accuracy of this paper method definitely is better than control methods.

The spatial accuracy contrast that front 20 frames of table 1Grandma adopt this method and reference method to ask for

The spatial accuracy contrast that front 20 frames of table 2Miss-American adopt this method and reference method to ask for.

Claims

1. one kind based on change detecting and the video picture segmentation method of the poor accumulation of frame, this video picture segmentation method is characterised in that: time domain is cut apart and is utilized t significance test to detect interframe to change, do not need according to loaded down with trivial details experimental data setting threshold, search the acquisition optimal threshold according to the t distribution table, do not need to know the variance of noise in the video, therefore avoided the estimation of noise parameter; At the poor accumulation phase of frame effective template and the concept of memory mask and both uses and forming method thereof have been proposed; Spatial segmentation utilizes improved Kirsch edge detection operator namely to obtain complete meticulous connection edge based on the Kirsch operator of discontinuity detecting technique, and the concrete steps of this video picture segmentation method are as follows:

Step 1: utilize each two field picture of gaussian filtering smoothed video sequence, use t significance test assay intervals to change each frame initial motion region of variation of acquisition as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is carried out obtaining the entire motion region of variation with operation, then take the poor accumulation of time domain fixed-interval frame to calculate, obtain regularly effectively template of section, and further integrate to form and remember mask, the time domain of finishing object video is cut apart;

Step 2: each frame to original video adopts improved Kirsch edge detection operator namely to carry out rim detection based on the Kirsch operator of discontinuity detecting technique; The result of binaryzation rim detection finishes the spatial segmentation of object video;

2. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, it is characterized in that: the time domain moving image detection of described step 1: first the poor calculating of frame is carried out in the gray level image of the symmetrical frame that is spaced apart k, then detect by t significance test and obtain the initial motion region of variation, carrying out the poor accumulation of time domain fixed-interval frame calculates again, and further integrate the formation memory template, concrete steps are as follows:

(1), the n frame is F behind the setting video sequence gray processing _n(x, y), level and smooth through gaussian filtering is G later _n(x, y).

(2), every two field picture noise is designated as N in the video sequence _n(x, y) variance is designated as

So can be with n frame gray level image G in the video sequence _n(x, y) is expressed as:

G_{n} (x, y) = {\overset{&OverBar;}{G}}_{n} (x, y) + N_{n} (x, y)

Wherein

FD (x, y) = {\overset{&OverBar;}{G}}_{n} (x, y) - {\overset{&OverBar;}{G}}_{(n - k)} (x, y) + N_{n} (x, y) - N_{(n - k)} (x, y)

t = \frac{A_{d} (n)}{s / \sqrt{p}}

A_{d} (n) = \frac{1}{p} Σ_{- n}^{n} Σ_{- n}^{n} | FD (x + l, y + m) |

s = \sqrt{\frac{1}{p - 1} Σ_{- n}^{n} Σ_{- n}^{n} {(FD (x + l, y + m) - A_{d} (n))}^{2}}

| t | &GreaterEqual; t_{\frac{α}{2}} (p - 1)

The selection of level of significance α is relevant with camera noise intensity in the concrete video sequence, according to the level of significance α of setting, if

Set up, then this central pixel point belongs to m (n).

The initial motion region of variation can be expressed as:

(3), the poor accumulation of time domain fixed-interval frame is calculated: have the conforming object video of height or object video static or motion is slowly in certain time period for comprising inner vein, only use above-mentioned (1), described change detecting method of (2) two steps can not detect complete moving region, cause when spatio-temporal filtering, can't obtain accurate moving object boundary profile, thereby in final Video Object Extraction, cause the local disappearance of target.

Use the poor accumulation method of time domain fixed-interval frame, can effectively solve the local disappearance problem of target.The poor accumulation method of time domain fixed-interval frame utilizes each interframe of video sequence in the correlation of time-domain and the continuity of target travel, not only consider within certain period, the number of times that each pixel occurs is namely in the preset time section, and occurrence number frequently part appears in the whole movement slot as effective template.And taken into full account the motion continuity of object video on a period of time, namely fully excavated the temporal information on the whole block.

EM (x, y) = \{\begin{matrix} 255, & T &GreaterEqual; τ \\ 0, & T < τ \end{matrix}

F ₁(x,y)=F ₂(x,y)=...F _m(x,y)=255

(4), through behind the poor accumulation meter of time domain fixed-interval frame, although the more initial t changes in distribution detection of effect has obtained very large improvement, the still existence cavity in object video inside.So this paper has proposed the concept of memory mask MT (Memory Template).Be about to the mask that poor accumulation is calculated through frame and carry out morphology processing and padding, obtain complete object video mask and then obtain altogether N/L memory mask.

Because opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, can not carry out morphology and process, only need to fill and get final product.

3. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, it is characterized in that, the improved Kirsch edge detection operator of described step 2 namely carries out rim detection based on the Kirsch operator of discontinuity detecting technique, and concrete steps are as follows:

(1), utilize traditional Kirsch edge detection operator to carry out edge detection calculation, obtain the edge image of initial each frame video sequence.During rim detection, some important edge details since disturb or the contrast deficiency thicken, faint.

Breakpoint appears in edge line easily that (2), directly utilize the Kirsch rim detection to obtain, and effect is undesirable.This method adopts 4 * 4 direction template to calculate around the impact point difference on 6 directions, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, comes thus the discontinuity at detected image edge.Thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity.Obtain later M through binaryzation _e

4. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, and it is characterized in that the spatio-temporal filtering of described step 3 obtains complete semantic video object, and concrete steps are as follows:

(1), N/L time domain memory mask MT respectively with each spatial domain binaryzation edge detection results M _eMerge and extract two-value Moving Objects template:

B(x,y)=MT(x,y)∩M _e(x,y)

If corresponding B (x, y) should be 255, namely this point finally is labeled as prospect, otherwise is labeled as background.

(2), adopt above-mentioned like this amalgamation mode, obviously can be with the constraint by the border of the occlusion area that produces weeds out because object video moves in the memory motherboard.At last, in conjunction with original video sequence V _O(x, y), finish cutting apart of object video:

VO (x, y) = \{\begin{matrix} V_{O} (x, y), & B (x, y) = 255 \\ 255, & B (x, y) = 0 \end{matrix}