CN102970528A - Video object division method based on change detection and frame difference accumulation - Google Patents

Video object division method based on change detection and frame difference accumulation Download PDF

Info

Publication number
CN102970528A
CN102970528A CN2012104024434A CN201210402443A CN102970528A CN 102970528 A CN102970528 A CN 102970528A CN 2012104024434 A CN2012104024434 A CN 2012104024434A CN 201210402443 A CN201210402443 A CN 201210402443A CN 102970528 A CN102970528 A CN 102970528A
Authority
CN
China
Prior art keywords
frame
video
time domain
image
edge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012104024434A
Other languages
Chinese (zh)
Other versions
CN102970528B (en
Inventor
祝世平
高洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Xin Xiang Technology Co., Ltd.
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201210402443.4A priority Critical patent/CN102970528B/en
Publication of CN102970528A publication Critical patent/CN102970528A/en
Application granted granted Critical
Publication of CN102970528B publication Critical patent/CN102970528B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a video object division method based on change detection and frame difference accumulation. The video object division method includes: interframe changes of symmetrical frames with the interval to be k frames are tested through t significance testing, a detected initial motion and variation area is subjected to time domain fixed interval time domain calculating, and a memory mask is formed through further integrating; a threshold value is adjusted through image edge continuous testing by aid of a Kirsch edge testing operator based on discontinuous testing to obtain all connected edge information in the current frame; a semantics video object plane is obtained through a space-time filter; and video object division is achieved through selective application filling and morphological processing operations. The video object division method based on change detection and frame difference accumulation is novel, effectively solves the problems of video object internal lack and background exposing which are caused by irregular object movements frequently happening in the video object division method, and is greatly improved in division speed, division effect, application range and portability.

Description

Based on the video picture segmentation method that changes detection and the poor accumulation of frame
Technical field
The present invention relates to the processing method in a kind of Video Object Extraction, particularly a kind of video picture segmentation method based on changing detection and the poor accumulation of frame.Under the prerequisite that guarantees segmentation effect and speed, having changed traditional variation detection needs test of many times to obtain the shortcoming of frame difference limen value, has taken into full account the information of neighborhood of pixels, obtains more accurately object video.The poor accumulative total of frame is used the information in a period of time, solved the local motion object disappearance problem that the Moving Objects irregular movement brings, the complete image marginal information and the binaryzation that obtain being communicated with in the present frame based on the Kirsch edge detection operator of discontinuity detecting technique, three kinds of methods make the method have more practicality and generalization in conjunction with application.
Background technology
Object-based Video segmentation is the content-based coding of realization MPEG-4 and the key of interactive function, is accompanied by the popularization of the content-based coding techniques of MPEG-4 standard, and Video Object Segmentation Technology has become the focus of current research.At present, object-based Video segmentation is at video monitoring, man-machine interaction, and military affairs, there is very widely application prospect in the fields such as communication.Although people conduct extensive research video object segmentation in recent years, still do not address this problem fully.
Video picture segmentation method is divided into auto Segmentation (referring to Thomas Meier according to whether needing manually to participate in cutting procedure, King N.Ngan.Automatic segmentation of moving objects for video object plane generation[J] .IEEE Transactions on Circuits and Systems for Video Technology, 1998,8 (5): 525-538) and semi-automatic cutting apart (referring to Munchurl Kim, J.G.Jeon, J.S.Kwak, M.H.Lee, C.Ahn.Moving object segmentation in video sequences by user interaction and automatic object tracking[J] .Image and Vision Computing, 2001,19 (5): 245-260).
Mainly comprise according to the application difference: the time domain splitting method, spatial segmentation method and space-time merge dividing method.The time domain splitting method utilize the Temporal Order of video sequence, the time domain by consecutive frame changes to detect moving target.Method commonly used has frame differential method and background subtraction point-score.Frame differential method (referring to Zhan C H.An improved moving object detection algorithm based on frame difference and edge detection[A] .International Conference on Image and Graphics[C], 2007:519-523); Method realizes simple, and the programming complexity is low, and is not too responsive to scene changes such as light, can adapt to various dynamic environment, stability better still can not extract the complete area of object, can only extract the border, depend on the time interval of the frame-to-frame differences of selection.The background subtraction point-score is (referring to Olivier Barnich, Marc Van Droogenbroeck.A universal background subtraction algorithm for video sequences[J] .IEEE Transactions on Image Processing, 2011,20 (6): 1709-1723); Its principle and method simplicity of design are processed according to the actual conditions definite threshold, acquired results has directly reflected the information such as the position, size, shape of moving target, can access more accurate moving target information, but the larger discrete noise point that produces easily of the amount of calculation of context update is subjected to the impact of the change of external conditions such as light, weather larger.Light stream (referring to Jareal A, Venkatesh K S.A new color based optical flow a1gorithm for environment mapping using a mobile robot[A] .IEEE International Symposium on Intelligent Control[C] .2007:567-572); Carried the movable information of target, under the scene information unknown situation, can detect well Moving Objects, accuracy in detection is high, but the optical flow method amount of calculation greatly generally can not be applied to processing in real time scene.The spatial segmentation method is (referring to Jaime S.Cardoso, Jorge C.S.Cardoso, Luis Corte-Real.Object-based spatial segmentation of video guided by depth and motion information[A] .IEEE Workshop on Motion and Video Computing[C] .2007.); Utilize the spatial domain attributes such as color, brightness, texture and marginal information of image to extract object video.Temporal-spatial fusion dividing method (Ahmed, R., Karmakar, G.C., Dooley, L.S.Incorporation of texture information for joint spatio-temporal probabilistic video object[A] .IEEE International Conference on Image Processing[C] .2007,6,293-296); Present the most frequently used dividing method, the time-domain information of the method combining video sequences and spatial information (si).Two kinds of information mutually merge and obtain relatively accurate segmentation result.
Yet no matter be to adopt which kind of method to cut apart Moving Objects, the irregular movement of displaying background and object (Moving Objects or its certain part are static within a period of time) all can cause the decline of segmentation accuracy.In motion analysis, displaying background and the static foreground area that causes because of the irregular movement of object easily are mistakenly detected as prospect or background, thereby cause the decline of segmentation accuracy.
Summary of the invention:
The present invention proposes a kind of video picture segmentation method based on changing the accumulation of detection and frame difference image, at first utilize gaussian filtering that each frame video image is carried out Gaussian smoothing filtering, then use t significance test assay intervals to change as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is calculated by the poor accumulation of time domain fixed-interval frame, and further integrate formation memory mask; Then use the Kirsch edge detection operator based on discontinuity detecting technique to detect to regulate threshold value by the image border continuity, thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity; Then obtain the semantic video object plane by space time filter; Final optionally application is filled and morphology processing operation realizes cutting apart of object video.This is a kind of new video Object Segmentation method, the inner disappearance of object video that is caused by irregular movement (Moving Objects or its certain part are static within a period of time) that its proposition has solved effectively that video picture segmentation method often occurs and the problem of displaying background, splitting speed, segmentation effect, the scope of application and portability have all had very large improvement.
The technical problem to be solved in the present invention is:
1, when directly obtaining the region of variation of Moving Objects by inter-frame difference, need to obtain by many experiments the threshold value of frame difference method, to noise and light sensitive and obtain the problem that there is serious disappearance in complete moving region;
2, (covering/appear) problem of blocking that object video irregular movement (Moving Objects or its certain part are static within a period of time) produces.
3, the Kirsch edge detection operator obtains the object video border and has discontinuity.
The technical solution adopted for the present invention to solve the technical problems: the video picture segmentation method based on changing the accumulation of detection and frame difference image may further comprise the steps:
Step 1: utilize each two field picture of gaussian filtering smoothed video sequence, use t significance test assay intervals to change each frame initial motion region of variation of acquisition as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is carried out obtaining the entire motion region of variation through symmetrical frame apart from region of variation phase and operation, then take the poor accumulation of time domain fixed-interval frame to calculate, obtain regularly effectively template of section, and further integrate to form and remember mask, the time domain of finishing object video is cut apart;
Step 2: each frame to original video namely carries out rim detection based on the Kirsch operator of discontinuity detecting technique with improved Kirsch edge detection operator; The result of binaryzation rim detection finishes the spatial segmentation of object video;
Step 3: adopt the mode of parallel temporal-spatial fusion will be carried out extracting with operation by the segmentation memory mask that forms in the step 1 and each frame of video sequence the exact boundary profile of Moving Objects by the binaryzation edge detection results that obtains in the step 2; Carry out optionally that morphology opens and closes and the extraction of object video is finished in padding according to boundary information.
The advantage that the present invention compared with prior art has is:
1, this method utilizes t distribution significance test to detect the variation of interframe, the estimation that does not need to know the variance of noise in the video so avoided noise parameter, and do not need manually experiment to obtain threshold value when obtaining frame difference image, can search the acquisition optimal threshold according to the t distribution table.The statistical change detection technology of hypothesis testing can well suppress camera noise to the impact of segmentation result, and segmentation result significantly is better than the result that Threshold segmentation obtains.
2, this method utilization image of being spaced apart the k frame can better be processed object video and slowly moves, the concept of memory mask MT (Memory Template) has been proposed, adopt the method for the poor accumulation of time domain interval frame decided at the higher level but not officially announced to obtain memory template, effectively solve the disappearance problem on border, moving region.
Breakpoint appears in edge line easily that 3, directly utilize the Kirsch rim detection to obtain, and effect is not very desirable.This method adopts 4 * 4 direction template to calculate around the impact point difference on 6 directions, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, comes thus the discontinuity at detected image edge.Thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity.
Description of drawings:
Fig. 1 is the flow chart of the video picture segmentation method based on changing detection and the poor accumulation of frame of the present invention.
Fig. 2 is that the Akiyo video sequence changes detection and the poor accumulative effect figure of frame: wherein (a) represents the 5th frame of Akiyo video sequence; (b) the 21st frame of expression Akiyo video sequence; (c) the initial motion region of variation that obtains after detecting by t significance test of presentation graphs (a); (d) the initial motion region of variation that obtains after detecting by t significance test of presentation graphs (b); (e) presentation graphs (c) is through the complete Motion-changed Region of the poor accumulation of time domain fixed-interval frame; (f) the complete Motion-changed Region of presentation graphs (d) after the poor accumulation of time domain fixed-interval frame.
Fig. 3 is the memory mask figure of Akiyo video sequence: wherein (a) represents first and second memory mask of Akiyo video sequence.
Fig. 4 is Akiyo video sequence VOP extraction effect figure: wherein (a) represents the 5th frame of Akiyo video sequence; (b) the 21st frame of expression Akiyo video sequence; (c) VOP that extracts (a) from figure of expression; (d) VOP that extracts (b) from figure of expression;
Fig. 5 is the VOP design sketch that the Grandma video sequence adopts the inventive method to extract: wherein (a) represents the 4th frame of Grandma video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 19th frame of expression Claire video sequence; (d) VOP that extracts (c) from figure of expression;
Fig. 6 is the VOP design sketch that the Claire video sequence adopts the inventive method to extract: wherein (a) represents the 8th frame of Claire video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 16th frame of expression Claire video sequence; (d) VOP that extracts (c) from figure of expression;
Fig. 7 is the VOP design sketch that the Miss-American video sequence adopts the inventive method to extract: wherein (a) represents the 20th frame of Miss-American video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 40th frame of expression Miss-American video sequence; (d) VOP that extracts (c) from figure of expression;
Fig. 8 is the VOP design sketch that Mother and daughter and Hall monitor video sequence adopt the inventive method to extract: wherein (a) represents the 15th frame of Mother and daughter video sequence; (b) VOP that extracts (a) from figure of expression; (c) the 70th frame of expression Hall monitor video sequence; (d) VOP that extracts (c) from figure of expression;
Fig. 9 is this paper method and reference (Zhu Shiping, Ma Li, Hou Yangshuan. based on the Video object segmentation algorithm [J] of time domain fixed-interval memory compensation. photoelectron. laser, 2010,21 (8): 1241-1246.) method is applied to the spatial accuracy comparison diagram of the front 20 frame segmentation results of Grandma video sequence and Miss-American video sequence.(a) be the spatial accuracy comparison diagram that is applied to the Grandma video sequence; (b) be the spatial accuracy comparison diagram that is applied to the Miss-American video sequence; Wherein the spatial accuracy result that this paper method obtains is used in spatial accuracy 1 representative, and spatial accuracy 2 represents the application reference method and obtains the spatial accuracy result.
Embodiment:
Be described in further detail the present invention below in conjunction with the drawings and the specific embodiments.
Video picture segmentation method based on time domain fixed-interval memory compensation of the present invention, Fig. 1 is the flow chart of the inventive method, this method may further comprise the steps:
Step 1: utilize each two field picture of gaussian filtering smoothed video sequence, use t significance test assay intervals to change each frame initial motion region of variation of acquisition as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is carried out obtaining the entire motion region of variation through symmetrical frame apart from region of variation phase and operation, then take the poor accumulation of time domain fixed-interval frame to calculate, obtain regularly effectively template of section, and further integrate to form and remember mask, the time domain of finishing object video is cut apart;
If F (n)The n frame of presentation video sequence, F (n)With F (n-k)Between error image comprise F (n)In object video and the background area that exposes owing to the motion of object.F (n+k)With F (n)Between the poor mask of frame comprise F (n)In object video and since object motion at F (n+k)The background area of middle covering.Next, video image is changed detection, adopt the change detection techniques that distributes based on t, the value of the value of region of variation and region of variation not in the statistics frame difference image obtains initial change and detects mask.Employing has avoided many experiments to obtain segmentation threshold based on the change detection techniques that t distributes, and has taken into full account the information that detects in the neighborhood of pixels, and the judged result that obtains is more accurate.Inner because the grain details of the motion details that the mistake alarm probability that exists owing to hypothesis testing is lost and Moving Objects lacks the inner void that causes and the noise region of scattered distribution is present in the initial motion region of variation.
Carry out time domain when cutting apart, if the target componental movement is not obvious, just be difficult to find the moving region that comprises the complete video object, so at first adopt symmetrical frame apart from the poor accumulative total of moving of frame.
The n frame is F behind the setting video sequence gray processing n(x, y), level and smooth through gaussian filtering is G later n(x, y).
Every two field picture noise is designated as N in the video sequence n(x, y) variance is designated as
Figure BDA00002285492200051
So can be with n two field picture G in the video sequence n(x, y) is expressed as:
G n ( x , y ) = G ‾ n ( x , y ) + N n ( x , y )
Wherein
Figure BDA00002285492200053
Actual value for video image.According to following formula, can get difference image:
FD ( x , y ) = G ‾ n ( x , y ) - G ‾ ( n - k ) ( x , y ) + N n ( x , y ) - N ( n - k ) ( x , y )
If D (x, y)=N n(x, y)-N (n-k)(x, y), wherein N n(x, y) and N (n-k)(x, y) is the identical and mutually independent random variables of probability density, so D (x, y)Still be additive zero Gaussian noise stochastic variable, variance is
Figure BDA00002285492200055
Because the noise of each pixel is mutually independently, if all non-vanishing frame differences are all caused by noise in the window, the average μ of these values should be for zero, so carry out hypothesis testing according to probability theory knowledge, if position (x, y) is background (is null hypothesis H 0): H 0: μ=0.In the situation of without knowledge of noise covariance, adopt t significance test to detect, construct statistical test amount t according to the pixel in the neighborhood window:
t = A d ( n ) s / p
Wherein, A d(n) and s be respectively sample average and sample variance in the neighborhood window.
A d ( n ) = 1 p Σ - n n Σ - n n | FD ( x + l , y + m ) |
s = 1 p - 1 Σ - n n Σ - n n ( FD ( x + l , y + m ) - A d ( n ) ) 2
Theoretical according to significance test, threshold value is determined by the distribution that given level of significance α and t obey:
| t | ≥ t α 2 ( p - 1 )
The selection of level of significance α is relevant with camera noise intensity in the concrete video sequence, and its value can be made as 10 usually -2, 10 -6Deng (choosing the α value herein is 10 -2, window size elects 5 * 5 as), can obtain desirable result. according to the level of significance α of setting, if
Figure BDA00002285492200065
Formula is set up, and then this central pixel point belongs to m (n).
The initial motion region of variation can be expressed as:
Apart from region of variation phase and operation, in the not obvious part of object video can being moved is included in, obtain complete Motion-changed Region through symmetrical frame:
Figure BDA00002285492200067
If the inner vein of moving target has the consistency of height, moving target is whole or local static or motion is slowly in certain time, only use above-mentioned change detecting method can not detect complete moving region, thereby cause when temporal-spatial fusion, can't obtain accurate moving object boundary profile, thereby in final extraction, cause the target disappearance.
Use the poor accumulation method of time domain fixed-interval frame for above-mentioned situation this paper, can effectively solve object video border disappearance problem.The method utilizes each interframe of video sequence in the correlation of time-domain and the continuity of target travel, considers within certain period the number of times that each pixel occurs.Namely in the preset time section, occurrence number frequently part appears in the whole movement slot as effective template.Take into full account the motion continuity of object video on a period of time, namely fully excavated the temporal information on the whole block.
If preset time, segment length was l, in this section time zone, comprise the L frame video image, be respectively M 1, M 2..., M L-1, M l, effective template EM that this section is corresponding (effective mask) is identical with every frame video image size:
EM ( x , y ) = 255 , T &GreaterEqual; &tau; 0 , T < &tau;
Wherein, T=n i/ L, n iFor point (x, y) is marked as the number of times of target travel point, M in the L frame video image 1, M 2..., M L-1, M lTo detect the Motion-changed Region mask image M that obtains through changing 1(x, y), M 2(x, y) ..., M L-1(x, y), M l(x, y), the proportion threshold value of τ for setting.Choose different proportion threshold value τ according to the different video sequence.Very fast for movement velocity, the video that motion amplitude is larger can be chosen larger value; Otherwise slower for movement velocity, the video that motion amplitude is less will be selected less value.
For any pixel (x, y), if EM (x, y) is 0, then do not carry out the poor cumulative calculation of frame, if EM (x, y) is 255, then carries out the poor accumulation of time domain fixed-interval frame and calculate.
The pixel value that carries out the respective point of the every frame video image in the rear corresponding time period of the poor accumulation calculating of frame is and is set as 255, namely
F 1(x,y)=F 2(x,y)=...F m(x,y)=255
Fig. 2 is that Akiyo carries out the result that the t changes in distribution detected and decided the poor accumulation meter of interval frame.In the test, the symmetrical frame of choosing is apart from k=2, proportion threshold value τ=2/12 in the poor accumulation of frame.
As can be seen from Figure 2, there are many cavities through changing to have in the moving region in the frame difference image that detects, moving target is also imperfect, effect had had sizable improvement after but poor accumulation was calculated through frame, not only obtained complete boundary profile, and the moving region interior void has obtained also filling preferably.
Behind the poor accumulation meter of time domain fixed-interval frame, although detecting, the more initial t changes in distribution of effect obtained very large improvement, the still existence cavity in object video inside.So this paper has proposed the concept of memory mask MT (Memory Template).Be about to the mask that poor accumulation is calculated through frame and carry out morphology processing and padding, obtain complete object video mask and then obtain altogether N/L memory mask.
Unlatching, closure are operations important in the morphology, and they are combined with by the dilation and erosion cascade and form.
Gray scale expansion and gray scale corrosion operation can be considered the image filtering operation, utilize structural element that signal is carried out gray scale corrosion and gray scale expansion, and it is defined as follows:
( A &CirclePlus; B ) ( s , t ) = max { A ( s - x , t - y ) + B ( x , y ) | ( s - x , t - y ) &Element; D A ; ( x , y ) &Element; D B }
(AΘB)(s,t)=min{A(s+x,t+y)+B(x,y)|(s+x,t+y)∈D A;(x,y)∈D B}
D wherein AAnd D BBe respectively the domain of definition of A and B, B is the square structure unit for reconstructed operation.
Open the profile that operation generally makes image and become smooth, disconnect narrow interruption and the burr on the profile.Closed operation can make outline line more smooth equally, but with open operation by contrast, it eliminates narrow tip and long thin wide gap usually, eliminates little cavity, and fills up the fracture in the outline line.
Below two formulas represent respectively morphologic open the operation and closed operation:
Figure BDA00002285492200081
A &CenterDot; B = ( A &CirclePlus; B ) &Theta;B
According to Boundary filling Moving Objects template step:
1, carries out level and fill, that is: travel through whole Moving Objects template, find first boundary point and last boundary point in every delegation, the pixel between these 2 all is labeled as the Moving Objects internal point;
2, vertically fill, that is: travel through whole Moving Objects template, find first boundary point and last boundary point in each row, the pixel between these 2 all is labeled as the Moving Objects internal point;
3, level is filled the result and get common factor with vertical filling result, finally filled the video motion object template after complete.
Because opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, can not carry out morphology and process, only need to fill and get final product.Mask profile among Fig. 2 is continuous and smooth, can directly fill, and it is filled the result and lists in Fig. 3.
Step 2: each frame to original video namely carries out rim detection based on the Kirsch operator of discontinuity detecting technique with improved Kirsch edge detection operator; The result of binaryzation rim detection finishes the spatial segmentation of object video;
When rim detection, some important edge details since disturb or the contrast deficiency thicken, faint.Breakpoint appears in edge line easily that directly utilize the Kirsch rim detection to obtain, and effect is not very desirable.This paper utilizes the image border continuity to detect to regulate threshold value, thus the image border that obtains being communicated with.Usually the discontinuous place at the edge, pixel value has larger difference, uses 4 * 4 direction template to calculate around the impact point difference on 6 directions herein, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, come thus the discontinuity at detected image edge.Suppressing noise, guaranteeing to have protected preferably low intensive edge details in the edge continuity, obtained gratifying effect by the method.Threshold value determining the precision of edge location and edge continuously.Remember that former video sequence is M through the result of rim detection and process filling and binaryzation e
Step 3: adopt the mode of parallel temporal-spatial fusion will be carried out extracting with operation by the segmentation memory mask that forms in the step 1 and each frame of video sequence the exact boundary profile of Moving Objects by the binaryzation edge detection results that obtains in the step 2; Carry out optionally that morphology opens and closes and the extraction of object video is finished in padding according to boundary information.
N/L time domain memory mask MT respectively with each spatial domain binaryzation edge detection results M eMerge and extract two-value Moving Objects template:
B(x,y)=MT(x,y)∩M e(x,y)
If corresponding B (x, y) should be 255, namely this point finally is labeled as prospect, otherwise is labeled as background.Adopt above-mentioned like this amalgamation mode, obviously can be with the constraint by the border of the occlusion area produce weeds out because object video moves in the memory motherboard.At last, in conjunction with original video sequence V O(x, y), finish cutting apart of object video:
VO ( x , y ) = V O ( x , y ) , B ( x , y ) = 255 255 , B ( x , y ) = 0
For the validity of this paper method is described, selected normal video cycle tests " Akiyo ", " Grandma ", " Claire ", " Miss-American ", " Mother and daughter " and " Hall monitor " as experimental subjects.Six sections test videos are the QCIF form, and size is 176 * 144 pixels.Experimental result shows that this paper method all has good segmentation effect to dissimilar video sequences.
This paper selects the C language as the implementation language of described method, CoreTM 2Duo E6300, the 1.86GHz dominant frequency, memory size is 2G, uses the programming of Visual C++6.0 development environment to realize.
For the better correctness of reflection this paper method, this paper adopts in the experiment of MPEG-4 core the evaluation of the accuracy that is proposed by Wollborn etc.The space evaluation of the accuracy has defined the spatial accuracy SA (Spatial Accuracy) of each frame cutting object mask.
Then can be by the accuracy of cutting apart of following formula method.
&Omega; ( I s , I g ) = 1 - | I e - I r | I r
In the formula, I eAnd I rThe reference segmentation and the resulting object template of actual dividing method that represent respectively the t frame; Spatial accuracy has reflected the segmentation result of each frame and the shape similarity degree between the reference segmentation template, and SA is larger, shows to cut apart more accurately, and SA is less, show cut apart more inaccurate.
Table 1 and table 2 have been listed respectively this paper method and reference (Zhu Shiping, Ma Li, Hou Yangshuan. based on the Video object segmentation algorithm [J] of time domain fixed-interval memory compensation. photoelectron. laser, 2010,21 (8): 1241-1246.) be applied to the spatial accuracy contrast of the front 20 frame segmentation results of Grandma video sequence and Miss-American video sequence.Can find out that by contrast the spatial accuracy of this paper method definitely is better than control methods.
Figure BDA00002285492200101
The spatial accuracy contrast that front 20 frames of table 1Grandma adopt this method and reference method to ask for
Figure BDA00002285492200102
The spatial accuracy contrast that front 20 frames of table 2Miss-American adopt this method and reference method to ask for.

Claims (4)

1. one kind based on change detecting and the video picture segmentation method of the poor accumulation of frame, this video picture segmentation method is characterised in that: time domain is cut apart and is utilized t significance test to detect interframe to change, do not need according to loaded down with trivial details experimental data setting threshold, search the acquisition optimal threshold according to the t distribution table, do not need to know the variance of noise in the video, therefore avoided the estimation of noise parameter; At the poor accumulation phase of frame effective template and the concept of memory mask and both uses and forming method thereof have been proposed; Spatial segmentation utilizes improved Kirsch edge detection operator namely to obtain complete meticulous connection edge based on the Kirsch operator of discontinuity detecting technique, and the concrete steps of this video picture segmentation method are as follows:
Step 1: utilize each two field picture of gaussian filtering smoothed video sequence, use t significance test assay intervals to change each frame initial motion region of variation of acquisition as the interframe of the symmetrical frame of k frame, again detected initial motion region of variation is carried out obtaining the entire motion region of variation with operation, then take the poor accumulation of time domain fixed-interval frame to calculate, obtain regularly effectively template of section, and further integrate to form and remember mask, the time domain of finishing object video is cut apart;
Step 2: each frame to original video adopts improved Kirsch edge detection operator namely to carry out rim detection based on the Kirsch operator of discontinuity detecting technique; The result of binaryzation rim detection finishes the spatial segmentation of object video;
Step 3: adopt the mode of parallel temporal-spatial fusion will be carried out extracting with operation by the segmentation memory mask that forms in the step 1 and each frame of video sequence the exact boundary profile of Moving Objects by the binaryzation edge detection results that obtains in the step 2; Carry out optionally that morphology opens and closes and the extraction of object video is finished in padding according to boundary information.
2. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, it is characterized in that: the time domain moving image detection of described step 1: first the poor calculating of frame is carried out in the gray level image of the symmetrical frame that is spaced apart k, then detect by t significance test and obtain the initial motion region of variation, carrying out the poor accumulation of time domain fixed-interval frame calculates again, and further integrate the formation memory template, concrete steps are as follows:
(1), the n frame is F behind the setting video sequence gray processing n(x, y), level and smooth through gaussian filtering is G later n(x, y).
(2), every two field picture noise is designated as N in the video sequence n(x, y) variance is designated as
Figure FDA00002285492100011
So can be with n frame gray level image G in the video sequence n(x, y) is expressed as:
G n ( x , y ) = G &OverBar; n ( x , y ) + N n ( x , y )
Wherein
Figure FDA00002285492100013
Actual value for video image.According to following formula, can get difference image:
FD ( x , y ) = G &OverBar; n ( x , y ) - G &OverBar; ( n - k ) ( x , y ) + N n ( x , y ) - N ( n - k ) ( x , y )
If D (x, y)=N n(x, y)-N (n-k)(x, y), wherein N n(x, y) and N (n-k)(x, y) is the identical and mutually independent random variables of probability density, so D (x, y)Still be additive zero Gaussian noise stochastic variable, variance is
Figure FDA00002285492100015
Because the noise of each pixel is mutually independently, if all non-vanishing frame differences are all caused by noise in the window, the average μ of these values should be for zero, so carry out hypothesis testing according to probability theory knowledge, if position (x, y) is background (is null hypothesis H 0): H 0: μ=0.In the situation of without knowledge of noise covariance, adopt t significance test to detect, construct statistical test amount t according to the pixel in the neighborhood window:
t = A d ( n ) s / p
Wherein, A d(n) and s be respectively sample average and sample variance in the neighborhood window.
A d ( n ) = 1 p &Sigma; - n n &Sigma; - n n | FD ( x + l , y + m ) |
s = 1 p - 1 &Sigma; - n n &Sigma; - n n ( FD ( x + l , y + m ) - A d ( n ) ) 2
Theoretical according to significance test, threshold value is determined by the distribution that given level of significance α and t obey:
| t | &GreaterEqual; t &alpha; 2 ( p - 1 )
The selection of level of significance α is relevant with camera noise intensity in the concrete video sequence, according to the level of significance α of setting, if
Figure FDA00002285492100025
Set up, then this central pixel point belongs to m (n).
The initial motion region of variation can be expressed as:
Figure FDA00002285492100026
Apart from region of variation phase and operation, in the not obvious part of object video can being moved is included in, obtain complete Motion-changed Region through symmetrical frame:
Figure FDA00002285492100027
(3), the poor accumulation of time domain fixed-interval frame is calculated: have the conforming object video of height or object video static or motion is slowly in certain time period for comprising inner vein, only use above-mentioned (1), described change detecting method of (2) two steps can not detect complete moving region, cause when spatio-temporal filtering, can't obtain accurate moving object boundary profile, thereby in final Video Object Extraction, cause the local disappearance of target.
Use the poor accumulation method of time domain fixed-interval frame, can effectively solve the local disappearance problem of target.The poor accumulation method of time domain fixed-interval frame utilizes each interframe of video sequence in the correlation of time-domain and the continuity of target travel, not only consider within certain period, the number of times that each pixel occurs is namely in the preset time section, and occurrence number frequently part appears in the whole movement slot as effective template.And taken into full account the motion continuity of object video on a period of time, namely fully excavated the temporal information on the whole block.
If preset time, segment length was l, in this section time zone, comprise the L frame video image, be respectively M 1, M 2..., M L-1, M l, effective template EM that this section is corresponding (effective mask) is identical with every frame video image size:
EM ( x , y ) = 255 , T &GreaterEqual; &tau; 0 , T < &tau;
Wherein, T=n i/ L, n iFor point (x, y) is marked as the number of times of target travel point, M in the L frame video image 1, M 2..., M L-1, M lTo detect the Motion-changed Region mask image M that obtains through changing 1(x, y), M 2(x, y) ..., M L-1(x, y), M l(x, y), the proportion threshold value of τ for setting.Choose different proportion threshold value τ according to the different video sequence.Very fast for movement velocity, the video that motion amplitude is larger can be chosen larger value; Otherwise slower for movement velocity, the video that motion amplitude is less will be selected less value.
For any pixel (x, y), if EM (x, y) is 0, then do not carry out the poor cumulative calculation of frame, if EM (x, y) is 255, then carries out the poor accumulation of time domain fixed-interval frame and calculate.
The pixel value that carries out the respective point of the every frame video image in the rear corresponding time period of the poor accumulation calculating of frame is and is set as 255, namely
F 1(x,y)=F 2(x,y)=...F m(x,y)=255
(4), through behind the poor accumulation meter of time domain fixed-interval frame, although the more initial t changes in distribution detection of effect has obtained very large improvement, the still existence cavity in object video inside.So this paper has proposed the concept of memory mask MT (Memory Template).Be about to the mask that poor accumulation is calculated through frame and carry out morphology processing and padding, obtain complete object video mask and then obtain altogether N/L memory mask.
Because opening and closing operations can make amount of calculation increase, therefore for the MT of profile smoother, can not carry out morphology and process, only need to fill and get final product.
3. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, it is characterized in that, the improved Kirsch edge detection operator of described step 2 namely carries out rim detection based on the Kirsch operator of discontinuity detecting technique, and concrete steps are as follows:
(1), utilize traditional Kirsch edge detection operator to carry out edge detection calculation, obtain the edge image of initial each frame video sequence.During rim detection, some important edge details since disturb or the contrast deficiency thicken, faint.
Breakpoint appears in edge line easily that (2), directly utilize the Kirsch rim detection to obtain, and effect is undesirable.This method adopts 4 * 4 direction template to calculate around the impact point difference on 6 directions, when maximum difference surpasses certain threshold value, can think that then this point is discontinuity point, comes thus the discontinuity at detected image edge.Thereby all image edge informations that obtain being communicated with in the present frame have been protected low intensive edge details preferably in the time of the residual noise in the minimizing memory mask, assurance edge continuity.Obtain later M through binaryzation e
4. a kind of described in according to claim 1 detects and the video picture segmentation method of the poor accumulation of frame based on changing, and it is characterized in that the spatio-temporal filtering of described step 3 obtains complete semantic video object, and concrete steps are as follows:
(1), N/L time domain memory mask MT respectively with each spatial domain binaryzation edge detection results M eMerge and extract two-value Moving Objects template:
B(x,y)=MT(x,y)∩M e(x,y)
If corresponding B (x, y) should be 255, namely this point finally is labeled as prospect, otherwise is labeled as background.
(2), adopt above-mentioned like this amalgamation mode, obviously can be with the constraint by the border of the occlusion area that produces weeds out because object video moves in the memory motherboard.At last, in conjunction with original video sequence V O(x, y), finish cutting apart of object video:
VO ( x , y ) = V O ( x , y ) , B ( x , y ) = 255 255 , B ( x , y ) = 0
CN201210402443.4A 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference Active CN102970528B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210402443.4A CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210402443.4A CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Publications (2)

Publication Number Publication Date
CN102970528A true CN102970528A (en) 2013-03-13
CN102970528B CN102970528B (en) 2016-12-21

Family

ID=47800372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210402443.4A Active CN102970528B (en) 2012-12-28 2012-12-28 The video picture segmentation method accumulated based on change-detection and frame difference

Country Status (1)

Country Link
CN (1) CN102970528B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218830A (en) * 2013-04-07 2013-07-24 北京航空航天大学 Method for extracting video object contour based on centroid tracking and improved GVF Snake
CN105046682A (en) * 2015-05-20 2015-11-11 王向恒 Video monitoring method based on local computing
CN106156747A (en) * 2016-07-21 2016-11-23 四川师范大学 The method of the monitor video extracting semantic objects of Behavior-based control feature
CN106530248A (en) * 2016-10-28 2017-03-22 中国南方电网有限责任公司 Method for intelligently detecting scene video noise of transformer station
CN106664417A (en) * 2014-05-15 2017-05-10 英特尔公司 Content adaptive background-foreground segmentation for video coding
CN107851180A (en) * 2015-07-09 2018-03-27 亚德诺半导体集团 Take the Video processing of detection
CN108769803A (en) * 2018-06-29 2018-11-06 北京字节跳动网络技术有限公司 Recognition methods, method of cutting out, system, equipment with frame video and medium
US10133927B2 (en) 2014-11-14 2018-11-20 Sony Corporation Method and system for processing video content
CN109784164A (en) * 2018-12-12 2019-05-21 北京达佳互联信息技术有限公司 Prospect recognition methods, device, electronic equipment and storage medium
CN110263789A (en) * 2019-02-18 2019-09-20 北京爱数智慧科技有限公司 A kind of object boundary recognition methods, device and equipment
CN110378327A (en) * 2019-07-09 2019-10-25 浙江大学 Add the object detecting device and method of complementary notable feature
CN110728746A (en) * 2019-09-23 2020-01-24 清华大学 Modeling method and system for dynamic texture
CN112017135A (en) * 2020-07-13 2020-12-01 香港理工大学深圳研究院 Method, system and equipment for spatial-temporal fusion of remote sensing image data
CN112669324A (en) * 2020-12-31 2021-04-16 中国科学技术大学 Rapid video target segmentation method based on time sequence feature aggregation and conditional convolution
CN113160273A (en) * 2021-03-25 2021-07-23 常州工学院 Intelligent monitoring video segmentation method based on multi-target tracking
CN113329227A (en) * 2021-05-27 2021-08-31 中国电信股份有限公司 Video coding method and device, electronic equipment and computer readable medium
CN114071166A (en) * 2020-08-04 2022-02-18 四川大学 HEVC compressed video quality improvement method combined with QP detection
CN116225972A (en) * 2023-05-09 2023-06-06 成都赛力斯科技有限公司 Picture difference comparison method, device and storage medium
CN116524026A (en) * 2023-05-08 2023-08-01 哈尔滨理工大学 Dynamic vision SLAM method based on frequency domain and semantics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030081836A1 (en) * 2001-10-31 2003-05-01 Infowrap, Inc. Automatic object extraction
US20070195993A1 (en) * 2006-02-22 2007-08-23 Chao-Ho Chen Method for video object segmentation
CN101719979A (en) * 2009-11-27 2010-06-02 北京航空航天大学 Video object segmentation method based on time domain fixed-interval memory compensation
CN101854467A (en) * 2010-05-24 2010-10-06 北京航空航天大学 Method for adaptively detecting and eliminating shadow in video segmentation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030081836A1 (en) * 2001-10-31 2003-05-01 Infowrap, Inc. Automatic object extraction
US20070195993A1 (en) * 2006-02-22 2007-08-23 Chao-Ho Chen Method for video object segmentation
CN101719979A (en) * 2009-11-27 2010-06-02 北京航空航天大学 Video object segmentation method based on time domain fixed-interval memory compensation
CN101854467A (en) * 2010-05-24 2010-10-06 北京航空航天大学 Method for adaptively detecting and eliminating shadow in video segmentation

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218830A (en) * 2013-04-07 2013-07-24 北京航空航天大学 Method for extracting video object contour based on centroid tracking and improved GVF Snake
CN103218830B (en) * 2013-04-07 2016-09-14 北京航空航天大学 Based on centroid trace and the object video contour extraction method of improvement GVF Snake
CN106664417A (en) * 2014-05-15 2017-05-10 英特尔公司 Content adaptive background-foreground segmentation for video coding
CN106664417B (en) * 2014-05-15 2020-02-18 英特尔公司 Method, system, and machine-readable medium for content adaptive background-foreground segmentation for video coding
US10133927B2 (en) 2014-11-14 2018-11-20 Sony Corporation Method and system for processing video content
CN105046682A (en) * 2015-05-20 2015-11-11 王向恒 Video monitoring method based on local computing
CN105046682B (en) * 2015-05-20 2018-04-03 王向恒 A kind of video frequency monitoring method based on local computing
CN107851180A (en) * 2015-07-09 2018-03-27 亚德诺半导体集团 Take the Video processing of detection
CN107851180B (en) * 2015-07-09 2022-04-29 亚德诺半导体国际无限责任公司 Video processing for occupancy detection
CN106156747A (en) * 2016-07-21 2016-11-23 四川师范大学 The method of the monitor video extracting semantic objects of Behavior-based control feature
CN106156747B (en) * 2016-07-21 2019-06-28 四川师范大学 The method of the monitor video extracting semantic objects of Behavior-based control feature
CN106530248A (en) * 2016-10-28 2017-03-22 中国南方电网有限责任公司 Method for intelligently detecting scene video noise of transformer station
CN108769803A (en) * 2018-06-29 2018-11-06 北京字节跳动网络技术有限公司 Recognition methods, method of cutting out, system, equipment with frame video and medium
CN109784164A (en) * 2018-12-12 2019-05-21 北京达佳互联信息技术有限公司 Prospect recognition methods, device, electronic equipment and storage medium
CN109784164B (en) * 2018-12-12 2020-11-06 北京达佳互联信息技术有限公司 Foreground identification method and device, electronic equipment and storage medium
CN110263789A (en) * 2019-02-18 2019-09-20 北京爱数智慧科技有限公司 A kind of object boundary recognition methods, device and equipment
CN110378327A (en) * 2019-07-09 2019-10-25 浙江大学 Add the object detecting device and method of complementary notable feature
CN110728746A (en) * 2019-09-23 2020-01-24 清华大学 Modeling method and system for dynamic texture
CN112017135B (en) * 2020-07-13 2021-09-21 香港理工大学深圳研究院 Method, system and equipment for spatial-temporal fusion of remote sensing image data
CN112017135A (en) * 2020-07-13 2020-12-01 香港理工大学深圳研究院 Method, system and equipment for spatial-temporal fusion of remote sensing image data
CN114071166A (en) * 2020-08-04 2022-02-18 四川大学 HEVC compressed video quality improvement method combined with QP detection
CN114071166B (en) * 2020-08-04 2023-03-03 四川大学 HEVC compressed video quality improvement method combined with QP detection
CN112669324A (en) * 2020-12-31 2021-04-16 中国科学技术大学 Rapid video target segmentation method based on time sequence feature aggregation and conditional convolution
CN112669324B (en) * 2020-12-31 2022-09-09 中国科学技术大学 Rapid video target segmentation method based on time sequence feature aggregation and conditional convolution
CN113160273A (en) * 2021-03-25 2021-07-23 常州工学院 Intelligent monitoring video segmentation method based on multi-target tracking
CN113329227A (en) * 2021-05-27 2021-08-31 中国电信股份有限公司 Video coding method and device, electronic equipment and computer readable medium
CN116524026A (en) * 2023-05-08 2023-08-01 哈尔滨理工大学 Dynamic vision SLAM method based on frequency domain and semantics
CN116524026B (en) * 2023-05-08 2023-10-27 哈尔滨理工大学 Dynamic vision SLAM method based on frequency domain and semantics
CN116225972A (en) * 2023-05-09 2023-06-06 成都赛力斯科技有限公司 Picture difference comparison method, device and storage medium

Also Published As

Publication number Publication date
CN102970528B (en) 2016-12-21

Similar Documents

Publication Publication Date Title
CN102970528A (en) Video object division method based on change detection and frame difference accumulation
CN102222346B (en) Vehicle detecting and tracking method
Čech et al. Scene flow estimation by growing correspondence seeds
Camplani et al. Depth-color fusion strategy for 3-D scene modeling with Kinect
CN103971386B (en) A kind of foreground detection method under dynamic background scene
CN102903119B (en) A kind of method for tracking target and device
CN101719979B (en) Video object segmentation method based on time domain fixed-interval memory compensation
CN103077539A (en) Moving object tracking method under complicated background and sheltering condition
Barranco et al. Bio-inspired motion estimation with event-driven sensors
CN103871076A (en) Moving object extraction method based on optical flow method and superpixel division
CN101324956A (en) Method for tracking anti-shield movement object based on average value wander
CN110458862A (en) A kind of motion target tracking method blocked under background
CN103729858A (en) Method for detecting article left over in video monitoring system
Ye et al. Estimating piecewise-smooth optical flow with global matching and graduated optimization
CN106651923A (en) Method and system for video image target detection and segmentation
Wu et al. Overview of video-based vehicle detection technologies
CN105741326A (en) Target tracking method for video sequence based on clustering fusion
CN111161308A (en) Dual-band fusion target extraction method based on key point matching
CN103455997A (en) Derelict detection method and system
CN101571952A (en) Method for segmenting video object based on fixed period regional compensation
CN103810472A (en) Method for pupil position filtering based on movement correlation
CN110858392A (en) Monitoring target positioning method based on fusion background model
CN115546764A (en) Obstacle detection method, device, equipment and storage medium
Xun et al. Congestion detection of urban intersections based on surveillance video
CN101447083B (en) Beaconing-free vision measuring-technique for moving target based on time-space correlative characteristics

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20181219

Address after: 518000 N District, 5th Floor, No. 3011 Shahe West Road, Xili Street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Shenzhen Xin Xiang Technology Co., Ltd.

Address before: 100191 Xueyuan Road, Haidian District, Beijing, No. 37

Patentee before: Beihang University