CN107871315A

CN107871315A - A kind of video image motion detection method and device

Info

Publication number: CN107871315A
Application number: CN201710934534.5A
Authority: CN
Inventors: 安振宇; 孙亭; 李毅; 叶云; 丁杰; 龚少麟
Original assignee: CETC 28 Research Institute
Current assignee: CETC 28 Research Institute
Priority date: 2017-10-09
Filing date: 2017-10-09
Publication date: 2018-04-03
Anticipated expiration: 2037-10-09
Also published as: CN107871315B

Abstract

The present invention proposes a kind of video image target detection method, including：Background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains the video image background of the video-input image, wherein the input of second of mixed Gauss model is the result after the modeling of first time mixed Gauss model；The video image background and the video-input image are made the difference frame by frame, obtain the video image prospect of video image；UNICOM region and the zonule based on pixel value size are eliminated to the video image prospect successively binary conversion treatment based on Otsu threshold, based on morphological erosion and dilation operation to eliminate, and form the foreground target of the inputted video image.In addition the present invention also proposes a kind of video image target detection means and system.

Description

A kind of video image motion detection method and device

Technical field

The present invention relates to a kind of video image motion object detection method and device, belong to artificial intelligence field.

Background technology

As China's Urbanization Construction paces are constantly accelerated, cities and towns safety management turns into the problem of very important, intelligence Video monitoring system is applied to urban safety management more and more widely, and scale also constantly expands.Core as video monitoring One of heart technology, the detection technique of moving object in video sequences also especially attract attention, and how further to improve in practice The stability and accuracy of moving object detection turn into the emphasis of domestic and international large quantities of experts and scholar's research.Conventional motion target is examined Survey method includes inter-frame difference, optical flow method, mixed Gauss model etc., wherein, as a kind of classical adaptive background modeling Method, mixed Gauss model have preferable adaptability to complex background, and practical manifestation is also relatively preferably, therefore obtain researcher always Concern.

The Chinese patent ZL201410090199.1 of prior art 1 discloses a kind of fortune based on mixed Gaussian and rim detection Moving target detection method, current image frame is read from the video of camera shooting；Utilize mixed Gauss model, the initialization back of the body Scape, and background is constantly updated, while isolate moving target and binaryzation；Fortune work(mesh is extracted using canny edge detection methods Mark；By the progress of obtained moving target or computing and filling cavity；Shadow removing；Necessary post processing is carried out, obtains and ties to the end Fruit；Circular treatment is until all picture frame processing terminate.The moving target and canny that the present invention is extracted using mixed Gauss model The moving target or computing of operator extraction, solve what conventional method was extracted under moving target and background color similar situation The problem of moving target serious loss.Although patent ZL for moving target with being handled under background color similar situation, But the performance of individual layer mixed Gauss model is limited to, for the scarce capacity of video image background modeling, part prospect may be missed It is judged to background.

The Chinese patent ZL200710304222.2 of prior art 2 discloses a kind of motion based on expansion mixed gauss model Object detection method, module is built by first-order model, based on expansion mixed gauss model, construct shaded background and prospect Probability density function；Module, based on the model of above-mentioned three class, tectonic movement target and non-athletic target are built by second-level model Probability density function；By sort module, using MAP-MRF (Maximum a Posteriori-Markov Random Field) method is classified；The feedback information of application tracking, further accurate foreground model.The present invention is by by Gaussian Mixture Model Fusion spatial information can overcome the prospect error detection caused by background motion；By merging the back of the body in a probabilistic framework Scape modeling, foreground detection and shadow removal can overcome the adverse effect caused by shade, so as to improve the detection of moving target Effect.Although prior art 2 is preferable to the background modeling effect of moving target, the algorithm based on markov random file, meter It is higher to calculate complexity, is relatively inaccessible to for the requirement of real-time of actual video monitoring image.

The present invention is directed to the drawbacks described above of prior art, and the present invention proposes a kind of video image motion object detection method And device.

The content of the invention

The purpose of the present invention is the defects of overcoming prior art to exist, there is provided a kind of video image motion object detection method And device.In addition, another object of the present invention there is provided a kind of video image target detection method of stabilization

Another object of the present invention realizes two layers of Gauss modeling algorithm.

The present invention is still another object is that post-processing algorithm to handling image.

To realize the above-mentioned purpose of the present invention, the present invention proposes a kind of video image motion object detection method, including： S1. background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains regarding for the video-input image Frequency image background, wherein the input of second of mixed Gauss model is the result after the modeling of first time mixed Gauss model；S2. will The video image background makes the difference frame by frame with the video-input image, obtains the video image prospect of video image；S3. Connection is eliminated to the video image prospect successively binary conversion treatment based on Otsu threshold, based on morphological erosion and dilation operation Logical region and the zonule elimination based on pixel value size, form the foreground target of the inputted video image.

In addition, the invention also provides a kind of video image target detection means, including：Background modeling unit, passes through two Layer mixed Gauss model carries out background modeling to the image of video input, obtains the video image back of the body of the video-input image Scape, wherein the input of second of mixed Gauss model is the result after the modeling of first time mixed Gauss model；Background eliminates unit, The video image background that the background modeling unit exports is made the difference frame by frame with the video-input image, obtains video figure The video image prospect of picture；Post-processing unit, the video image prospect that unit output is eliminated to the background are based on big Tianjin successively The binary conversion treatment of threshold value, UNICOM region and the zonule based on pixel value size are eliminated based on morphological erosion and dilation operation Eliminate, form the foreground target of the inputted video image.

In addition, the invention also provides a kind of video image target detection means, including：One or more processors；Deposit The non-transitorycomputer readable storage medium of the one or more instructions of storage, the one or more of instructions of computing device When, it is configured to：Background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains the video input The video image background of image, wherein the input of second of mixed Gauss model is the knot after the modeling of first time mixed Gauss model Fruit；The video image background and the video-input image are made the difference frame by frame, before obtaining the video image of video image Scape；Disappear to the video image prospect successively binary conversion treatment based on Otsu threshold, based on morphological erosion and dilation operation Eliminated except UNICOM region with the zonule based on pixel value size, form the foreground target of the inputted video image.

The beneficial effects of the invention are as follows：First, the mode of two layers of mixed Gauss model modeling provided by the invention can counted Reach the effect of relative equilibrium, effect stability in terms of the degree of accuracy for calculating speed and background modeling.2nd, after algorithm is by a variety of images Processing operation, including binaryzation, morphology operations, avoid the interference of Small object.3rd, algorithm can be realized effectively continuous for video The background modeling of image, so as to effective detection moving target.

Brief description of the drawings

Fig. 1 shows the video image motion target detection schematic diagram based on two layers of gauss hybrid models；

Fig. 2 shows the video image motion object detection method flow chart based on two layers of gauss hybrid models；

Fig. 3 shows the composition frame chart of video image target detection means；

Fig. 4 shows the composition frame chart of post-processing unit；

Fig. 5 shows the composition frame chart of another video image target detection means；

Fig. 6 shows the design sketch after the processing of video images detection method.

Embodiment

In image and adaptation processing, so-called video, actually by a series of images with time series feature, It is sequence image.For common monitor video, because camera is relatively stable, therefore one will be had in sequence image Individual metastable scene does not change, that is, background.The object changed in the background, i.e. moving target, because with value Information is, it is necessary to be detected, i.e., so-called foreground information.Background and prospect are actually also image, but background is typically constant, preceding Scape is real-time change as moving target.

Fig. 1 is the video image motion target detection schematic diagram proposed by the present invention based on two layers of gauss hybrid models, defeated The video image entered obtains video background after two layers of gauss hybrid models processing 101, then carries out background and eliminates 102, will Result after background eliminates carries out post processing of image 103, obtains the foreground image of moving target.

Fig. 2 is the video image motion object detection method flow proposed by the present invention based on two layers of gauss hybrid models Figure, it is broadly divided into two layers of mixed Gauss model background modeling S1, background makes the difference and eliminates tri- S2, post processing of image S3 steps.

1) background modeling S1：From input monitoring video input picture frame by frame, background is entered by two layers of mixed Gauss model Row modeling, wherein the input of second of mixed Gauss model is the result of modeling for the first time.Modeled by two-layer model, most end form Into the background of video image.

The step of wherein single mixed Gauss model models is as follows：

Initial background model μ, initial background average are μ₀, primary standard difference σ₀, initial differential threshold value T (being arranged to 20), I_{X, y}For the pixel value at pixel (x, y) place：

μ (x, y)=I_{X, y}

σ (x, y)=T

Wherein, T is the pixel value of image, and it only has gray level, without dimension, artificially can be set according to environment.

I. pixel I is checked_{X, y}Belong to prospect or background, wherein being λ threshold parameters, judge mean μ (x, y) whether one Determine in scope：

If | I_{X, y}- μ (x, y) | ＜ λ * σ (x, y), I_{X, y}For background

Otherwise, I_{X, y}For prospect

Ii. study renewal is carried out to background, more new formula is as follows, and wherein α is learning rate, typically may be configured as 1e-4：

μ (x, y)=(1- α) * μ (x, y)+α * I_{X, y}

Iii. repeat step ii, iii stop until algorithm, that is, work asWhen stop, ε is also here One constant value is a small amount of, can use 1e-5.

2) background eliminates S2：The video background established using the first step, is made the difference frame by frame with video image, eliminates background, warp Cross the prospect of the result made the difference, as video image, and the moving target of video sequence image；

Making the difference formula is：

D_{X, y}=I_{X, y}- μ (x, y)

Wherein, I_{X, y}Artwork is represented, μ (x, y) is calculating gained background.

3) post processing of image S3：To the image after making the difference, post processing of image operation is carried out, before ultimately forming image Scape target.These post-processing operations are carried out successively, are specifically included：It is binary conversion treatment based on Otsu threshold, rotten based on morphology Erosion eliminates UNICOM region, the zonule based on pixel value size with dilation operation and eliminated.

Wherein, the binary conversion treatment based on Otsu threshold is：

Otsu threshold assumes that image histogram is bimodal distribution, and its basic assumption is that setting can be by image G prospects and the back of the body The separated threshold value of scape, it should make it that the inter-class variance of foreground and background pixel is maximum.Mathematically, Otsu threshold t should meet such as Lower optimal expression formula：

Wherein, ω₀=N₀/ N, ω₁=N₁/ N,Here, N₀, N₁Represent prospect, background and total number of pixels respectively with N.p_iRepresent gray level i frequency.μ₀, μ₁WithBefore representing respectively The gray average of scape, background and full figure pixel.For RGB image, t value scope is 0-255.Therefore, after obtaining t, lead to Thresholding is crossed, segmentation figure can be obtained as R^segIt is as follows：

In segmentation figure as R^segIn, the pixel for representing prospect is all marked as 1, and background pixel is labeled as 0.

Wherein, it is based on morphological erosion and the step of dilation operation elimination UNICOM region：

Provided with two images B, A, if A is processed object, i.e., the data after the binary conversion treatment based on Otsu threshold, And B is for handling A, then B is referred to as structural element, and is visually referred to as brush.Structural element is generally all that some compare Small image.Etching operation is first carried out to the data after the binary conversion treatment based on Otsu threshold, then carries out expansive working, It is exactly morphologic opening operation.

Wherein corroding (Erosion) operation is：

X is all set for making x still in X after S translations x with the S results corroded.In other words, obtained with S to corrode X To set be set that S is entirely included in S origin position when in X.

Wherein expanding (Dilation) operation is：

Expansion can regard the dual operations of corrosion as, and its definition is：Ba is obtained after structural element B is translated a, if Ba is hit Middle X, write down this point.The set of all a points compositions for meeting above-mentioned condition is referred to as the result that X is expanded by B.

Carried out to eliminating the data behind UNICOM region based on morphological erosion and dilation operation based on the small of pixel value size Region eliminates, wherein, the zonule elimination side based on pixel value size is：

If it is { A to obtain UNICOM region in image G₁, A₂..., A_N, corresponding UNICOM's area pixel value number is respectively {n₁, n₂..., n_N, if then n_i＜ ε, wherein ε are manually set, and can be taken as 30, then the region is cast out, and are determined as non-targeted；If n_i＞ ε, then it is target, as prospect.

Fig. 3 is a kind of video image target detection means 300 proposed by the present invention, including：Background modeling unit 301, lead to Cross two layers of mixed Gauss model and background modeling is carried out to the image of video input, obtain the video image of the video-input image Background, wherein the input of second of mixed Gauss model is the result after the modeling of first time mixed Gauss model；Background eliminates single Member 302, the video image background that the background modeling unit exports is made the difference frame by frame with the video-input image, obtained The video image prospect of video image；Post-processing unit 303, the video image prospect of unit output is eliminated to the background successively Binary conversion treatment based on Otsu threshold, eliminate based on morphological erosion and dilation operation UNICOM region and based on pixel value size Zonule eliminate, form the foreground target of the inputted video image.

Each layer of modeling method in wherein described background modeling unit in two layers of mixed Gauss model regards to Fig. 1 as described above In frequency image object detection method described in modeling method.

As shown in figure 4, wherein described post-processing unit 303 includes the binary conversion treatment module 304 based on Otsu threshold, base UNICOM's regions module 305 and the zonule cancellation module 306 based on pixel value size are eliminated in morphological erosion and dilation operation, Described image prospect obtains the foreground target of inputted video image after three resume modules successively.

Fig. 5 is that the present invention proposes another video image target detection means 400, and it includes：One or more processors 401；

The non-transitorycomputer readable storage medium 403 of the one or more instructions 402 of storage, wherein the computer can Pending data can also be stored by reading storage medium 403, and the data may be alternatively stored in other storage mediums, the processor When performing one or more of instructions, it is configured to：The image of video input is carried on the back by two layers of mixed Gauss model Scape models, and the video image background of the video-input image is obtained, wherein the input of second of mixed Gauss model is first Result after secondary mixed Gauss model modeling；The video image background and the video-input image are made the difference frame by frame, Obtain the video image prospect of video image；To the video image prospect successively binary conversion treatment based on Otsu threshold, base Eliminated in morphological erosion with dilation operation elimination UNICOM region with the zonule based on pixel value size, form the input and regard The foreground target of frequency image.

Fig. 6 is the inventive method to be provided by taking indoor result as an example and result that device obtains, city video and indoor video Processing form is different except position, and remaining is consistent.

In a word, the modeling format provided by the invention based on two layers of mixed Gauss model, it is possible to prevente effectively from video image The influence of noise spot in moving object detection, the effect of relative equilibrium is reached in terms of the degree of accuracy of calculating speed and background modeling Fruit.The operation of a variety of post processing of image, such as binaryzation, morphology operations, the interference of Small object can be avoided.In general, For the detection method of moving object in video sequences, it is possible to achieve automatically analyzing and studying and judging to supervision of the cities video, for The urban issueses such as processing parking offense, random road occupying can play effective Auxiliary support effect.

Claims

1. a kind of video image target detection method, including：

S1. background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains the video-input image Video image background, wherein the input of second of mixed Gauss model be first time mixed Gauss model modeling after result；

S2. the video image background and the video-input image are made the difference frame by frame, obtains the video figure of video image As prospect；

S3. transported to the video image prospect successively binary conversion treatment based on Otsu threshold, based on morphological erosion and expansion Calculate and eliminate UNICOM region and the zonule elimination based on pixel value size, form the foreground target of the inputted video image.

2. image object detection method as claimed in claim 1, each layer in two layers of mixed Gauss model is modeled as：

S11. image background model is initialized,Wherein, initial background average is μ₀, primary standard is poor σ₀, initial differential threshold value T (being arranged to 20), I_{X, y}For the pixel value at pixel (x, y) place；

S12. by whether within the specific limits to judge mean μ (x, y), to check pixel I_{X, y}Belong to prospect or background, if |I_{X, y}- μ (x, y) | ＜ λ * σ (x, y), I_{X, y}For background, otherwise, I_{X, y}For prospect, wherein being λ threshold parameters；

S13. μ (x, y)=(1- α) * μ (x, y)+α * I are passed through_{X, y}Study renewal, and repeat step S12 are carried out to background, until μ (x, y) meets condition, whereinWherein α is learning rate.

3. method as claimed in claim 2, wherein learning rate α takes 1e-4.

4. the method as described in claim 1, wherein the binary conversion treatment based on Otsu threshold is：It is double to set image histogram Peak is distributed, and sets the separated Otsu threshold of display foreground and background so that the inter-class variance of foreground and background pixel is maximum.

5. method as claimed in claim 4, wherein the Otsu threshold t meets following formula,

<mrow> <mi>t</mi> <mo>=</mo> <munder> <mrow> <mi>arg</mi> <mi>max</mi> </mrow> <mi>t</mi> </munder> <mo>{</mo> <msub> <mi>&omega;</mi> <mn>0</mn> </msub> <msup> <mrow> <mo>(</mo> <msub> <mi>&mu;</mi> <mn>0</mn> </msub> <mo>-</mo> <mover> <mi>&mu;</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>+</mo> <msub> <mi>&omega;</mi> <mn>1</mn> </msub> <msup> <mrow> <mo>(</mo> <msub> <mi>&mu;</mi> <mn>1</mn> </msub> <mo>-</mo> <mover> <mi>&mu;</mi> <mo>&OverBar;</mo> </mover> <mo>)</mo> </mrow> <mn>2</mn> </msup> <mo>}</mo> </mrow>

Wherein, ω₀=N₀/ N, ω₁=N₁/ N,N₀, N₁Distinguish with N Represent prospect, background and total number of pixels, p_iRepresent gray level i frequency, μ₀, μ₁WithProspect, background and full figure are represented respectively The gray average of pixel.

6. the method as described in claim 1, wherein eliminating UNICOM region with dilation operation based on morphological erosion and being：Knot is set Constitutive element B, etching operation is first carried out to the result after the binary conversion treatment based on Otsu threshold, then carries out expansive working.

7. the method as described in claim 1, wherein the elimination of the zonule based on pixel value size is：It is by UNICOM's region representation {A₁, A₂..., A_N, corresponding UNICOM's area pixel value number is respectively { n₁, n₂..., n_N, if then n_i＜ ε, wherein ε are artificial Setting, then the region is cast out, and is determined as non-targeted；If n_i＞ ε, then it is target, as prospect.

8. method as claimed in claim 7, wherein the ε is 30.

9. as profit requires 2 methods stated, wherein T values 20.

10. a kind of video image target detection means, including：

Background modeling unit, background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains described regard The video image background of frequency input picture, wherein the input of second of mixed Gauss model is the modeling of first time mixed Gauss model Result afterwards；

Background eliminates unit, and the video image background that the background modeling unit exports is done frame by frame with the video-input image Difference processing, obtains the video image prospect of video image；

Post-processing unit, the video image prospect exported to background elimination unit is successively at the binaryzation based on Otsu threshold Reason, the zonule based on morphological erosion and dilation operation elimination UNICOM region and based on pixel value size eliminate, described in formation The foreground target of inputted video image.

11. device as claimed in claim 10, wherein each in two layers of mixed Gauss model in the background modeling unit Layer is modeled as：

Image background model is initialized,Wherein, initial background average is μ₀, primary standard difference σ₀, just Beginning differential threshold T (being arranged to 20), I_{X, y}For the pixel value at pixel (x, y) place；

By whether within the specific limits to judge mean μ (x, y), to check pixel I_{X, y}Belong to prospect or background, if | I_{X, y}- μ (x, y) | ＜ λ * σ (x, y), I_{X, y}For background, otherwise, I_{X, y}For prospect, wherein being λ threshold parameters；

Pass through μ (x, y)=(1- α) * μ (x, y)+α * I_{X, y}Study renewal, and repeat step S12 are carried out to background, until μ (x, y) Meet condition, whereinWherein α is learning rate.

12. device as claimed in claim 10, wherein the post-processing unit includes the binary conversion treatment mould based on Otsu threshold Block, UNICOM's regions module and the zonule cancellation module based on pixel value size are eliminated based on morphological erosion and dilation operation, Described image prospect obtains the foreground target of inputted video image after three resume modules successively.

13. device as claimed in claim 12, wherein the setting image histogram of the binary conversion treatment module based on Otsu threshold For bimodal distribution, and set the separated Otsu threshold of display foreground and background so that the inter-class variance of foreground and background pixel It is maximum.

14. device as claimed in claim 13, wherein the Otsu threshold t meets following formula,

15. device as claimed in claim 12, wherein it is described based on morphological erosion and dilation operation eliminate UNICOM region and Zonule cancellation module based on pixel value size is used for：Setting structure element B, after the binary conversion treatment based on Otsu threshold Result first carry out etching operation, then carry out expansive working.

16. device as claimed in claim 12, wherein the elimination of the zonule based on pixel value size is：By UNICOM's region representation For { A₁, A₂..., A_N, corresponding UNICOM's area pixel value number is respectively { n₁, n₂..., n_N, if then n_i＜ ε, wherein ε are people Work is set, then the region is cast out, and is determined as non-targeted；If n_i＞ ε, then it is target, as prospect.

17. a kind of video image target detection means, including：

One or more processors；

The non-transitorycomputer readable storage medium of the one or more instructions of storage, the computing device are one or more During individual instruction, it is configured to：

Background modeling is carried out to the image of video input by two layers of mixed Gauss model, obtains regarding for the video-input image Frequency image background, wherein the input of second of mixed Gauss model is the result after the modeling of first time mixed Gauss model；

The video image background and the video-input image are made the difference frame by frame, before obtaining the video image of video image Scape；

Disappear to the video image prospect successively binary conversion treatment based on Otsu threshold, based on morphological erosion and dilation operation Eliminated except UNICOM region with the zonule based on pixel value size, form the foreground target of the inputted video image.