CN103325121A

CN103325121A - Method and system for estimating network topological relations of cameras in monitoring scenes

Info

Publication number: CN103325121A
Application number: CN2013102703492A
Authority: CN
Inventors: 张红广; 崔建竹; 唐潮; 田飞; 王鹏; 邓娜娜; 蒋建彬; 马娜; 高会武; 徐尚鹏; 季益华; 马铁; 宋成国
Original assignee: SMART CITY INFORMATION TECHNOLOGY Co Ltd; Shanghai Advanced Research Institute of CAS; China Security and Surveillance Technology PRC Inc
Current assignee: SMART CITY INFORMATION TECHNOLOGY Co Ltd; Shanghai Advanced Research Institute of CAS; China Security and Surveillance Technology PRC Inc
Priority date: 2013-06-28
Filing date: 2013-06-28
Publication date: 2013-09-25
Anticipated expiration: 2033-06-28
Also published as: CN103325121B

Abstract

The invention relates to the technical field of security and protection, and provides a method and system for establishing network topological relations of cameras in monitoring scenes. The method comprises the steps of resolving the monitoring scenes in video streams shot in a monitoring network into grids, obtaining color histogram information of light streams of all grids in the monitoring scenes, conducting clustering on the grids in the monitoring scenes according to the color histogram information of the light streams of the grids in the monitoring scenes to obtain semantic region segmentation results of the monitoring scenes, and determining the network topological relations among all the cameras according to the semantic region segmentation results in the monitoring scenes. The method and system for establishing the network topological relations among the cameras in the monitoring scenes solves the problem that in the prior art, due to the fact that the topological relations among the cameras are all based on locating and tracking of specific target activities, when obstructions exist in a monitoring environment or the monitoring image resolution ratio is low, algorithm performance decreases sharply.

Description

Camera network topological relation evaluation method and system in a kind of monitoring scene

Technical field

The invention belongs to technical field of security and protection, relate in particular to camera network topological relation evaluation method and system in a kind of monitoring scene.

Background technology

The topology of camera network estimates it is the key issue that camera network is disposed, and accurate topological estimation not only can be grasped the motor pattern of the targets such as the interior personnel of guarded region, crowd, also can by feedback, further optimize deployment.

The topology that prior art provides certain methods to carry out camera network is estimated, comprising:

One, based on personnel's detection and tracking result of image background rejecting, obtains the relevance out of control of crowd activity between a plurality of video cameras, for the goal activities pattern of analyzing and set up whole scene provides foundation.

Two, personnel's paces information of utilizing a plurality of shooting robots to catch is obtained personnel activity's general modfel, and is readjusted video camera according to this pattern and dispose, and realizes arriving monitoring objective with better visual angle and video camera number still less.

Three, based on the mixing probability density estimator of Parzen window and gaussian kernel, the probability density function that movement velocity equivalent when estimating by the time interval, the position that passes in and out the observation ken and the turnover ken forms, whole estimation procedure is realized by the method for learning training collection data.

Four, adopting a kind of Fuzzy Time interval to represent the possibility that observed object occurs aspect the time-domain constraints in next video camera, this possibility is estimated to obtain by the equation of motion.

Five, utilizing a large amount of target observation data, by the method for unsupervised learning, is that a multiple-camera monitor network is automatically set up the time-space domain topological relation between the video camera.On this basis, they give the method for verification algorithm performance and have realized the tracking of target in this network.

Six, utilize a more generally information theory thought of trusting about statistics, uncertain correspondence and bayes method are combined, reduced assumed condition and embodied preferably performance.

Seven, suppose that all there is potential annexation in all video cameras, then by observation impossible connection is removed, experimental results show that their method aspect the extensive camera network topological relation of study, especially in the situation that learning sample is less, has preferably efficient and effect.

Eight, extensive work utilizes the topological relation of multiple-camera, carries out overall activity analysis and pedestrian and identifies.

But above topology inference algorithm is substantially all based on location, tracking to the objectives activity, and is higher to the monitor video quality requirements, blocks or monitoring image resolution when low when existing in the monitoring environment, and algorithm performance will sharply descend.

Summary of the invention

The purpose of the embodiment of the invention is to provide camera network topological relation evaluation method and system in a kind of monitoring scene, exist with the solution prior art, existing topology inference algorithm is substantially all based on location, tracking to the objectives activity, higher to the monitor video quality requirements, block or monitoring image resolution when low the problem that algorithm performance will sharply descend when existing in the monitoring environment.

Embodiments of the invention are achieved in that camera network topological relation evaluation method in a kind of monitoring scene, said method comprising the steps of:

Monitoring scene in the video flowing that every video camera in the monitor network is photographed is decomposed into grid;

For each monitoring scene, obtain the color histogram information of the light stream of each grid in the described monitoring scene;

For each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, obtain the semantic region segmentation result of described monitoring scene;

Determine the network topology between each video camera in the monitor network according to the semantic region segmentation result of each monitoring scene.

The purpose of another embodiment of the present invention is to provide camera network topological relation estimating system in a kind of monitoring scene, and described system comprises:

Resolving cell, the monitoring scene in the video flowing that is used for every video camera of monitor network is photographed is decomposed into grid;

Acquiring unit is used for for each monitoring scene, obtains the color histogram information of the light stream of each grid in the described monitoring scene;

Cluster cell is used for for each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, obtains the semantic region segmentation result of described monitoring scene;

Determining unit is for the network topology of determining according to the semantic region segmentation result of each monitoring scene between each video camera of monitor network.

The embodiment of the invention is by the color histogram feature of the light stream of optical flow algorithm computing grid, further calculate the topological relation between the video camera, do not need very clearly to obtain the location of moving target or follow the trail of, solved that prior art exists, topological relation between the calculating video camera is all based on location, tracking to the objectives activity, higher to the monitor video quality requirements, block or monitoring image resolution when low the problem that algorithm performance will sharply descend when existing in the monitoring environment.

Description of drawings

In order to be illustrated more clearly in the technical scheme in the embodiment of the invention, the below will do to introduce simply to the accompanying drawing of required use in embodiment or the description of the Prior Art, apparently, accompanying drawing in the following describes only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain according to these accompanying drawings other accompanying drawing.

Fig. 1 is the realization flow figure of camera network topological relation evaluation method in the monitoring scene that provides of one embodiment of the invention;

Fig. 2 is the video camera topological relation estimated result figure that another embodiment of the present invention provides;

Fig. 3 is that floor and the video camera that another embodiment of the present invention provides disposed sketch;

Fig. 4 is the modular structure figure of camera network topological relation estimating system in the monitoring scene that provides of another embodiment of the present invention.

Embodiment

In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.

One embodiment of the invention provides camera network topological relation evaluation method in the monitoring scene, described method as shown in Figure 1, concrete steps comprise:

In step S101, the monitoring scene in the video flowing that every video camera in the monitor network is photographed is decomposed into grid.

In the present embodiment, comprise at least two video cameras in the monitor network, every video camera all photographs video flowing, and the video flowing the inside comprises that a plurality of frames, every frame all are piece images, and in a plurality of frames, the image that the moving target process is arranged is exactly monitoring scene.

Need to prove, sizing grid is generally 10*10, also can preset, but the sizing grid after the monitoring scene in the video flowing that all video cameras photograph decomposes is consistent.

In step S102, for each monitoring scene, obtain the color histogram information of the light stream of each grid in the described monitoring scene.

Particularly, the method for color histogram information that realizes obtaining the light stream of each grid in the described monitoring scene is specially:

The video flowing that the definition video camera photographs is I _n(X), wherein X is the coordinate of the grid in the monitoring scene of described video flowing, X=(x; Y) ^T, described x is the horizontal ordinate of described grid, and described y is the ordinate of described grid, and the transposition of described T representing matrix, described n are the numbering of the frame of video that comprises of described video flowing;

Definition

W (X; p) = (\begin{matrix} (1 + p 1) \times x + p 3 \times y + p 5 \\ p 2 \times x + (1 + p 4) \times y + p 6 \end{matrix}); - - - (1)

Wherein, W represents deformable template, p=(p1, p2, p3, p4, p5, p6) ^T, described p1, p2, p3, p4 are 0, described p5, p6 are the Optic flow information of grid;

Definition

p = \arg \min_{p} \underset{x}{Σ} {[I (W (x; p + Δp)) - T (x)]}^{2}; - - - (2)

Wherein, Δ p represents p poor of twice iteration, T(x) grid that decomposes of expression video flowing the first frame;

Need to prove, the grid that video flowing the first frame decomposes refers to the grid after the decomposition of the first two field picture in the video flowing.

Carry out iteration according to (3), (4), (5), until satisfy Δ p less than predetermined threshold value ε;

&dtri; I = W (Δx; p) = \begin{matrix} (\begin{matrix} Ix + p_{5} \\ Iy + p_{6} \end{matrix}); - - - (3) \end{matrix}

Wherein, described Ix represents the gradient map of grid on the x direction of principal axis, and described Iy represents the gradient map of grid on the y direction of principal axis, and described ▽ I represents that grid is at process deformable template W (X; P) gradient map after the conversion;

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δp = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

Calculate p5, p6 when satisfying Δ p less than predetermined threshold value ε;

Obtain the color of light stream information that light stream obtains grid from three components of RGB of the Optic flow information of grid;

Color of light stream information according to described grid, calculate light stream at the histogram information of 8 directions, described light stream is color histogram features of the light stream of grid at the histogram information of 8 directions, and the color histogram feature of the light stream of described grid comprises light stream u ' b on the horizontal direction and the light stream v ' on the vertical direction _b

Need to prove, described 8 directions are specially every direction of 45 degree.

In step S103, for each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, obtain the semantic region segmentation result of described monitoring scene.

Particularly, performing step S103 is specially:

u_{n} = Σ_{{b &Element; r}_{n}} {u^{'}}_{b} - - - (6)

v_{n} = {Σ_{b &Element; r}}_{n} {v^{'}}_{b} - - - (7)

In step S104, determine the network topology between each video camera in the monitor network according to the semantic region segmentation result of each monitoring scene.

Particularly, take two video cameras as example, described two video cameras are respectively the first video camera and the second video camera, and described first and second do not represent order, only are used for distinguishing video camera; The first video camera obtains the first video flowing, and the second video camera obtains the second video flowing, and video flowing comprises a plurality of frames, and every frame all is piece image, and the first video flowing comprises the first image, and the second video flowing comprises the second image; In the first image, the image that the moving target process is arranged is the first monitoring scene, and described moving target comprises people, animal or other material object, and in the second image, the image that the moving target process is arranged is the second monitoring scene.

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}); - - - (10)

Wherein, described a _iThe color histogram feature that represents the light stream of the first grid, described the first grid is decomposed by the first monitoring scene, and described the first monitoring scene is taken by the first video camera, described a _jThe color histogram feature that represents the light stream of the second grid, described the second grid is decomposed by the second monitoring scene, described the second monitoring scene is taken by the second video camera, described the first video camera and the second video camera are any two video cameras in the monitor network, described c represents second grid of τ after the moment, and is described

Represent the degree of association of color histogram feature of the light stream of the color histogram feature of light stream of the first grid and the second grid, described

Represent the time shift of the first grid and the second grid, described Ψ _{I, j}The topological relation estimated result that represents the first video camera and the second video camera.

Need to prove Ψ _{I, j}Greater than 0.5 o'clock, represent that the first video camera and the second video camera are that topology is relevant; In step S104, need to calculate the topological relation estimated result between any two video cameras.

Another embodiment of the present invention provides video camera topological relation estimated result as shown in Figure 2, and video camera topological relation estimated result is as follows:

The embodiment of the invention is chosen 7 video cameras of research institute's administrative building one deck such as Chinese Academy of Sciences Shanghai is high, therefrom chooses at ten one at noon one day to afternoon any video flowing and calculates video camera topological relation estimated result as sample.

1. number and 3. 7 video cameras are deployed in same floor in the experiment, and wherein number video camera is deployed in lift port, and all the other video cameras are deployed in place, 5 entrance and exit of the passage.Floor and video camera are disposed sketch as shown in Figure 3.

Numeral camera number in the circle among Fig. 2, corresponding with Fig. 3 camera number, between two video cameras existence association between these two camera supervised targets of the continuous expression of solid line is arranged in the experimental result, be that same target appears in two camera coverages, can be from the angle reflection goal activities trend of probability statistics.Very little without solid line these two video camera onrelevants of continuous expression or relevance between video camera.As 1. 6. 7. there being very strong association between number video camera, because 7. number video camera position is the main entrance of whole administrative building, enters behind the floor and must go upstairs or enter dining room, Stall rear end by the 6. passage at number video camera place by the 1. elevator at number video camera place.Because the selected time period is the lunchtime, the above a lot of personnel of second floor need to arrive Stall by 1. number video camera place elevator, enter the dining room from 6. number video camera passage again, and after the lunch, personnel understand former road and return.2. number video camera with 3. do not have related (or relevance is very little) between number video camera, only inner for the dining room because 3. number video camera place elevator is Cargo Lift, be kitchen, backstage, dining room between two place's video cameras, do not have direct path.

Another embodiment of the present invention provides camera network topological relation estimating system in the monitoring scene, and the modular structure of described system specifically comprises as shown in Figure 4:

Resolving cell 41, the monitoring scene in the video flowing that is used for every video camera of monitor network is photographed is decomposed into grid;

Acquiring unit 42 is used for for each monitoring scene, obtains the color histogram information of the light stream of each grid in the described monitoring scene;

Cluster cell 43 is used for for each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, obtains the semantic region segmentation result of described monitoring scene;

Determining unit 44 is for the network topology of determining according to the semantic region segmentation result of each monitoring scene between each video camera of monitor network.

Optionally, described acquiring unit 42 specifically is used for:

Definition

W (X; p) = (\begin{matrix} (1 + p 1) \times x + p 3 \times y + p 5 \\ p 2 \times x + (1 + p 4) \times y + p 6 \end{matrix}); - - - (1)

Definition

p = \arg \min_{p} \underset{x}{Σ} {[I (W (x; p + Δp)) - T (x)]}^{2}; - - - (2)

&dtri; I = W (Δx; p) = \begin{matrix} (\begin{matrix} Ix + p_{5} \\ Iy + p_{6} \end{matrix}); - - - (3) \end{matrix}

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δp = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

Color of light stream information according to described grid, calculate light stream at the histogram information of 8 directions, described light stream is color histogram features of the light stream of grid at the histogram information of 8 directions, and the color histogram feature of the light stream of described grid comprises the light stream u ' on the horizontal direction _bWith the light stream v ' on the vertical direction _b

Optionally, described cluster cell 43 specifically is used for:

u_{n} = {Σ_{b &Element; r}}_{n} {u^{'}}_{b}; - - - (6)

v_{n} = {Σ_{b &Element; r}}_{n} {v^{'}}_{b} . - - - (7)

Optionally, described determining unit 44 specifically is used for:

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}); - - - (10)

Wherein, described a _iThe color histogram feature that represents the light stream of the first grid, described the first grid is decomposed by the first monitoring scene, and described the first monitoring scene is taken by the first video camera, described a _jThe color histogram feature that represents the light stream of the second grid, described the second grid is decomposed by the second monitoring scene, described the second monitoring scene is taken by the second video camera, described the first video camera and described the second video camera are any two video cameras in the monitor network, described c represents second grid of τ after the moment, and is described

Optionally, described 8 directions are specially every direction of 45 degree.

One of ordinary skill in the art will appreciate that as the included modules of above-described embodiment is to divide according to function logic, but be not limited to above-mentioned division, as long as can realize corresponding function; In addition, the concrete title of each functional module also just for the ease of mutual differentiation, is not limited to protection scope of the present invention.

Those of ordinary skills it is also understood that, realize that all or part of step in above-described embodiment method is to come the relevant hardware of instruction to finish by program, described program can be in being stored in read/write memory medium, and described storage medium comprises ROM/RAM etc.

The above only is preferred embodiment of the present invention, not in order to limiting the present invention, all any modifications of doing within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims

1. camera network topological relation evaluation method in the monitor network is characterized in that, described method comprises:

2. the method for claim 1 is characterized in that, the described color histogram information of obtaining the light stream of each grid in the described monitoring scene is specially:

Definition

W (X; p) = (\begin{matrix} (1 + p 1) \times x + p 3 \times y + p 5 \\ p 2 \times x + (1 + p 4) \times y + p 6 \end{matrix}); - - - (1)

Definition

p = \arg \min_{p} \underset{x}{Σ} {[I (W (x; p + Δp)) - T (x)]}^{2}; - - - (2)

&dtri; I = W (Δx; p) = \begin{matrix} (\begin{matrix} Ix + p_{5} \\ Iy + p_{6} \end{matrix}); - - - (3) \end{matrix}

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δp = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

3. method as claimed in claim 2, it is characterized in that, described for each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, the semantic region segmentation result that obtains described monitoring scene is specially:

u_{n} = Σ_{{b &Element; r}_{n}} {u^{'}}_{b}; - - - (6)

v_{n} = {Σ_{b &Element; r}}_{n} {v^{'}}_{b} . - - - (7)

4. method as claimed in claim 3 is characterized in that, described semantic region segmentation result according to each monitoring scene determines that the network topology between each video camera is specially in the monitor network:

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}); - - - (10)

5. method as claimed in claim 2 is characterized in that, described 8 directions are specially every direction of 45 degree.

6. camera network topological relation estimating system in the monitor network is characterized in that, described system comprises:

Cluster cell is used for for each monitoring scene, according to the color histogram information of the light stream of each grid in the described monitoring scene grid in the described monitoring scene is carried out cluster, obtains the semantic region of described monitoring scene

Segmentation result;

7. system as claimed in claim 6 is characterized in that, described acquiring unit specifically is used for:

The video flowing that the definition video camera photographs is I _n(X), wherein X is the coordinate of the grid in the monitoring scene of described video flowing, X=(x; Y) ^T, described x is the horizontal ordinate of described grid, and described y is the ordinate of described grid, and the transposition of described T representing matrix, described n are the numbering of the frame of video that comprises of described video flowing; Definition

W (X; p) = (\begin{matrix} (1 + p 1) \times x + p 3 \times y + p 5 \\ p 2 \times x + (1 + p 4) \times y + p 6 \end{matrix}); - - - (1)

Wherein, W represents deformable template, p=(p1, p2, p3, p4, p5, p6) ^T, described p1, p2, p3, p4 are 0, described p5, p6 are the Optic flow information of grid; Definition

p = \arg \min_{p} \underset{x}{Σ} {[I (W (x; p + Δp)) - T (x)]}^{2}; - - - (2)

&dtri; I = W (Δx; p) = \begin{matrix} (\begin{matrix} Ix + p_{5} \\ Iy + p_{6} \end{matrix}); - - - (3) \end{matrix}

H = \underset{x}{Σ} {[&dtri; I]}^{T} [&dtri; I]; - - - (4)

Δp = H^{- 1} * \underset{x}{Σ} {[&dtri; I]}^{T} [T (x) - I (W (x; p))]; - - - (5)

8. system as claimed in claim 7 is characterized in that, described cluster cell specifically is used for:

u_{n} = Σ_{{b &Element; r}_{n}} {u^{'}}_{b}; - - - (6)

v_{n} = {Σ_{b &Element; r}}_{n} {v^{'}}_{b} . - - - (7)

9. system as claimed in claim 8 is characterized in that, described determining unit specifically is used for:

ρ_{a_{i}, a_{j}} (τ) = \frac{E [a_{i} c]}{\sqrt{E [{a_{i}}^{2}] E [c^{2}]}}; - - - (8)

{\hat{τ}}_{a_{i}, a_{j}} = \underset{τ}{\arg \max} \frac{Σ ρ_{a_{i}, a_{j}} (τ)}{Γ}; - - - (9)

Ψ_{i, j} = ρ_{a_{i}, a_{j}} (τ) (1 - {\hat{τ}}_{a_{i}, a_{j}}); - - - (10)

Wherein, described ai represents the color histogram feature of the light stream of the first grid, described the first grid is decomposed by the first monitoring scene, described the first monitoring scene is taken by the first video camera, described aj represents the color histogram feature of the light stream of the second grid, described the second grid is decomposed by the second monitoring scene, described the second monitoring scene is taken by the second video camera, described the first video camera and described the second video camera are any two video cameras in the monitor network, described c represents second grid of τ after the moment, and is described

10. system as claimed in claim 7 is characterized in that, described 8 directions are specially every direction of 45 degree.