CN102116621B

CN102116621B - Object tracking method and system across sensors

Info

Publication number: CN102116621B
Application number: CN 201010002105
Authority: CN
Inventors: 黄钟贤; 周正全; 吴瑞成
Original assignee: Industrial Technology Research Institute ITRI
Current assignee: Industrial Technology Research Institute ITRI
Priority date: 2010-01-05
Filing date: 2010-01-05
Publication date: 2013-01-02
Anticipated expiration: 2030-01-05
Also published as: CN102116621A

Abstract

The invention relates to an object tracking method across sensors, which is applied to a sensor network. A training phase and a detection phase are included in the method. The training phase comprises the following steps of: acquiring the measurement data of multiple sensors as a training sample through each sensor in the sensor network, planning at least one access point in the measurement ranges of multiple sensors in the sensor network, and estimating at least three characteristic functions related to the object through an automatic learning manner, wherein the three characteristic functions comprise a function for the space correlation of each sensor in the sensor network, a function of movement time difference and a function of appearance similarity. In the detection phase, the at least three characteristic functions are used as guidelines for linking object tracing to correlation.

Description

Stride object tracking method and system between sensor

Technical field

The present invention discloses a kind of object tracking (object tracking) method and system in sensor network.

Background technology

In recent years, by the auxiliary Automatic monitoring systems of computer vision technique, play the part of important role.Visual monitor system (Video Surveillance System) by the behavior of mobile personage in the analysis monitoring picture, is detected the generation of unusually saving event from damage, and the vaild notice Security Officer processes.The basic subject under discussion of vision monitoring is such as the existing considerable document such as background subtracting, detecting moving object and tracking, shadow removal and research.The case detecting of high-order, such as behavioural analysis (behavior analysis), legacy detecting (unattended object detection), the detecting of pacing up and down (linger detection) or crowded detecting (jam detection) etc., robotization and the behavioural analysis with intelligence are expected also may have great demand.And one of basic module that stable Moving Objects Tracking is the Intelligent visual supervisory system.

Single sensor measuring range, the visual field of video camera for example, the environment of taking into account required monitoring that can't be complete.The video camera network (camera network) that most video cameras consists of is limited to cost and considers the visual field that does not often have overlapping, and when video camera quantity increased gradually, the color between each video camera was corrected with network structure also more complicated.In the document of TaiWan, China patent publication No. 200806020, disclose a kind of image tracing technology, do object tracing by the fixed video camera of the default right of priority of multi-section and PTZ video camera are collaborative.When the visual field of the video camera with right of priority detects mobile object, start the PTZ video camera mobile object is followed the trail of, make the visual field contain the fixedly video camera visual field.

In the document of another piece TaiWan, China patent publication No. 200708102, the data that merges a plurality of monitoring cameras that discloses is done the video monitoring system of vast area scene, and for the scene that wish is monitored, then provide the information of the sensor networks model in target ground scene photo, target ground dimensional drawing and place.For example shown in Figure 1, the information of these types can be stored in that map-visual field shines upon 104, human size ground Figure 108 and video camera network model 112, and it can be produced and management by map base calibration device 102, visual field base calibration device 106 and video camera network model supvr 110 respectively.

U.S. Patent number 7149325 discloses the framework of a kind of cooperating type (cooperative) video camera network, it records pedestrian's color character and deposits a database in compares identification usefulness as the personage, the part that wherein only has the visual field to overlap at video camera as the personage just might be finished the tracking of this mobile object.Another piece U.S. Patent number 7394916 discloses a kind of target tracking method that is beneficial to, when coming across different video camera for the personage, relatively more previous all personages leave in the appearance of other scene and the transfer similarity between scene (likelihoods of transition), follow the trail of and relevant foundation as the personage.This transfer similarity is blueprint, mobile object speed and gateway distance or the traffic behavior for scene, and is set by the user.

China Patent Publication No. 101142593A discloses a kind of method of following the tracks of the target in the video sequence, when the method appears at different video cameras for front scenery, the macroscopic features that shows changes and compares, simultaneously before comparison is different during scenery, situation when also for scenery before different the state of combination being arranged, carry out the action of extra comparison, thus action eliminate because of front scenery have in conjunction with the time, can't find out correct corresponding front scenery.During front scenery in more different video cameras, the degree of correlation between front scenery is calculated in the color distribution of scenery and the combination of marginal density information before adopting.

Another piece China Patent Publication No. 101090485A discloses a kind of image monitoring system and subject area tracking, the functional module of graphics processing unit 200 wherein detects subject area Check processing and the subject area of mobile object and follows the tracks of processing as shown in Figure 2 in this element carries out image.Subject area between different video cameras is followed the tracks of the processing section, and this element uses unique identification information, and current subject area is associated with the past subject area.When subject area is covered and when disappearing, subject area is followed the tracks of the identification information that the predetermine one zone that gives to have disappeared is continued to keep in the processing section, and when reproducing in this predetermine one zone, give this predetermine one zone with the identification information that keeps.

Then be characteristics such as color, time of occurrence according to object for striding that the video camera personage follows the trail of, the training stage by manually the personage being carried out mark, and then find out probability distribution between different video cameras by training sample, afterwards can be by training probability distribution out at test phase, do and stride being correlated with between the video camera object, to reach the object tracking of striding video camera.

Summary of the invention

In the enforcement example that the present invention discloses, can provide a kind of object tracking method and system of striding between sensor, this object tracking is carried out in the sensor network, and branched sensor is arranged in this sensor network.

Implement in the example one, disclosed is a kind of object tracking method of striding between sensor, can comprise a training stage and a test phase, and in the training stage, by each sensor in the sensor network, obtain a plurality of sensor metrology datas as training sample; In the measuring range of a plurality of sensors in this sensor network, plan at least one entry and exit point; Via the mode of an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise function that function, the traveling time of each sensor spatial coherence in this sensor network is poor and the function of appearance similarity; And in reconnaissance phase, at least three kinds of fundamental functions thus, the criterion that links as object tracking and relevance.

Implement in the example at another, disclosed is a kind of object tracking system of striding between sensor.This system can comprise branched sensor, a training stage processing module (training-phase processing module), the fundamental function estimation and update module (characteristic function estimating and updating module) and a reconnaissance phase tracking process module (detection-phase tracking module) in this sensor network, wherein in the measuring range of a plurality of sensors in this sensor network, cooked up at least one entry and exit point.The training stage processing module is by each sensor, obtain a plurality of sensor metrology datas as training sample, and for the event that enters that each entry and exit point of each sensor occurs, record first all leave event in for the previous period in a training sample space.Fundamental function estimation and update module are by the sample that is present in the training sample space, mode via an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise function that function, the traveling time of each sensor spatial coherence in this sensor network is poor and the function of appearance similarity.Reconnaissance phase tracking process module is by estimating at least three kinds of fundamental functions, the criterion that links as object tracking and relevance.

Description of drawings

Below cooperate accompanying drawing, implement the detailed description of example, with on address other purpose of the present invention and advantage and be specified in after, wherein:

Fig. 1 is place model management person's the example schematic of the video monitoring system on basis, a kind of wide area place.

Fig. 2 is an example schematic, and a kind of functional module of graphics processing unit of image monitoring system is described.

Fig. 3 A is the visual field of each video camera of a video camera network and an example schematic of entry and exit point, and it is consistent that some that discloses with the present invention implemented example.

Fig. 3 B is an example schematic, illustrates that the object tracking problem of striding video camera is equal to object that each entry and exit point observes in the object relevant issues of different time points and different entry and exit points, and it is consistent that some that discloses with the present invention implemented example.

Fig. 4 illustrates when the personage enters the video camera visual field from entry and exit point, follows the trail of this personage by the foundation of where leaving, and it is consistent that some that discloses with the present invention implemented example.

Fig. 5 is an exemplary flowchart, and a kind of object tracking method of striding between sensor is described, and is consistent with some enforcement example that the present invention discloses.

Fig. 6 A to Fig. 6 C is example schematic, and the whole training stage is described, and is consistent with some enforcement example that the present invention discloses.

Fig. 7 is a block schematic diagram, and the design of recursion learning strategy is described, and is consistent with some enforcement example that the present invention discloses.

Fig. 8 is an exemplary flowchart, describes the step of recursion learning strategy in detail, and is consistent with some enforcement example that the present invention discloses.

Fig. 9 A and Fig. 9 B are example schematic, and the interior experiment scene of video camera network and the video camera configuration of a work example are described, and be consistent with some enforcement example that the present invention discloses.

Figure 10 is an example schematic, a training result that actual corresponding relation is arranged of key diagram 9, wherein Figure 10 A and Figure 10 B be respectively histogram H (Δ a) with H (Δ t), Figure 10 C is the approximate mixed Gauss model of histogram H (Δ t), and is consistent with some enforcement example that the present invention discloses.

Figure 11 is an example schematic, training result without actual corresponding relation of key diagram 9, wherein Figure 11 A and Figure 11 B be respectively histogram H (Δ a) with H (Δ t), Figure 11 C is the approximate model of histogram H (Δ t), and is consistent with some enforcement example that the present invention discloses.

Figure 12 is an example schematic, a Query Result of finding out correct dependent event is described, wherein Figure 12 A is that the personage who inquires about enters event, and Figure 12 B, Figure 12 C, Figure 12 D find out three personage's leave event might being correlated with, and is consistent with some enforcement example that the present invention discloses.

Figure 13 is an example schematic, a Query Result that can't inquire dependent event is described, wherein Figure 13 A is that the personage who inquires about enters event, and Figure 13 B and Figure 13 C are two low-down events of degree of correlation numerical value finding out, and is consistent with some enforcement example that the present invention discloses.

Figure 14 is a kind of example schematic of object tracking system of striding between sensor, and is consistent with some enforcement example that the present invention discloses.

Embodiment

Striding the tracking of sensor mobile object is to be defined in a sensor network, and this sensor network comprises such as C_1, C_2, ..., the k such as C_k prop up sensor, and every sensor comprises nk entry and exit point, for instance, sensor C_1 measuring range in, comprise a_1, a_2 ..., the n1 such as an a_n1 entry and exit point, and in the measuring range of sensor C_2, comprise b_1, b_2, ..., the n2 such as a b_n2 entry and exit point, and entry and exit point refers to object by these zone appearance or disappears in the measuring range of sensor.Supposing that the defined good and object of entry and exit point in the measuring range of each sensor is followed the trail of in the measuring range of sensor has solved, and the object tracking of then striding sensor will be equal to object that each entry and exit point observes in the object relevant issues of different time points and different entry and exit points.

In the enforcement example that the present invention discloses, sensor can be the sensor component of plurality of classes, for example sensor can be colour camera, and framework one video camera network is followed the trail of mobile object therebetween, but be not limited with this type of sensor, also can be, such as the sensor component of other classifications such as black-and-white photography machine, hotness video camera, infrared camera, microphone, ultrasonic, laser range finder or weighing scale.

Take sensor as video camera, and being the example in the visual field of video camera, its measuring range illustrates, suppose that the visual field of 3 video cameras is respectively A, B, C.In the A of the visual field, comprise entry and exit point A1 and A2; In the B of the visual field, comprise entry and exit point B1, B2 and B3; In the C of the visual field, comprise entry and exit point C1 and C2, shown in the example of Fig. 3 A.In this example, suppose that each entry and exit point observes 14 different object images altogether in different time points, shown in the example of Fig. 3 B, wherein, for instance, object image 311 is to come across when entering A1 in the visual field of video camera, and object image 312 is to disappear to when leaving A1 in the visual field of video camera; Object image 321 is to come across when entering B1 in the visual field of video camera, and object image 322 is to disappear to when leaving B3 in the visual field of video camera; Object image 331 is to come across when entering C1 in the visual field of video camera.And the object tracking of striding video camera will be equal to how the correct dotted line of setting up

object image

312 and 321 links 310, and the dotted line of object image 322 and 331 links 320.By that analogy, the correct dotted line link of setting up all objects image can be finished the object tracking of striding between video camera.

Take the personage as example, observe the p of a group traveling together if be shown in entry and exit point i with O_i_p, can by macroscopic features O_i_p (a) and personage leave enter another entry and exit point behind the entry and exit point mistiming as feature O_i_p (t), come the relevant foundation as the personage, to finish object tracing.Further, definable one momentum event M ((i, j), (p, q)), expression personage p leaves a video camera visual field and personage q enters a video camera visual field in entry and exit point j in entry and exit point i, if this this turnover personage is same a group traveling together, then p=q.

Thus, this figure picture pass problem namely can represent (the M ((i such as condition probability P, j), (p, q)) | O_i_p, O_j_q), this probit value is given observed value (O_i_p, O_j_q) in the condition situation, the probability that personage p and q belong to same person and move to entry and exit point j from entry and exit point i why, wherein i belongs to respectively different video cameras from j.Therefore, enter an entry and exit point j if observe the q of a group traveling together in time t, and between between the time interval (t_t_Max), there is an event sets E={O_i1_p1, Oi2_p2, ... betide the entry and exit point of non-j, the t_Max here represents video camera Network Mobility for this reason and imports and exports required maximum times in any two, is then found the maximum possible dependent event of q by following formula (1), can finish personage's the video camera of striding and follow the trail of:

O_i_p = \arg \max P (M ((i, j), (p, q)) | O_i_p, O_j_q), &ForAll; O_i_p &Element; E - - - (1)

Mention as aforementioned, for each observation (i.e. mobile personage), can calculate the poor Δ a of macroscopic features that it is sampled in different video cameras, and the personage strides the poor Δ t of traveling time of video camera as feature, and the appearance when supposing that mobile personage moves in this video camera network changes little, and majority's translational speed is roughly the same, this supposes to be all reasonable assumption in most scenes, thus, the P of formula (1) (M ((i, j), (p then, q)) | O_i_p, O_j_q) can be rewritten as following formula by Bei Shi rule (Bayes rule):

P (M ((i, j), (p, q)) | O_i_p, O_j_q) = \frac{P (O_i_p, O_j_q | M ((i, j), (p, q))) P (M ((i, j), (p, q)))}{P (O_i_p, O_j_q)}

= \frac{P (Δa (p, q) | M ((i, j), (p, q))) P (Δt (p, q) | M ((i, j), (p, q))) P (M ((i, j), (p, q)))}{P (O_i_p, O_j_q)}

\approx cP (Δa (p, q) | M ((i, j), (p, q))) P (Δt (p, q) | M ((i, j), (p, q))) P (M ((i, j), (p, q)))

(uniform distribution) is approximate because P (O_i_p, O_j_q) can one evenly distributes, thus replace take c as a constant value, same, P (M (i, j), (p, q)) also be proportional to P (M (i, j)), so formula (1) can be rewritten as:

O_i_p = \arg \max P (Δa (p, q) | M ((i, j), (p, q))) P (Δt (p, q) | M ((i, j), (p, q))) P (M (i, j)),

&ForAll; O_i_p &Element; E - - - (2)

Take the personage as example, the letter meaning of formula (2) that is to say, when a personage q enters a video camera visual field from an entry and exit point, is it by where leaving? the foundation that this personage q follows the trail of is as follows: all that can chase after forward in the north time Δ T are left the personage in each video camera visual field, maximization formula (2) and P (Δ a (p, q) | M ((i, j), (p, q))), P (Δ t (p, q) | M ((i, j), (p,) and P (M (i q)), j)) positive correlation namely is positively correlated with the appearance similarity, the features such as the poor and video camera spatial coherence of traveling time, these features can be estimated with probability function.Take the visual field of 3 video cameras of video camera network of Fig. 3 A and entry and exit point as example, Fig. 4 illustrates when a certain personage enters the video camera visual field from entry and exit point A2, follows the trail of this personage by the foundation of where leaving, it is consistent that some that discloses with the present invention implemented example.

In the example of Fig. 4, suppose that a certain personage enters the video camera visual field in moment t from entry and exit point A2, and obtain the camera image that this enters event, it is video camera sampling 401, all that then can chase after forward in the north time Δ T are left the personage in each video camera visual field, namely all in time interval (t-Δ T, t) are left the personage in each video camera visual field.Suppose to find out the video camera sampling 411-413 of three leave event, when being respectively moment t1, t2, t3, leave from entry and exit point B2, B3, C1, the degree of correlation result who wherein leaves from entry and exit point B2, B3, C1 is respectively P1, P2, P3.Then degree of correlation P1 is positively correlated with appearance similarity, video camera spatial coherence M (A2, B2), the traveling time poor (t-t1) of video camera sampling 411; Degree of correlation P2 is positively correlated with appearance similarity, video camera spatial coherence M (A2, B3), the traveling time poor (t-t2) of video camera sampling 412; Degree of correlation P3 is positively correlated with appearance similarity, video camera spatial coherence M (A2, C1), the traveling time poor (t-t3) of video camera sampling 413.Can be from P1, P2, P3, selecting personage's leave event of highest similarity is correct dependent event.

The present invention discloses and is intended to one without the tracking of carrying out object in the sensor network that is total to measuring range, the scene blueprint information that does not need sensor to set up, and learning process does not need operator's intervention in advance, by a large amount of data and utilize machine learning and the characteristics of statistics, come automatic learning to go out P (Δ a (p in the following formula (2), q) | M ((i, j), (p, q))), P (Δ t (p, q) | M ((i, j), (p, q))) and P (M (i, j)), appearance similarity namely, the features such as the poor and sensor spatial coherence of traveling time.The present invention discloses the mode that proposes a kind of automatic learning and estimates required probability function, the mode of this automatic learning must not limit the sample number that comes across the training data, also need not manually to indicate related person, can be a kind of recursion training method, will describe in detail again.

After the training data is configured to the training sample space according to its entry and exit point relation, by features such as the appearance that measures mobile object and times, automatically estimate the spatial coherence that each video camera, leave and mistiming of inlet point distributes and the heterochromia of object appearance distributes, with this foundation as object tracking.According to this, the object tracking technology between the sensor machine that discloses of the present invention can be divided into the training stage and reconnaissance phase is implemented.Fig. 5 is an exemplary flowchart, and a kind of object tracking method of striding between sensor is described, and is consistent with some enforcement example that the present invention discloses.

In the training stage, by each sensor in the video camera network, obtain a plurality of sensor metrology datas as training sample, shown in step 510.In the measuring range of a plurality of sensors in this sensor network, plan at least one entry and exit point, shown in step 520.Mode via an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise that function, the object of each sensor spatial coherence in this sensor network passes in and out the function of mistiming of different sensor measuring ranges and the poor function of similarity of object appearance, shown in step 530.In reconnaissance phase, at least three kinds of fundamental functions thus, the criterion that links as object tracking and relevance is shown in step 540.

Mention as above-mentioned, the mode of this automatic learning for example can be a kind of recursion learning strategy.Below the example schematic of collocation Fig. 6 A to Fig. 6 C illustrates the whole training stage, and is consistent with some enforcement example that the present invention discloses.

At first, can configure the training sample space of one n * n in a storer, n represents total number of all entry and exit points in the whole sensor network.Each field in this space is used for depositing the turnover event of all pairwise correlations, this space for example available one n * n matrix represents, wherein, field (the d of this n * n matrix, b) expression is when observing an object and enter the event of entry and exit point d, and its past is left the event of entry and exit point b in all in the cycle of a special time.For instance, if when time t, exist one to enter event q in entry and exit point d, backtracking time in the past interval t_Max, that is between the time (t-t_Max, t), if exist a leave event in other entry and exit points of different sensors b for example, then this event is collected and is positioned over (the d in training sample space, b) position, that is field (d, b).That is to say, each field of this sample space exists the sensor spatial coherence.

Take sensor as video camera, and being the example in the visual field of video camera, its measuring range illustrates.Fig. 6 A is an example schematic, illustrates how to configure 7 * 7 training sample spaces, and this training sample space is to represent with 7 * 7 matrixes 630, and is consistent with some enforcement example that the present invention discloses.Wherein, the visual field of 3 video cameras is respectively A, B, C in the whole video camera network, has 7 entry and exit points.In the A of the visual field, comprise entry and exit point A1 and A2; In the B of the visual field, comprise entry and exit point B1, B2 and B3; In the C of the visual field, comprise entry and exit point C1 and C2.Wherein, suppose that all training sample data 610 comprise a plurality of object images, for example object image 1, object image 2, ..., object image 3, ... etc., from all training sample data 610, for example form 615 is listed all sample data that enter A2 (for example object image 1, object image 5, ... etc.) and all leave the sample data of B2, and (for example object image 6, object image 7, object image 10, ... etc.), and the field of matrix 630 (A2, B2) expression is when observing an object and enter the event of A2, in the special time cycle Δ T all are left the event of B2 in the past, for example, (object image 1, object image 6), (object image 1, object image 7), (object image 1, object image 6), ... etc., with label 620 expressions.Can find out, each field of matrix 630 exists the correlativity between the entry and exit point of these 3 video cameras.

After handling all training sample data, deposit in the event of position (d, b), namely be used for training corresponding probability distribution function P (Δ a|M (b, d)), P (Δ t|M (b, d)) and P (M (b, d)).Clearly, if the time that the existence of two entry and exit points links and movement of objects is required is less than t_Max, then correct turnover dependent event must be selected in the relevant event sets of entry and exit point therewith, but similarly, the turnover dependent event of mistake is also selected advances this set, therefore, and next can be by a kind of mode of recurrence repeatedly, with the event of low reliability one by one screening go out, and only keep correct dependent event.By being retained the correct event of getting off in the sample space, successfully estimate the probability distribution function of required correspondence again.

The example that the present invention discloses is for each field, and poor and personage strides that the traveling time of sensor is poor to be used as feature and to do statistics with histogram with macroscopic features.For example, first step is that in each link that may exist, the statistical distribution by the appearance similarity begins first, removes outlier; Second step is, in the higher data of appearance similarity, goes for out obvious data of the time interval; Aforementioned two steps repeatedly after, exist if link really, then can find out the distribution of the poor and time interval convergence of appearance similarity.

Take field (A2, B2) and field (A2, C2) as example, the example of Fig. 6 B is to be used as the statistics with histogram that feature is done with macroscopic features is poor, and transverse axis represents appearance similarity Δ A, and the longitudinal axis represents the statistical distribution of appearance similarity Δ A, i.e. H (Δ A).And the removal outlier, that is to say, the potential rough error (outliers) of small part, the i.e. data of this histogram rightmost side are fallen in screening.The example of Fig. 6 C is from stay the higher data of similarity, find out more obvious data of the time interval, object striden the traveling time of sensor is poor is used as the statistics with histogram that feature is done, transverse axis represents the poor Δ t of traveling time, and it is H (Δ t) that the longitudinal axis represents statistics with histogram.Represent to remove these data with symbol x.

This two histogram H (Δ A) and H (Δ t) can go to be similar to by a mixed Gauss model (MixtureGaussian Models).In histogram H (Δ A), that can expect must exist Gauss and other the Gauss's value that has larger mean value and variance that has less mean value and a variance, reason is the consistance of mobile object appearance, so that correct coupling causes appearance similarity Δ A less, namely correspond to the Gauss's value that has less mean value and variance, same, the Gauss's value that has larger mean value and variance then corresponds to the rough error of sample, then is further to go the part of deleting.

Same reason, if there is the link on the entity space in two entry and exit points, then in histogram H (Δ t), must exist the less Gauss of a variance to correspond to correct sample, its mean value has represented a personage by crossing over the required traveling time of this two entry and exit point, and Gauss's value that variance is larger then corresponds to the rough error of sample.Otherwise any two entry and exit points also have the not link on the Existential Space of high probability, and H this moment (Δ A) should be comparatively random with the distribution of H (Δ t), presents even distribution, and P (M) then levels off to 0.

Has above characteristic owing to connecting the characteristic similarity function of entry and exit point, so can be filtered out by recurrence repeatedly last correct sample.At first, institute might be made histogram H (Δ A) and H (Δ t) by sample statistics, be directed to H (Δ A), the potential rough error of small part is fallen in screening, the i.e. data of this histogram rightmost side, and upgrade simultaneously and whether the Gauss that observes among the H (Δ t) has concentrated trend, if having, then continue screening H (Δ A) and upgrade the data of H (Δ t) until the convergence of similarity distribution function; Otherwise, if without the trend of concentrating, and P (M) with respect to other combination come little, represent the link on this two entry and exit points incorporeity space.

Fig. 7 is a block schematic diagram, and the design of recursion learning strategy is described, and is consistent with some enforcement example that the present invention discloses.With reference to figure 7, at first, for all two entry and exit point d and b, namely all fields (d, b) of n * n matrix according to the example explanation of earlier figures 6A, are collected all possible corresponding leave event, to set up event pond (event poo1) 710.By event pond 710, estimate and the renewal poor probability function P of appearance similarity (Δ A) and the poor probability function P of traveling time (Δ t), wherein, the estimation of the poor probability function of appearance similarity and renewal comprise to be estimated P (Δ A), data pruning (datatrimming) and upgrades P (Δ A), can go the poor probability function P of approximate similarity degree (Δ A) and the data that removes peripheral rough error by a mixed Gauss model G1 (Δ A) in this data training; The estimation of the poor probability function of traveling time and renewal comprise estimates that P (Δ T), data are pruned and renewal P (Δ T), can be removed the approximate poor probability function P of traveling time (Δ T) and be removed the data that does not have central tendency by another mixed Gauss model G2 (Δ T) during this data is pruned.

After upgrading P (Δ t), can judge again whether the poor probability function P of traveling time (Δ T) restrains, during non-convergence, then return event pond 710 and proceed to estimate and upgrade the poor probability function P of appearance similarity (Δ A) and the poor probability function P of traveling time (Δ T); During convergence, then finish.Whether the data that removes peripheral rough error can decide less than-predetermined value K1 according to condition probability function P (Δ A|G1).Whether remove the data that does not have central tendency also can decide less than a predetermined value K2 according to condition probability function P (Δ T|G2).And the condition of convergence of the poor probability function P of traveling time (Δ T) for example is, the number (number of removed events) that is removed event is less than-predetermined number K3.K1, K2 value set larger, and the deleted ratio of data is higher, also reaches sooner the condition of convergence, too highly also may delete too much data yet set; The K3 setting is larger then more easily satisfies the condition of convergence, but may leave over the event that many incorporeities link.K1, K2, K3 can set according to actual application environment and demand, for example can be set as the empirical value of experiment.

Hold above-mentionedly, Fig. 8 is an exemplary flowchart, and the step of recursion learning strategy is described, it is consistent that some that discloses with the present invention implemented example.With reference to figure 8, for the event that enters that each entry and exit point of each sensor occurs, record first all leave event in for the previous period in a training sample space, shown in step 810.By the sample that is present in this training sample space, estimate entry and exit point correlativity probability function, the poor probability function of traveling time and the poor probability function of appearance similarity, shown in step 820.Observe the poor probability function of appearance similarity, the data that belongs to peripheral rough error on the statistics is removed, shown in step 830.By the poor probability function of data updating traveling time that stays and the poor probability function of appearance similarity, shown in step 840.Repeating

step

830 and 840 is until the convergence of the poor probability function of traveling time, shown in step 850.

In the step 830, can remove the approximate poor probability function of appearance similarity by a mixed Gauss model.In the step 840, before upgrading the poor probability function of traveling time, can remove the approximate poor probability function of traveling time and observe the data that does not have central tendency that whether removes by another mixed Gauss model first.In the step 850, whether the poor probability function of traveling time restrains, and for example whether the visual number that is removed event decides less than a predetermined number.After the step 850, the data of last event can be used to estimate entry and exit point correlativity probability function.

Below be that one of video camera work example illustrates in a video camera network with sensor, carry out the object tracking method between sensor that the present invention discloses.Experiment scene in this video camera network configures shown in the example of Fig. 9 A and Fig. 9 B with video camera, and the environment set of the experiment scene of Fig. 9 A is an office areas, and installs four and do not have an overlapping visual field, i.e. four dashed region, video camera A, B, C and D.Wherein, the example of Fig. 9 B is in the visual field 910 of video camera A an entry and exit point a1 to be arranged, three entry and exit point b1, b2, b3 are arranged in the visual field 920 of video camera B, two entry and exit point c1, c2 are arranged in the visual field 930 of video camera C, 940 have two entry and exit point d1, d2 in the visual field of video camera D.

In this experiment scene, the present invention discloses and uses one section film, and front 7 minutes part is as the training stage, and rear one minute as reconnaissance phase.In the training stage, the appearance of estimating between each entry and exit point changes and the mistiming.In reconnaissance phase, during the event inquiry that enters with a personage, the event of entering has personage's leave event of higher similarity to be listed therewith.All personage's leave event are to occur in (t-t_Max, t) period, and t refers to the time point that the personage enters.

For instance, consider that entry and exit point is the correlativity of b1 among a1 and Fig. 9 B among Fig. 9 B, according to this experimental result, through after six recurrence, obtain the training result of actual corresponding relation, shown in the example of Figure 10, wherein Figure 10 A and Figure 10 B are respectively histogram H (Δ A) and H (Δ t), and Figure 10 C is the approximate mixed Gauss model of histogram H (Δ t); The transverse axis of Figure 10 A is the similarity of turnover personage in the event, and is more higher near 0 similarity; The transverse axis of Figure 10 B and Figure 10 C is the time degree of turnover personage in the event, and unit is second; The longitudinal axis of Figure 10 all represents event number.

Without the training result of actual corresponding relation, shown in the example of Figure 11.Among Fig. 9 B among c1 and Fig. 9 B this two entry and exit point of d1 the not link on the Existential Space of high probability is arranged, this moment H (Δ a) should be comparatively random with the distribution of H (Δ t), respectively shown in the example of Figure 11 A and Figure 11 B.Present even distribution, its approximate probability model is shown in the example of Figure 11 C.P (M) then levels off to 0.

In reconnaissance phase, during the event inquiry that enters with an object (for example personage), the result is broadly divided into two kinds, and a kind of is that personage's leave event of highest similarity is correct dependent event, and another kind is not inquire any relevant leave event.The example of Figure 12 represents a Query Result of finding out correct dependent event.Wherein Figure 12 A is that the personage who inquires about enters the camera image of event (entering b2), and Figure 12 B, Figure 12 C, Figure 12 D are the camera images of finding out three personage's leave event might being correlated with, are respectively to leave a1, c2, d2.Wherein, the degree of correlation result of Figure 12 B, Figure 12 C, Figure 12 D can be calculated by formula (2), be P (Δ a (p, q) | M ((i, j), (p, q))), P (Δ t (p, q) | M ((i, j), (p, q))) with the continued product of P (M (i, j)), be respectively 0.2609,2.82*10-100 and 2.77*10-229, the personage that enters who leaves personage and inquiry among Figure 12 B is same people, represents that correct its degree of correlation of event is obviously high than other irrelevant event, and the result of inquiry is correct.That is to say, the criterion of object tracking and relevance is positively correlated with aforesaid appearance similarity, traveling time is poor and the video camera spatial coherence, that is such as the criterion of formula (2).

The example of Figure 13 is after one of inquiry enters the personage, can't inquire the relevant personage that leaves.Wherein Figure 13 A is that the personage who inquires about enters event, because not very not positively related leave event exists, two dependent events therefore finding out, i.e. Figure 13 B and Figure 13 C, its degree of correlation is respectively 7.86*10-4 and 3.83*10-138, and degree of correlation numerical value is relatively very low.

The enforcement example that the present invention discloses is except being implemented on above-mentioned colour camera network, also may be implemented in the sensor network framework of other classification, sensor network frameworks such as black-and-white photography machine, hotness video camera, infrared camera, microphone, ultrasonic, laser range finder or weighing scale.For example, the metric data of this type of other each sensor extracted have the distinctive feature of object, replace the macroscopic features that is obtained by colour camera, can successfully replace.For instance, when sensor was black-and-white photography machine or infrared camera, material (texture) or GTG intensity (gray intensity) that its macroscopic features can be mobile object distributed; If during the hotness video camera, then macroscopic features can be selected object temperature value or its Temperature Distribution; If obtain information by microphone, then feature can be audio frequency or the tone color that different objects sends; If ultrasonic, laser range finder or weighing scale etc. can be used as macroscopic features by its height or body weight that measures mobile object.

Above-mentioned object tracking method of striding between sensor can be carried out with an object tracking system, and is carried out in the video camera network, and Figure 14 is example schematic of this object tracking system, and is consistent with some enforcement example that the present invention discloses.In the example of Figure 14, the m that object tracking system 1400 can comprise in this sensor network props up sensor, a training stage processing module 1410, fundamental function estimation and update module 1420 and a reconnaissance phase tracking process module 1430.Wherein, m props up sensor and represents to sensor m with sensor 1, and at least one entry and exit point, 1≤j≤m in the measuring range of each sensor j, are cooked up in m 〉=2.

Training stage processing module 1410 is by each sensor j, obtain a plurality of sensor metrology datas as training sample, and for the event that enters that each entry and exit point of each sensor j occurs, record first all leave event in for the previous period in a training sample space 1410a.Fundamental function estimation and update module 1420 are by the sample that is present in the 1410a of training sample space, mode via an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise function 1422 that function 1421, the traveling time of sensor spatial coherence is poor and the function 1423 of appearance similarity.Reconnaissance phase tracking process module 1430 is by estimating at least three kinds of fundamental functions, as the criterion of object tracking and relevance.

This branched sensor can be arranged on one without in the sensor network that is total to measuring range.Suppose that m props up in the measuring range of sensor, cooked up n entry and exit point altogether, then can in a storer, configure the training sample space of one n * n.As previously mentioned, this training sample space for example available one n * n matrix represents, wherein, and the field (d of this n * n matrix, b) expression is when observing an object and enter the event of entry and exit point d, and its past is left the event of entry and exit point b in all in the cycle of a special time.And aforementioned three kinds of fundamental functions can be estimated with above-mentioned probability function.

In sum, the enforcement example that the present invention discloses can provide a kind of object tracking method and system of striding between sensor, this object tracking can be carried out at one without in the sensor network that is total to measuring range, the enforcement example that the present invention discloses is to need not to understand the scene blueprint that sensor sets up, and learning process also need not any operator's intervention in advance, by features such as the appearance that measures mobile object and times, utilize the characteristics of machine learning and statistics, observe a large amount of samples and automatically estimate the spatial coherence that each video camera, leave and the mistiming distribution of inlet point and the heterochromia distribution of object appearance, with this foundation as object tracking.

Sensor can be the sensor component of plurality of classes, for example sensor can be colour camera, and framework one video camera network is followed the trail of mobile object therebetween, also can be, such as the sensor component of other classifications such as black-and-white photography machine, hotness video camera, infrared camera, microphone, ultrasonic, laser range finder or weighing scale.

The above-described enforcement example that only discloses for the present invention is when not limiting according to this scope of the invention process.Be that the equalization that every the present patent application claim is done changes and modification, all should still belong to the scope that claim of the present invention contains.

Claims

1. an object tracking method of striding between sensor is carried out in the sensor network, and this sensor network comprises branched sensor, and the method is divided into a training stage and a reconnaissance phase, and comprises:

In this training stage, by each sensor of this branched sensor, obtain a plurality of sensor metrology datas as training sample;

In the measuring range of this each sensor, plan at least one entry and exit point;

Via the mode of an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise function that function, the traveling time of sensor spatial coherence is poor and the function of appearance similarity; And

In this reconnaissance phase, by these at least three kinds of fundamental functions, as the criterion of object tracking and relevance link,

Wherein the mode of this automatic learning is a kind of recursion learning strategy, this recursion learning strategy further comprises: the event that enters that (a) occurs for each entry and exit point of each sensor, record first all leave event in for the previous period in a training sample space; (b) by the sample that is present in this training sample space, estimate function, the poor probability function of a traveling time and the poor probability function of an appearance similarity of a sensor spatial coherence; (c) observe the poor probability function of this appearance similarity, the data that belongs to peripheral rough error on the statistics is removed; (d) by the poor probability function of this traveling time of the data updating that stays and the poor probability function of this appearance similarity; Repeating step (c) with (d) until the convergence of the poor probability function of this traveling time.

2. object tracking method of striding between sensor as claimed in claim 1, wherein this sensor spatial coherence is the correlativity of the different entry and exit points between this branched sensor, this traveling time is poor to be the mistiming that object passes in and out the different sensors visual field of this branched sensor, and the similarity that this appearance similarity is the object appearance is poor.

3. object tracking method of striding between sensor as claimed in claim 1, wherein these at least three kinds of fundamental functions are all estimated with probability function.

4. object tracking method of striding between sensor as claimed in claim 3 wherein removes the approximate poor probability function of this appearance similarity by a mixed Gauss model.

5. object tracking method of striding between sensor as claimed in claim 3 wherein upgrades before the poor probability function of this traveling time, is removed the approximate poor probability function of this traveling time and is observed the data that does not have central tendency that whether removes by another mixed Gauss model.

6. object tracking method of striding between sensor as claimed in claim 3, the data that wherein removes this periphery rough error are whether the functional value according to the poor probability function of this appearance similarity decides less than a predetermined value.

7. object tracking method of striding between sensor as claimed in claim 5 wherein when the functional value of the poor probability function of this traveling time during less than another predetermined number, removes the data that this does not have central tendency.

8. object tracking method of striding between sensor as claimed in claim 3, wherein this training sample space represents with one n * n matrix, n is total number of the interior entry and exit point of measuring range of this branched sensor, the turnover event that each field of this matrix (d, b) is deposited all pairwise correlations, expression is when observing an object and enter entry and exit point d, and interior all are left the event of entry and exit point b a special time cycle in the past.

9. object tracking method of striding between sensor as claimed in claim 2, wherein the criterion of this object tracking and relevance is positively correlated with this appearance similarity, this traveling time is poor and this sensor spatial coherence.

10. object tracking method of striding between sensor as claimed in claim 3, wherein whether the convergence of the poor probability function of this traveling time number of looking the event of being removed decides less than a predetermined number.

11. object tracking method of striding between sensor as claimed in claim 10, wherein the data of last event is used for estimating this entry and exit point correlativity probability function.

12. an object tracking system of striding between sensor, this system comprises:

Branched sensor is cooked up at least one entry and exit point in the measuring range of each sensor;

One training stage processing module, each sensor by this branched sensor, obtain a plurality of sensor metrology datas as training sample, and for the event that enters that each entry and exit point of each sensor occurs, record first all leave event in for the previous period in a training sample space;

The estimation of one fundamental function and update module, by the sample that is present in this training sample space, mode via an automatic learning, estimate at least three kinds of fundamental functions that the object association, comprise function that function, the traveling time of sensor spatial coherence is poor and the function of appearance similarity, this sensor spatial coherence is the correlativity of the different entry and exit points between this branched sensor; And

One reconnaissance phase tracking process module, by estimating these at least three kinds of fundamental functions of, the criterion that links as object tracking and relevance,

Wherein the mode of this automatic learning is a kind of recursion learning strategy, this recursion learning strategy further comprises: the event that enters that (a) occurs for each entry and exit point of each sensor, record first all leave event in for the previous period in a training sample space; (b) by the sample that is present in this training sample space, estimate an entry and exit point correlativity probability function, the poor probability function of a traveling time and the poor probability function of an appearance similarity; (c) observe the poor probability function of this appearance similarity, the data that belongs to peripheral rough error on the statistics is removed; (d) by the poor probability function of this traveling time of the data updating that stays and the poor probability function of this appearance similarity; Repeating step (c) with (d) until the convergence of the poor probability function of this traveling time.

13. object tracking system of striding between sensor as claimed in claim 12 wherein should the training stage processing module have a storer, and this training sample space is to be configured in this storer.

14. object tracking system of striding between sensor as claimed in claim 13, wherein this training sample space represents with one n * n matrix, n is total number of the interior entry and exit point of measuring range of this branched sensor, the turnover event that each field of this matrix (d, b) is deposited all pairwise correlations, expression is when observing an object and enter entry and exit point d, and interior all are left the event of entry and exit point b a special time cycle in the past.

15. object tracking system of striding between sensor as claimed in claim 13, wherein this branched sensor is arranged on one without in the sensor network that is total to measuring range.

16. object tracking system of striding between sensor as claimed in claim 12, wherein this branched sensor is selected from the sensor component of the above-mentioned at least a classification of colour camera, black-and-white photography machine, hotness video camera, infrared camera, microphone, ultrasonic, laser range finder or weighing scale.