CN102831378B

CN102831378B - The detection and tracking method and system of people

Info

Publication number: CN102831378B
Application number: CN201110159513.3A
Authority: CN
Inventors: 范圣印; 王鑫; 王刚
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2011-06-14
Filing date: 2011-06-14
Publication date: 2015-10-21
Anticipated expiration: 2031-06-14
Also published as: CN102831378A

Abstract

A people's detection and tracking method, comprising: input multiframe depth image; Connected domain analysis is carried out to detect connected domain to the depth image of each frame of multiframe depth image; For the depth image of current detection object frame, the difference of current detection object frame depth image is calculated based on multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list; For the existing object frame in multiframe depth image, based on connected domain coupling, the people's entry in list is followed the tracks of; Based on pre-defined rule, the people's entry traced into is verified; Export the information of the people's entry through verification.The detection and tracking system of a kind of people is also correspondingly provided.

Description

The detection and tracking method and system of people

Technical field

Present invention relates in general to the detection and tracking method and system of a kind of people.

Background technology

The detection and tracking technology of people is one of the basis and core technology of video monitoring, man-machine interface, digital camera, computer game and IVOD Intelligent Video Conferencing System, and studied and practice for many years.But, how under limited computational resource condition, to realize the high detection rate in complex environment, people's detection and tracking of low false drop rate are still the difficult problem that academia and industry member face.

The detection and tracking technology of people belongs to image and field of video processing substantially, it utilizes depth image usually, in depth image, the real-world object of pixel value embodiment when this depth image is taken corresponding to this pixel of each pixel and the distance of camera or video camera.

Patent document US 20090183125 A1 proposes a kind of three dimensional user interactive interface.Which define an interactive flat, this interactive flat is the one piece of interaction area defined in space.Depth map sequence in this interaction area in capture time, at least will comprise a part for the health of people in the depth map of this sequence.Process this depth map sequence to detect when a part for the health of people is by direction during this interaction area and movement velocity.With the direction calculated and movement velocity computer for controlling application program.In this patent document, need a pre-defined interaction area, then determine a part for the health of the people in interaction area based on degree of depth connected domain analysis, estimated direction and the speed of interaction point by the direction of motion and speed calculating connected domain in interaction area.The object of this patented claim is the gesture in detection and tracking interaction area, can not the extra-regional object of detection and tracking.

Patent document US 20110019920 A1 proposes a kind of method of detected object, device and program.According to this patent document, first build a background model, calculate the difference of current frame image and background, find the pixel that difference is large, merging the large pixel of these differences is object.This patented claim processes based on RGB and gray level image, and testing result is poor for shadow, Irradiance and illumination variation robustness (robust).The technology of this patent document needs to build background model, when circumstance complication or environmental change more, background model is difficult to accurate foundation, and the inaccurate meeting of background model causes subsequent detection result poor.

Patent document US 20100215271 A1 proposes a kind of characteristics of human body's detection of Shape-based interpolation context (context) inner distance and the method for people's pose estimation.The shape profile of people is split, from profile edge line up-sampling point from depth image.Calculate Shape context inner distance (IDSC) based on point, carry out human body unique point according to IDSC descriptor (descriptor).The technology of this patent document employs depth image and shape contour mould, and in the posture of people, robustness is poor.And its IDSC used is a kind of histogram of point, is original-shape context extension, needs sampled contour point, and point is normally unstable.

Summary of the invention

In view of prior art above-mentioned defect and the present invention is proposed, the invention provides a kind of detection and tracking method of people and the detection and tracking system of people.

According to an aspect of the present invention, a kind of detection and tracking method of people is provided, comprises: input step, input multiframe depth image; Object detection step, carries out connected domain analysis to detect connected domain to the depth image of each frame of described multiframe depth image; People's detecting step, for the depth image of the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list; People's tracking step, for the existing object frame in described multiframe depth image, follows the tracks of the people's entry in list based on connected domain coupling; People's checking procedure, verifies the people's entry traced into based on pre-defined rule; And output step, export the information of the people's entry through verification.

According to another aspect of the present invention, the detection and tracking system of a kind of people is provided, comprises: input media, for inputting multiframe depth image; Object test equipment, the depth image for each frame to described multiframe depth image carries out connected domain analysis to detect connected domain; People's pick-up unit, for the depth image for the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list; People's tracking means, for for the existing object frame in described multiframe depth image, follows the tracks of the people's entry in list based on connected domain coupling; People's checking procedure, verifies the people's entry traced into based on pre-defined rule; And output unit, for exporting the information of the people's entry through verification.

The detection and tracking method and system of people proposed by the invention utilizes depth image, with connected domain and difference for according to carrying out detection and tracking people, all has strong robustness to the posture of people, postural change and motion speed.

Accompanying drawing explanation

Fig. 1 comprises Figure 1A and Figure 1B, illustrate the detection and tracking method and system of the people according to the embodiment of the present invention for schematic images, wherein Figure 1A is the example of original image, and Figure 1B is the depth image corresponding with original image shown in Figure 1A.

Fig. 2 illustrates the overview flow chart of the detection and tracking method of the people according to the embodiment of the present invention.

Fig. 3 illustrates the process flow diagram of the people's detecting step according to the embodiment of the present invention.

Fig. 4 illustrates and performs according to the design sketch after the differential process of the embodiment of the present invention depth image.

Fig. 5 illustrates the process flow diagram flow chart detecting people's object based on filtration and pixel difference according to the embodiment of the present invention.

Fig. 6 A schematically shows the effect example performed according to the filter process of the embodiment of the present invention.

Fig. 6 B is shown schematically in image shown in Fig. 6 A and removes the effect example that distance becomes far point.

Fig. 7 illustrates the process flow diagram of the people's tracking step according to the embodiment of the present invention.

Fig. 8 illustrate according to the embodiment of the present invention judge connected domain whether with the process flow diagram flow chart of people's entries match.

Fig. 9 illustrates the process flow diagram flow chart processed the people's entry do not traced into according to the embodiment of the present invention.

Figure 10 illustrates the general frame of the detection and tracking system of the people according to the embodiment of the present invention.

Figure 11 illustrates the result example implementing to export according to the detection and tracking method and system of the people of the embodiment of the present invention.

Embodiment

In order to make those skilled in the art understand the present invention better, the present invention is described in further detail by embodiment below in conjunction with accompanying drawing.

Fig. 1 comprises Figure 1A and Figure 1B, illustrate the detection and tracking method and system of the people according to the embodiment of the present invention for schematic images, wherein Figure 1A is the example of original image, and Figure 1B is the depth image corresponding with original image shown in Figure 1A.Figure 1B schematically illustrates the input information of the detection and tracking method of the people of the embodiment of the present invention and the detection and tracking system of people.

The detection and tracking method of the people of the embodiment of the present invention and the detection and tracking system of people realize the detection and tracking of people based on the process of depth image.The equipment supporting that depth image exports by one obtains depth image.This equipment can obtain depth image based on mode that is infrared, laser, also can obtain depth image by being similar to binocular mechanism.

The embodiment of the present invention can adopt and can either obtain the equipment that depth image can obtain again RGB image and gray level image simultaneously.Such as, Figure 1A illustrates the signal gray-scale map of original RGB image, and Figure 1B illustrates the schematic diagram of corresponding depth image.In depth image, the pixel value (gray-scale value) of each pixel represents the distance value of the actual photographed object corresponding to this pixel to video camera center, and gray-scale value larger expression distance is far away, and it is nearer to be worth less expression.Consider the precision problem of depth image equipment, for point or the face that distance cannot be detected, can be represented them by maximum range value.Shown in Figure 1A, image illustrates as just example, and not implements the actual needs of the embodiment of the present invention, and the method and system of the embodiment of the present invention only needs depth image as shown in Figure 1B can carry out the detection and tracking of people.

Fig. 2 illustrates the overview flow chart of the detection and tracking method of the people according to the embodiment of the present invention.As shown in Figure 2, the detection and tracking method of this people comprises: input step S11, can input multiframe depth image; Object detection step S12, can carry out connected domain analysis to detect connected domain to the depth image of each frame of described multiframe depth image; People's detecting step S13, can for the depth image of the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list; People tracking step S14, for the existing object frame in described multiframe depth image, can follow the tracks of the people's entry in list based on connected domain coupling; People's checking procedure S15, can verify the people's entry traced into based on pre-defined rule; And export step S16, the information of the people's entry through verification can be exported.

The equipment that the depth image inputted at input step S11 can export depth image from any one obtains.Such as, the video camera of TOF (flight time) mechanism or the video camera or binocular camera etc. of mode light mechanism can be used.Any one the said equipment can obtain the depth image of multiple frame from video, and is imported by above-mentioned input step S11, this multiframe depth image as the people of the embodiment of the present invention detection and tracking method for handling object.

At object detection step S12, detected object can be carried out based on the connected domain analysis of depth image.Connected domain analysis is a kind of proven technique, and it may be used for black white image, gray level image or RGB image usually.Connected domain analysis based on depth image is actually and connected domain analysis is applied to depth image, and it realizes with connected domain analysis is used for gray level image similar.Connected domain analysis is carried out to the depth image of each frame of described multiframe depth image, its respective connected domain can be obtained respectively about each frame, for the depth image of each frame, its may comprise one, multiple or do not comprise (0) connected domain.

In order to carry out explanation after this, can first define at this, in described multiframe depth image, the current handled frame of described people's detecting step S13 is current detection object frame, and the current handled frame of described people tracking step S14 is existing object frame.It will be understood by those skilled in the art that no matter be current detection object frame, or existing object frame, " current " wherein shows that it is dynamic concept, and in the process processed each frame of multiframe depth image successively, present frame is in constantly change; But, people's detecting step S13 and people tracking step S14 can separate, independently process for the different frame in multiframe depth image in other words, both can the depth image of frame one by one process, also can process every the frame of some numbers, and handled by people's detecting step S13 and people tracking step S14, the skip number of frame also has nothing to do each other.Namely, frame handled by people's detecting step S13 and the frame handled by people tracking step S14 can be the same or different, in an implementation process of the detection and tracking method of the people of the embodiment of the present invention, described current detection object frame and described existing object frame can be same frame, also can be different frames.Those skilled in the art are also appreciated that, because people tracking step S14 is after people's detecting step S13, therefore in an implementation process of the detection and tracking method of the people of the embodiment of the present invention, described existing object frame is at least no earlier than current detection object frame.

People's detecting step S13 is described below in detail.Fig. 3 illustrates the process flow diagram of the people's detecting step S13 according to the embodiment of the present invention.As shown in Figure 3, in people's detecting step S13, first, in step S131, the difference of current detection object frame depth image is calculated based on described multiframe depth image, this step S131 can comprise: the depth value of the respective pixel of the previous detected object frame depth image of predetermined frame number of being separated by before can deducting current detection object frame by the depth value of the pixel with current detection object frame depth image, calculates the difference of current detection object frame depth image.

The frame number determined current detection object frame depth image and be previously separated by between detected object frame depth image is needed, the number of image frames namely skipped in testing process when calculating difference.Different skipped frame numbers means the difference of the difference for present frame accumulation.Such as, for the object towards certain appointment unidirectional motion, the frame number skipped is many, and cumulative difference is just large.For the object of oscillating motion, the frame number skipped is many, and cumulative difference may not be just large.Can find by experiment, can determine according to the size of the object that will detect the frame number skipped.Such as, for the large Moving Objects as people, the frame number skipped such as can be set to 3; For the little Moving Objects as hand, the frame number skipped such as can be set to 0.At this, people's detecting step for detected object be people, therefore can consider that the number of image frames skipped can be such as 3.Obviously, the frame number of being separated by between current detection object frame and previously detected object frame is not limited to 3, and other number of such as 2 or 5 etc. is also passable, or selects 0 namely not skip any frame and to carry out check processing frame by frame also passable.

Subsequently, in step S132, can positive and negative according to difference, the pixel in the depth image of current detection object frame is divided three classes: apart from constant pixel, distance becomes near pixel and distance becomes pixel far away.Based on the difference of the current detection object frame depth image that multiframe depth image calculates, it may (be greater than 0) for negative (being less than 0), just or equaling 0.Difference is negative pixel, and namely this pixel of current detection object frame becomes near relative to this pixel distance being arranged in image same position of the detected object frame be previously detected, and this pixel is the pixel that distance becomes near.Difference is positive pixel, and namely this pixel of current detection object frame becomes far away relative to the pixel distance of the same position of the detected object frame be previously detected, and this pixel is the pixel that distance becomes far away.Difference be 0 pixel be in the constant pixel of the spacing of this two detected objects frame.

Fig. 4 illustrates and performs according to the design sketch after the differential process of the embodiment of the present invention depth image.Wherein, three kinds of different colors show that the pixel of current detection object frame depth image has been divided into three classes according to difference: the part being expressed as black is the pixel that distance becomes near, the part being expressed as grey is the pixel that distance becomes far away, and all the other white portions are that other is apart from constant pixel.

In step S133, people's object is detected based on filtration and pixel difference, particularly, the pixel of current detection object frame depth image can be filtered based on the connected domain of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list 134.List 134 can be the database of such as people's items for information, for being stored in the information of people involved in the detection and tracking process of the people of the embodiment of the present invention.

Fig. 5 illustrates the process flow diagram flow chart detecting the step S133 of people's object based on filtration and pixel difference according to the embodiment of the present invention.

First, in step S1331, extract the difference of each pixel of current detection object frame depth image.In this image, the difference value of each pixel calculates in step S131 before, therefore in step S1331, is recalled by the differential data of each pixel.

In step S1332, enumerate the connected domain of current detection object frame depth image successively.Because the connected domain in the depth image of each frame of multiframe depth image detects at step S12 before, therefore recall in this connected domain by the depth image of current detection object frame.The connected domain detected in step S12 is the object detected, may be people, also may be the object of other type.

In step S1333, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, specifically can comprise: judge whether the pixel of current detection object frame depth image belongs to connected domain, extract the difference belonging to the pixel of this connected domain.Namely, if pixel belongs to connected domain, then the difference of this pixel retains; If pixel does not belong to connected domain, then this pixel does not enter subsequent treatment.

Now, in order to obtain better technique effect, can consider to adopt the step S1334 based on threshold filtering pixel, to remove noise.Particularly, in step S1334, judge whether the difference absolute value of the pixel of this connected domain is less than predetermined threshold, if the difference absolute value of the pixel of this connected domain is less than predetermined difference threshold value, then the difference of this pixel is set to 0.This threshold value can be determined by experiment, it can be 150mm, it will be understood by those skilled in the art that 150mm is only example, the threshold value that herein can adopt is not limited thereto, and other threshold value such as 50mm, 100mm, 120mm, 175mm etc. also may be used for the process of step S1334.

After the difference of the depth image shown in Fig. 4 design sketch after above-mentioned filter process as shown in Figure 6A.Fig. 6 A schematically shows the effect example performed according to the filter process of the embodiment of the present invention.Image shown in Fig. 4, through step S1331, S1332, S1333 and S1334 process, also just after filtering based on connected domain and filtering difference result based on predetermined threshold, obtains all schematic diagram as shown in Figure 6A.In fig. 6, three connected domain objects being shown---object-1, object-2, object-3, most noise is removed, and only in the marginal portion of connected domain object, there is the noise spot of some grey display (distance becomes pixel far away).

At following step S1335, S1336 and S1337, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, specifically can comprise: the total number of pixel that statistics belongs to this connected domain and the difference belonging to this connected domain are negative number of pixels, the difference calculating this connected domain be negative number of pixels relative to the ratio of the total number of pixel, judge that whether this connected domain is about people's object according to this ratio.

Particularly, in step S1335, at current detection object frame depth image in the connected domain of pre-treatment, retain distance and become near pixel, then statistical distance becomes the number of near pixel.At most cases, as foreground object is moved to the left, moves right, moves up, moves down, moves forward and under their combination (not comprising foreground object distally to move in the depth direction), the movement of foreground object can cause the distance value of pixel in the foreground object of current detection object frame to diminish relative to its history value, and namely difference value is negative.

Fig. 6 B is shown schematically in image shown in Fig. 6 A and removes the effect example that distance becomes far point.As can be seen from the figure, three connected domain objects being still shown---object-1, object-2, object-3, through the process of step S1335, the pixel of all grey parts is removed.The distance retained becomes near pixel and can provide higher stability, can the situation of movement of reflection object preferably.

In step S1336, the difference value added up in current connected domain after connected domain is filtered is the number of negative pixel, namely obtain the number of pixels that connected domain middle distance becomes near, then, use formula (1) calculates the ratio that connected domain middle distance becomes the total number of pixel that near number of pixels comprises with this connected domain:

move_ratio = \frac{number_nearer_difference_pixels}{number_CCA_pixels} - - - (1)

Wherein, number_nearer_idfference_pixels is that involved connected domain middle distance becomes near number of pixels, and number_CCA_pixels is the total number of pixel that this connected domain comprises, and move_ration is the ratio calculated.

In step S1337, judge that whether connected domain is about people's object according to the size of ratio.If this ratio is greater than certain predetermined threshold, then this connected domain is a people's object detected.This predetermined threshold is determined by experiment, and be such as 0.05, obviously, 0.05 is only an example, it will be understood by those skilled in the art that this predetermined threshold also can be other value of such as 0.01,0.1 etc.

Such as, formula (2) can be used to judge, and whether connected domain is about people's object.

CCA_type = \{\begin{matrix} human & move_ratio &GreaterEqual; move_threshold \\ other_object & move_ratio < move_threshold \end{matrix} - - - (2)

Wherein, CCA_type represents judged result, and human represents judged result behaviour object, and other_object represents that judged result is the object of other type, move_ration is the ratio that formula (1) calculates, and move_threshold is the predetermined threshold for judging herein.

If judge that certain connected domain is about people's object, then in step S1338, the information (i.e. the information of this people's object) of adding this connected domain to list 134, as the information of people's entry wherein.The mode of directly carrying out adding can be considered, although this may cause the redundancy in list at this.But, effectively can remove redundancy by the loss mechanisms of tracking phase.At complex environment or when there is various posture and the motion of people, detect and accurately may reduce, tracking may have loss, CCA (connected domain analysis) may have undetected, therefore can pass through the balance of the loss mechanisms of redundancy and the tracking detected, ensure good detection and tracking effect.

After all connected domains in current detection object frame depth image being all disposed, the process of people's detecting step S13 terminates.

Then, in step S14, carry out the tracking of people based on connected domain coupling.

Fig. 7 illustrates the process flow diagram of the people tracking step S14 according to the embodiment of the present invention.

First, the list of the connected domain of existing object frame depth image is extracted in step S141.Because the connected domain of each depth image detects in object detection step S12, therefore extract at this.

Then, in step S142, from list 134, extract the list of people's entry.Wherein, the renewal of last people's detecting step S13 has at least been passed through in the list of people's entry.

In step S143, carried out the people of match query by the matching degree calculating connected domain and people's entry, and upgrade the people of coupling.Particularly, can for the existing object frame in described multiframe depth image, the matching degree of people's entry in connected domain and list in existing object frame is calculated based on two-dimensional rectangle, mean depth and number of pixels, if this matching degree meets preassigned, then judge this connected domain and this people's entries match, this people's entry is traced to, and according to the information of this people's entry in the information updating list of this connected domain.Fig. 8 illustrate according to the embodiment of the present invention judge connected domain whether with the process flow diagram flow chart of people's entries match.

First, in step S1431, enumerate each connected domain successively.Connected domain is the object detected, and it may be people's object, also may be the object of other types.

Then, in step S1432, from enumerating everyone successively.

Then, in step S1433, the matching degree of the information of connected domain and people's entry is calculated based on 2D (two dimension) rectangle, mean depth and/or number of pixels etc., wherein 2D rectangle is the boundary rectangle of connected domain, mean depth is the depth-averaged value of this connected domain interior pixels, and number of pixels is the number of the pixel of this connected domain inside.Can be such as that if the Duplication of both 2D rectangles, both rate of change of number of pixels of both Distance geometry of mean depth all meet respective predetermined threshold, then this people's entry is mated with this connected domain.

The Duplication that formula (3) calculates the 2D rectangle of connected domain and people's entry can be passed through:

2 D_overlap_rate = \frac{overlap_area}{Min (CCA_2 D_area, Human_2 D_area)} - - - (3)

Wherein, CCA_2D_area refers to the area of the 2D rectangle of connected domain, Human_2D_area refers to the area of the 2D rectangle of the people's object corresponding to people's items for information, Min () is the computing that gets the small value, overlap area refers to the overlapping area of the rectangle of people's object and the rectangle of connected domain, and 2D overlap rate refers to the Duplication of the 2D rectangle of connected domain and the people's entry calculated.

The distance that formula (4) calculates the mean depth of connected domain and people's entry can be passed through:

Dist＝abs(CCA_averageDepth-Human_averageDepth) (4)

Wherein, CCA averageDepth is the mean depth of connected domain, Human averageDepth is the mean depth of people's object corresponding to people's entry, and abs () is the computing that takes absolute value, and Dist is the distance of the mean depth of connected domain and the people's entry calculated.

The rate of change that formula (5) calculates the number of pixels of connected domain and people's entry can be passed through:

ChangeRate = \frac{abs (CCA_pixel_number - Human_pixel_number)}{Human_pixel_number} - - - (5)

Wherein, CCA_pixel_number is the number of the pixel that connected domain comprises, Human_pixel_number refers to the number of the pixel that people's object of people's entry comprises, abs () is the computing that takes absolute value, and changeRate is the rate of change of the number of pixels of connected domain and the people's entry calculated.

Formula (6) can be passed through and judge whether this people's entry mates with this linker.If the Duplication of both 2D rectangles, both rate of change of number of pixels of both Distance geometry of mean depth all meet respective threshold value, then this people's entry is mated with this connected domain:

bMatch = \{\begin{matrix} (2 D_overlap_rate > 2 D_threshold) \cap \\ (Dist < Dist_threshold) \cap \\ (ChangRage < Change_threshold) \end{matrix} - - - (6)

Wherein, 2D_threshold is the threshold value of the Duplication of the 2D rectangle of setting, and Dist_threshold is the threshold value of the distance of mean depth, and Change_threshold is the threshold value of change ratio of pixel.These 3 different given threshold values can be determined by experiment.Such as, 2D_threshold can be 0.5, Dist threshold can be 200mm, Change threshold can be 1, and it will be understood by those skilled in the art that above-mentioned numerical value is only example, other numerical value also can be applied to the embodiment of the present invention.

In step S1434, for the people's entry matched, use the information of this people's entry of information updating of the connected domain matched.The information upgraded can comprise: 2D rectangle, mean depth and number of pixels.After renewal, the information of this people's entry is write back list 134.

In step S1435, the status information of this people's entry is upgraded in list 134, can be such as, the value that the summary counter of this people's entry " is lost " is in short-term set to zero, the summary counter " loss " of this people's entry is set to zero, lost condition bLost is set to vacation (F), and tracking and matching state bTrackingMatched is set to very (T).After state information updating, the information of this people's entry is write back list 134.

Then, in step S144, based on calculating the lost condition of people with the loss mechanisms of people time long in short-term and controlling the loss of people and remove.Can be specifically if the read-around ratio that in described list, people's entry does not obtain coupling is less than predetermined first threshold, then judge that this people's entry is in lost condition in short-term; If the read-around ratio that in described list, people's entry does not obtain coupling is more than or equal to predetermined first threshold, then judge that this people's entry is in lost condition; If the read-around ratio that in described list, people's entry does not obtain coupling is greater than predetermined Second Threshold, then this people's entry is removed from described list.

First, in step S1441, enumerate the people's entry not having to mate successively.

Then, in step S1442, the people's entry summary counter do not mated for certain of specifying " is lost " in short-term.If in existing object frame, people's entry is not mated, and so its summary counter " is lost in short-term " and is just set to 1, and this people's entry is in lost condition in short-term.In the next frame, this people's entry is not mated again, then its summary counter " is lost in short-term " and will be added 1, is 2, the like.

In step S1443, according to the value of counter, judge whether people is in lost condition, if the value of counter is more than or equal to given first threshold, sets this people's entry of not mating and be in lost condition.First threshold obtains by experiment, and can be such as 5, obviously, 5 be only an example.

In step S1444, for the people's entry not having to mate being in lost condition, summary counter " loss ".If in existing object frame, people's entry is not mated and it is in lost condition, and so its summary counter " loss " is set to 1.In the next frame, this people's entry is not mated and is still in lost condition, then its summary counter " loss " will add 1, is 2, the like.

In step S1445, according to the value of summary counter " loss ", judge whether people's entry is in long lost condition, if the value of summary counter is greater than given Second Threshold, then people's entry of mating should do not had just to be set to the state of removing.Second Threshold obtains by experiment, and can be such as 30 obviously, 30 be only an example.

In step S1446, for the people being in the state of removing, it is removed from people's entry of list 134.

In step S1447, upgrade the information not having people's entry of mating.These information such as comprise: summary counter " is lost " in short-term, and lost condition, summary counter " loss ", removes state.

In step S145, can the people that arrives of output tracking, namely output matching people's items for information or be in the information not having to mate but be in people's entry of lost condition in short-term.

At people's checking procedure S15, carry out the verification of people's entry based on predefined rule to eliminate flase drop.The rule that can adopt such as comprises: the ratio of the height and width of the people's object corresponding to people's entry should meet certain threshold value; In a certain given distance, the change of the number of pixels of people's entry should in the scope limited; And in a certain given distance, the number of the pixel of people's entry should be greater than a certain given threshold value; Etc..Above-mentioned rule can all use or built-up section uses.

At output step S16, output detections and the result of people traced into.These results are output to internal memory, database, hard disk or other equipment.

The present invention can also be embodied as the detection and tracking system of a kind of people, can perform the detection and tracking method of aforesaid people.

Figure 10 illustrates the general frame of the detection and tracking system of the people according to the embodiment of the present invention.As shown in Figure 10, the detection and tracking system of a kind of people can comprise: input media 11, can perform aforementioned input step S11, for inputting multiframe depth image, object test equipment 12, can perform aforementioned object detecting step S12, and the depth image for each frame to described multiframe depth image carries out connected domain analysis to detect connected domain, people's pick-up unit 13, aforementioned people's detecting step S13 can be performed, for the depth image for the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list, people's tracking means 14, can perform aforementioned people tracking step S14, for for the existing object frame in described multiframe depth image, follows the tracks of the people's entry in list based on connected domain coupling, people's calibration equipment 15, can perform aforementioned people's checking procedure S15, for verifying the people's entry traced into based on pre-defined rule, and output unit 16, aforementioned output step S16 can be performed, for exporting the information of the people's entry through verification.

Wherein, in described multiframe depth image, the current handled frame of described people's pick-up unit 13 is current detection object frame, and the current handled frame of described people's tracking means 14 is existing object frame.

Described people's pick-up unit 13 can perform abovementioned steps S131, to be separated by before deducting current detection object frame by the depth value of the pixel with current detection object frame depth image the depth value of respective pixel of previous detected object frame depth image of predetermined frame number, to calculate the difference of current detection object frame depth image.

Described people's pick-up unit 13 can perform abovementioned steps S1333, judges whether the pixel of current detection object frame depth image belongs to connected domain, extracts the difference belonging to the pixel of this connected domain.

Described people's pick-up unit 13 can perform abovementioned steps S1335 ~ S1337, the total number of pixel that statistics belongs to this connected domain and the difference belonging to this connected domain are negative number of pixels, the difference calculating this connected domain be negative number of pixels relative to the ratio of the total number of pixel, judge that whether this connected domain is about people's object according to this ratio.

Described people's pick-up unit 13 can perform abovementioned steps S1334, and judge whether the difference absolute value of the pixel of this connected domain is less than predetermined threshold, if the difference absolute value of the pixel of this connected domain is less than predetermined difference threshold value, then the difference of this pixel is set to 0.

Described people's tracking means 14 can perform abovementioned steps S143, for the existing object frame in described multiframe depth image, the matching degree of people's entry in connected domain and list in existing object frame is calculated based on two-dimensional rectangle, mean depth and number of pixels, if this matching degree meets preassigned, then judge this connected domain and this people's entries match, this people's entry is traced to, and according to the information of this people's entry in the information updating list of this connected domain.

Described people's tracking means 14 can also perform abovementioned steps S144, if the read-around ratio that in described list, people's entry does not obtain coupling is less than predetermined first threshold, then judges that this people's entry is in lost condition in short-term; If the read-around ratio that in described list, people's entry does not obtain coupling is more than or equal to predetermined first threshold, then judge that this people's entry is in lost condition; If the read-around ratio that in described list, people's entry does not obtain coupling is greater than predetermined Second Threshold, then this people's entry is removed from described list.

Described people's tracking means 14 can also perform abovementioned steps S145, exports the information being in people's entry of lost condition in short-term.

Figure 11 illustrates the result example implementing to export according to the detection and tracking method and system of the people of the embodiment of the present invention.The result exported can be such as aforementioned output step S16 and the result performed by output unit 16.

To the result of the detection and tracking of the depth image executor as shown in Figure 1B enterprising line identifier of gray-scale map at RGB image as shown in Figure 1A, result as shown in figure 11.In fig. 11, the rectangle frame of white is used to identify through the final detected of embodiment of the present invention process gained and people's object of tracing into.

The sequence of operations illustrated in the description can be performed by the combination of hardware, software or hardware and software.When being performed this sequence of operations by software, computer program wherein can be installed in the storer be built in the computing machine of specialized hardware, make computing machine perform this computer program.Or, computer program can be installed in the multi-purpose computer that can perform various types of process, make computing machine perform this computer program.

Such as, computer program can be prestored in the hard disk or ROM (ROM (read-only memory)) of recording medium.Or, (record) computer program can be stored in removable recording medium, such as floppy disk, CD-ROM (compact disc read-only memory), MO (magneto-optic) dish, DVD (digital versatile disc), disk or semiconductor memory temporarily or for good and all.So removable recording medium can be provided as canned software.

The present invention has been described in detail with reference to specific embodiment.But clearly, when not deviating from spirit of the present invention, those skilled in the art can perform change to embodiment and replace.In other words, the form that the present invention illustrates is open, instead of explains with being limited.Judge main idea of the present invention, appended claim should be considered.

Claims

1. a people's detection and tracking method, comprising:

Input step, input multiframe depth image;

Object detection step, carries out connected domain analysis to detect connected domain to the depth image of each frame of described multiframe depth image;

People's detecting step, for the depth image of the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list;

People's tracking step, for the existing object frame in described multiframe depth image, follows the tracks of the people's entry in list based on connected domain coupling;

People's checking procedure, verifies the people's entry traced into based on pre-defined rule; And

Export step, export the information of the people's entry through verification,

Wherein, in described people's detecting step, the described difference based on described multiframe depth image calculating current detection object frame depth image comprises:

To be separated by before deducting current detection object frame by the depth value of the pixel with current detection object frame depth image the depth value of respective pixel of previous detected object frame depth image of predetermined frame number, to calculate the difference of current detection object frame depth image; And

Positive and negative according to calculated difference, is divided three classes the pixel in the depth image of current detection object frame: apart from constant pixel, distance becomes near pixel and distance becomes pixel far away.

2. the detection and tracking method of people according to claim 1, wherein, in described multiframe depth image, the current handled frame of described people's detecting step is current detection object frame, and the current handled frame of described people's tracking step is existing object frame.

3. according to the detection and tracking method of the people in claim 1-2 described in any one, wherein, in described people's detecting step, the pixel that the described connected domain based on current detection object frame depth image filters current detection object frame depth image comprises: judge whether the pixel of current detection object frame depth image belongs to connected domain, extract the difference belonging to the pixel of this connected domain.

4. the detection and tracking method of people according to claim 3, wherein, in described people's detecting step, whether the described difference of pixel in the connected domain of filtering based on current detection object frame depth image detects this connected domain and comprises about people's object: the total number of pixel that statistics belongs to this connected domain and the difference belonging to this connected domain are negative number of pixels, the difference calculating this connected domain be negative number of pixels relative to the ratio of the total number of pixel, judge that whether this connected domain is about people's object according to this ratio.

5. the detection and tracking method of people according to claim 4, wherein, in described people's detecting step, the described difference of pixel in the connected domain of filtering based on current detection object frame depth image detect this connected domain whether about people's object before, judge whether the difference absolute value of the pixel of this connected domain is less than predetermined threshold, if the difference absolute value of the pixel of this connected domain is less than predetermined difference threshold value, then the difference of this pixel is set to 0.

6. the detection and tracking method of people according to claim 1, wherein, in described people's tracking step, for the existing object frame in described multiframe depth image, the matching degree of people's entry in connected domain and list in existing object frame is calculated based on two-dimensional rectangle, mean depth and number of pixels, if this matching degree meets preassigned, then judge this connected domain and this people's entries match, this people's entry is traced to, and according to the information of this people's entry in the information updating list of this connected domain.

7. the detection and tracking method of people according to claim 6, wherein, described people's tracking step also comprises: if the read-around ratio that in described list, people's entry does not obtain coupling is less than predetermined first threshold, then judge that this people's entry is in lost condition in short-term; If the read-around ratio that in described list, people's entry does not obtain coupling is more than or equal to predetermined first threshold, then judge that this people's entry is in lost condition; If the read-around ratio that in described list, people's entry does not obtain coupling is greater than predetermined Second Threshold, then this people's entry is removed from described list.

8. the detection and tracking method of people according to claim 7, wherein, described people's tracking step also comprises: export the information being in people's entry of lost condition in short-term.

9. a people's detection and tracking system, comprising:

Input media, for inputting multiframe depth image;

Object test equipment, the depth image for each frame to described multiframe depth image carries out connected domain analysis to detect connected domain;

People's pick-up unit, for the depth image for the current detection object frame in described multiframe depth image, the difference of current detection object frame depth image is calculated based on described multiframe depth image, connected domain based on current detection object frame depth image filters the pixel of current detection object frame depth image, whether difference based on the pixel in the connected domain of filtering of current detection object frame depth image detects this connected domain about people's object, if this connected domain is about people's object, then using the information of the information of this connected domain as the people's entry in list;

People's tracking means, for for the existing object frame in described multiframe depth image, follows the tracks of the people's entry in list based on connected domain coupling;

People's calibration equipment, for verifying the people's entry traced into based on pre-defined rule; And

Output unit, for exporting the information of the people's entry through verification,

Wherein, described people's pick-up unit is separated by before deducting current detection object frame by the depth value of the pixel with current detection object frame depth image the depth value of respective pixel of previous detected object frame depth image of predetermined frame number, calculate the difference of current detection object frame depth image, and positive and negative according to calculated difference, is divided three classes the pixel in the depth image of current detection object frame: apart from constant pixel, distance becomes near pixel and distance becomes pixel far away.