CN101406390A

CN101406390A - Method and apparatus for detecting part of human body and human, and method and apparatus for detecting objects

Info

Publication number: CN101406390A
Application number: CNA2007101639084A
Authority: CN
Inventors: 陈茂林; 郑文植
Original assignee: Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Current assignee: Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Priority date: 2007-10-10
Filing date: 2007-10-10
Publication date: 2009-04-15
Anticipated expiration: 2027-10-10
Also published as: KR101441333B1; CN101406390B; KR20090037275A

Abstract

The invention discloses a method and equipment for detecting human body parts and humans by using differential images as characteristic images. The method comprises the following steps: calculating the differential images of detected images; and detecting a first human body part by using a fist human body part model corresponding to the first human body part on the basis of the calculated differential images of the detected images, wherein the first human body part model is obtained through studying characteristic sets extracted from the differential images of a positive sample and a negative sample of the first human body part.

Description

Human body position and people's method and apparatus and object detection method and apparatus

Technical field

The present invention relates to the object detection method and apparatus, more particularly, relate to a kind of by using difference image to come human body position and people's method and apparatus and object detection method and apparatus as characteristic image.

Background technology

Object detection is extremely important for video analysis technology (for example, content-based video or image object recovery, video monitoring, video compress and driver assistance) automatically.In real world, people's detection is one of challenging detection classification of tool.People's detection can be applied to three kinds of situations:

First kind of situation is used for determining whether have the people in the visual field.For example, when assist driving, as the pedestrian on the road during near vehicle, system makes warning to the driver.This can be implemented as the embedded intelligence device of integration imaging device.

Second kind of situation is used for still camera or the same people's of ptz camera coherent tracking.But the former collector's motion track can be suitable for the intelligent behavior analysis.The latter can be adjusted into its attitude the people who follow the tracks of to move, and make its at the center of image to write down its details or behavior.This can be implemented as smart camera or the ptz camera that is connected through the internet to storage device or display device.

The third situation is when robot wishes to follow the people or attempts to watch attentively the person to person when exchanging.Robot attempts detecting the position of people in its image of camera, and makes corresponding action, for example, moves, follows, watches attentively or adjust its posture.This can be embodied as the total embedded equipment of functional unit that is integrated in robot.

The clothing of all kinds and pattern cause people's part and whole pattern to have very big transmutability, therefore only there is regional area seldom to can be described as the feature of its affiliated classification, even also have the feature set of robustness and recognizability in the background of the confusion of these needs under different lighting conditions.In addition, global shape experience is because various possible definition and become the wide region that the appurtenance of obstruction causes in a large number and be out of shape, perhaps can cause the wide region distortion that a plurality of people's of appearance obstruction causes in same image-region of people's profile change, this just need overcome minority and disturb the algorithm of inferring correct result from total sign (evidence).

Various trials have been carried out to overcoming these problems.The example of these trials comprises: the method for many views people's head detection (is disclosed in " multi-view human head detection in static images ", machine vision application 2005, the author is M.Chen etc., hereinafter referred to as R1), use the detection method of motion and appearance information (to be disclosed in " international conference on computer vision2003 ", the author is V.Paul etc., hereinafter referred to as R2), use the block diagram detection people's of gradient method (to be disclosed in " international conference on computer vision and pattern recognition ", the author is Q.Zhu etc., hereinafter referred to as R3), the detection people's of the statistical distribution of use edge orientations method (being disclosed in the 20060147108A1 United States Patent (USP)) hereinafter referred to as R4.

Outrunner's detection method is not to use world model's (for example, the set of whole health appearance or profile detector or local feature or location detection device).The former extracts people's global feature and sets up world model [R1] based on its appearance or profile.The latter is decomposed into several sections (for example, people's head, people's trunk, people's lower limb and people's arm) [R2, R3, R4] with human body.Show people's detection by location detection, and research contents is reduced to and the corresponding model of human body.The model learning method generally includes SVM, Adaboosting and other householder methods.

We know that people's face detects has obtained very big progress in recent years, and it can reach very high verification and measurement ratio and lower false alarm rate in handling in real time.Yet for the application of actual purpose, people's detection still needs to do a lot of work.At first, the people detection device can be adapted to the change of people's appearance along with clothes pattern and different lighting conditions, and should be based upon and can catch on the basis of robust type feature of character shapes from the various distortion of people's appearance; At last, should need less amount of calculation and processing in real time.

Summary of the invention

Exemplary embodiment of the present invention overcomes above-mentioned shortcoming and top other shortcomings of not describing.In addition, the present invention does not need to overcome above-mentioned shortcoming, and exemplary embodiment of the present invention can not overcome above-mentioned any problem.

According to an aspect of the present invention, provide the method for human body/people in a kind of detected image, described method comprises: the difference image that calculates detected image; Difference image use based on the detected image that calculates detects described the first body region with the corresponding the first body region model of the first body region, wherein, described the first body region model is to learn to obtain by the feature set that the difference image from the positive sample of described the first body region and negative sample is extracted.

According to a further aspect in the invention, provide a kind of in image human body position/people's equipment, described equipment comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage human body; Sub-window processor, the positive sample of the human body of storing from the training DB that image processor calculates and the difference image of negative sample extract feature set; The first body region grader, the difference image of the detected image that calculates based on image processor, use the first body region model, detect and the corresponding the first body region of described the first body region grader, wherein, the described the first body region model feature set that to be sub-window processor extract by the difference image to the positive sample of the described the first body region stored from training DB and negative sample is learnt to obtain.

Provide according to a further aspect in the invention a kind of in image human body position/people's method, described method comprises: the difference image that (a) calculates detected image; (b) based on the difference image of the detected image that calculates, use with a plurality of different a plurality of one to one human body models of human body in a human body model detect and a described human body that the human body model is corresponding, wherein, each of described a plurality of human body models is to learn to obtain by the feature set that the difference image from the positive sample of each pairing human body of described a plurality of human body models and negative sample is extracted; (c) repeating step (b), for another human body in the described a plurality of different human body position that is different from the described human body in the step (b), use the human body model corresponding to detect described another human body with described another human body, and, from detected human body, remove false-alarm according to human geometry based on described detected another human body; (d), finally determine the human body of detection, and learn the people who determines detection according to the human geometry based on the result of step (c).

Provide according to a further aspect in the invention a kind of in image human body position/people's equipment, described equipment comprises: a plurality of human body detectors, corresponding one by one with a plurality of different human bodies, and detect corresponding human body; Determiner is learned according to the human geometry, based on the detected human body of a plurality of human body detectors, removes false-alarm, to determine human body and the people in the detected image; Wherein, each of described a plurality of human body detectors comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage human body; Sub-window processor, the positive sample of the human body of storing from the training DB that image processor calculates and the difference image of negative sample extract feature set; The human body grader, the difference image of the detected image that calculates based on image processor, end user's body region model, detect and the corresponding human body of described human body grader, wherein, the described human body model feature set that to be sub-window processor extract by the difference image to the positive sample of the described human body stored from training DB and negative sample is learnt to obtain.

According to a further aspect in the invention, provide a kind of imaging device, comprising: image-generating unit, the image of reference object; Detecting unit, difference image based on the image of the object of taking, use the zone of reference object in the object model detected image, wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of the object of described shooting and negative sample is extracted; The attitude parameter computing unit, the object of the shooting that detects according to detecting unit is the zone in image, calculates to produce the parameter of adjusting the imaging device attitude, so that object is placed the central area of image; Control unit receives the parameter of adjusting the imaging device attitude from described attitude parameter computing unit, adjusts the attitude of imaging device; Memory element, the image of the object that storage is taken; Display unit, the image of the object that demonstration is taken.

According to a further aspect in the invention, provide the method for object in a kind of detected image, described method comprises: the difference image that calculates detected image; Based on the difference image of the detected image that calculates, use object model to detect described object, wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of described object and negative sample is extracted.

According to a further aspect in the invention, provide the equipment of object in a kind of detected image, described equipment comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage object; Sub-window processor, the difference image of the positive sample of objects stored and negative sample extracts feature set from the training DB that image processor calculates; The object grader based on the difference image of the detected image that calculates, uses object model to detect described object, and wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of described object and negative sample is extracted.

Description of drawings

Below by the description of in conjunction with the accompanying drawings exemplary embodiment being carried out, it is clear and easier to understand that these and/or other aspect of the present invention, characteristics and advantage will become, wherein:

Fig. 1 is the block diagram according to a checkout equipment of exemplary embodiment of the present invention;

Fig. 2 illustrates the diagrammatic sketch of the method for image processor 110 calculating difference images according to an exemplary embodiment of the present invention;

Fig. 3 illustrates some negative samples that typically have the contour shape similar to the contour shape of the positive sample of head;

Fig. 4 illustrates the difference image with the object of linear structure;

Fig. 5 illustrates the example according to the difference image of the people's of method calculating shown in Figure 2 head;

Fig. 6 illustrates the sub-window that uses according to an exemplary embodiment of the present invention in feature extracting method;

Fig. 7 illustrates the division of the view of people's head according to an exemplary embodiment of the present;

Fig. 8 illustrates the pyramid detector of the head that is used to detect the people according to an exemplary embodiment of the present invention;

Fig. 9 is the flow chart of the detection of pyramid detector shown in Figure 8 according to an exemplary embodiment of the present invention;

Figure 10 is the block diagram that has the detector of many human bodies detector according to an exemplary embodiment of the present invention;

Figure 11 is the detailed diagram of an exemplary detector of the detector with many human bodies detector of Figure 10 according to an exemplary embodiment of the present invention;

Figure 12 is the block diagram of the detector with many human bodies detector of another exemplary embodiment according to the present invention;

Figure 13 illustrates the block diagram of the imaging device of exemplary enforcement according to the present invention;

Figure 14 illustrates the block diagram of the detecting unit among Figure 13 according to an exemplary embodiment of the present invention.

The specific embodiment

Below, with reference to accompanying drawing the present invention is described more all sidedly, exemplary embodiment of the present invention is shown in the drawings, and wherein, label identical in whole accompanying drawing is represented components identical.Below, embodiment is described to explain the present invention.

Fig. 1 is the block diagram according to a checkout equipment of exemplary embodiment of the present invention.

With reference to Fig. 1, head detection equipment 100 comprises: image processor 110, training DB 120, sub-window processor 130 and head grader 140.

The difference image of image processor 110 computed image.The positive sample and the negative sample of training DB 120 storage people's head.Sub-window processor 130 is from extracting feature set by the positive sample that is stored in the head of training the people the DB120 of image processor 110 calculating and the difference image of negative sample.Grader 140 is based on the difference image of the detected input picture of wanting of calculating of image processor 110, uses the head model that obtains from the feature set study of extracting to detect people's head zone.Describe the operation of image processor 110 in detail now with reference to Fig. 2.

Fig. 2 illustrates the diagrammatic sketch of the method for image processor 110 calculating difference images according to an exemplary embodiment of the present invention.Image processor 110 calculates dx, dy, du and four difference images of dv.

With reference to Fig. 2, each pixel value of difference image dx be in original image in the N*N adjacent area in the horizontal direction pixel poor.With reference to the dx of Fig. 2, if N equals 3, then the pixel value pixel value of grey rectangle and that deduct the Lycoperdon polymorphum Vitt circle and be the value of the center pixel of difference image dx.Each pixel value of difference image dy be in original image in the N*N adjacent area in vertical direction pixel poor.With reference to the dy of Fig. 1, if N equals 3, then the pixel value pixel value of grey rectangle and that deduct the Lycoperdon polymorphum Vitt circle and be the value of the center pixel of difference image dy.Each pixel value of difference image du be in original image in the N*N adjacent area pixel on the right side-left diagonal poor.With reference to the du of Fig. 1, if N equals 3, then the pixel value pixel value of grey rectangle and that deduct the Lycoperdon polymorphum Vitt circle and be the value of the center pixel of difference image du.Each pixel value of difference image dv be in original image in the N*N adjacent area pixel on the L-R diagonal poor.With reference to the dv of Fig. 1, if N equals the pixel value and the pixel value that deducts the Lycoperdon polymorphum Vitt circle of 3 grey rectangle and is the value of the center pixel of difference image dv.By this way, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction.

Simultaneously, can calculate difference image with different yardsticks.For example, in dx, dy, du and the dv of Fig. 1, the pixel value pixel value of black rectangle and that deduct the black circle and be the value of the center pixel of 5*5 adjacent area, can be expressed as (2*n+1) * (2*n+1) with formula, n=1,2 ....For multiple dimensioned, image is as will for example, when n equals 2, being represented to carry out the calculating of an error image every a pixel by continuous sub-sampling.For the coarse resolution image, calculating difference image than large scale (that is, bigger adjacent area), the influence of background noise when reducing to extract feature.Simultaneously, for high-definition picture, yardstick that can be less (that is, less adjacent domain) calculates difference image, to catch local detail.

Supposing that picture traverse is the individual pixel of w (0...w-1), highly is (0....h-1) individual pixel, width from 1 to w-2 and 1 to h-2 high computational difference image, and be 0 calculating the pixel value that exceeds the image edge season.For example, when to being 1 at width, when highly being the pixel difference of 1 the pixel horizontal direction of calculating the 5*5 adjacent area, the 5*5 adjacent area has only part within image, and the value that order exceeds two pixels outside the image is 0.Before calculating difference image, to source gray level image sub-sampling with thick yardstick.Then, according to the calculated characteristics image of introducing above.

Fig. 3 illustrates some negative samples that typically have the contour shape similar to the contour shape of the positive sample of head.Although these negative samples are made of approximate rectangular line, they appear to the shape of distortion of people's head, and this identification ability to grader is a challenge.This is because the false-alarm that usually occurs in testing result mainly comprises the object that has with the shape similar shapes of people's head.On the other hand, this has shown the importance of Feature Conversion and extraction.The invention provides the method preferably that reduces this difficulty.

Fig. 4 illustrates the difference image with the object of linear structure.

With reference to Fig. 4, the straight line object of constructing in image can be decomposed into different difference images.This is decomposed into feature extraction and has stayed resource preferably.Difference image dx maintains image and changes in the horizontal direction, and difference image dy maintains image and changes in vertical direction.Difference image du and dv maintain image and change on diagonal.Fig. 5 illustrates the example according to the difference image of the people's of method calculating shown in Figure 2 head.Compare with the difference image among Fig. 4, we can find: although lines are decomposed, the profile of head keeps ground better.

Fig. 6 illustrates the sub-window that quilt window processor 130 uses in feature extracting method according to an exemplary embodiment of the present invention.By the single window that on training image, slides, and with its width and Level Change for creating single window feature with the possible ratio that image scaled is regulated.By two windows that on training image, slide, with its width and Level Change is the possible yardstick that can regulate with graphical rule, and at one time with same magnification factor with the width of double window and Level Change for creating the double window feature by the yardstick that yardstick is caught the stealth mode of wide region.Two windows can be movable relative to each other in the horizontal and vertical directions, to catch the stealth mode of wide region.Window by these three connections of sliding on training sample is created three window features.These three windows are in order to catch the pattern of wide region and to have identical width during changing yardstick and height by yardstick.Second window can move with respect to first window and the 3rd window, to catch protruding and recessed profile.Have two kind of three window feature, a kind of is the horizontal layout of three windows, and another kind is the vertical layout of three windows.For horizontal layout, second window can move horizontally with respect to first window and the 3rd window; For vertical layout, second window can move with respect to first window is vertical with the 3rd window.

Suppose the feature of f for extracting, G is a difference image, and w is the feature window, OP ¹For being used for the operator of feature extraction, it comprises '+' and '-' two simple operators.OP ¹＝{+，-}。OP ²Be second operator that uses in the present invention, it comprises '+' and '-' and two simple operators of main operator domin.OP ²＝{+，-，domin}。

For the feature extraction of single difference image, can by respectively with single feature window, bicharacteristic window and the corresponding equatioies of three feature windows (1), (2) (3) calculated characteristics.A is in four difference images.

f_{1}^{a} = \underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j) - - - (1)

f_{2}^{a} = (O P^{1}) (\underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j), \underset{(i, j) &Element; w_{2}}{Σ} G_{a} (i, j)) - - - (2)

f_{3}^{a} = (O P^{1}) (\underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j), \underset{(i, j) &Element; w_{2}}{Σ} G_{a} (i, j), \underset{(i, j) &Element; w_{3}}{Σ} G_{a} (i, j)) - - - (3)

For the feature extraction of two difference images, can by respectively with single feature window, bicharacteristic window and the corresponding equatioies of three feature windows (4), (5) (6) calculated characteristics that overlap on the difference image.A and b are any two images in four difference images.

f_{1}^{ab} = (O P^{2}) (f_{1}^{a}, f_{1}^{b}) - - - (4)

f_{2}^{ab} = (O P^{2}) (f_{2}^{a}, f_{2}^{b}) - - - (5)

f_{3}^{ab} = (O P^{2}) (f_{3}^{a}, f_{3}^{b}) - - - (6)

For the feature extraction of three difference images, can by respectively with single feature window, bicharacteristic window and the corresponding equatioies of three feature windows (7), (8) (9) calculated characteristics.A, b and c are any three images in four difference images.

f_{1}^{abc} = (O P^{2}) (f_{1}^{a}, f_{1}^{b}, f_{1}^{c}) - - - (7)

f_{2}^{abc} = (O P^{2}) (f_{2}^{a}, f_{2}^{b}, f_{2}^{c}) - - - (8)

f_{3}^{abc} = (O P^{2}) (f_{3}^{a}, f_{3}^{b}, f_{3}^{c}) - - - (9)

For the feature extraction of four difference images, can by respectively with single feature window, bicharacteristic window and the corresponding equatioies of three feature windows (10), (11) (12) calculated characteristics.A, b, c and d are four difference images.

f_{1}^{abcd} = (O P^{2}) (f_{1}^{a}, f_{1}^{b}, f_{1}^{c}, f_{1}^{d}) - - - (10)

f_{2}^{abcd} = (O P^{2}) (f_{2}^{a}, f_{2}^{b}, f_{2}^{c}, f_{2}^{d}) - - - (11)

f_{3}^{abcd} = (O P^{2}) (f_{3}^{a}, f_{3}^{b}, f_{3}^{c}, f_{3}^{d}) - - - (12)

Shown in equation (13)～(17), operator 1 is addition operator and subtraction operator, and operator 2 also comprises main operator except addition operator and subtraction operator.

OP ¹(a，b)＝(a+b)or(a-b)(13)

O P^{2} (a, b) = (a + b) or (a - b) or (\frac{a}{a + b}) or (\frac{b}{a + b}) - - - (14)

OP ¹(a，b，c)＝(a+b+c)or(2b-a-c)(15)

OP ²(a，b，c)＝(a+b+c)

OP ²(a，b，c)＝2b-a-c(16)

O P^{2} (a, b, c) = \frac{a or b or c}{a + b + c}

OP ²(a，b，c，d)＝(a+b+c+d)(17)

OP ²(a，b，c，d)＝3a-b-c-d

OP ²(a，b，c，d)＝3b-a-c-d

OP ²(a，b，c，d)＝3c-a-b-d

OP ²(a，b，c，d)＝3d-a-b-c

O P^{2} (a, b, c, d) = \frac{a, b, c or d}{a + b + c + d}

The sub-window of use in the method for extracting feature shown in Figure 6 only is exemplary, rather than in order to limit purpose.The present invention can be implemented with the sub-window of other quantity and type.

Sub-window processor 130 can extract feature by the single difference diagram of scanning, two difference images, three difference images or four difference images.Feature set can be to extract the feature one or extract the combination in any of features from single difference diagram, two difference images, three difference images or four difference images from single difference diagram, two difference images, three difference images or four difference images.In addition,, can use different yardstick to calculate difference image in order to obtain more feature, to and extract feature set from the difference image that calculates according to different scale.

For the feature set of extracting, use statistical learning method (for example, Adaboost, SVM etc.) to select to have the feature of identification ability, can produce final disaggregated model.In mode identification technology, disaggregated model is implemented as usually and uses this model classification device.

In exemplary embodiment of the present invention, prepare the positive sample and the negative sample of people's head, and the positive sample of use image processor 110 calculating people's head and the difference image of negative sample.Sub-window processor 130 is based on the positive sample that calculates and the difference image of negative sample, use above-mentioned feature extracting method to create and have the feature set of big measure feature, thereby use statistical method to learn to obtain the head part class model then, thereby obtain to use the head grader 140 of this model.Equally, also can use the same method for other positions of human body and obtain the model and the grader of this human body.For example, for trunk, prepare the positive sample and the negative sample of trunk.The feature extracting method that sub-window processor 130 is introduced above using from the difference image of the positive sample of trunk and negative sample extracts feature set.Use statistical learning method (for example, Adaboost, SVM etc.) to select to have the feature of identification ability, and produce final human human trunk model and trunk grader.

Will be understood by those skilled in the art that, head detection equipment 100 shown in Figure 1 only is exemplary, (for example head grader 140 in the head detection equipment 100 can be replaced with other people grader of body region, trunk grader, human leg portion grader, human body arm grader and whole human body grader), form the checkout equipment of other people body region, in image, to detect described other people body region.

Head grader 140 among Fig. 1 can also be the head detector with a plurality of head graders, pyramid detector 800 for example shown in Figure 8.

Fig. 7 illustrates the division of the view of people's head according to an exemplary embodiment of the present.Fig. 8 illustrates the pyramid detector of the head that is used to detect the people according to an exemplary embodiment of the present invention.

With reference to Fig. 7, when contour shape with camera during towards the change of the angle of people's head, the head view is classified as eight divisions, is respectively that front side, a left side are half side, left side, left back half side, rear side, right back half side, right side and a left side be half side.Eight discrete views are represented all views of 360 degree that obtain around people's head.Each view covers the visual angle of certain limit, rather than the certain viewing angles point.For example, if front view is represented 0 degree, 90 degree are represented in the left side, and then the actual covering of front view [22.5 ,+22.5] is spent, the actual covering of left half side view [22.5,67.5] degree, the actual covering of left side view [67.5,112.5] degree.

First head model that the feature set study that difference image by half side positive sample in, left side half side to front side, a left side from people's head, left back half side, rear side, right back half side, right side and a left side and negative sample extracts obtains and the first head grader that uses first head model.

Second head model that obtains by the feature set study that the difference image from the positive sample of the front side of people's head and rear side and negative sample is extracted and the second head grader that uses second head model.

The 3rd head model that obtains by the feature set study that the difference image from the positive sample on the left side of people's head and right side and negative sample is extracted and the 3rd head grader that uses the 3rd head model.

Four-head portion model that obtains by the feature set study that the difference image from left half side, left back half side, the right back half side and left half side positive sample of people's head and negative sample is extracted and the four-head portion grader that uses four-head portion model.

In producing the process of above-mentioned four graders, use with produce Fig. 1 in the process of grader 140 in extract the identical method extraction feature set of method of feature set.

Above-mentioned first, second, third and four-head portion grader correspond respectively to A, FP and HP grader among Fig. 8.

With reference to Fig. 8, when detecting, image processor 110 is four difference images of calculating input image at first, promptly, the difference image of level, vertical, left and right sides diagonal, right left diagonal, pyramid detector 800 detected image then, pyramid detector 800 search and test each possible yardstick and position in image when detecting.Estimate by pyramid detector 800, be accepted head candidate into the people.Otherwise, be removed and be false-alarm.

Specifically, at first each possible yardstick and position (that is the detected head zone of grader A) of the search and the head of test person in image of grader A.Be classified that the detected head zone of device A will further be classified device F, P and HP estimates.If sample is classified an acceptance among device F, P and the HP, then this sample is confirmed as the head candidate of the pairing head view of this grader.Grader F, P and HP estimate one by one to the sample that grader A accepts respectively, until having estimated all samples that grader A accepts.

Fig. 9 is the flow chart of the detection of pyramid detector 800 shown in Figure 8 according to an exemplary embodiment of the present invention.

With reference to Fig. 9, in training process, front side, a left side of preparing people's head is half side, left side, left back half side, rear side, right back half side, half side positive sample and the negative sample in a right side and a left side.

As mentioned above, first head model that obtains of the feature set study of extracting of the difference image by half side positive sample in, left side half side, left back half side, rear side, right back half side, right side and a left side and negative sample to front side, a left side from people's head; Second head model that obtains by the feature set study that the difference image from the positive sample of the front side of people's head and rear side and negative sample is extracted; The 3rd head model that obtains by the feature set study that the difference image from the positive sample on the left side of people's head and right side and negative sample is extracted; The four-head portion model that obtains by the feature set study that the difference image from left half side, left back half side, the right back half side and left half side positive sample of people's head and negative sample is extracted.Grader A, F, P and the HP of pyramid detector 800 uses first, second, third and four-head portion model respectively.

In testing process, four of calculating input image difference images at first, that is, and the difference image of level, vertical, left and right sides diagonal, right left diagonal.Pyramid detector 800 detects input pictures, and the head that will detect the people is exported as testing result.

In another exemplary embodiment of the present invention, (for example use at least one other people body region except people's head, people's trunk, people's lower limb and people's arm) grader (detector), learn according to the human geometry, the people's who determines from pyramid detector shown in Figure 8 800 head zone is further removed the head false-alarm, with the precision of the head detection that improves the people.In addition, people's head detector can be used as people's detector according to an exemplary embodiment of the present invention, and this is because when people's head is detected, and can determine people's existence.

Figure 10 is the block diagram that has the detector 1000 of many human bodies detector according to an exemplary embodiment of the present invention.

With reference to Figure 10, detector 1000 comprises three location detection devices, and one of them location detection device is as primary detector.Detector 1000 is used for detecting and the corresponding human body of primary detector at input picture.In exemplary embodiment of the present invention, location detection device I uses as primary detector.

Below, will the operation of detector 1000 be described.At first, with detected input picture input part bit detector I.Location detection device I detects with the human body I corresponding to location detection device I regional accordingly in input picture.Subsequently, location detection device II detects with the human body II corresponding to location detection device II regional accordingly in input picture.At this moment, learn relation constraint according to the human geometry, the zone of the human body II that detects based on location detection device II, be used as false-alarm from the position candidate's of the human body I of location detection device I output a part and remove (the position candidate's of position II a part also is used as false-alarm and removes), keep the human body II corresponding simultaneously with the position candidate of remaining human body I.Location detection device N detects with the human body N corresponding to location detection device N regional accordingly in input picture.At this moment, learn constraint according to the human geometry, the zone of the human body N that detects based on location detection device N, part among the position candidate of pricer body region I by location detection device II is used as false-alarm and removes (the position candidate of position II and the position candidate's of position N a part also is used as false-alarm and removes), and obtains human body II corresponding with the position candidate of remaining human body I and human body N.Like this, compare, the detector at a plurality of positions is used carries out human body position I, can access more accurate result with only using a detection about the detector of human body I.In addition, use the detector at a plurality of positions like this, can people and other people body region be detected from input picture according to human geometry.

As mentioned above, location detection device II conduct is from the position candidate's of location detection device I validator.In this case, location detection device II scanning comes human body position II from position candidate's adjacent domain of location detection device I.Described adjacent domain is learned relation constraint according to the human geometry and is determined.For example, if location detection device I is a head detector, location detection device II is the trunk detector, then based on head candidate from location detection device I, can determine roughly (promptly according to people's the head and the relative position constraint and the dimension constraint of trunk at scanning space, detected input picture) position and the size of trunk in, this can carry out statistical analysis and obtain based on positive training sample.

In the present invention, the location detection device that detector 1000 is had is not limited to three, can with less than or realize detector 1000 greater than three location detection device.Described location detection device can be head detector, trunk detector, shank detector, arm detector and whole human body detector etc.

Figure 11 is the detailed diagram of an exemplary detector 1100 of the detector with many human bodies detector 1000 of Figure 10 according to an exemplary embodiment of the present invention.

With reference to Figure 11, detector 1100 comprises: determiner 1110, head detector 1120, shank detector 1130 and trunk detector 1140.In the present embodiment, be primary detector with head detector 1120.

Head detector 1120 can be a head detection equipment 100 shown in Figure 1.Shank detector 1130 has the structure identical with the head detection equipment of Fig. 1 with trunk detector 1140.

Each of head detector 1120, shank detector 1130 and trunk detector 1140 comprises: image processor (not shown), training DB (not shown), sub-window processor (not shown), its have with Fig. 1 in image processor 110, training DB 120, sub-window processor 130 identical functions, therefore will omit detailed description.Simultaneously, head detector 1120 also comprises the head grader that obtains according to the head model that obtains from feature set study, wherein, be included in sub-window processor in the head detector 1120 from be stored in the training DB that is included in the head detector 1120 the positive sample of head and the difference image of negative sample extract feature set, be included in the difference image that image processor in the head detector 1120 calculates positive sample of head and negative sample.Shank detector 1130 also comprises the shank grader that obtains according to the shank model that obtains from feature set study, wherein, be included in sub-window processor in the shank detector 1130 from be stored in the training DB that is included in the shank detector 1130 the positive sample of shank and the difference image of negative sample extract feature set, be included in the difference image that image processor in the shank detector 1130 calculates positive sample of shank and negative sample.Trunk detector 1140 also comprises the trunk grader that obtains according to the human trunk model that obtains from feature set study, wherein, be included in sub-window processor in the trunk detector 1140 from be stored in the training DB that is included in the trunk detector 1140 the positive sample of trunk and the difference image of negative sample extract feature set, be included in the difference image that image processor in the trunk detector 1140 calculates positive sample of trunk and negative sample.

At first, head detector 1120, shank detector 1130 and trunk detector 1140 be detected image respectively, with out-feed head candidate, shank candidate and trunk candidate.

Determiner 1110 is based on from the head candidate of head detector 1120 with from the trunk candidate of head detector 1140, learn (for example, constraint of the relative position of people's head and trunk and dimension constraint) according to the human geometry and from detected head candidate, remove false-alarm.Then, determiner 1110 is learned according to the human geometry based on the head candidate and the shank candidate that have removed false-alarm in conjunction with the trunk candidate, further removes the false-alarm among the head candidate, thereby people's head, trunk, shank and people detection can be come out.

In another exemplary embodiment of the present invention, head detector 1120 also can be a pyramid detector 800 shown in Figure 8.

In another exemplary embodiment of the present invention, in head detector 1120, shank detector 1130 and trunk detector 1140, do not comprise image processor, training DB, sub-window processor, head detector 1120, shank detector 1130 and trunk detector 1140 shared same image processors, training DB, sub-window processor.

Figure 12 is the block diagram of the detector with many human bodies detector of another exemplary embodiment according to the present invention.

With reference to Figure 12, detector 1200 comprises: the individual location detection device I-N of N (N is a natural number).In addition, detector 1200 also comprises image processor (not shown), training DB (not shown) and sub-window processor (not shown), its have with Fig. 1 in image processor 110, training DB 120, sub-window processor 130 identical functions, therefore will omit detailed description.Each comprised image processor of described a plurality of location detection device I-N, training DB and sub-window processor also can shared same image processors, training DB and sub-window processor.

In addition, detector 1200 also comprises the determiner (not shown), learns according to the human geometry, based on the detected human body of location detection device I-N, removes the false-alarm in the detected human body, to determine human body and the people in the detected image.

Detector I comprises m ₁Individual grader S11 about human body I, S12, S13 ..., S1m ₁Detector II comprises m ₂Individual grader S21 about human body II, S22, S23 ..., S1m ₂... detector n (n=1,2,3..., n represent n detector in N the location detection device) comprises m _nIndividual grader S21 about human body n, S22, S23 ..., S1m _n... detector N comprises m _NIndividual grader S11 about human body N, S12, S13 ..., S1m _N, (wherein, m ₁, m ₂..., m _NBe natural number).The training method of grader is identical with method about the grader at a plurality of positions described in Figure 11, and all omit the description to the training of grader.As described in Figure 12, in each of N location detection device that detector 1200 comprises, grader is arranged with the ascending order of its required amount of calculation and is used.Usually, the quantity of the feature that the amount of calculation of grader has with it substantially is corresponding, so grader is arranged with the ascending order of its characteristic number that has.Below, describe the operation of detector 1200 in detail with reference to Figure 12.

After image input detector 1200 that will be detected, at first in detector I, pass through a plurality of front end graders (that is, S11 and S12) detected image, obtain candidate's (that is, operation arrives the A point) about the human body I of detector I.Among detector IIs by a plurality of end graders (that is, S21 and S2) detected image, obtain candidate about the human body II of detector II thereafter.Determiner is learned according to the human geometry based on the candidate of candidate who obtains human body I and human body II then, removes false-alarm (that is, operation arrives the B point) from the candidate of human body I.By this way, use the front end grader of remaining detector III-N successively, further remove false-alarm (that is, operation arrives the F point) by determiner.Then, operation arrives K, and the grader that reuses among the detector I detects, and further removes false-alarm by determiner, then uses detector II to N successively, repeated use detector I-N, all grader in using detector I-N.The principle of this detection mode is: by the front end grader (grader that characteristic quantity is less) that uses each detector at first ordinatedly, can use less amount of calculation to remove most false-alarm, and then the more grader of use characteristic amount progressively, can greatly improve detection speed.For same false-alarm, if only use detector I, then need to use grader S11, S12 and S13 to comprise that altogether 50 features just are removed,, then only need 5 features just can remove this false-alarm if use grader S11, the S12 of detector I and the grader S21 of detector II.According to the predetermined threshold value of the characteristic number in the grader, automatically switch to next detector from a detector.Although can select threshold value in a different manner, its principle is identical, that is, switch between different detectors with less feature at front end and to remove false-alarm.

In one exemplary embodiment of the present invention, for all graders in the detector 1200, the ascending order of the characteristic number that has with grader uses each grader to detect.That is to say, when using detector 1200 to detect, do not consider the location detection device under the grader, but the ascending order of the characteristic number that has with grader, at first the grader that the use characteristic number is few uses each grader in the detector 1200.

The invention is not restricted to human body and people's detection, can be applied to any detection (for example, animal, plant, building, natural scene and daily productive life articles for use etc.) with object of definite shape.

Figure 13 illustrates the block diagram of the imaging device with object detection functions 1300 exemplary according to the present invention.

With reference to Figure 13, imaging device 1300 comprises image-generating unit 1301, subject detecting unit 1302, parameter calculation unit 1303, control unit 1304, memory element 1305, display unit 1306 and mark unit 1307.This imaging device can be PTZ (PAN, TILT, ZOOM) any among video camera, static surveillance camera, DC (digital camera), shooting mobile phone, DV (digital camera) and the PVR (personal video recorder).

Image-generating unit 1301 is hardware units, and for example CCD or COMS device are used for perception and produce natural image.

For moving object tracking, there are two kinds of methods that the position and the size in motion of objects zone are provided.First method is an automated process, uses embedded object to detect size and position that merit is extracted the zone of interested object.Second method is a manual methods, and user or operator go up the zone of the interested object of mark at the image (for example, touch screen) that shows.For automated process, use according to object method of the present invention, object can be detected automatically.Mark unit 1307 offers user or operator with marking Function, so that user or operator can enough pens or finger interested objects of manual mark on image.

Subject detecting unit 1302 can receive view data from image-generating unit 1301, also can receive the user for example with the size and the positional information of the interested subject area of the form mark of rough labelling.The accurate zone at subject detecting unit 1302 detected object places, parameter calculation unit 1303 are calculated according to the zone at the 1303 object places that provide and are produced the parameter of adjusting the imaging device attitude.It should be noted that when using the first method (that is, automated process) that object's position and size are provided, mark unit 1307 is optional.When a plurality of tracing objects were selective, when a plurality of motion object of following the tracks of was for example arranged, the user can revise the tracing object that imaging device is selected automatically among the present invention.Describe subject detecting unit 1302 in detail below with reference to Figure 14.

With reference to Figure 14, subject detecting unit 1302 comprises: image processor 1410, training DB 1420, sub-window processor 1430, object grader 1440 and output unit 1450.

The method that image processor 1410 uses Fig. 2 to describe is calculated the difference image of the image of image-generating unit 1301 outputs among Figure 13.The positive sample and the negative sample of the training DB 1420 various objects of storage (for example, various biologies, plant, building, natural scene and daily productive life articles for use etc.).Sub-window processor 1430 uses the feature extracting method of Fig. 6 description from extracting feature set by the positive sample that is stored in the object the training DB 1420 of image processor 1410 calculating and the difference image of negative sample.Object grader 1440 uses the subject area from the object model detected image that the feature set study of extracting obtains based on the difference image of the detected input picture of wanting of calculating of image processor 1410.Can use the method identical to obtain described object model with the learning method of above-mentioned head model.Output unit 1450 outputs are from the zone at object place in image of object grader 1440 and/or mark unit 1307.

Subject detecting unit 1302 also can not comprise training DB 1420 and sub-window processor 1430, and the disaggregated model (that is object model) of various predetermine ones is preset in the object grader 1440.Object grader 1440 wishes that according to the user object type that detects comes the allocating object disaggregated model.

Control unit 1304 can be adjusted to the attitude of picture equipment, and the attitude of imaging device is perhaps controlled by convergent-divergent and the focusing operation of static surveillance camera, DC, DV or PVR by the operation control in swing, inclination, convergent-divergent and the selective focus zone of Pan/Tilt/Zoom camera.Control unit 1304 receives the parameter of adjusting the imaging device attitude from parameter calculation unit 1303.Subject detecting unit 1302 can provide the position and the dimension information of object in new time point or the new frame data.Control unit 1304 is according to the attitude of described parameter adjustment imaging device, to make object placed in the middle in image by swing/tilt operation, select the zone of interested object by the operation in selective focus zone, and come the zone of interested object is focused on by zoom operations, with the details of the motion object that obtains high image quality.In the operation in selective focus zone, the new zone at control unit may command imaging device alternative place is as focusing on foundation, so that described zone is focused on.In addition, when control unit control imaging device selective focus zone, imaging device is except selecting the picture centre zone imaging and focusing zone as acquiescence, dynamically the new image-region at alternative place is as the imaging and focusing zone, image data information according to focal zone, dynamically adjust convergent-divergent multiple, focal length, swing or the tilt parameters of imaging device, thereby obtain better imaging effect.

For the electronic product that is held in user's hands, such as DC, DV or PVR, the user can manually adjust its attitude, make interested object placed in the middle in image, control unit in the exemplary embodiment of the present can dynamically be adjusted the convergent-divergent multiple and the focusing parameter of this imaging device according to the parameter that provides of testing result and parameter calculation unit.

Memory element 1305 memory images or video, display unit 1306 is shown to the user with the image or the video at scene.

Detecting unit 1302 also can be implemented as software according to an exemplary embodiment of the present invention, and this software is used to be connected to imaging device and puts embedded system with control unit, to regulate the parameter of imaging device attitude.For embedded imaging device system, but its receiver, video and sends to the control unit of imaging device with order as input, with the attitude of regulating imaging device, lens focus zone etc.

The present invention has following effect and advantage:

(1) amount of calculation still less.The calculating of difference image only needs the subtraction of neighbor, and does not need division or arctangent cp cp operation.

(2) performance fully.Source images is divided into a plurality of subimages of a plurality of ratios, fully shows source images and do not need to quantize.

(3) very strong separating capacity.Experimental results show that new feature can reduce complexity and raise the efficiency.

(4) can freely make up and use many views multi-section bit detector for the people detection device according to user's situation.

(5) feature extraction of many ratios can reduce the influence of background noise.Under the situation of thick ratio, can suppress assorted speckle background; Under the situation of good ratio, can obtain local detail.

Although specifically shown and described the present invention with reference to its exemplary embodiment, but it should be appreciated by those skilled in the art, under the situation that does not break away from the spirit and scope of the present invention that are defined by the claims, can carry out various changes on form and the details to it.

Claims

1, human body/people's method in a kind of detected image, described method comprises:

Calculate the difference image of detected image;

Difference image use based on the detected image that calculates detects described the first body region with the corresponding the first body region model of the first body region, wherein, described the first body region model is to learn to obtain by the feature set that the difference image from the positive sample of described the first body region and negative sample is extracted.

2, the method for claim 1 also comprises:

Difference image based on the detected image that calculates, use the second human body model corresponding to detect described second human body with at least one second human body that is different from described the first body region, wherein, the described second human body model is to learn to obtain by the feature set that the difference image from the positive sample of described second human body and negative sample is extracted;

The locus at the different human body position of learning according to the human geometry and the constraint of size based on described detected second human body, are removed false-alarm from described detected the first body region.

3, method as claimed in claim 1 or 2, wherein, described difference image is included in the difference image that calculates with at least one yardstick on horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal of image.

4, method as claimed in claim 3, wherein, the step of the feature set of extraction difference image comprises the described feature set of extraction at least one that use in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

5, method as claimed in claim 4, wherein, one of the head that described the first body region is the people, trunk, human body shank, human body arm and whole human body.

6, method as claimed in claim 5, wherein, described the first body region is people's a head.

7, method as claimed in claim 6, wherein, the positive sample and the negative sample of described the first body region comprise: the front side of people's head, half side, left back half side, the rear side in a left side, right back half side, half side positive sample and the negative sample in a right side and a left side.

8, method as claimed in claim 7, wherein, described the first body region model comprises:

First head model that the half side positive sample in half side, left back half side, the rear side in front side, the left side of people's head, right back half side, right side and a left side and negative sample are learnt to obtain being used to detect head;

The positive sample of the front side of people's head and rear side and negative sample are learnt to obtain to be used to detect people's front side of head and second head model of rear side;

The positive sample on the left side of people's head and right side and negative sample are learnt to obtain to be used to detect people's the left side of head and the 3rd head model on right side;

To left back half side, the left side of people's head half side, right back half side and and right half side positive sample and negative sample learn to obtain to be used to detect left back half side, half side, right back half side and and the right half side four-head portion model in a left side of people's head.

9, method as claimed in claim 8, wherein, the step of using the first body region model to detect the first body region corresponding with described the first body region model based on the difference image of the detected image that calculates comprises:

Based on the difference image of the detected image that calculates, use first head model to detect head in the detected image;

Difference image based on the detected image that calculates, use second, third or four-head portion model that the head that detects by first head model is estimated, with left back half side, half side, the right back half side and right half side false-alarm in a left side of the left side of the front side of the head of removing the people respectively and rear side false-alarm, people's head and right side false-alarm and people's head.

10, method as claimed in claim 9, wherein, described second human body comprises: at least one in trunk, human body shank, human body arm and the whole human body.

11, a kind of in image human body position/people's equipment, described equipment comprises:

Image processor, the difference image of computed image;

Training DB, the positive sample and the negative sample of storage human body;

Sub-window processor, the positive sample of the human body of storing from the training DB that image processor calculates and the difference image of negative sample extract feature set;

The first body region grader, the difference image of the detected image that calculates based on image processor, use the first body region model, detect and the corresponding the first body region of described the first body region grader, wherein, the described the first body region model feature set that to be sub-window processor extract by the difference image to the positive sample of the described the first body region stored from training DB and negative sample is learnt to obtain.

12, equipment as claimed in claim 11 also comprises:

At least one second human body grader, the difference image of the detected image that calculates based on image processor, use the second human body model, detect described second human body, wherein, the described second human body model feature set that to be sub-window processor extract by the difference image to the positive sample of described second human body stored from training DB and negative sample is learnt to obtain, wherein, each of described at least one second human body grader is with at least one second human body that is different from described the first body region each be corresponding one by one;

Determiner, the locus at the different human body position of learning according to the human geometry and the constraint of size based on described detected second human body, are removed false-alarm from described detected the first body region.

13, as claim 11 or 12 described equipment, wherein, described image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

14, equipment as claimed in claim 13, wherein, the window processor uses and extracts described feature set in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal at least one.

15, equipment as claimed in claim 14, wherein, one of the head that described the first body region is the people, trunk, human body shank, human body arm and whole human body.

16, equipment as claimed in claim 15, wherein, described the first body region is people's a head.

17, equipment as claimed in claim 16, wherein, the positive sample and the negative sample of described the first body region comprise: the front side of people's head, half side, left back half side, the rear side in a left side, right back half side, half side positive sample and the negative sample in a right side and a left side.

18, equipment as claimed in claim 17, wherein, described the first body region grader comprises:

The first head grader, based on the difference image of head, first head model that uses half side positive sample in half side, left back half side, the rear side in front side, a left side to people's head, right back half side, right side and a left side and negative sample to learn to obtain detects head;

The second head grader based on the difference image of head, uses the front side and the rear side that the positive sample of the front side of people's head and rear side and second head model that negative sample is learnt to obtain are detected head;

The 3rd head grader based on the difference image of head, uses left side and the right side of the positive sample on the left side of people's head and right side and the 3rd head model that negative sample is learnt to obtain being detected head;

Four-head portion grader, based on the difference image of head, use to left back half side, the left side of people's head half side, right back half side and and right half side positive sample and negative sample learn to obtain four-head portion model detect the people head left back half side, a left side is half side, right back half side and and right half side.

19, equipment as claimed in claim 18, wherein, use second, third and four-head portion grader that the head that detects by the first head grader is estimated, with left back half side, half side, the right back half side and right half side false-alarm in a left side of the left side of the front side of the head of removing the people respectively and rear side false-alarm, people's head and right side false-alarm and people's head.

20, equipment as claimed in claim 19, wherein, described second human body comprises: at least one in trunk, human body shank, human body arm and the whole human body.

21, a kind of in image human body position/people's method, described method comprises:

(a) difference image of the detected image of calculating;

(b) based on the difference image of the detected image that calculates, use with a plurality of different a plurality of one to one human body models of human body in a human body model detect and a described human body that the human body model is corresponding, wherein, each of described a plurality of human body models is to learn to obtain by the feature set that the difference image from the positive sample of each pairing human body of described a plurality of human body models and negative sample is extracted;

(c) repeating step (b), for another human body in the described a plurality of different human body position that is different from the described human body in the step (b), use the human body model corresponding to detect described another human body with described another human body, and, from detected human body, remove false-alarm according to human geometry based on described detected another human body;

(d), finally determine the human body of detection, and learn the people who determines detection according to the human geometry based on the result of step (c).

22, method as claimed in claim 21, wherein, described difference image comprise image in the horizontal direction, the difference image that calculates with at least one yardstick on vertical direction, L-R diagonal and the right side-left diagonal.

23, method as claimed in claim 22, wherein, the step of the feature set of extraction difference image comprises the described feature set of extraction at least one that use in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

24, method as claimed in claim 23, wherein, each corresponding a plurality of human body model of described a plurality of different human bodies.

25, method as claimed in claim 24, wherein, step (b) comprising: at human body in described a plurality of different human bodies, the ascending order of the quantity of the feature that has with the human body model uses in a plurality of human body models corresponding with a described human body at least one to remove false-alarm.

26, method as claimed in claim 25, wherein, step (c) comprising: at described a plurality of different human bodies, with predesigned order about described a plurality of different human bodies, repeating step (b) all was used until described a plurality of different corresponding human body models of human body.

27, method as claimed in claim 24, wherein, step (b) and (c) also comprise: the ascending order of the quantity of the feature that has with the human body model, use all human body models, and do not consider the detection order of the pairing human body of human body model.

28, as claim 26 or 27 described methods, wherein, described a plurality of human bodies comprise: people's head, trunk, human body shank, human body arm and whole human body.

29, a kind of in image human body position/people's equipment, described equipment comprises:

A plurality of human body detectors, corresponding one by one with a plurality of different human bodies, and detect corresponding human body;

Determiner is learned according to the human geometry, based on the detected human body of a plurality of human body detectors, removes false-alarm, to determine human body and the people in the detected image;

Wherein, each of described a plurality of human body detectors comprises:

Image processor, the difference image of computed image;

Training DB, the positive sample and the negative sample of storage human body;

The human body grader, the difference image of the detected image that calculates based on image processor, end user's body region model, detect and the corresponding human body of described human body grader, wherein, the described human body model feature set that to be sub-window processor extract by the difference image to the positive sample of the described human body stored from training DB and negative sample is learnt to obtain.

30, equipment as claimed in claim 29, wherein, described image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

31, equipment as claimed in claim 30, wherein, the window processor uses single window or many windows to extract described feature set at least one of described four difference images.

32, equipment as claimed in claim 31, wherein, the window processor uses and extracts described feature set in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal at least one.

33, equipment as claimed in claim 32, wherein, each of described a plurality of human body detectors has a plurality of human body graders.

34, equipment as claimed in claim 32, wherein, predesigned order with described a plurality of different human bodies, use at least one the human body grader in the corresponding human body detector to detect, with the described a plurality of different human bodies of described predesigned order duplicate detection, all human body graders in described human body/people's equipment all are used, wherein, after everyone body region is detected, determiner is removed false-alarm based on testing result, wherein, when end user's body region detector, the ascending order of the quantity of the feature that has with the employed human body model of the human body grader in this human body detector is used described at least one human body grader.

35, equipment as claimed in claim 33, wherein, the ascending order of the quantity of the feature that has with the employed human body model of human body grader, use all human body graders to detect, and do not consider detection order with the pairing human body of human body grader, wherein, after everyone body region was detected, determiner was removed false-alarm based on testing result.

36, as claim 34 or 35 described equipment, wherein, described a plurality of human bodies comprise: people's head, trunk, human body shank, human body arm and whole human body.

37, a kind of imaging device comprises:

Image-generating unit, the image of reference object;

Detecting unit, difference image based on the image of the object of taking, use the zone of the object of taking in the object model detected image, wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of the object of described shooting and negative sample is extracted;

The attitude parameter computing unit, the object of the shooting that detects according to detecting unit is the zone in image, calculates to produce the parameter of adjusting the imaging device attitude, so that object is placed the central area of image;

Control unit receives the parameter of adjusting the imaging device attitude from described attitude parameter computing unit, adjusts the attitude of imaging device;

Memory element, the image of the object that storage is taken;

Display unit, the image of the object that demonstration is taken.

38, according to the described imaging device of claim 37, also comprise:

The mark unit, the subject area that the user is manually marked on image offers detecting unit.

39, according to the described imaging device of claim 38, wherein, at least a according in the operation in swing, inclination, convergent-divergent and the selective focus zone of the parameter adjustment imaging device of adjusting the imaging device attitude of control unit.

40, according to the described imaging device of claim 39, wherein, in the operation in selective focus zone, the new zone at control unit control imaging device alternative place is as focusing on foundation, so that described zone is focused on.

41, according to the described imaging device of claim 40, wherein, when control unit control imaging device selective focus zone, imaging device selects the picture centre zone as the imaging and focusing zone of giving tacit consent to, perhaps dynamically the new image-region at alternative place as the imaging and focusing zone, according to the image data information of focal zone, dynamically adjust convergent-divergent multiple, focal length, swing or the tilt parameters of imaging device.

42, the method for object in a kind of detected image, described method comprises:

Calculate the difference image of detected image;

Based on the difference image of the detected image that calculates, use object model to detect described object, wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of described object and negative sample is extracted.

43, method as claimed in claim 42, wherein, described difference image is included in the difference image that calculates with at least one yardstick on horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal of image.

44, method as claimed in claim 43, wherein, the step of the feature set of extraction difference image comprises the described feature set of extraction at least one that use in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

45, the equipment of object in a kind of detected image, described equipment comprises:

Image processor, the difference image of computed image;

Training DB, the positive sample and the negative sample of storage object;

Sub-window processor, the difference image of the positive sample of objects stored and negative sample extracts feature set from the training DB that image processor calculates;

The object grader based on the difference image of the detected image that calculates, uses object model to detect described object, and wherein, described object model is to learn to obtain by the feature set that the difference image from the positive sample of described object and negative sample is extracted.

46, equipment as claimed in claim 45, wherein, described image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

47, equipment as claimed in claim 46, wherein, the window processor uses and extracts described feature set in single window or the difference image of many windows on described horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal at least one.