CN108537188B

CN108537188B - Pedestrian detection method based on local decorrelation features

Info

Publication number: CN108537188B
Application number: CN201810336812.1A
Authority: CN
Inventors: 孙一品; 李航
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2018-04-16
Filing date: 2018-04-16
Publication date: 2021-09-28
Anticipated expiration: 2038-04-16
Also published as: CN108537188A

Abstract

The invention provides a pedestrian detection method based on local decorrelation characteristics, which comprises the following steps of: carrying out pedestrian region labeling on all sample images in the sample image set, and taking all labeled sample images as a pedestrian detection data set; training each sample image in the pedestrian detection data set to perform 10-channel transformation processing; calculating the average value of 10 transformation channels of all training sample images in the training sample data set as an average human body model; performing covariance extraction on the head and shoulder regions of all the training sample images to realize decorrelation processing, and extracting the head and shoulder regions in the average human body model through covariance to generate a matrix as a filter; the filter is acted on a training sample data set to obtain a final characteristic, and the obtained final characteristic is input into an AdaBoost classifier based on a decision tree to train the AdaBoost classifier; and taking the filter and the classifier as a finally generated detector to detect the pedestrians of the image to be detected.

Description

Pedestrian detection method based on local decorrelation features

Technical Field

The invention relates to the technical field of computer vision and image processing, in particular to a pedestrian detection method based on local decorrelation features.

Background

Pedestrian Detection (Pedestrian Detection) is the use of computer vision techniques to determine whether a Pedestrian is present in an image or video sequence and to provide accurate positioning. The technology can be applied to the fields of artificial intelligence systems, vehicle auxiliary driving systems, intelligent robots, intelligent video monitoring, human body behavior analysis, intelligent transportation and the like.

The LDCF method is one of the methods commonly used in the current non-deep learning-based methods for pedestrian detection, and performs decorrelation processing on 10 transform channel features generated by a training picture through HOG transform (i.e., calculating gradient amplitude between image pixels and transform in 60-360 degrees left direction according to image pixel values) and LUV image channels (L is image luminance space, U/V refers to chrominance space) in a training stage, so as to achieve the purpose of reducing dimensions, and perform simplified calculation using convolution operation. In the decorrelation operation, the method calculates the co-correlation coefficient of all the areas of all the positive samples, and uses the average value of the co-correlation coefficient in the filter generation. This results in a less than sufficient filter being produced, reducing the tolerance of the filter.

Disclosure of Invention

The present invention is made to solve the above-described problems, and an object of the present invention is to provide a pedestrian detection method based on local decorrelation features.

The invention provides a pedestrian detection method based on local decorrelation characteristics, which is characterized by comprising the following steps: the method comprises the following steps: step one, carrying out pedestrian region labeling on all sample images in a sample image set, and taking all labeled sample images as a pedestrian detection data set; performing 10-channel transformation processing on each sample image in the pedestrian detection data set to obtain a training sample data set; calculating the average value of 10 transformation channels of all training sample images in the training sample data set, and taking the average value as an average human body model; cutting all training sample images in the training sample data set into head and shoulder areas serving as high discrimination areas based on the pedestrian area labels in the step one; performing covariance extraction on the head and shoulder regions of all the training sample images to realize decorrelation processing, and extracting the head and shoulder regions in the average human body model through covariance to generate a matrix as a filter; step six, the filter obtained in the step five acts on a training sample data set to obtain a final characteristic, and the obtained final characteristic is input into an AdaBoost classifier based on a decision tree to train the AdaBoost classifier; step seven, the filter obtained in the step five and the classifier obtained in the step six are used as finally generated detectors; and step eight, inputting the image to be detected into a detector for pedestrian detection.

In the pedestrian detection method based on the local decorrelation feature, the pedestrian detection method further has the following features: in step two, the 10 channel transform processes include 1 HOG gradient magnitude channel, 6 HOG direction channels, and 3 color space channels.

In the pedestrian detection method based on the local decorrelation feature, the pedestrian detection method further has the following features: the specific calculation process of the HOG gradient amplitude channel is as follows: for any oneA sample image I (x, y) is first taken with [ -1, 0, 1 [ -1]Performing convolution operation on the gradient operator to obtain a gradient component G in the horizontal direction_x：G_xI (x +1, y) -I (x-1, y), and then [1,0, -1]^TPerforming convolution operation on the gradient operator to obtain a gradient component G in the vertical direction_y:G_yI (x, y +1) -I (x, y-1), gradient magnitude G at point (x, y)_xyComprises the following steps: g_xy＝sqrt(G_x^2+G_y^2)。

In the pedestrian detection method based on the local decorrelation feature, the pedestrian detection method further has the following features: the specific calculation process of the 6 HOG directional channels is as follows: for any sample image, the image is divided into 8 × 8 grid units, 2 × 2 grid units form a square unit, and the gradient direction Alpha (x, y) of any pixel point (x, y) in the square unit is as follows: alpha (x, y) ═ arctan (G)_y(x,y)/G_x(x, y)), during feature description, the HOG divides the gradient direction of the angle theta into 6 uniform spaces Sk in a 360-degree interval, each interval is 60 degrees, and the projection L of the pixel point (x, y) in the k gradient directions_k(x, y) is:

then voting statistics is carried out on pixel points belonging to the square units, the gradient direction of each pixel point in the square units is calculated by adopting linear interpolation to obtain the gradient direction characteristics in each square unit, and the gradient direction characteristics of all the square units are combined together for statistics.

In the pedestrian detection method based on the local decorrelation feature, the pedestrian detection method further has the following features: wherein, 3 color space channels are converted from RGB channel to LUV channel, and the formula is as follows:

where XYZ is the image LUV channel value; b₁₁、b₁₂、b₁₃、b₂₁、b₂₂、b₂₃、b₃₁、b₃₂、b₃₃Are all the conversion constants shown; RGB is the original RGB channel pixel values of the image.

In the pedestrian detection method based on the local decorrelation feature, the pedestrian detection method further has the following features: wherein, the autocorrelation matrix extraction process in the decorrelation process of step five is as follows:

in the formula, R_xFor the autocorrelation matrix, E is the mathematical expectation, x is the random vector, r_ijIs the cross-correlation coefficient of x, and H is the transposed conjugate.

Action and Effect of the invention

The invention relates to a pedestrian detection method based on local decorrelation features, which is improved based on LDCF: the local regions of the average pedestrian model are used for decorrelation processing and the resulting filters are used in the detection phase, acting on the 10 transform channels. By learning and extracting the filter for the region with high distinguishability and using the filter for pedestrian feature extraction in the detection stage, an efficient detector is trained. Compared with deep learning, the pedestrian detection method based on the local decorrelation features only carries out classification by using a classifier based on a decision tree after the pedestrian features are extracted, and the training and detection time is shorter and is closer to the requirement of real-time processing.

Drawings

FIG. 1 is a process diagram of a pedestrian detection method based on local decorrelation features in an embodiment of the present invention; and

fig. 2 is a schematic diagram of a cut head and shoulder area in an embodiment of the present invention.

Detailed Description

In order to make the technical means, the creation features, the achievement objects and the efficacy of the present invention easy to understand, the following embodiments specifically describe the pedestrian detection method based on the local decorrelation features in combination with the accompanying drawings.

FIG. 1 is a process diagram of a pedestrian detection method based on local decorrelation features in an embodiment of the present invention; and FIG. 2 is a schematic view of a cut head and shoulder area in an embodiment of the present invention.

As shown in fig. 1, the pedestrian detection method based on local decorrelation features of the present invention includes the steps of:

step one, carrying out pedestrian region labeling on all sample images in the sample image set, and taking all labeled sample images as a pedestrian detection data set.

And step two, performing 10-channel transformation processing on each sample image in the pedestrian detection data set to obtain a training sample data set.

In step two, the 10-channel transform process includes 1 HOG gradient magnitude channel, 6 HOG direction channels, and 3 color space channels.

The specific calculation process of the HOG gradient magnitude channel is as follows:

for any sample image I (x, y), first use [ -1, 0, 1 [ -1 [ ]]Performing convolution operation on the gradient operator to obtain a gradient component G in the horizontal direction_x：

G_x＝I(x+1,y)-I(x-1,y)，

Then [1,0, -1 ] is used]^TPerforming convolution operation on the gradient operator to obtain a gradient component G in the vertical direction_y:

G_y＝I(x,y+1)-I(x,y-1)，

Gradient amplitude G at point (x, y)_xyComprises the following steps:

G_xy＝sqrt(G_x^2+G_y^2)。

the specific calculation process of 6 HOG directional channels is as follows:

for any sample image, the image is divided into 8 × 8 grid units, 2 × 2 grid units form a square unit, and the gradient direction Alpha (x, y) of any pixel point (x, y) in the square unit is as follows:

Alpha(x,y)＝arctan(G_y(x,y)/G_x(x,y))，

HOG during characterization, the gradient square of the angle theta is usedDividing the space into 6 uniform spaces Sk in a 360-degree interval, wherein each interval is 60 degrees, and the projection L of pixel points (x, y) in k gradient directions_k(x, y) is:

The 3 color space channels are transformed from the RGB channel to the LUV channel, as follows:

And thirdly, calculating the average value of 10 transformation channels of all training sample images in the training sample data set, and taking the average value as an average human body model.

And step four, cutting out specific areas of all training sample images in the training sample data set as high discrimination areas based on the pedestrian area labels in the step one. As shown in fig. 2, in the training sample image 1 in the present embodiment, the head-shoulder area 2 is used as a specific area.

And fifthly, performing covariance extraction on the head and shoulder regions of all the training sample images to realize decorrelation processing, and extracting the head and shoulder regions in the average human body model through covariance to generate a matrix as a filter.

The autocorrelation matrix extraction process in the decorrelation process is as follows:

And step six, acting the filter obtained in the step five on a training sample data set to obtain final characteristics, and inputting the obtained final characteristics into an AdaBoost classifier based on a decision tree to train the AdaBoost classifier.

And step seven, taking the filter obtained in the step five and the classifier obtained in the step six as a finally generated detector.

And step eight, inputting the image to be detected into a detector for pedestrian detection.

Effects and effects of the embodiments

The pedestrian detection method based on the local decorrelation features related to the embodiment is improved based on LDCF: the local regions of the average pedestrian model are used for decorrelation processing and the resulting filters are used in the detection phase, acting on the 10 transform channels. By learning and extracting the filter for the region with high distinguishability and using the filter for pedestrian feature extraction in the detection stage, an efficient detector is trained. The pedestrian detection method based on the local decorrelation features belongs to a shallow learning method, decorrelation processing is carried out on a local region of a positive sample, a filter is extracted, the filter is used for all regions to be detected of a picture to be detected in a detection stage by using a Sliding window (Sliding window) method, and only after the pedestrian features are extracted, a classifier based on a decision tree is used for classification, so that training and detection time is shorter, and requirements of real-time processing are met.

The above embodiments are preferred examples of the present invention, and are not intended to limit the scope of the present invention.

Claims

1. A pedestrian detection method based on local decorrelation features is characterized by comprising the following steps:

step one, carrying out pedestrian region labeling on all sample images in a sample image set, and taking all labeled sample images as a pedestrian detection data set;

performing 10-channel transformation processing on each sample image in the pedestrian detection data set to obtain a training sample data set;

calculating the average value of 10 transformation channels of all training sample images in the training sample data set, and taking the average value as an average human body model;

cutting head and shoulder regions of all training sample images in the training sample data set as high discrimination regions based on the pedestrian region labels in the step one;

fifthly, performing covariance extraction on the head and shoulder regions of all the training sample images to realize decorrelation processing, and extracting a matrix generated by the head and shoulder regions in the average human body model through the covariance to serve as a filter;

step six, the filter obtained in the step five acts on the training sample data set to obtain final characteristics, and the obtained final characteristics are input into an AdaBoost classifier based on a decision tree to train the AdaBoost classifier;

step seven, the filter obtained in the step five and the classifier obtained in the step six are used as finally generated detectors;

and step eight, inputting the image to be detected into the detector to detect the pedestrian.

2. The pedestrian detection method based on local decorrelation features according to claim 1, characterized in that:

in step two, the 10 channel transform processes include 1 HOG gradient magnitude channel, 6 HOG direction channels, and 3 color space channels.

3. The pedestrian detection method based on local decorrelation features according to claim 2, characterized in that:

the specific calculation process of the HOG gradient amplitude channel is as follows:

G_x＝I(x+1,y)-I(x-1,y)，

G_y＝I(x,y+1)-I(x,y-1)，

Gradient amplitude G at point (x, y)_xyComprises the following steps:

G_xy＝sqrt(G_x^2+G_y^2)。

4. the pedestrian detection method based on local decorrelation features according to claim 2, characterized in that:

wherein, the specific calculation process of the 6 HOG directional channels is as follows:

for any sample image, dividing the image into 8 × 8 grid units, forming a square unit by 2 × 2 grid units, where the gradient direction Alpha (x, y) of any pixel point (x, y) in the square unit is:

Alpha(x,y)＝arctan(G_y(x,y)/G_x(x,y))，

when the HOG carries out feature description, the gradient direction of the angle theta is divided into 6 uniform spaces Sk in a 360-degree interval, each interval is 60 degrees, and the projection L of a pixel point (x, y) in the k gradient directions_k(x, y) is:

then voting and counting the pixel points belonging to the square units, calculating the gradient direction of each pixel point in the square units by adopting linear interpolation to obtain the gradient direction characteristics in each square unit, and combining the gradient direction characteristics of all the square units for counting.

5. The pedestrian detection method based on local decorrelation features according to claim 2, characterized in that:

wherein, the 3 color space channels are transformed from RGB channels to LUV channels, and the formula is as follows:

6. The pedestrian detection method based on local decorrelation features according to claim 1, characterized in that:

wherein, the autocorrelation matrix extraction process in the decorrelation process of step five is as follows: