CN112215103B - Vehicle pedestrian multi-category detection method and device based on improved ACF - Google Patents
Vehicle pedestrian multi-category detection method and device based on improved ACF Download PDFInfo
- Publication number
- CN112215103B CN112215103B CN202011034733.9A CN202011034733A CN112215103B CN 112215103 B CN112215103 B CN 112215103B CN 202011034733 A CN202011034733 A CN 202011034733A CN 112215103 B CN112215103 B CN 112215103B
- Authority
- CN
- China
- Prior art keywords
- vehicle
- pedestrian
- detection
- training sample
- detection result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 192
- 238000012549 training Methods 0.000 claims abstract description 81
- 230000002776 aggregation Effects 0.000 claims abstract description 48
- 238000004220 aggregation Methods 0.000 claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 40
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 30
- 230000000007 visual effect Effects 0.000 claims description 14
- 230000003595 spectral effect Effects 0.000 claims description 12
- 238000011176 pooling Methods 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 8
- 239000013598 vector Substances 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 5
- 238000007477 logistic regression Methods 0.000 claims description 5
- 238000000354 decomposition reaction Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000012706 support-vector machine Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 3
- 238000003066 decision tree Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000003064 k means clustering Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/584—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Traffic Control Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a vehicle pedestrian multi-category detection method and device based on improved ACF, wherein the method comprises the following steps: acquiring a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample; extracting the preprocessed multi-view aggregation channel characteristics of the vehicle training sample and the context pixel aggregation channel characteristics of the pedestrian training sample by using a vehicle pedestrian detection frame, establishing a vehicle detector according to the multi-view aggregation channel characteristics, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics; sharing the aggregation channel characteristics of the image to be detected to the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result; and adopting a false detection rejection strategy based on road constraint to perform false detection rejection on the vehicle detection result and the pedestrian detection result. The invention solves the problems of single detection target, low detection precision and easy false detection existing in the prior art.
Description
Technical Field
The invention relates to the technical field of unmanned visual analysis, in particular to a vehicle pedestrian multi-category detection method, device and storage medium based on improved ACF.
Background
With technological progress and improvement of living standard of people, the automobile conservation amount is increased dramatically, and various factors cause frequent traffic accidents. Unmanned efforts are directed to improving this problem, where vehicle and pedestrian detection technology is of paramount importance. The accuracy and instantaneity of the vehicle and pedestrian detection algorithm directly influence the safety performance of the unmanned vehicle.
The current mainstream vehicle and pedestrian detection algorithm has a deep learning detection algorithm and a statistical feature detection algorithm. The CNN feature training period in the deep learning is long, and the calculated amount is large. According to the difference of detection strategies, the statistical feature detection method can be subdivided into a DPM method and a decision tree method, and the method is rarely applied to an unmanned system due to the characteristics of high complexity and low running speed. In the decision tree method, the design of the feature descriptors is the key of the detection algorithm and is the most studied content at present, and the feature descriptors mainly comprise gradient, texture, color and fusion features thereof. The Haar features are mainly used for extracting texture information of a target, are widely applied to the field of vehicle detection, and HOG features and the like are used for capturing information such as outlines, shapes and the like of the target, are representative of gradient features and are usually used for detecting pedestrians. In addition, gray scale, RGB, LUV, and other color features can also be used to characterize the target. However, these features are generally only useful for detecting specific targets, and expression capabilities are limited in complex road scenarios. Aiming at the problems, firstly, integral channel characteristics (Integral Channel Features, ICF) are provided, gradient, color, texture and other characteristics are fused, then, in order to improve detection performance, aggregate channel characteristics (Aggregated Channel Features, ACF) are provided, and compared with the ICF, the ACF algorithm is different in that the characteristics are extracted by utilizing a pixel lookup table mode, and the detection performance is greatly improved compared with the ICF. Then, the LDCF introduces filtering operation on the basis of the ACF to strengthen the expression capacity of the LDCF, but also brings great calculated amount, and greatly reduces the instantaneity although the algorithm detection precision is further improved. Although the LDCF has a larger improvement than the ACF detection precision, the real-time performance is reduced, and the LDCF is difficult to be suitable for detecting the pedestrians of the lightweight vehicles.
Therefore, when the conventional vehicle and pedestrian detection method is applied to a road scene, the problems of single detection target, low detection precision and easy occurrence of false detection are existed.
Disclosure of Invention
In view of the foregoing, there is a need for a method, apparatus and storage medium for detecting pedestrian and vehicle based on an improved ACF, which are used for solving the problems of single detection target, low detection precision and easy occurrence of false detection when detecting vehicles and pedestrians at present.
In a first aspect, the present invention provides a vehicle pedestrian multi-category detection method based on an improved ACF, including the steps of:
acquiring a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample;
extracting multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle pedestrian detection frame, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
extracting the context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle pedestrian detection frame, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics;
acquiring a preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected to the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result;
and adopting a false detection rejection strategy based on road constraint to perform false detection rejection on the vehicle detection result and the pedestrian detection result.
Preferably, in the method for detecting pedestrian in a vehicle based on the improved ACF, the method for preprocessing the vehicle training sample and the pedestrian training sample specifically includes:
scaling the vehicle training sample and the pedestrian training sample in the horizontal and vertical directions, and maintaining the normalization of the center positions of the targets in the vehicle training sample and the pedestrian training sample.
Preferably, in the method for detecting multiple categories of pedestrians in a vehicle based on an improved ACF, the step of extracting the characteristics of a multi-view aggregation channel of the preprocessed training samples of the vehicle by using a vehicle pedestrian detection frame, and establishing a vehicle detector according to the characteristics of the multi-view aggregation channel specifically includes:
and calculating a similar incidence matrix among all sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining feature vectors with multiple dimensions through matrix spectral decomposition, clustering the feature vectors with multiple dimensions by adopting a K-means algorithm to extract aggregation channel features of multiple visual angles, and training a vehicle detector with corresponding visual angles by utilizing the aggregation channel features of all visual angles.
Preferably, in the method for detecting pedestrian in a vehicle based on an improved ACF, the step of extracting the context pixel aggregation channel characteristics of the pedestrian training sample after the preprocessing by using a vehicle pedestrian detection frame, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics specifically includes:
10 characteristic channels of the pre-processed pedestrian training sample are extracted, and the ten channels are processed by using 2×2 averaging pooling to obtain an aggregate channel F with n=2 2×2 After the feature, carrying out 2×2 average pooling treatment on the aggregation channel feature F2×2 twice to obtain a region context pixel aggregation channel F 4×4 Features and F 8×8 Characterised in that said F 4×4 Features and F 8×8 Feature sampling to F 2×2 And the resolution ratio is combined to form 30 deformation-resistant context pixel aggregation channel characteristics with the same size, and a pedestrian detector is built according to the context pixel aggregation channel characteristics.
Preferably, in the method for detecting a pedestrian in a vehicle based on an improved ACF, the step of obtaining the preprocessed image to be detected, and sharing the aggregate channel features of the image to be detected to the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result further includes:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
Preferably, in the method for detecting pedestrian in vehicle based on improved ACF, the step of adopting a false detection rejection policy based on road constraint to perform false detection rejection on the vehicle detection result and the pedestrian detection result includes:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinates of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinates of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinate of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinate of the lower edge of the pedestrian calibration frame by adopting a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the position of the lower edge of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the position of the lower edge of the pedestrian in the pedestrian detection result by adopting a second regression model;
calculating a first error value of the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value of the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is misdetected, otherwise, accepting the vehicle detection result; and when the second error value is larger than a second threshold value, judging that the pedestrian detection result is misdetected, otherwise, receiving the pedestrian detection result.
Preferably, in the improved ACF-based vehicle pedestrian multi-category detection method, the first regression model is:
H 1 =f(Y 1 ),
wherein H is 1 Representing the height of a vehicle calibration frame, Y 1 Representing the position coordinates of the lower edge of the vehicle calibration frame;
the first error value calculating method comprises the following steps:
wherein E is 1 Represents a first error value, h 1 Indicating the height of the actual vehicle calibration frame, h' 1 Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
Preferably, in the improved ACF-based vehicle pedestrian multi-category detection method, the second regression model is:
H 2 =f(Y 2 ),
wherein H is 2 Representing the height of the pedestrian calibration frame, Y 2 Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the calculation method of the second error value comprises the following steps:
wherein E is 2 Represents a second error value, h 2 Indicating the height of the actual vehicle calibration frame, h' 2 Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
In a second aspect, the present invention also provides a vehicle pedestrian multi-category detection device based on the improved ACF, including: a processor and a memory;
the memory has stored thereon a computer readable program executable by the processor;
the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicle pedestrian multi-category detection method as described above.
In a third aspect, the present invention also provides a computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-category detection method as described above.
[ beneficial effects ]
In the vehicle pedestrian multi-category detection method, the device and the storage medium based on the improved ACF, the problem of single detection category of the Adaboost classifier in the ACF detection algorithm is solved, a multi-category detection frame is adopted, and the vehicle and the pedestrian are detected simultaneously; in order to solve the problem of low detection precision of vehicles and pedestrians, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the visual angle difference of a vehicle sample and deformation of the posture of the pedestrians during walking can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.
Drawings
FIG. 1 is a flowchart of a method for improved ACF-based pedestrian detection in a vehicle in accordance with a preferred embodiment of the present invention;
FIG. 2 is a flowchart illustrating the operation of a preferred embodiment of the vehicle pedestrian detection framework of the present invention;
FIG. 3 is a schematic diagram of a training process for a vehicle detector according to the present invention;
FIG. 4 is a schematic diagram of the training process of the pedestrian detector of the present invention;
FIG. 5 is a statistical graph of the relationship between the target height of the calibration frame and the coordinates of the lower edge thereof;
FIG. 6 is a schematic diagram of the operating environment of a preferred embodiment of the improved ACF-based pedestrian detection program.
Detailed Description
Preferred embodiments of the present invention will now be described in detail with reference to the accompanying drawings, which form a part hereof, and together with the description serve to explain the principles of the invention, and are not intended to limit the scope of the invention.
Referring to fig. 1, the method for detecting pedestrian in a vehicle based on an improved ACF according to the embodiment of the present invention includes the following steps:
s100, acquiring a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample.
In this embodiment, in order to realize detection of a vehicle and a pedestrian, training of a sample is required first, and in order to ensure detection accuracy, reinforcement and improvement of sample data are required first, and pretreatment is performed, specifically, in step S100, a method for preprocessing a sample specifically includes:
scaling the vehicle training sample and the pedestrian training sample in the horizontal and vertical directions, and maintaining the normalization of the center positions of the targets in the vehicle training sample and the pedestrian training sample.
Specifically, the current ACF detection algorithm adopts a horizontal overturning method to strengthen the data set, and ignores the influence of the labeling error of the data set. Meanwhile, the image is standardized generally in the training process, the problem that the scaled image is easily misaligned in the target can be solved, and the detection precision is seriously affected. Therefore, the method and the device directly remove horizontal overturn, increase multi-scale data reinforcement, namely, scale by 1.1 times in the horizontal, vertical and other directions of the original training sample, and maintain the normalization of the center position of the target, and can reduce the sensitivity of the surrounding background of the labeling frame by utilizing the multi-scale reinforcement, thereby improving the classification robustness.
S200, extracting multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle pedestrian detection frame, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
and S300, extracting the context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle pedestrian detection frame, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics.
In this embodiment, since the conventional ACF algorithm has a single detection target, a multi-class pedestrian-vehicle detection frame is introduced, and feature extraction is performed on the vehicle sample and the pedestrian sample, so that ACF features detected by the vehicle detector and the pedestrian detector are shared, and then the ACF features are used for vehicle-pedestrian detection. Specifically, the invention provides a vehicle pedestrian detection framework based on feature sharing, so that a vehicle detector and a pedestrian detector share ACF features, the detection efficiency is improved, and multi-category detection is finished. The frame trains the pedestrian detector and the vehicle detector at the same time in the training stage, and enables the pedestrian detector and the vehicle detector to share ACF characteristics, so that the training efficiency can be remarkably improved. Meanwhile, the framework has universality, other detectors can be added, and other detection categories are easy to expand.
Further, the step S200 specifically includes:
and calculating a similar incidence matrix among all sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining feature vectors with multiple dimensions through matrix spectral decomposition, clustering the feature vectors with multiple dimensions by adopting a K-means algorithm to extract aggregation channel features of multiple visual angles, and training a vehicle detector with corresponding visual angles by utilizing the aggregation channel features of all visual angles.
Specifically, as shown in fig. 3, when training a Multi-view aggregation channel feature vehicle detector (Multi-view Aggregated Channel Features, mv-ACF), the invention firstly performs clustering processing on vehicle training samples, extracts features of each view sample, and trains a corresponding vehicle detector, and adopts an unsupervised learning K-means algorithm in consideration of higher dimension of the extracted ACF features. Aiming at the possible cluster degradation problem, the invention adopts a spectral clustering algorithm, calculates the similar incidence matrix among sample points, obtains the feature vector through matrix spectral decomposition, constructs a new feature space, and uses a K-means algorithm for clustering.
And verifying the effectiveness of a spectral clustering algorithm used by the method. And carrying out K-means clustering on the training samples through experiments, and clustering the training samples into 20 classes. The classified samples were trained and tested using an Mv-ACF detector and compared to the spectral clustering algorithm results as shown in table 1. Obviously, the AP precision of the spectral clustering algorithm under different levels is higher than that of the K-means clustering algorithm, and the spectral clustering algorithm achieves the expected effect.
In a preferred embodiment, the step S300 specifically includes:
10 characteristic channels of the pre-processed pedestrian training sample are extracted, and the ten channels are processed by using 2×2 averaging pooling to obtain an aggregate channel F with n=2 2×2 After the feature, carrying out 2×2 average pooling treatment on the aggregation channel feature F2×2 twice to obtain a region context pixel aggregation channel F 4×4 Features and F 8×8 Characterised in that said F 4×4 Features and F 8×8 Feature sampling to F 2×2 And the resolution ratio is combined to form 30 deformation-resistant context pixel aggregation channel characteristics with the same size, and a pedestrian detector is built according to the context pixel aggregation channel characteristics.
Specifically, the posture of the pedestrian when walking can cause deformation, so that the difficulty of pedestrian detection is increased. To this end, the present invention proposes a context pixel aggregation channel feature (Context Pixel Aggregated Channel Features, CP-ACF). As shown in fig. 4, 10 feature channels are first extracted in the same manner as the ACF algorithm, and then the ten channels are processed by using 2×2 averaging pooling to obtain ACF features f2×2 with n=2, and then the 2×2 averaging pooling is performed twice to obtain region up and down Wen Xiangsu aggregate f4×4 and f8×8 features. And finally, up-sampling F4×4 and F8×8 to F2×2 resolution, and finally combining to form 30 deformation-resistant CP-ACF channel features with the same size so as to realize fusion of local and context features. When the soft cascade Adaboost is classified, the weak classifier can adaptively select local and contextual features of different regions in the CP-ACF channel, and the CP-ACF has stronger deformation resistance compared with the feature that the ACF can only select a fixed region. The AP accuracy of CP-ACF and ACF at different levels of the KITTI validation set is shown in the table below.
The invention respectively designs the vehicle detector and the pedestrian detector in the frame, fuses road information so as to improve detection precision, realizes feature sharing of the vehicle detector and the pedestrian detector, and improves algorithm instantaneity.
S400, acquiring preprocessed images to be detected, and sharing the aggregation channel characteristics of the images to be detected to the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result.
Specifically, in the detection algorithm, the portion that generally takes the longest time is image feature extraction, and in the conventional vehicle pedestrian detection algorithm, vehicle detection and pedestrian detection are generally performed separately, that is, feature extraction is performed separately on images, which takes a long time. The vehicle pedestrian detection framework based on feature sharing is proposed, as shown in fig. 2, so that the vehicle detector and the pedestrian detector share ACF features, the detection efficiency is improved, and multi-category detection is completed.
Further, since the Mv-ACF and the CP-ACF use the same 10 original ACF feature channels, the former uses 2×2 average pooling to extract features, the latter uses three average pooling, 2×2, 4×4, and 8×8 average pooling to extract features, respectively, it can be seen that the latter uses feature channels including the former and the two feature pyramid construction modes are the same, so that the vehicle detector and the pedestrian detector can share the feature pyramid of the latter.
In a preferred embodiment, the step S400 further includes:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
Specifically, each subclass of Mv-ACF detector adopts a data training model with different visual angles in the test process, and the detection result can comprise confidence scores with different distributions and detection frames with inconsistent geometric characteristics (such as length-width ratio). Direct merging can introduce noise, so that the subsequent NMS is unstable and the accuracy is reduced. According to the invention, a parameterized Logistic regression calibration method is introduced, and confidence score calibration is carried out on the detection result, so that the distribution of the detection result is more reasonable.
Specifically, let deti= { d i1 ,d i2 ,…,d ij ,…,d ir And r detection results of the ith subclass detector. Wherein d ij ={R ij ,c ij The j-th detection result is represented by a detection frame R ij Confidence score c ij Composition is prepared. Set mDet i ={md i1 ,md i2 ,…,md ij ,…,md ir And } is the calibrated result, where md ij ={R ij ,c′ ij }. The purpose of the confidence score calibration is to use a calibration function g i Make c' ij =g i (c ij ). The parameterized Logistic regression calibration method is introduced, and the score is normalized, namely:
wherein the parameter A of the ith subclass detector i And B i Obtained by solving the regularized maximum interpretation problem:
substituting formula (1) into formula (2) to obtain
Wherein,
wherein r is + And r - Respectively the ith subclass for training parameters A i And B i Positive and negative samples of (a). y is j Label representing jth sample, y j = +1 as target, y j = -1 represents background. Through the above process, the confidence score calibration of the vehicle subclass detector is completed.
S500, adopting a false detection rejection strategy based on road constraint to perform false detection rejection on the vehicle detection result and the pedestrian detection result.
In this embodiment, in order to avoid reducing the false detection rate, a step of false detection and rejection is further provided for the output result, and a false detection and rejection strategy based on road constraint is introduced, so that the false detection rate is reduced, and a detection algorithm is perfected, so that the method is suitable for detecting lightweight vehicles and pedestrians. Specifically, the step S500 specifically includes:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinates of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinates of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinate of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinate of the lower edge of the pedestrian calibration frame by adopting a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the position of the lower edge of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the position of the lower edge of the pedestrian in the pedestrian detection result by adopting a second regression model;
calculating a first error value of the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value of the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is misdetected, otherwise, accepting the vehicle detection result; and when the second error value is larger than a second threshold value, judging that the pedestrian detection result is misdetected, otherwise, receiving the pedestrian detection result.
Wherein the first regression model is:
H 1 =f(Y 1 ),
wherein H is 1 Representing the height of a vehicle calibration frame, Y 1 Representing the position coordinates of the lower edge of the vehicle calibration frame;
the first error value calculating method comprises the following steps:
wherein E is 1 Represents a first error value, h 1 Indicating the height of the actual vehicle calibration frame, h' 1 Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
The second regression model is:
H 2 =f(Y 2 ),
wherein H is 2 Representing the height of the pedestrian calibration frame, Y 2 Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the calculation method of the second error value comprises the following steps:
wherein E is 2 Represents a second error value, h 2 Indicating the height of the actual vehicle calibration frame, h' 2 Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
In other words, the invention firstly normalizes the heights H of 12186 pedestrians and 15891 vehicle calibration frames and the lower edge position coordinates Y of the calibration frames in the Caltech and KITTI training data set. The false detection phenomenon existing in the vehicle pedestrian detection process can be eliminated by utilizing the road prior information. In order to utilize the road prior information, firstly, statistics is carried out on the heights H of 12186 pedestrians and 15891 vehicle calibration frames in the normalized Caltech and KITTI training data set and the position coordinates Y of the lower edges of the calibration frames, and as the result is shown in fig. 5, a certain statistical relationship exists between the H and the Y. According to the relation, the invention provides a simple and efficient road constraint (Ground Plane Constraints, GPC) false detection rejection strategy, namely, taking the target which does not accord with the relation as false detection. The statistical relationship f may be determined using a regression model between H and Y, first by normalizing H and Y of the training samples, and then training the regression model W between H and Y using the SVM. And comparing the vehicle pedestrian calibration frame given in the detection result with the corresponding group trunk to obtain the vehicle pedestrian calibration frame closest to the real value. After training the model, regarding the detection frame { x, y, W, h } obtained after NMS, the lower edge position of the detection frame is y+h, then using the trained regression model W to calculate the corresponding h ', finally calculating the relative error between h and h', if the final error value is greater than the set threshold, considering the detection frame as false detection, otherwise, accepting the detection frame.
As shown in fig. 6, based on the above-mentioned method for detecting pedestrian in a vehicle based on the improved ACF, the present invention further provides a device for detecting pedestrian in a vehicle based on the improved ACF, where the device for detecting pedestrian in a vehicle based on the improved ACF may be a computing device such as a mobile terminal, a desktop computer, a notebook computer, a palm computer, and a server. The improved ACF-based vehicular pedestrian multi-category detection device includes a processor 10, a memory 20, and a display 30. Fig. 6 shows only a portion of the components of the improved ACF-based vehicle pedestrian multi-category detection device, but it is to be understood that not all of the illustrated components need be implemented, and that more or fewer components may alternatively be implemented.
The memory 20 may in some embodiments be an internal storage unit of the improved ACF-based vehicular pedestrian multi-class detection device, such as a hard disk or memory of the improved ACF-based vehicular pedestrian multi-class detection device. The memory 20 may also be an external storage device of the improved ACF-based vehicle pedestrian multi-class detection device in other embodiments, such as a plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card) or the like, which is provided on the improved ACF-based vehicle pedestrian multi-class detection device. Further, the memory 20 may also include both an internal memory unit and an external memory device of the improved ACF-based vehicular pedestrian multi-class detection device. The memory 20 is used for storing application software and various types of data installed on the improved ACF-based vehicle pedestrian multi-class detection device, such as program code for installing the improved ACF-based vehicle pedestrian multi-class detection device. The memory 20 may also be used to temporarily store data that has been output or is to be output. In an embodiment, the memory 20 stores a modified ACF-based pedestrian multi-category detection program 40, and the modified ACF-based pedestrian multi-category detection program 40 is executable by the processor 10 to implement the modified ACF-based pedestrian multi-category detection method of the embodiments of the present application.
The processor 10 may in some embodiments be a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chip for executing program code or processing data stored in the memory 20, for example, for performing the improved ACF-based vehicle pedestrian multi-category detection method, etc.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like in some embodiments. The display 30 is used for displaying information at the improved ACF-based vehicle pedestrian multi-category detection device and for displaying a visual user interface. The components 10-30 of the improved ACF-based vehicular pedestrian multi-class detection device communicate with each other over a system bus.
In an embodiment, the improved ACF-based vehicle pedestrian multi-category detection method described in the above embodiment is implemented when the processor 10 executes the improved ACF-based vehicle pedestrian multi-category detection program 40 in the memory 20, and since the improved ACF-based vehicle pedestrian multi-category detection method is described in detail above, the description thereof is omitted.
In summary, in the vehicle pedestrian multi-category detection method, device and storage medium based on the improved ACF provided by the invention, the problem of single detection category of the Adaboost classifier in the ACF detection algorithm is solved, and a multi-category detection frame is adopted to detect vehicles and pedestrians simultaneously; in order to solve the problem of low detection precision of vehicles and pedestrians, a multi-view vehicle detector and a context pixel pedestrian detector are adopted, so that the visual angle difference of a vehicle sample and deformation of the posture of the pedestrians during walking can be effectively captured, and the detection precision is improved; in order to overcome the false detection phenomenon in the vehicle pedestrian detection process, the false detection is effectively removed by utilizing the road prior information.
Of course, those skilled in the art will appreciate that implementing all or part of the above-described methods may be implemented by a computer program for instructing relevant hardware (e.g., a processor, a controller, etc.), where the program may be stored in a computer-readable storage medium, and where the program may include the steps of the above-described method embodiments when executed. The storage medium may be a memory, a magnetic disk, an optical disk, or the like.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention.
Claims (8)
1. The vehicle pedestrian multi-category detection method based on the improved ACF is characterized by comprising the following steps of:
acquiring a vehicle training sample and a pedestrian training sample, and preprocessing the vehicle training sample and the pedestrian training sample;
extracting multi-view aggregation channel characteristics of the preprocessed vehicle training samples by using a vehicle pedestrian detection frame, and establishing a vehicle detector according to the multi-view aggregation channel characteristics;
extracting the context pixel aggregation channel characteristics of the preprocessed pedestrian training sample by using a vehicle pedestrian detection frame, and establishing a pedestrian detector according to the context pixel aggregation channel characteristics;
acquiring a preprocessed image to be detected, and sharing the aggregation channel characteristics of the image to be detected to the vehicle detector and the pedestrian detector to obtain a vehicle detection result and a pedestrian detection result;
adopting a false detection rejection strategy based on road constraint to perform false detection rejection on the vehicle detection result and the pedestrian detection result;
the step of extracting the pretreated multi-view aggregation channel characteristics of the vehicle training sample by using the vehicle pedestrian detection frame and establishing a vehicle detector according to the multi-view aggregation channel characteristics specifically comprises the following steps:
calculating a similar incidence matrix among all sample points in the preprocessed vehicle training sample by adopting a spectral clustering algorithm, obtaining feature vectors with multiple dimensions through matrix spectral decomposition, clustering the feature vectors with multiple dimensions by adopting a K-means algorithm to extract aggregation channel features of multiple visual angles, and training a vehicle detector with corresponding visual angles by utilizing the aggregation channel features of all visual angles;
the step of extracting the pre-processed context pixel aggregation channel characteristics of the pedestrian training sample by using the vehicle pedestrian detection frame and establishing a pedestrian detector according to the context pixel aggregation channel characteristics specifically comprises the following steps:
10 characteristic channels of the pre-processed pedestrian training sample are extracted, and the ten channels are processed by using 2×2 averaging pooling to obtain an aggregate channel F with n=2 2×2 After the feature, carrying out 2×2 average pooling treatment on the aggregation channel feature F2×2 twice to obtain a region context pixel aggregation channel F 4×4 Features and F 8×8 Characterised in that said F 4×4 Features and F 8×8 Feature sampling to F 2×2 And the resolution ratio is combined to form 30 deformation-resistant context pixel aggregation channel characteristics with the same size, and a pedestrian detector is built according to the context pixel aggregation channel characteristics.
2. The improved ACF-based vehicle pedestrian multi-class detection method of claim 1 wherein the method of preprocessing the vehicle training sample and pedestrian training sample is specifically:
scaling the vehicle training sample and the pedestrian training sample in the horizontal and vertical directions, and maintaining the normalization of the center positions of the targets in the vehicle training sample and the pedestrian training sample.
3. The method for detecting pedestrian in vehicle according to claim 1, wherein the step of obtaining the preprocessed image to be detected and sharing the aggregate channel features of the image to be detected to the vehicle detector and the pedestrian detector to obtain the vehicle detection result and the pedestrian detection result further comprises:
and carrying out confidence score calibration on the vehicle detection result by adopting a parameterized Logistic regression calibration method.
4. The improved ACF-based vehicle pedestrian multi-category detection method of claim 1 wherein the step of false detection rejection of the vehicle detection result and pedestrian detection result using a road constraint-based false detection rejection strategy comprises:
normalizing the height of the vehicle calibration frame of the vehicle training sample and the position coordinates of the lower edge of the vehicle calibration frame, and the height of the pedestrian calibration frame of the pedestrian training sample and the position coordinates of the lower edge of the pedestrian calibration frame;
training a first regression model between the height of the vehicle calibration frame and the position coordinate of the lower edge of the vehicle calibration frame and a second regression model between the height of the pedestrian calibration frame and the position coordinate of the lower edge of the pedestrian calibration frame by adopting a support vector machine;
calculating the height of a predicted vehicle calibration frame corresponding to the position of the lower edge of the vehicle detection result by adopting a first regression model, and calculating the height of a predicted pedestrian calibration frame corresponding to the position of the lower edge of the pedestrian in the pedestrian detection result by adopting a second regression model;
calculating a first error value of the height of the predicted vehicle calibration frame and the height of an actual vehicle calibration frame in the vehicle detection result, and a second error value of the height of the predicted pedestrian calibration frame and the height of the actual pedestrian calibration frame in the pedestrian detection result;
when the first error value is larger than a first threshold value, judging that the vehicle detection result is misdetected, otherwise, accepting the vehicle detection result; and when the second error value is larger than a second threshold value, judging that the pedestrian detection result is misdetected, otherwise, receiving the pedestrian detection result.
5. The improved ACF-based vehicle pedestrian multi-class detection method of claim 4 wherein the first regression model is:
,
wherein H is 1 Representing the height of a vehicle calibration frame, Y 1 Representing the position coordinates of the lower edge of the vehicle calibration frame;
the first error value calculating method comprises the following steps:
,
wherein,representing a first error value,/>Indicating the height of the actual vehicle calibration frame, +.>Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
6. The improved ACF-based vehicle pedestrian multi-class detection method of claim 4 wherein the second regression model is:
,
wherein H is 2 Representing the height of the pedestrian calibration frame, Y 2 Representing the position coordinates of the lower edge of the pedestrian calibration frame;
the calculation method of the second error value comprises the following steps:
,
wherein,representing a second error value, +.>Indicating the height of the actual vehicle calibration frame, +.>Representing the height of the predicted vehicle calibration frame, abs represents the absolute value.
7. A vehicle pedestrian multi-category detection device based on an improved ACF, comprising: a processor and a memory;
the memory has stored thereon a computer readable program executable by the processor;
the processor, when executing the computer readable program, implements the steps in the improved ACF-based vehicular pedestrian multi-category detection method as recited in any one of claims 1 to 6.
8. A computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the improved ACF-based vehicle pedestrian multi-category detection method of any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011034733.9A CN112215103B (en) | 2020-09-27 | 2020-09-27 | Vehicle pedestrian multi-category detection method and device based on improved ACF |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011034733.9A CN112215103B (en) | 2020-09-27 | 2020-09-27 | Vehicle pedestrian multi-category detection method and device based on improved ACF |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112215103A CN112215103A (en) | 2021-01-12 |
CN112215103B true CN112215103B (en) | 2024-02-23 |
Family
ID=74050818
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011034733.9A Active CN112215103B (en) | 2020-09-27 | 2020-09-27 | Vehicle pedestrian multi-category detection method and device based on improved ACF |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112215103B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105760858A (en) * | 2016-03-21 | 2016-07-13 | 东南大学 | Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features |
CN107657225A (en) * | 2017-09-22 | 2018-02-02 | 电子科技大学 | A kind of pedestrian detection method based on converging channels feature |
CN108376235A (en) * | 2018-01-15 | 2018-08-07 | 深圳市易成自动驾驶技术有限公司 | Image detecting method, device and computer readable storage medium |
CN109190456A (en) * | 2018-07-19 | 2019-01-11 | 中国人民解放军战略支援部队信息工程大学 | Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes |
CN110109455A (en) * | 2019-04-24 | 2019-08-09 | 安徽大学 | A kind of Target Tracking System based on ACF converging channels feature |
-
2020
- 2020-09-27 CN CN202011034733.9A patent/CN112215103B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105760858A (en) * | 2016-03-21 | 2016-07-13 | 东南大学 | Pedestrian detection method and apparatus based on Haar-like intermediate layer filtering features |
CN107657225A (en) * | 2017-09-22 | 2018-02-02 | 电子科技大学 | A kind of pedestrian detection method based on converging channels feature |
CN108376235A (en) * | 2018-01-15 | 2018-08-07 | 深圳市易成自动驾驶技术有限公司 | Image detecting method, device and computer readable storage medium |
CN109190456A (en) * | 2018-07-19 | 2019-01-11 | 中国人民解放军战略支援部队信息工程大学 | Pedestrian detection method is overlooked based on the multiple features fusion of converging channels feature and gray level co-occurrence matrixes |
CN110109455A (en) * | 2019-04-24 | 2019-08-09 | 安徽大学 | A kind of Target Tracking System based on ACF converging channels feature |
Non-Patent Citations (1)
Title |
---|
使用聚合通道特征的嵌入式实时人体头肩检测;陆泽早 等;中国图象图形学报;20190430;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112215103A (en) | 2021-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11151363B2 (en) | Expression recognition method, apparatus, electronic device, and storage medium | |
CN110020592B (en) | Object detection model training method, device, computer equipment and storage medium | |
EP4148622A1 (en) | Neural network training method, image classification system, and related device | |
CN109740478B (en) | Vehicle detection and identification method, device, computer equipment and readable storage medium | |
CN111178245B (en) | Lane line detection method, lane line detection device, computer equipment and storage medium | |
CN109190444B (en) | Method for realizing video-based toll lane vehicle feature recognition system | |
Zhang et al. | Image segmentation based on 2D Otsu method with histogram analysis | |
CN111444821A (en) | Automatic identification method for urban road signs | |
US10740927B2 (en) | Method and device for vehicle identification | |
US11380010B2 (en) | Image processing device, image processing method, and image processing program | |
CN105447503A (en) | Sparse-representation-LBP-and-HOG-integration-based pedestrian detection method | |
WO2021184718A1 (en) | Card border recognition method, apparatus and device, and computer storage medium | |
WO2023071024A1 (en) | Driving assistance mode switching method, apparatus, and device, and storage medium | |
CN112784712B (en) | Missing child early warning implementation method and device based on real-time monitoring | |
CN111723822A (en) | RGBD image significance detection method and system based on multi-level fusion | |
Hatolkar et al. | A survey on road traffic sign recognition system using convolution neural network | |
CN110909656B (en) | Pedestrian detection method and system integrating radar and camera | |
Sun et al. | Vehicle Type Recognition Combining Global and Local Features via Two‐Stage Classification | |
CN110175500B (en) | Finger vein comparison method, device, computer equipment and storage medium | |
CN109726621B (en) | Pedestrian detection method, device and equipment | |
CN112215103B (en) | Vehicle pedestrian multi-category detection method and device based on improved ACF | |
CN110570469B (en) | Intelligent identification method for angle position of automobile picture | |
Li et al. | A novel approach for vehicle detection using an AND–OR-graph-based multiscale model | |
CN110458004B (en) | Target object identification method, device, equipment and storage medium | |
JP2011040070A (en) | System, method and program product for camera-based object analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |