CN107330432B - Multi-view vehicle detection method based on weighted Hough voting - Google Patents

Multi-view vehicle detection method based on weighted Hough voting Download PDF

Info

Publication number
CN107330432B
CN107330432B CN201710554766.8A CN201710554766A CN107330432B CN 107330432 B CN107330432 B CN 107330432B CN 201710554766 A CN201710554766 A CN 201710554766A CN 107330432 B CN107330432 B CN 107330432B
Authority
CN
China
Prior art keywords
voting
image
view
sample
positive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710554766.8A
Other languages
Chinese (zh)
Other versions
CN107330432A (en
Inventor
李冬梅
李涛
向涛
朱晓珺
张栋梁
曲豪
汪伟
郭航宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
YANCHENG CHANTU INTELLIGENT TECHNOLOGY Co.,Ltd.
Original Assignee
Yancheng Chantu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yancheng Chantu Intelligent Technology Co ltd filed Critical Yancheng Chantu Intelligent Technology Co ltd
Priority to CN201710554766.8A priority Critical patent/CN107330432B/en
Publication of CN107330432A publication Critical patent/CN107330432A/en
Application granted granted Critical
Publication of CN107330432B publication Critical patent/CN107330432B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/259Fusion by voting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a weighted Hough voting-based multi-view vehicle detection method, which comprises the following steps of: step A: defining a training sample image set; and B: carrying out visual angle subclass division on a positive sample set in a training sample image set; and C: calculating the contribution weight of each positive sample to different view subclasses; step D: determining the voting score of the image block at the candidate position by using a weighted Hough voting method; step E: determining a vehicle detection frame in the test image; according to the method, automatic division of subclasses of different viewing angles of the vehicle is realized by utilizing the LLE and the k-means, voting weights of a positive sample set under different viewing angles in the Hough voting process are defined by utilizing the division, and the Hough voting for accurate positioning is carried out by combining the voting weights, so that the accurate detection of the vehicle under multiple viewing angles is realized; compared with the prior art, the method greatly improves the detection speed, and effectively utilizes the shared information among the subclasses with different visual angles, thereby further improving the accuracy of vehicle detection.

Description

Multi-view vehicle detection method based on weighted Hough voting
Technical Field
The invention relates to the field of vehicle detection in a video traffic environment, in particular to a weighted Hough voting-based multi-view vehicle detection method.
Background
With the fact that automobiles increasingly become the most important tool in daily life, vehicle detection also becomes an important component of an intelligent traffic system in a smart city, but in a real scene, multi-view vehicle detection is still a difficult problem of vehicle detection, and due to the fact that vehicles are presented in different views in images when the vehicles move or shooting positions are different, appearance characteristics of the vehicles are greatly different, and accordingly accuracy of vehicle detection is sharply reduced.
Vehicle detection for multiple viewing angles in the prior art is mainly classified into three major categories, first: dividing an image training set into different subclasses by using a manual method or based on sample Aspect Ratio (Aspect Ratio), wherein each subclass comprises a certain range of visual angle change, and establishing a detection model for each subclass independently; secondly, the method comprises the following steps: based on an automatic subclassification method, or embedding an unsupervised clustering process in the process of learning the detector; thirdly, the method comprises the following steps: embedding 3D visual angle information in the model, and estimating a target visual angle; the three methods solve the multi-view vehicle detection problem from different angles, but all have obvious defects or limitations, such as ignoring the characteristic commonality of multi-view targets or difficult acquisition of 3D view information, and the like, thereby causing inaccurate multi-view vehicle detection results.
Disclosure of Invention
The invention aims to provide a multi-view vehicle detection method based on weighted Hough voting, which can effectively solve the problem that in the prior art, the multi-view vehicle detection result is inaccurate due to the fact that the common characteristics of multi-view targets are ignored or 3D view information is ignored.
In order to achieve the purpose, the invention adopts the following technical scheme:
a multi-view vehicle detection method based on weighted Hough voting comprises the following steps:
step A: defining a set of training sample images
Figure GDA0002519768650000011
The image size is 128 × 64, where fiFor image feature expression, HOG features are adopted when multi-view division is carried out, and multi-channel pixel features are adopted when visual words are trained and Hough voting is carried out; y isi∈ { -1, +1} is the training sample label, yiWhen is-1 time IiAs a background sample, yi1 time I ═ 1 timeiIs a target sample; n is the size of the training sample set;
and B: for training sample image set
Figure GDA0002519768650000021
Set of positive samples in (1)
Figure GDA0002519768650000022
Performing view subclassing, wherein N+Representing the number of positive samples;
and C: calculating the contribution weight of each positive sample to different view subclasses;
step D: determining the voting score of the image block at the candidate position by using a weighted Hough voting method;
step E: a vehicle detection frame is determined in the test image.
The step B comprises the following steps:
step B1: using LLE algorithm to set D positive samples+Embedding a positive sample image expressed by the HOG characteristics into a two-dimensional space;
step B2: selecting a central point of a ring formed by sample point distribution in a two-dimensional space, and regularizing all samples to a circle based on relative angles from the sample points to the central point;
step B3: clustering samples on the circle by using a k-means algorithm, and collecting a positive sample set D+Divided into K viewsThe horn class.
The step C specifically adopts the following method:
in LLE embedding space, the cluster center of the kth view subclass sample set is defined as okK ∈ {1, 2.., K }, positive sample set D+Mean sample imagejThe expression in the LLE embedding space is fjAll samples are positivejContribution weight w to view subclass kjkIs defined as:
Figure GDA0002519768650000023
wherein d (f'j,ok) Is LLE embedded in space f'jAnd okThe distance between them; to ensure the correctness of the calculation, the sample image needs to be alignedjContribution weight w under each view subclassjkNormalization is performed to ensure that the sum of weights is 1, i.e.
Figure GDA0002519768650000024
The step D comprises the following steps:
step D1: definition and image block ptThe matched visual word is L, and the set of L containing the offset vectors of the positive sample image blocks is ELAnd obtaining the classification probability C by counting the proportion of the image blocks containing the positive samples in the LLThen image block ptThe vote score at candidate position h is:
Figure GDA0002519768650000025
wherein E isLThe voting of each voting unit E to the candidate position h is estimated by using a Gaussian Parzen window, | ELI denotes set size, qtFor image block ptThe center position of (a) is the standard deviation of a gaussian Parzen window; after the visual word L is generated, its corresponding classification probability CLDetermined, image block ptThe voting score at the candidate position h depends mainly on the set of offset vectors ELVoting in (1)Unit e, using the linear accumulation property of Hough voting score, can convert V (h | p)t) Is defined by the accumulation of ELThe voting score of each voting unit to the candidate position h is rewritten into an accumulation sum ELPositive sample image associated with middle voting unitjThe form of voting score on candidate position h, namely:
Figure GDA0002519768650000031
wherein the content of the first and second substances,
Figure GDA0002519768650000032
wherein the content of the first and second substances,
Figure GDA0002519768650000033
represents ELThe voting unit e in (a) is from the positive sample set D+Mean sample imagej(ii) a Go through the image block p in the test image GtThe final vote score at candidate position h is:
Figure GDA0002519768650000034
step D2: due to the positive sample set D+The final Hough map generated by voting is disordered due to overlarge difference of visual angles of the images of all the samples, bright spots are not concentrated enough, and the candidate position h cannot be accurately determined, so that the scheme is
Figure GDA0002519768650000035
A perspective variable K ∈ {1, 2., K } is introduced into the defined voting model to define the voting under the same perspective so as to ensure the perspective consistency of the voting on the candidate position h, that is:
Figure GDA0002519768650000036
the formula calculation flow is as follows: firstly, the multi-view subclassing method described in step B is used as a positive sample set D+Each sample image is labeled with a view angle sub-class K ∈ {1,2
Figure GDA0002519768650000037
Calculating a view angle contribution weight w for each sample imagejk,j∈{1,2,...,N+}; positive sample set D with view angle and view angle contribution weight finally calibrated+Then, then
Figure GDA0002519768650000038
Is redefined as:
Figure GDA0002519768650000039
wherein W is WjkThe capital of which is N+× K.
The step E comprises the following steps:
step E1: firstly, the test image is subjected to scale decomposition, and the scale space of the test image is defined as
Figure GDA00025197686500000310
M is the number of discrete scales;
step E2: in the dimension λmThe same size image blocks in the test image are densely sampled, the visual word matching each image block is found, and then the above formula is used
Figure GDA0002519768650000041
Obtaining a Hough voting graph under the scale;
step E3: at (h, λ)m) In the formed three-dimensional Hough space, the mean-shift algorithm and a judgment threshold value are utilized to determine the final target center position (h ', lambda'm), and the final target center position is in the original test chart
Figure GDA0002519768650000042
Position mark size of
Figure GDA0002519768650000043
The detection frame of (1).
The invention has the beneficial effects that:
compared with the prior art, the weighted Hough voting-based multi-view vehicle detection method has the advantages that automatic division of subclasses of different views of the vehicle is achieved through Local Linear Embedding (LLE) and k-means, voting weights of a positive sample set under different views in the Hough voting process are defined through the division, and accurate positioning Hough voting is conducted through the combination of the voting weights, so that accurate detection of the vehicle under multiple views is achieved; compared with the prior art, the method greatly improves the detection speed, and effectively utilizes the shared information among the subclasses with different visual angles, thereby further improving the accuracy of vehicle detection.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of a portion of the results of the assay using the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention relates to a weighted Hough voting-based multi-view vehicle detection method, which comprises the following steps of:
step A: defining a set of training sample images
Figure GDA0002519768650000044
The image size is 128 × 64, where fiExpressing image features (HOG features are adopted when multi-view division is carried out, and multi-channel pixel features are adopted when visual words are trained and Hough voting is carried out); y isi∈ { -1, +1} is a training sample label (y)iWhen is-1 time IiAs a background sample, yi1 time I ═ 1 timeiIs a target sample); n is the size of the training sample set;
in the process of collecting the training sample, the background images and the target images are consistent in size and close in number, and the diversity of the training sample should be ensured as much as possible, that is, the background images should include various scenes that the target may appear, and the target images should include various visual angle forms that the target may present.
And B: for training sample image set
Figure GDA0002519768650000051
Set of positive samples in (1)
Figure GDA0002519768650000052
Performing multi-view subclassing, wherein N+Representing the number of positive samples; the method specifically comprises the following steps:
step B1: using LLE algorithm to set D positive samples+Embedding a positive sample image (multi-view vehicle image) expressed by the HOG characteristic into a two-dimensional space; the result shows that after the multi-view vehicle image represented by the HOG feature is embedded into the two-dimensional space, the sample points are distributed to form a ring, and the vehicle view angle changes steadily along the ring;
step B2: from embedded samples (sample set D)+A positive sample image expressed by a medium HOG feature), selecting a central point of a ring formed by the distribution of sample points in a two-dimensional space, and regularizing all samples on a circle based on the relative angle from the sample points to the central point, wherein samples in the similar area on the circle have a similar visual angle;
step B3: as shown in fig. 1: clustering samples on the circle by using a K-means algorithm, dividing samples with similar visual angles into the same arc of the circle, wherein each arc on the circle represents a subclass with similar visual angles and is divided into K visual angle subclasses;
and C: calculating the contribution weight of each sample to different view subclasses, specifically adopting the following method:
when training the sample image set D, the positive sample set D+After the K visual angle subclasses are divided and determined, calculating the contribution weight of each positive sample to each visual angle subclass so as to fully utilize the sharing and difference of information among the visual angle subclasses; the calculation method is as follows:
in LLE embedding space, the cluster center of a k-th view subclass sample set is defined as okK ∈ {1, 2.., K }, positive sample set D+Mean sample imagejExpression in the LLE embedding space is f ″jThen the sample image is positivejContribution weight w to view subclass kjkIs defined as:
Figure GDA0002519768650000053
wherein d (f'j,ok) Is LLE embedded in space f'jAnd okThe distance between them; to ensure the correctness of the calculation, the aligned sample is neededjContribution weight w under each view sub-classjkNormalization is performed to ensure that the sum of weights is 1, i.e.
Figure GDA0002519768650000054
Step D: determining the voting score of the image block at the candidate position by using a weighted Hough voting method, which specifically comprises the following steps:
step D1: defining and testing local image blocks ptThe matched visual word is L, and the set of L containing the offset vector of the image block of the normal sample is ELObtaining the classification probability C by counting the proportion of the image blocks of the normal sample in LLThen image block ptVoting value at candidate position h is
Figure GDA0002519768650000061
Wherein E isLThe vote of each offset vector E for candidate position h is estimated using a Gaussian Parzen window, | ELI denotes set size, qtFor image block ptThe center position of (a) is the standard deviation of a gaussian Parzen window; after the visual word L is generated, its corresponding classification probability CLDetermined, local image block ptThe voting score at the candidate position h depends mainly on the set of offset vectors ELThe voting unit e in (1) can convert V (h | p) into V (h | p) by utilizing the linear accumulation characteristic of Hough voting scorest) Is defined by the accumulation of ELIn the method, the voting score of each voting unit to the candidate position h is rewritten into an accumulation ELPositive sample image associated with middle voting unitjThe form of voting score on candidate position h, namely:
Figure GDA0002519768650000062
wherein the content of the first and second substances,
Figure GDA0002519768650000063
wherein the content of the first and second substances,
Figure GDA0002519768650000064
represents ELThe voting unit e in (a) is from the positive sample set D+Mean sample imagej(ii) a Traversing all local image blocks p in test image GtThe final vote score for candidate position h is:
Figure GDA0002519768650000065
step D2: due to the positive sample set D+The final Hough map generated by voting is disordered due to overlarge difference of visual angles of the images of all the samples, bright spots are not concentrated enough, and the candidate position h cannot be accurately determined, so that the scheme is
Figure GDA0002519768650000066
A perspective variable K ∈ {1, 2., K } is introduced into the defined voting model to define the voting under the same perspective so as to ensure the perspective consistency of the voting on the candidate position h, that is:
Figure GDA0002519768650000067
the formula calculation flow is as follows: firstly, a visual angle subclassing method is utilized to obtain a positive sample set D+Each image labeled with a view angle sub-class K ∈ {1, 2.., K }, and then using the above formula
Figure GDA0002519768650000071
Calculating a view contribution weight w for each imagejk,j∈{1,2,...,N+}; finally, a sample set D is corrected through a sample set with a view angle and a view angle contribution weight calibrated+Then, then
Figure GDA0002519768650000072
Is redefined as:
Figure GDA0002519768650000073
wherein W is WjkThe capital of which is N+× K.
Step E: determining a final target detection frame in the vehicle detection image under the multiple view angles, wherein the steps are specifically described as follows:
step E1: firstly, the test image is subjected to scale decomposition, and the scale space of the test image is defined as
Figure GDA0002519768650000074
M is the number of discrete scales;
step E2: in the dimension λmThe same size image blocks in the test image are densely sampled, the visual word matching each image block is found, and then the above formula is used
Figure GDA0002519768650000075
Obtaining a Hough voting graph under the scale;
step E3: at (h, λ)m) Forming a three-dimensional Hough space (h ═ h)x,hy) Including the horizontal and vertical coordinate positions in the image), the final target center position (h ', lambda ') is determined by using the mean-shift algorithm and the judgment threshold value 'm) And in the original test pattern
Figure GDA0002519768650000076
Position mark size of
Figure GDA0002519768650000077
The detection frame of (1).
The first embodiment is as follows:
the embodiment comprises a plurality of image samples, and when the vehicle position in the image samples is detected by using the weighted Hough voting-based multi-view vehicle detection method, the following steps are adopted:
step A: defining a set of training sample images
Figure GDA0002519768650000078
The sample image sizes used were all normalized to 128 × 64, where fiExpressing image features (HOG features are adopted when multi-view division is carried out, and multi-channel pixel features are adopted when visual words are trained and Hough voting is carried out); y isi∈ { -1, +1} is a training sample label (y)iWhen is-1 time IiAs a background sample, yi1 time I ═ 1 timeiIs a target sample); n is the size of the training sample set;
in the process of collecting the training sample images, the background images and the target images are consistent in size and close in number, and the diversity of the training sample images should be ensured as much as possible, that is, the background images should include various scenes in which the target may appear, and the target images should include various forms in which the target may appear.
And B: for training sample image set
Figure GDA0002519768650000081
Set of positive samples in (1)
Figure GDA0002519768650000082
Performing multi-view subclassing, wherein N+Representing the number of positive samples; in this embodiment, the viewing angle sub-number is defined as 8; the method specifically comprises the following steps:
step B1: using LLE algorithm to set D positive samples+Embedding a positive sample image (multi-view vehicle image) expressed by the HOG characteristic into a two-dimensional space, wherein sample points are distributed to form a ring, and the vehicle view angle changes smoothly along the ring;
step B2: selecting a central point of a ring formed by the distribution of the sample points in the step B1, and regularizing all samples on a circle O based on the relative angles of the sample points to the central point, wherein the samples in the similar areas on the circle O have close visual angles;
step B3: clustering samples on the circle O by using a k-means algorithm, dividing the samples with similar visual angles to the same arc of the circle O, wherein each arc represents a subclass with similar visual angles and is divided into 8 visual angle subclasses;
and C: calculating the contribution weight of each sample to 8 different view subclasses, specifically adopting the following method:
defining the cluster center of the divided kth view subclass sample set as o in LLE embedding spacekK ∈ {1, 2.., 8}, positive sample set D+Mean sample imagejExpression in LLE embedding space is fjAll samples are positivejContribution weight w to view subclass kjkIs defined as:
Figure GDA0002519768650000083
wherein d (f'j,ok) Is LLE embedded in space f'jAnd okThe distance between them; to ensure the correctness of the calculation, the aligned sample is neededjContribution weight w under each view subclassjkNormalization is performed to ensure that the sum of weights is 1, i.e.
Figure GDA0002519768650000084
Step D: determining image block p by using weighted Hough voting methodtThe voting score at the candidate position h specifically comprises the following steps:
step D1: definition and image block ptThe matched visual word is L, and the set of L containing the offset vector of the image block of the normal sample is ELObtaining the classification probability C by counting the proportion of the image blocks of the normal sample in LLThen image block ptVoting value at candidate position h is
Figure GDA0002519768650000085
Wherein E isLThe vote of each offset vector E for candidate position h is estimated using a Gaussian Parzen window, | ELI denotes set size, qtFor image block ptThe center position of (a); local image block ptThe voting score at the candidate position h depends mainly on the set of offset vectors ELThe voting unit e in (1) can convert V (h | p) into V (h | p) by utilizing the linear accumulation characteristic of Hough voting scorest) Is defined by the accumulation of ELThe voting score of each voting unit to the candidate position h is rewritten into an accumulation sum ELPositive sample image associated with middle voting unitjThe form of voting score on candidate position h, namely:
Figure GDA0002519768650000091
wherein the content of the first and second substances,
Figure GDA0002519768650000092
wherein the content of the first and second substances,
Figure GDA0002519768650000093
represents ELThe voting unit e in (a) is from the positive sample set D+Mean sample imagej(ii) a Traversing all local image blocks p in test image GtThe final vote score for candidate position h is:
Figure GDA0002519768650000094
step D2: due to the positive sample set D+The final Hough map generated by voting is disordered due to overlarge difference of visual angles of the images of all the samples, bright spots are not concentrated enough, and the candidate position h cannot be accurately determined, so that the scheme is
Figure GDA0002519768650000095
A perspective variable K ∈ {1, 2., K } is introduced into the defined voting model to define the voting under the same perspective so as to ensure the perspective consistency of the voting on the candidate position h, that is:
Figure GDA0002519768650000096
the formula calculation flow is as follows: firstly, a visual angle subclassing method is utilized to obtain a positive sample set D+Each image labeled with a view angle sub-class K ∈ {1, 2.., K }, and then using the above formula
Figure GDA0002519768650000097
Calculating a view contribution weight w for each imagejk,j∈{1,2,...,N+}; finally, a sample set D is corrected through a sample set with a view angle and a view angle contribution weight calibrated+Then, then
Figure GDA0002519768650000098
Is redefined as:
Figure GDA0002519768650000099
wherein W is WjkThe capital of which is N+× K;
step E: determining a final target detection frame in a vehicle detection image under multiple visual angles, and specifically comprising the following steps:
step E1: firstly, the test image is subjected to scale decomposition, and the scale space of the test image is defined as
Figure GDA00025197686500000910
M is the number of discrete scales;
step E2: in the dimension λmIn the test image of (2), the image blocks of the same size are densely sampled, and in the present embodiment, the scale is λmRandomly extracting 50 image blocks with the size of 16 × 16 from each sample image in the test image, calibrating the visual angle class of each image block by using the step B, generating a visual word L by using a random forest, wherein the number of trees in the forest is 20, the maximum depth of the forest is 15, the minimum number of image blocks in splitting nodes is 20, the maximum class purity is 99.5%, the deviation square error of the offset vector of the positive sample image block in the node is 30, the scale space of each test image is set on 20 scales from 0.1 to 0.8, and then using the formula in the step D
Figure GDA0002519768650000101
Obtaining a Hough voting graph under the scale;
step E3: at (h, λ)m) Forming a three-dimensional Hough space (h ═ h)x,hy) Including the horizontal and vertical coordinate positions in the image), the final target center position (h ', lambda ') is determined by using the mean-shift algorithm and the judgment threshold value 'm) And in the original test pattern
Figure GDA0002519768650000102
Position mark size of
Figure GDA0002519768650000103
The position of the detection frame is the position of the vehicle in the target image; the partial detection results are shown in FIG. 2.
Compared with the prior art, the weighted Hough voting-based multi-view vehicle detection method has the advantages that automatic division of subclasses of different views of the vehicle is achieved through Local Linear Embedding (LLE) and k-means, voting weights of a positive sample set under different views in the Hough voting process are defined through the division, and accurate positioning Hough voting is conducted through the combination of the voting weights, so that accurate detection of the vehicle under multiple views is achieved; compared with the prior art, the method greatly improves the detection speed, and effectively utilizes the shared information among the subclasses with different visual angles, thereby further improving the accuracy of vehicle detection.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (3)

1. A multi-view vehicle detection method based on weighted Hough voting is characterized by comprising the following steps:
step A: defining a set of training sample images
Figure FDA0002519768640000011
The image size is 128 × 64, where fiFor image feature expression, HOG features are adopted when multi-view division is carried out, and multi-channel pixel features are adopted when visual words are trained and Hough voting is carried out; y isi∈ { -1, +1} is the training sample label, yiWhen is-1 time IiAs a background sample, yi1 time I ═ 1 timeiIs a target sample; n is the size of the training sample set;
and B: for training sample image set
Figure FDA0002519768640000012
Set of positive samples in (1)
Figure FDA0002519768640000013
Performing view subclassing, wherein N+Representing the number of positive samples;
and C: calculating the contribution weight of each positive sample to different view subclasses;
the step C adopts the following method:
in LLE embedding space, the cluster center of the kth view subclass sample set is defined as okK ∈ {1, 2.., K }, positive sample set D+Mean sample imagejExpression in the LLE embedding space is f ″jThen positive samplejContribution weight w to view subclass kjkIs defined as:
Figure FDA0002519768640000014
wherein d (f'j,ok) Is LLE embedded in space f'jAnd okThe distance between them; to ensure the correctness of the calculation, the sample image needs to be alignedjContribution weight w under each view subclassjkNormalization is performed to ensure that the sum of weights is 1, i.e.
Figure FDA0002519768640000015
Step D: determining the voting score of the image block at the candidate position by using a weighted Hough voting method;
the step D comprises the following steps:
step D1: definition and image block ptThe matched visual word is L, and the set of L containing the offset vectors of the positive sample image blocks is ELAnd obtaining the classification probability C by counting the proportion of the image blocks containing the positive samples in the LLThen image block ptThe vote score at candidate position h is:
Figure FDA0002519768640000021
wherein E isLFor each voting unit e to candidate bitThe vote to set h is estimated using a Gaussian Parzen window, | ELI denotes set size, qtFor image block ptThe center position of (a); after the visual word L is generated, its corresponding classification probability CLDetermined, image block ptThe voting score at the candidate position h depends mainly on the set of offset vectors ELThe voting unit e in (1) can convert V (h | p) into V (h | p) by utilizing the linear accumulation characteristic of Hough voting scorest) Is defined by the accumulation of ELThe voting score of each voting unit to the candidate position h is rewritten into an accumulation sum ELPositive sample image associated with middle voting unitjThe form of voting score on candidate position h, namely:
Figure FDA0002519768640000022
wherein the content of the first and second substances,
Figure FDA0002519768640000023
wherein, is the standard deviation of a Gaussian Parzen window,
Figure FDA0002519768640000024
represents ELThe voting unit e in (a) is from the positive sample set D+Mean sample imagej(ii) a Go through the image block p in the test image GtThe final vote score at candidate position h is:
Figure FDA0002519768640000025
step D2: due to the positive sample set D+The final Hough map generated by voting is disordered due to overlarge difference of visual angles of the images of all the samples, bright spots are not concentrated enough, and the candidate position h cannot be accurately determined, so that the scheme is
Figure FDA0002519768640000026
Introduction of perspective variable k ∈ into defined voting model{1, 2.. K }, the vote is restricted to the same view angle to guarantee view angle consistency of the vote for candidate position h, namely:
Figure FDA0002519768640000027
the formula calculation flow is as follows: firstly, the multi-view subclassing method described in step B is used as a positive sample set D+Each sample image is labeled with a view angle sub-class K ∈ {1,2
Figure FDA0002519768640000031
Calculating a view angle contribution weight w for each sample imagejk,j∈{1,2,...,N+}; positive sample set D with view angle and view angle contribution weight finally calibrated+Then, then
Figure FDA0002519768640000032
Is redefined as:
Figure FDA0002519768640000033
wherein W is WjkThe capital of which is N+× K;
step E: a vehicle detection frame is determined in the test image.
2. The weighted Hough voting-based multi-view vehicle detection method according to claim 1, wherein the step B comprises the following steps of:
step B1: using LLE algorithm to set D positive samples+Embedding a positive sample image expressed by the HOG characteristics into a two-dimensional space;
step B2: selecting a central point of a ring formed by sample point distribution in a two-dimensional space, and regularizing all samples to a circle based on relative angles from the sample points to the central point;
step B3: feeding samples on a circle by using a k-means algorithmLine clustering, positive sample set D+Into K view sub-classes.
3. The weighted Hough voting-based multi-view vehicle detection method according to claim 1, wherein the step E comprises the following steps:
step E1: firstly, the test image is subjected to scale decomposition, and the scale space of the test image is defined as
Figure FDA0002519768640000034
M is the number of discrete scales;
step E2: in the dimension λmThe same size image blocks in the test image are densely sampled, the visual word matching each image block is found, and then the above formula is used
Figure FDA0002519768640000035
Obtaining a Hough voting graph under the scale;
step E3: at (h, λ)m) Forming three-dimensional Hough space h ═ (h)x,hy) Determining the final target center position (h ', lambda ') by utilizing mean-shift algorithm and judgment threshold value 'm) And in the original test pattern
Figure FDA0002519768640000041
Position mark size of
Figure FDA0002519768640000042
The detection frame of (1).
CN201710554766.8A 2017-07-07 2017-07-07 Multi-view vehicle detection method based on weighted Hough voting Active CN107330432B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710554766.8A CN107330432B (en) 2017-07-07 2017-07-07 Multi-view vehicle detection method based on weighted Hough voting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710554766.8A CN107330432B (en) 2017-07-07 2017-07-07 Multi-view vehicle detection method based on weighted Hough voting

Publications (2)

Publication Number Publication Date
CN107330432A CN107330432A (en) 2017-11-07
CN107330432B true CN107330432B (en) 2020-08-18

Family

ID=60197190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710554766.8A Active CN107330432B (en) 2017-07-07 2017-07-07 Multi-view vehicle detection method based on weighted Hough voting

Country Status (1)

Country Link
CN (1) CN107330432B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726761B (en) * 2018-12-29 2023-03-31 青岛海洋科学与技术国家实验室发展中心 CNN evolution method, CNN-based AUV cluster working method, CNN evolution device and CNN-based AUV cluster working device and storage medium
CN109948692B (en) * 2019-03-16 2020-12-15 四川大学 Computer-generated picture detection method based on multi-color space convolutional neural network and random forest
CN111723721A (en) * 2020-06-15 2020-09-29 中国传媒大学 Three-dimensional target detection method, system and device based on RGB-D

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104112282A (en) * 2014-07-14 2014-10-22 华中科技大学 A method for tracking a plurality of moving objects in a monitor video based on on-line study
US8953888B2 (en) * 2011-02-10 2015-02-10 Microsoft Corporation Detecting and localizing multiple objects in images using probabilistic inference
CN106529461A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Vehicle model identifying algorithm based on integral characteristic channel and SVM training device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9129152B2 (en) * 2013-11-14 2015-09-08 Adobe Systems Incorporated Exemplar-based feature weighting

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8953888B2 (en) * 2011-02-10 2015-02-10 Microsoft Corporation Detecting and localizing multiple objects in images using probabilistic inference
CN104112282A (en) * 2014-07-14 2014-10-22 华中科技大学 A method for tracking a plurality of moving objects in a monitor video based on on-line study
CN106529461A (en) * 2016-11-07 2017-03-22 湖南源信光电科技有限公司 Vehicle model identifying algorithm based on integral characteristic channel and SVM training device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Recovering 6D object pose and predicting next-best-view in the crowd;Doumanoglou A, Kouskouridas R, Malassiotis S, et al.;《Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition》;20161231;全文 *
Robust object detection with interleaved categorization and segmentation;Leibe B, Leonardis A, Schiele B;《 International journal of computer vision》;20081231;全文 *
基于霍夫变换和条件随机场模型的目标检测;杜本汉;《中国优秀硕士学位论文全文数据库信息科技辑》;20150615;全文 *
复杂场景下目标检测算法研究;向涛;《中国博士学位论文全文数据库》;20170215;全文 *

Also Published As

Publication number Publication date
CN107330432A (en) 2017-11-07

Similar Documents

Publication Publication Date Title
CN104850850B (en) A kind of binocular stereo vision image characteristic extracting method of combination shape and color
CN112257605B (en) Three-dimensional target detection method, system and device based on self-labeling training sample
CN102663411B (en) Recognition method for target human body
GB2532948A (en) Objection recognition in a 3D scene
CN109658442B (en) Multi-target tracking method, device, equipment and computer readable storage medium
CN111882586B (en) Multi-actor target tracking method oriented to theater environment
CN109118528A (en) Singular value decomposition image matching algorithm based on area dividing
CN107330432B (en) Multi-view vehicle detection method based on weighted Hough voting
CN110223310B (en) Line structure light center line and box edge detection method based on deep learning
CN112818905B (en) Finite pixel vehicle target detection method based on attention and spatio-temporal information
CN106446785A (en) Passable road detection method based on binocular vision
CN111695373B (en) Zebra stripes positioning method, system, medium and equipment
Zelener et al. Cnn-based object segmentation in urban lidar with missing points
CN108073940B (en) Method for detecting 3D target example object in unstructured environment
CN111126393A (en) Vehicle appearance refitting judgment method and device, computer equipment and storage medium
CN112150448B (en) Image processing method, device and equipment and storage medium
CN110969212A (en) ISAR image classification method based on spatial transformation three-channel convolution
CN111383286A (en) Positioning method, positioning device, electronic equipment and readable storage medium
CN111325184A (en) Intelligent interpretation and change information detection method for remote sensing image
CN110675442A (en) Local stereo matching method and system combined with target identification technology
CN106446832B (en) Video-based pedestrian real-time detection method
CN113052110A (en) Three-dimensional interest point extraction method based on multi-view projection and deep learning
CN108388854A (en) A kind of localization method based on improvement FAST-SURF algorithms
CN110334703B (en) Ship detection and identification method in day and night image
CN104484647B (en) A kind of high-resolution remote sensing image cloud height detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200722

Address after: 224000 in Jiangsu Province in the south of Yancheng City District Xindu street landscape Avenue branch building 22 North Building (CND)

Applicant after: YANCHENG CHANTU INTELLIGENT TECHNOLOGY Co.,Ltd.

Address before: 450016, Zhengzhou City, Henan Province, Second West Avenue, South Road, one South Road Xinghua science and Technology Industrial Park Building 2, 9, 908, -37 room

Applicant before: ZHENGZHOU CHANTU INTELLIGENT TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant