CN107463892A - Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics - Google Patents
Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics Download PDFInfo
- Publication number
- CN107463892A CN107463892A CN201710624030.3A CN201710624030A CN107463892A CN 107463892 A CN107463892 A CN 107463892A CN 201710624030 A CN201710624030 A CN 201710624030A CN 107463892 A CN107463892 A CN 107463892A
- Authority
- CN
- China
- Prior art keywords
- feature
- pedestrian
- image
- roi
- detection method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Probability & Statistics with Applications (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Human Computer Interaction (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of based on deep learning and with reference to image context information and the pedestrian detection method of image multi-stage characteristics, target detection depth model Faster R CNN are applied in pedestrian detection field, with reference to the contextual information input feature vector grader around pedestrian;Then the multi-stage characteristics of depth characteristic extraction model VGG16 in Faster R CNN are combined, the high-rise coarse feature feature group fine with low layer are combined together so that feature includes more abundant information, can preferably detect small size pedestrian;The false drop rate of the present invention is low, has wide applicability, is applicable to the detection of intelligent monitor system or unmanned middle pedestrian, has important application value.
Description
Technical field
The present invention relates to image analysis technology field, more particularly to one kind based on deep learning and to combine image context letter
The pedestrian detection method of breath and image multi-stage characteristics.
Background technology
Pedestrian detection technology refers to allow computer combination image procossing and related machine learning algorithm, by image or
The analysis of video content, can be to wherein judging, if there is pedestrian, it is also necessary to the row in image with the presence or absence of pedestrian
People carries out positioning mark exactly.Because video has by a frame group of picture into for the performance of the pedestrian detection technology of image
Naturally the performance of the pedestrian detection technology for video is decide, so the present invention is mainly to the static video after being converted by video
Image carries out pedestrian detection.This kind of image is often shot using vehicle-mounted camera and found a view in street, and its background is complicated, and illumination is strong
Weak to differ, pedestrian's dressing, posture vary, and pedestrian's situation that is blocked also happens occasionally so that still deposit in pedestrian detection field
In many challenges, because being had very important significance in pedestrian's detection field, the analysis for this kind of video image.
At present according to the difference of pedestrian's feature extraction mode, existing pedestrian detection model can be divided into two classes:
The first kind is the pedestrian detection method based on manual feature.Compared to deep learning method in recent years, this side
Method is also referred to as conventional method, and this method is directed to a certain region of image, first by pre-designed manual feature extraction algorithm
To extract pedestrian's feature, then feature is transported in support vector machines or adaptive enhancing AdaBoost graders
Row is constantly trained, classifies and positioned, and reaches the purpose according to feature detection pedestrian.Conventional manual feature has Haar-
Like features, HOG features, DPM features and ICF features etc..For full of challenges video image, manual feature is all based on bottom
Layer feature, although these methods have good performance under certain assumed condition, for from reality scene
In the video image with complex background, these low-level image features can not be effectively by the feature extraction of pedestrian in image and table
Sign comes out.
Second class is the pedestrian detection method based on deep learning.As deep learning in recent years is in image, voice, text
Outstanding achievement in research is achieved Deng field, emerges many pedestrian detection methods based on deep learning.These methods utilize
Depth model learns pedestrian's feature automatically, is constantly trained by substantial amounts of data, it is possible to achieve automatic from a large amount of high dimensional datas
Learn to the feature for including thousands of parameters, then obtained feature is classified and positioned, can equally reach pedestrian
The purpose of detection.At present, the pedestrian detection method performance based on deep learning remote hyper-base in the pedestrian detection side of manual feature
Method, and by designing more preferable depth detection model, effectively improve performance.
Existing target detection model includes Faster R-CNN, and still, Faster R-CNN models have two in itself
Shortcoming:First, the feature classifiers in Faster R-CNN are past merely with the feature of pedestrian, the peripheral region of pedestrian in classification
Toward comprising more grader being helped to do the useful information adjudicated;Second, Faster R-CNN can not be well in detection image
Small size pedestrian, cause Faster R-CNN poor performance, false drop rate on pedestrian's test problems higher.
The content of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides the pedestrian detection method in a kind of new image, base
In deep learning and image context information and image multi-stage characteristics are combined, realize the pedestrian detection in image.The inventive method
The detection of pedestrian in image or video after being caught to camera is can be applied to during intelligent monitor system is either unmanned, with
This obtains pedestrian position that may be present in image or video, is easy to system subsequent analysis and operation.
The present invention principle be:The inventive method is based on deep learning and combines image context information and the multistage spy of image
Sign, realizes the pedestrian detection in image.This method uses for reference research of the deep learning in object detection field first, by one at present
Outstanding target detection model faster convolutional neural networks Faster R-CNN (Ren, Shaoqing, et based on region
al."Faster R-CNN:Towards real-time object detection with region proposal
Networks. " Advances in neural information processing systems.2015) it is applied to pedestrian's inspection
In survey field, reach more good Detection results;Then, Faster is helped with reference to the image context information around pedestrian
Feature classifiers " seeing " in R-CNN must be more extensive, and makes more correct judgement;Finally, we combine Faster
Depth characteristic extraction model VGG16 (Simonyan, Karen, and Andrew Zisserman. " Very deep in R-CNN
convolutional networks for large-scale image recognition."arXiv preprint
arXiv:1409.1556 (2014)) multi-stage characteristics, the high-rise coarse feature feature group fine with low layer is combined together,
So that feature includes more abundant information, Faster R-CNN are helped to detect small size pedestrian well.
Present invention is primarily based on depth targets detection model Faster R-CNN to carry out pedestrian detection, and combines above and below image
Literary information provides pedestrian's ambient condition information for grader, provides what is more enriched with reference to VGG16 multi-stage characteristics for grader
Feature.The inventive method is tested in 4024 test pictures of Caltech data sets, and its false drop rate is most of less than current
Method.
Technical scheme provided by the invention is:
A kind of pedestrian detection method of combination image context information and image multi-stage characteristics, to imagery exploitation VGG16 moulds
Type (have and fix 13 convolutional layers) carries out feature extraction, and internal memory is stored in from characteristic pattern caused by each layer of convolutional layer
In, extracted region network (Region Proposal Network, RPN) is performed on last layer of characteristic pattern conv5_3 to obtain
Many high quality area-of-interests (Region of Interest, RoI) that may include pedestrian are obtained, that is, frame are preselected, for every
One RoI, we extract a contextual information feature in relevant position to conv5_3 first, then extract the RoI and exist
The feature and composition multi-stage characteristics of relevant position on tri- characteristic patterns of conv3_3, conv4_3, conv5_3, by contextual information
Link together to be transported in grader by channel dimension with multi-stage characteristics and classified and positioned, by constantly training, i.e.,
It can reach the purpose accurately detected to pedestrian in image.Specifically comprise the following steps:
1) input:One static video image to be detected;
2) feature extraction:The picture of input is carried out using the VGG16 depth convolutional network models with 13 layers of convolutional layer
Feature extraction, each convolutional layer can produce a characteristic pattern;Last layer of characteristic pattern is conv5_3;
3) frame extraction is preselected:Using a size be n × n spatial window on last layer of characteristic pattern conv5_3 with
The speed that step-length is 1 often slides into a position, while predict and produce k different scale, different length-width ratios along long and wide slip
Reference block (being referred to as anchor box);Each pre-selection frame predicts one point according to the possibility for wherein including target
Number, sorts from high to low according to fraction, retains the most possible pre-selection frame RoI for including pedestrian of TopN (such as preceding 2000);
4) image context information extracts:For each RoI, we are in last layer of characteristic pattern conv5_3, corresponding
Position is utilized equivalent to l (l>1) the RoI pondizations operation of times RoI pre-selection frame areas, to extract the RoI above and below the position
Literary information characteristics;
5) image multi-stage characteristics extract:For each RoI, we are respectively to conv3_3, conv4_ caused by VGG16
3rd, conv5_3 three-levels characteristic pattern, features at different levels are extracted respectively using the operation of RoI pondizations in relevant position;
6) feature connects:L is carried out respectively to the image contextual characteristics and image multi-stage characteristics extracted2Normalization operations
And zoom operations, then feature is linked together along channel dimension, i.e., contextual feature is combined with multi-stage characteristics, made
Feature includes more information;
7) detect:The feature combined is sent in grader classified and callout box return, testing result for should
Pre-selection frame is classified as the possibility score value of pedestrian's classification and the pre-selection frame coordinate value after callout box returns, according to score value
0.01 is set a threshold to, pre-selection frame and its corresponding coordinate position output of threshold value are will be greater than, so as to reach pedestrian detection
Purpose.
Compared with prior art, the beneficial effects of the invention are as follows:
The invention provides pedestrian detection method in a kind of new image, deep learning is used for reference and has been obtained in object detection field
The outstanding achievement in research obtained, and applied it in pedestrian detection field, a more good Detection results can be reached;
Secondly, the present invention is in order to solve existing Faster R-CNN models disadvantage itself, respectively in connection with image context letter
Breath and image multi-stage characteristics so that feature has more abundant information, helps grader preferably to classify and position.This
Inventive method can be applied to during intelligent monitor system is either unmanned pedestrian in image or video after being caught to camera
Detection, pedestrian position that may be present in image or video is obtained with this, is easy to system subsequent analysis and operation.
Compared with prior art, the present invention is in current the most widely used pedestrian detection data set Caltech data sets
Test data part tested in assessment, by testing and assessing, its false drop rate is less than current most methods, with the best way
Also about 6% point is differed only by, illustrates the technological merit of the inventive method.
Brief description of the drawings
Fig. 1 is the general frame figure of pedestrian detection method provided by the invention;
Wherein, 1.-input image to be detected;2.-utilize VGG16 progress feature extractions;3.-carried out using RPN networks
Preselect frame RoI extractions;4.-utilize l (l>1) RoI pondizations operation extraction contextual feature again;5.-utilize 1 times of RoI ponds
Change operation and be extracted in different characteristic figure epigraph multi-stage characteristics;6.-L is carried out respectively to multi-stage characteristics2Normalization operations and contracting
Put;7.-contextual feature carries out L2Normalization operations and scaling;8.-by contextual feature and multi-stage characteristics according to passage
Dimension connects;9.-dimensionality reduction is carried out to the feature combined;10.-feature after dimensionality reduction is transported in full articulamentum and divided
Class;- feature after dimensionality reduction is transported in full articulamentum be labeled frame recurrence.
Fig. 2 is the method flow diagram of pedestrian detection method provided by the invention.
Fig. 3 is that image context information extracts schematic diagram in the present invention;
Wherein,- VGG16 multilayer feature figures;- pedestrian feature RoI ponds;- pedestrian image contextual feature
Chi Hua;- RoI pondizations operate.
Fig. 4 is that the partial results after being detected in present invention specific implementation to some images in Caltech test sets are shown
Illustration.
Embodiment
Below in conjunction with the accompanying drawings, the present invention, the model of but do not limit the invention in any way are further described by embodiment
Enclose.
Pedestrian detection is the subproblem of object detection field, its intelligent monitor system, intelligent transportation system and nobody
The fields such as driving play the technical support effect of key.Because video has by a frame group of picture into for pedestrian's inspection of image
The performance of survey technology decides the performance of the pedestrian detection technology for video naturally, so the present invention by video mainly to being converted
Static video image afterwards proposes a kind of novel pedestrian detection method.This method is first by object detection field one outstanding
Depth model Faster R-CNN are applied in pedestrian detection field, reach more good Detection results;Then we combine
It is more extensive that image context information around pedestrian helps grader " seeing " to obtain;Then we combine VGG16 multistage spy
Sign, the high-rise coarse feature feature group fine with low layer is combined together so that feature includes more abundant information, helps
Faster R-CNN can preferably detect small size pedestrian;Finally tested on Caltech data sets, as a result show ours
Method false drop rate reaches 14.0%, less than current most of advanced algorithms.
Fig. 1 is the general frame figure of pedestrian detection method provided by the invention, and Fig. 2 is pedestrian detection side provided by the invention
The method flow diagram of method, specifically comprises the following steps:
First, image to be detected is inputted, if input is video data, we need first to locate video data in advance
Manage as multiple still images, still image is inputted to detection respectively.
Second, feature extraction.We are entered using the good ImageNet image classification depth models of pre-training to input picture
Row feature extraction, currently used model have tool to be of five storeys ZF models (Zeiler, Matthew D., the and Rob of convolutional layer
Fergus."Visualizing and understanding convolutional networks."European
Conference on computer vision.Springer, Cham, 2014.), there is the VGG16 and tool of 13 layers of convolutional layer
There are residual error network ResNet101 (He, Kaiming, et al. " the Deep residual learning for of 101 layers of convolutional layer
image recognition."Proceedings of the IEEE conference on computer vision and
Pattern recognition.2016), due to VGG16 have in training speed and precision aspect it is more good, here I
From VGG16 as feature extraction network.The parameter of 13 layers of convolutional layer is as shown in table 1 in VGG16.
Convolutional layer is set with pond layer parameter in the VGG16 of table 1
Layer title | Type | Convolution kernel size | Step-length | Padding | Convolution kernel number |
conv1_1 | Convolutional | 3 | 1 | 1 | 64 |
conv1_2 | Convolutional | 3 | 1 | 1 | 64 |
pool1 | Pooling | 2 | 2 | 0 | 128 |
conv2_1 | Convolutional | 3 | 1 | 1 | 128 |
conv2_2 | Convolutional | 3 | 1 | 1 | 128 |
pool2 | Pooling | 2 | 2 | 0 | 256 |
conv3_1 | Convolutional | 3 | 1 | 1 | 256 |
conv3_2 | Convolutional | 3 | 1 | 1 | 256 |
conv3_3 | Convolutional | 3 | 1 | 1 | 256 |
pool3 | Pooling | 2 | 2 | 0 | 512 |
conv4_1 | Convolutional | 3 | 1 | 1 | 512 |
conv4_2 | Convolutional | 3 | 1 | 1 | 512 |
conv4_3 | Convolutional | 3 | 1 | 1 | 512 |
pool4 | Pooling | 2 | 2 | 0 | 512 |
conv5_1 | Convolutional | 3 | 1 | 1 | 512 |
conv5_2 | Convolutional | 3 | 1 | 1 | 512 |
conv5_3 | Convolutional | 3 | 1 | 1 | 512 |
pool5 | Pooling | 2 | 2 | 0 | 512 |
3rd, pre-selection frame extraction.The extraction of pre-selection frame is to be referred to as extracted region network by one in Faster R-CNN
RPN module is completed.The module utilizes the space that a size is n × n on the characteristic pattern conv5_3 of last layers of VGG16
Window using step-length as 1 speed along it is long and it is wide slide, our pre-set p kinds yardstick and q kind length-width ratios then often slide into
One position, we predict the reference block (being referred to as anchor box) for producing k=p × q fixed qty simultaneously, for
Size is w × h characteristic pattern, can produce w × h × k reference block altogether.By experiment, we set as shown in table 2
10 kinds of yardsticks and a kind of length and width, so each position can produce 10 various sizes of reference blocks.We give each reference block
A fraction is distributed, the possibility for including target in the pre-selection frame corresponding to the reference block is represented, is arranged from high to low according to fraction
Sequence, 2000 most possibly preselect frame RoI comprising pedestrian before reservation.
The setting of the reference block yardstick of table 2 and length-width ratio
Yardstick | Length-width ratio | Yardstick | Length-width ratio |
2.02 | 2.44 | 5.71222 | 2.44 |
2.62 | 2.44 | 7.42592 | 2.44 |
3.382 | 2.44 | 9.6532 | 2.44 |
3.54312 | 2.44 | 12.54972 | 2.44 |
4.3942 | 2.44 | 16.31462 | 2.44 |
4th, image context information extraction.Fig. 3 is that image context information extracts schematic diagram in the present invention, for every
One RoI, we are first by its RoI area with middle auxocardia l (l>1) again, then in last layer of characteristic pattern conv5_
3, the operation of RoI pondizations is carried out to the RoI after expansion in relevant position, to extract contextual information features of the RoI in the position,
Here l we be arranged to 1.5.Due to the feature classifiers in Faster R-CNN classification when merely with pedestrian feature, OK
The peripheral region of people, which usually contains, more can help grader to do the useful information adjudicated, thus expand after RoI include than
The former more information of RoI, so as to help grader " seeing " to obtain more extensively a bit, and make more correct judgement.
Wherein RoI pondizations operation is completed by RoI ponds layer, for being solid by the Feature Conversion in any effective RoI
Wide H × the W of fixed length smaller characteristic pattern (for example, 7 × 7), wherein H and W are independently of any specific RoI layer hyper parameter.This
In RoI refer to that RPN caused pre-selection frame, each RoI on characteristic pattern define x, y, w, h by 4 dimensional vectors, represent its upper left corner
(x, y) and length and width (h, w).For RoI ponds layer by the way that h × w RoI windows are divided into H × W grid, each grid is size
ForSubwindow, then to each subwindow carry out maximum extraction, as the output of respective window, equivalent to standard
Maximum pond layer.
5th, the extraction of image multi-stage characteristics.For each RoI, we respectively to conv3_3 caused by VGG16,
Conv4_3, conv5_3 three-level characteristic pattern, features at different levels are extracted respectively to the RoI using the operation of RoI pondizations in relevant position.By
Feature is only extracted on conv5_3 in FasterR-CNN acquiescences, for small size pedestrian, the feature on conv5_3 is very thick
It is rough, cause its can not detection image small-medium size pedestrian well, cause Faster R-CNN performances on pedestrian's test problems
Not good enough, false drop rate is higher, so by the way that the fine feature of low layer and high-rise coarse feature are combined together, enables to small
The feature of size pedestrian also can more be enriched, so as to improve Detection results of the Faster R-CNN to small pedestrian.
6th, feature connection.Due to extract image contextual characteristics, each have different models between multi-stage characteristics
Number and yardstick, simply they link together along channel dimension can be so that detection performance declines, because numerical value is big
Feature can occupy leading position, so in order to solve this problem, we are first to four groups of features (3 grades of feature+contextual features)
L is carried out according to formula 12Normalization.
Assuming that being characterized as that d is tieed up, x=(x are designated as1,x2,…,xd), we are using formula 1 come regular this feature:
Wherein, It is characterized x normalized value.
But only regular feature can change the yardstick of each feature and can slow down pace of learning, it is therefore necessary to be
Each passage of input introduces a zoom factor γi, the normalized value being so scaled isIn training rank
Section, we are that each feature x and each zoom factor γ individually learn using back-propagation algorithm and chain rule, so
By training each feature to have similar norm and yardstick.
By the way that feature is carried out into L2After normalization operations and zoom operations, training can be caused more stable and can be with
Improving performance., will be then by L finally along channel dimension2Contextual feature after normalization operations and zoom operations with it is more
Level feature links together, and forms final pedestrian's feature, is classified for grader.
7th, detection.Final pedestrian's feature after combination is compressed first with 1 × 1 convolutional layer, by feature
Boil down to 512 × 7 × 7 tie up, be then sent in grader classified and callout box return, set threshold value, will be greater than
The positive sample of threshold value and its corresponding coordinate position output, so as to reach the purpose of pedestrian detection.
It is the specific embodiment party that the present invention combines image context information and multi-stage characteristics carry out pedestrian detection above
Case.The training and test of above-described embodiment are all carried out on single scale image, without selecting scale pyramid strategy, sheet
Method is trained on Caltech-10x data sets, is trained 60,000 times with learning rate lr=0.001, then with learning rate lr=
0.0001 training 20,000 times, then has in Caltech data sets and is tested on the test set of 4024 images, and
Assessed under the conditions of Reasonable (pedestrian level is more than 50 pixels, be blocked region be no more than 35%), evaluation criteria choosing
With FPPI- false drop rates, table 3 is the inventive method and other assessments of six algorithms on Caltech data sets in contrast
As a result, all score value all represents average false drop rate here, and score value is lower to represent that algorithm performance is better.As a result show, the present invention
The false drop rate of method is less than most methods, and 6% point only higher than the false drop rate of current effect best method, it is sufficient to says
The superiority of bright this method performance.
Test evaluation result of the distinct methods of table 3 on Caltech data sets
Algorithm title | VJ[1] | HOG[2] | ACF++[3] | Checkboards[4] | LDCF[5] | RPN+BF[6] | F-DNN[7] | Ours |
False drop rate | 94.7% | 68.5% | 17.7% | 17.1% | 15.0% | 9.6% | 8.2% | 14.0% |
Finally, Fig. 4 is more visual after being detected using the present invention to parts of images in Caltech test sets shows
Example.
The existing method for being used to contrast in table 3 is documented in following corresponding document respectively:
[1]Viola,Paul,and Michael J.Jones."Robust real-time face detection."
International journal of computer vision 57.2(2004):137-154.
[2]Dalal,Navneet,and Bill Triggs."Histograms of oriented gradients
for human detection."Computer Vision and Pattern Recognition,2005.CVPR
2005.IEEE Computer Society Conference on.Vol.1.IEEE,2005.
[3]Ohn-Bar,Eshed,and Mohan M.Trivedi."To boost or not to boostOn the
limits of boosted trees for object detection."Pattern Recognition(ICPR),2016
23rd International Conference on.IEEE,2016.
[4]Zhang,Shanshan,Rodrigo Benenson,and Bernt Schiele."Filtered
channel features for pedestrian detection."Computer Vision and Pattern
Recognition(CVPR),2015IEEE Conference on.IEEE,2015.
[5]Nam,Woonhyun,Piotr Dollár,and Joon Hee Han."Local decorrelation
for improved pedestrian detection."Advances in Neural Information Processing
Systems.2014.
[6]Zhang,Liliang,et al."Is faster r-cnn doing well for pedestrian
detection."European Conference on Computer Vision.Springer International
Publishing,2016.
[7]Du,Xianzhi,et al."Fused DNN:A deep neural network fusion approach
to fast and robust pedestrian detection."Applications of Computer Vision
(WACV),2017IEEE Winter Conference on.IEEE,2017.
It should be noted that the purpose for publicizing and implementing example is that help further understands the present invention, but the skill of this area
Art personnel are appreciated that:Do not departing from the present invention and spirit and scope of the appended claims, various substitutions and modifications are all
It is possible.Therefore, the present invention should not be limited to embodiment disclosure of that, and the scope of protection of present invention is with claim
The scope that book defines is defined.
Claims (9)
1. the pedestrian detection method of a kind of combination image context information and image multi-stage characteristics, to image to be detected, utilizes figure
As the progress feature extraction of depth of assortment model, characteristic pattern caused by each layer of convolutional layer in depth model is stored in internal memory
In, extracted region network RPN is performed on last layer of characteristic pattern, obtains multiple high quality region of interest that may include pedestrian
Domain RoI, that is, preselect frame;For each RoI, contextual information spy is extracted in relevant position to last layer of characteristic pattern first
Sign, image multi-stage characteristics extraction is then carried out, extract the RoI feature of relevant position and the multistage spy of composition on multiple characteristic patterns
Sign;Contextual information is linked together with multi-stage characteristics by channel dimension, is transported in grader and is carried out classification based training and determine
Position detection;By constantly training identification pedestrian, thus reach the purpose accurately detected to pedestrian in image.
2. pedestrian detection method as claimed in claim 1, it is characterized in that, described image depth of assortment model is by document
(Simonyan,Karen,and Andrew Zisserman."Very deep convolutional networks for
large-scale image recognition."arXiv preprint arXiv:1409.1556 (the 2014)) tool recorded
There are the VGG16 depth convolutional network models of 13 layers of convolutional layer;Last layer of characteristic pattern is conv5_3.
3. pedestrian detection method as claimed in claim 2, it is characterized in that, pre-selection frame extraction is specifically:Using a size be n ×
N spatial window, slided on last layer of characteristic pattern conv5_3 using the speed that step-length is 1 along long and width, often slide into one
Position, while predict and produce k different scale, the reference block anchor box of different length-width ratios;For each pre-selection frame, according to
The possibility comprising target predicts a fraction in the pre-selection frame;Sorted from high to low according to fraction, reservation is above multiple most to be had
The pre-selection frame RoI of pedestrian may be included.
4. pedestrian detection method as claimed in claim 2, it is characterized in that, contextual information is extracted in relevant position to conv5_3
Feature, it is specifically:For each pre-selection frame RoI, in last layer of characteristic pattern conv5_3, using equivalent to l (l>1) again
The RoI pondizations operation of the pre-selection frame RoI areas, extraction obtain contextual information features of the RoI in relevant position.
5. pedestrian detection method as claimed in claim 2, it is characterized in that, the extraction of image multi-stage characteristics is specific to each pre-selection
Frame RoI, respectively to conv3_3, conv4_3, conv5_3 three-level characteristic pattern caused by VGG16, RoI ponds are utilized in relevant position
Changing operation, extraction obtains features at different levels respectively, and forms multi-stage characteristics.
6. pedestrian detection method as claimed in claim 2, it is characterized in that, feature connection is:To the image context spy extracted
Sign carries out L respectively with image multi-stage characteristics2Normalization operations and zoom operations, then feature is connected to one along channel dimension
Rise, i.e., combined contextual feature with multi-stage characteristics so that the feature of acquisition includes more information;Specifically include following process:
1) L is carried out according to formula 1 to 3 grades of features and contextual feature2Normalization:
If being characterized as that d is tieed up, x=(x are designated as1,x2,…,xd), using formula 1 come regular this feature:
Wherein, It is characterized x normalized value;
2) each passage for input introduces a zoom factor γi, the normalized value being scaled is
3) along channel dimension, by L2Contextual feature after normalization operations and zoom operations links together with multi-stage characteristics,
Final pedestrian's feature is formed, is classified for grader;
4) it is that each feature x and each zoom factor γ are mono- using back-propagation algorithm and chain rule in the training stage
Solely study so that by training each feature to have similar norm and yardstick.
7. pedestrian detection method as claimed in claim 2, it is characterized in that, when detecting pedestrian, the feature combined is sent to classification
Classification is carried out in device and callout box returns, testing result is the possibility score value and process that the pre-selection frame is classified as pedestrian's classification
Pre-selection frame coordinate value after callout box recurrence, sets point threshold, will be greater than the pre-selection frame of threshold value and corresponding coordinate position
Output, thus reach the purpose of pedestrian detection.
8. pedestrian detection method as claimed in claim 7, it is characterized in that, it is right before the feature combined is sent in grader
The convolutional layer of characteristic use 1 × 1 is compressed, and is 512 × 7 × 7 dimensions by Feature Compression.
9. pedestrian detection method as claimed in claim 7, it is characterized in that, point threshold is arranged to 0.01.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710624030.3A CN107463892A (en) | 2017-07-27 | 2017-07-27 | Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710624030.3A CN107463892A (en) | 2017-07-27 | 2017-07-27 | Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107463892A true CN107463892A (en) | 2017-12-12 |
Family
ID=60547587
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710624030.3A Pending CN107463892A (en) | 2017-07-27 | 2017-07-27 | Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107463892A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108416314A (en) * | 2018-03-16 | 2018-08-17 | 中山大学 | The important method for detecting human face of picture |
CN108520229A (en) * | 2018-04-04 | 2018-09-11 | 北京旷视科技有限公司 | Image detecting method, device, electronic equipment and computer-readable medium |
CN108545021A (en) * | 2018-04-17 | 2018-09-18 | 济南浪潮高新科技投资发展有限公司 | A kind of auxiliary driving method and system of identification special objective |
CN108549852A (en) * | 2018-03-28 | 2018-09-18 | 中山大学 | Pedestrian detector's Auto-learning Method under special scenes based on the enhancing of depth network |
CN108681718A (en) * | 2018-05-20 | 2018-10-19 | 北京工业大学 | A kind of accurate detection recognition method of unmanned plane low target |
CN108875537A (en) * | 2018-02-28 | 2018-11-23 | 北京旷视科技有限公司 | Method for checking object, device and system and storage medium |
CN108898047A (en) * | 2018-04-27 | 2018-11-27 | 中国科学院自动化研究所 | The pedestrian detection method and system of perception are blocked based on piecemeal |
CN109101908A (en) * | 2018-07-27 | 2018-12-28 | 北京工业大学 | Driving procedure area-of-interest detection method and device |
CN109377474A (en) * | 2018-09-17 | 2019-02-22 | 苏州大学 | A kind of macula lutea localization method based on improvement Faster R-CNN |
CN109583517A (en) * | 2018-12-26 | 2019-04-05 | 华东交通大学 | A kind of full convolution example semantic partitioning algorithm of the enhancing suitable for small target deteection |
CN109697257A (en) * | 2018-12-18 | 2019-04-30 | 天罡网(北京)安全科技有限公司 | It is a kind of based on the network information retrieval method presorted with feature learning anti-noise |
CN109886286A (en) * | 2019-01-03 | 2019-06-14 | 武汉精测电子集团股份有限公司 | Object detection method, target detection model and system based on cascade detectors |
CN110020658A (en) * | 2019-03-28 | 2019-07-16 | 大连理工大学 | A kind of well-marked target detection method based on multitask deep learning |
WO2019148362A1 (en) * | 2018-01-31 | 2019-08-08 | 富士通株式会社 | Object detection method and apparatus |
CN110135243A (en) * | 2019-04-02 | 2019-08-16 | 上海交通大学 | A kind of pedestrian detection method and system based on two-stage attention mechanism |
CN110263712A (en) * | 2019-06-20 | 2019-09-20 | 江南大学 | A kind of coarse-fine pedestrian detection method based on region candidate |
CN110334622A (en) * | 2019-06-24 | 2019-10-15 | 电子科技大学 | Based on the pyramidal pedestrian retrieval method of self-adaptive features |
CN110751181A (en) * | 2019-09-23 | 2020-02-04 | 华中科技大学 | Target identification method based on sum pooling characteristics |
CN110826392A (en) * | 2019-09-17 | 2020-02-21 | 安徽大学 | Cross-modal pedestrian detection method combined with context information |
CN110929668A (en) * | 2019-11-29 | 2020-03-27 | 珠海大横琴科技发展有限公司 | Commodity detection method and device based on unmanned goods shelf |
CN111401418A (en) * | 2020-03-05 | 2020-07-10 | 浙江理工大学桐乡研究院有限公司 | Employee dressing specification detection method based on improved Faster r-cnn |
CN113168705A (en) * | 2018-10-12 | 2021-07-23 | 诺基亚技术有限公司 | Method and apparatus for context-embedded and region-based object detection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096561A (en) * | 2016-06-16 | 2016-11-09 | 重庆邮电大学 | Infrared pedestrian detection method based on image block degree of depth learning characteristic |
CN106096605A (en) * | 2016-06-02 | 2016-11-09 | 史方 | A kind of image obscuring area detection method based on degree of depth study and device |
CN106127173A (en) * | 2016-06-30 | 2016-11-16 | 北京小白世纪网络科技有限公司 | A kind of human body attribute recognition approach based on degree of depth study |
CN106778854A (en) * | 2016-12-07 | 2017-05-31 | 西安电子科技大学 | Activity recognition method based on track and convolutional neural networks feature extraction |
-
2017
- 2017-07-27 CN CN201710624030.3A patent/CN107463892A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096605A (en) * | 2016-06-02 | 2016-11-09 | 史方 | A kind of image obscuring area detection method based on degree of depth study and device |
CN106096561A (en) * | 2016-06-16 | 2016-11-09 | 重庆邮电大学 | Infrared pedestrian detection method based on image block degree of depth learning characteristic |
CN106127173A (en) * | 2016-06-30 | 2016-11-16 | 北京小白世纪网络科技有限公司 | A kind of human body attribute recognition approach based on degree of depth study |
CN106778854A (en) * | 2016-12-07 | 2017-05-31 | 西安电子科技大学 | Activity recognition method based on track and convolutional neural networks feature extraction |
Non-Patent Citations (2)
Title |
---|
CHENCHEN ZHU,ET AL: "CMS-RCNN Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection", 《ARXIV PREPRINT》 * |
SHAOQING REN ET AL: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111095295B (en) * | 2018-01-31 | 2021-09-03 | 富士通株式会社 | Object detection method and device |
CN111095295A (en) * | 2018-01-31 | 2020-05-01 | 富士通株式会社 | Object detection method and device |
WO2019148362A1 (en) * | 2018-01-31 | 2019-08-08 | 富士通株式会社 | Object detection method and apparatus |
CN108875537A (en) * | 2018-02-28 | 2018-11-23 | 北京旷视科技有限公司 | Method for checking object, device and system and storage medium |
CN108416314A (en) * | 2018-03-16 | 2018-08-17 | 中山大学 | The important method for detecting human face of picture |
CN108416314B (en) * | 2018-03-16 | 2022-03-08 | 中山大学 | Picture important face detection method |
CN108549852A (en) * | 2018-03-28 | 2018-09-18 | 中山大学 | Pedestrian detector's Auto-learning Method under special scenes based on the enhancing of depth network |
CN108549852B (en) * | 2018-03-28 | 2020-09-08 | 中山大学 | Specific scene downlink person detector automatic learning method based on deep network enhancement |
CN108520229B (en) * | 2018-04-04 | 2020-08-07 | 北京旷视科技有限公司 | Image detection method, image detection device, electronic equipment and computer readable medium |
CN108520229A (en) * | 2018-04-04 | 2018-09-11 | 北京旷视科技有限公司 | Image detecting method, device, electronic equipment and computer-readable medium |
CN108545021A (en) * | 2018-04-17 | 2018-09-18 | 济南浪潮高新科技投资发展有限公司 | A kind of auxiliary driving method and system of identification special objective |
CN108898047A (en) * | 2018-04-27 | 2018-11-27 | 中国科学院自动化研究所 | The pedestrian detection method and system of perception are blocked based on piecemeal |
CN108898047B (en) * | 2018-04-27 | 2021-03-19 | 中国科学院自动化研究所 | Pedestrian detection method and system based on blocking and shielding perception |
CN108681718A (en) * | 2018-05-20 | 2018-10-19 | 北京工业大学 | A kind of accurate detection recognition method of unmanned plane low target |
CN108681718B (en) * | 2018-05-20 | 2021-08-06 | 北京工业大学 | Unmanned aerial vehicle low-altitude target accurate detection and identification method |
CN109101908A (en) * | 2018-07-27 | 2018-12-28 | 北京工业大学 | Driving procedure area-of-interest detection method and device |
CN109377474B (en) * | 2018-09-17 | 2021-06-15 | 苏州大学 | Macular positioning method based on improved Faster R-CNN |
CN109377474A (en) * | 2018-09-17 | 2019-02-22 | 苏州大学 | A kind of macula lutea localization method based on improvement Faster R-CNN |
CN113168705A (en) * | 2018-10-12 | 2021-07-23 | 诺基亚技术有限公司 | Method and apparatus for context-embedded and region-based object detection |
CN109697257A (en) * | 2018-12-18 | 2019-04-30 | 天罡网(北京)安全科技有限公司 | It is a kind of based on the network information retrieval method presorted with feature learning anti-noise |
CN109583517A (en) * | 2018-12-26 | 2019-04-05 | 华东交通大学 | A kind of full convolution example semantic partitioning algorithm of the enhancing suitable for small target deteection |
CN109886286A (en) * | 2019-01-03 | 2019-06-14 | 武汉精测电子集团股份有限公司 | Object detection method, target detection model and system based on cascade detectors |
CN110020658A (en) * | 2019-03-28 | 2019-07-16 | 大连理工大学 | A kind of well-marked target detection method based on multitask deep learning |
CN110135243A (en) * | 2019-04-02 | 2019-08-16 | 上海交通大学 | A kind of pedestrian detection method and system based on two-stage attention mechanism |
CN110135243B (en) * | 2019-04-02 | 2021-03-19 | 上海交通大学 | Pedestrian detection method and system based on two-stage attention mechanism |
CN110263712B (en) * | 2019-06-20 | 2021-02-23 | 江南大学 | Coarse and fine pedestrian detection method based on region candidates |
CN110263712A (en) * | 2019-06-20 | 2019-09-20 | 江南大学 | A kind of coarse-fine pedestrian detection method based on region candidate |
CN110334622A (en) * | 2019-06-24 | 2019-10-15 | 电子科技大学 | Based on the pyramidal pedestrian retrieval method of self-adaptive features |
CN110334622B (en) * | 2019-06-24 | 2022-04-19 | 电子科技大学 | Pedestrian retrieval method based on adaptive feature pyramid |
CN110826392A (en) * | 2019-09-17 | 2020-02-21 | 安徽大学 | Cross-modal pedestrian detection method combined with context information |
CN110826392B (en) * | 2019-09-17 | 2023-03-10 | 安徽大学 | Cross-modal pedestrian detection method combined with context information |
CN110751181A (en) * | 2019-09-23 | 2020-02-04 | 华中科技大学 | Target identification method based on sum pooling characteristics |
CN110929668A (en) * | 2019-11-29 | 2020-03-27 | 珠海大横琴科技发展有限公司 | Commodity detection method and device based on unmanned goods shelf |
CN111401418A (en) * | 2020-03-05 | 2020-07-10 | 浙江理工大学桐乡研究院有限公司 | Employee dressing specification detection method based on improved Faster r-cnn |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107463892A (en) | Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics | |
CN111259850B (en) | Pedestrian re-identification method integrating random batch mask and multi-scale representation learning | |
Younis et al. | Real-time object detection using pre-trained deep learning models MobileNet-SSD | |
CN110348376B (en) | Pedestrian real-time detection method based on neural network | |
CN108304873A (en) | Object detection method based on high-resolution optical satellite remote-sensing image and its system | |
CN108537824B (en) | Feature map enhanced network structure optimization method based on alternating deconvolution and convolution | |
CN103886308B (en) | A kind of pedestrian detection method of use converging channels feature and soft cascade grader | |
CN112446388A (en) | Multi-category vegetable seedling identification method and system based on lightweight two-stage detection model | |
CN109711262B (en) | Intelligent excavator pedestrian detection method based on deep convolutional neural network | |
CN111680655A (en) | Video target detection method for aerial images of unmanned aerial vehicle | |
CN109902806A (en) | Method is determined based on the noise image object boundary frame of convolutional neural networks | |
CN107688808A (en) | A kind of quickly natural scene Method for text detection | |
CN108154102A (en) | A kind of traffic sign recognition method | |
CN110197152A (en) | A kind of road target recognition methods for automated driving system | |
CN103854016A (en) | Human body behavior classification and identification method and system based on directional common occurrence characteristics | |
CN107315990A (en) | A kind of pedestrian detection algorithm based on XCS LBP features and cascade AKSVM | |
CN111368775A (en) | Complex scene dense target detection method based on local context sensing | |
US20240161315A1 (en) | Accurate and robust visual object tracking approach for quadrupedal robots based on siamese network | |
Wei et al. | Traffic sign detection and recognition using novel center-point estimation and local features | |
CN112329861A (en) | Layered feature fusion method for multi-target detection of mobile robot | |
Ishioka et al. | Single camera worker detection, tracking and action recognition in construction site | |
CN110188811A (en) | Underwater target detection method based on normed Gradient Features and convolutional neural networks | |
Cao et al. | Foreign object debris detection on airfield pavement using region based convolution neural network | |
Kheder et al. | Transfer learning based traffic light detection and recognition using CNN inception-V3 model | |
Yang et al. | Real-time pedestrian detection for autonomous driving |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171212 |
|
WD01 | Invention patent application deemed withdrawn after publication |