CN107463892A - Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics - Google Patents

Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics Download PDF

Info

Publication number
CN107463892A
CN107463892A CN201710624030.3A CN201710624030A CN107463892A CN 107463892 A CN107463892 A CN 107463892A CN 201710624030 A CN201710624030 A CN 201710624030A CN 107463892 A CN107463892 A CN 107463892A
Authority
CN
China
Prior art keywords
feature
pedestrian
image
roi
detection method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710624030.3A
Other languages
Chinese (zh)
Inventor
李革
孔伟杰
李楠楠
臧祥浩
王文敏
王荣刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN201710624030.3A priority Critical patent/CN107463892A/en
Publication of CN107463892A publication Critical patent/CN107463892A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of based on deep learning and with reference to image context information and the pedestrian detection method of image multi-stage characteristics, target detection depth model Faster R CNN are applied in pedestrian detection field, with reference to the contextual information input feature vector grader around pedestrian;Then the multi-stage characteristics of depth characteristic extraction model VGG16 in Faster R CNN are combined, the high-rise coarse feature feature group fine with low layer are combined together so that feature includes more abundant information, can preferably detect small size pedestrian;The false drop rate of the present invention is low, has wide applicability, is applicable to the detection of intelligent monitor system or unmanned middle pedestrian, has important application value.

Description

Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics
Technical field
The present invention relates to image analysis technology field, more particularly to one kind based on deep learning and to combine image context letter The pedestrian detection method of breath and image multi-stage characteristics.
Background technology
Pedestrian detection technology refers to allow computer combination image procossing and related machine learning algorithm, by image or The analysis of video content, can be to wherein judging, if there is pedestrian, it is also necessary to the row in image with the presence or absence of pedestrian People carries out positioning mark exactly.Because video has by a frame group of picture into for the performance of the pedestrian detection technology of image Naturally the performance of the pedestrian detection technology for video is decide, so the present invention is mainly to the static video after being converted by video Image carries out pedestrian detection.This kind of image is often shot using vehicle-mounted camera and found a view in street, and its background is complicated, and illumination is strong Weak to differ, pedestrian's dressing, posture vary, and pedestrian's situation that is blocked also happens occasionally so that still deposit in pedestrian detection field In many challenges, because being had very important significance in pedestrian's detection field, the analysis for this kind of video image.
At present according to the difference of pedestrian's feature extraction mode, existing pedestrian detection model can be divided into two classes:
The first kind is the pedestrian detection method based on manual feature.Compared to deep learning method in recent years, this side Method is also referred to as conventional method, and this method is directed to a certain region of image, first by pre-designed manual feature extraction algorithm To extract pedestrian's feature, then feature is transported in support vector machines or adaptive enhancing AdaBoost graders Row is constantly trained, classifies and positioned, and reaches the purpose according to feature detection pedestrian.Conventional manual feature has Haar- Like features, HOG features, DPM features and ICF features etc..For full of challenges video image, manual feature is all based on bottom Layer feature, although these methods have good performance under certain assumed condition, for from reality scene In the video image with complex background, these low-level image features can not be effectively by the feature extraction of pedestrian in image and table Sign comes out.
Second class is the pedestrian detection method based on deep learning.As deep learning in recent years is in image, voice, text Outstanding achievement in research is achieved Deng field, emerges many pedestrian detection methods based on deep learning.These methods utilize Depth model learns pedestrian's feature automatically, is constantly trained by substantial amounts of data, it is possible to achieve automatic from a large amount of high dimensional datas Learn to the feature for including thousands of parameters, then obtained feature is classified and positioned, can equally reach pedestrian The purpose of detection.At present, the pedestrian detection method performance based on deep learning remote hyper-base in the pedestrian detection side of manual feature Method, and by designing more preferable depth detection model, effectively improve performance.
Existing target detection model includes Faster R-CNN, and still, Faster R-CNN models have two in itself Shortcoming:First, the feature classifiers in Faster R-CNN are past merely with the feature of pedestrian, the peripheral region of pedestrian in classification Toward comprising more grader being helped to do the useful information adjudicated;Second, Faster R-CNN can not be well in detection image Small size pedestrian, cause Faster R-CNN poor performance, false drop rate on pedestrian's test problems higher.
The content of the invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides the pedestrian detection method in a kind of new image, base In deep learning and image context information and image multi-stage characteristics are combined, realize the pedestrian detection in image.The inventive method The detection of pedestrian in image or video after being caught to camera is can be applied to during intelligent monitor system is either unmanned, with This obtains pedestrian position that may be present in image or video, is easy to system subsequent analysis and operation.
The present invention principle be:The inventive method is based on deep learning and combines image context information and the multistage spy of image Sign, realizes the pedestrian detection in image.This method uses for reference research of the deep learning in object detection field first, by one at present Outstanding target detection model faster convolutional neural networks Faster R-CNN (Ren, Shaoqing, et based on region al."Faster R-CNN:Towards real-time object detection with region proposal Networks. " Advances in neural information processing systems.2015) it is applied to pedestrian's inspection In survey field, reach more good Detection results;Then, Faster is helped with reference to the image context information around pedestrian Feature classifiers " seeing " in R-CNN must be more extensive, and makes more correct judgement;Finally, we combine Faster Depth characteristic extraction model VGG16 (Simonyan, Karen, and Andrew Zisserman. " Very deep in R-CNN convolutional networks for large-scale image recognition."arXiv preprint arXiv:1409.1556 (2014)) multi-stage characteristics, the high-rise coarse feature feature group fine with low layer is combined together, So that feature includes more abundant information, Faster R-CNN are helped to detect small size pedestrian well.
Present invention is primarily based on depth targets detection model Faster R-CNN to carry out pedestrian detection, and combines above and below image Literary information provides pedestrian's ambient condition information for grader, provides what is more enriched with reference to VGG16 multi-stage characteristics for grader Feature.The inventive method is tested in 4024 test pictures of Caltech data sets, and its false drop rate is most of less than current Method.
Technical scheme provided by the invention is:
A kind of pedestrian detection method of combination image context information and image multi-stage characteristics, to imagery exploitation VGG16 moulds Type (have and fix 13 convolutional layers) carries out feature extraction, and internal memory is stored in from characteristic pattern caused by each layer of convolutional layer In, extracted region network (Region Proposal Network, RPN) is performed on last layer of characteristic pattern conv5_3 to obtain Many high quality area-of-interests (Region of Interest, RoI) that may include pedestrian are obtained, that is, frame are preselected, for every One RoI, we extract a contextual information feature in relevant position to conv5_3 first, then extract the RoI and exist The feature and composition multi-stage characteristics of relevant position on tri- characteristic patterns of conv3_3, conv4_3, conv5_3, by contextual information Link together to be transported in grader by channel dimension with multi-stage characteristics and classified and positioned, by constantly training, i.e., It can reach the purpose accurately detected to pedestrian in image.Specifically comprise the following steps:
1) input:One static video image to be detected;
2) feature extraction:The picture of input is carried out using the VGG16 depth convolutional network models with 13 layers of convolutional layer Feature extraction, each convolutional layer can produce a characteristic pattern;Last layer of characteristic pattern is conv5_3;
3) frame extraction is preselected:Using a size be n × n spatial window on last layer of characteristic pattern conv5_3 with The speed that step-length is 1 often slides into a position, while predict and produce k different scale, different length-width ratios along long and wide slip Reference block (being referred to as anchor box);Each pre-selection frame predicts one point according to the possibility for wherein including target Number, sorts from high to low according to fraction, retains the most possible pre-selection frame RoI for including pedestrian of TopN (such as preceding 2000);
4) image context information extracts:For each RoI, we are in last layer of characteristic pattern conv5_3, corresponding Position is utilized equivalent to l (l>1) the RoI pondizations operation of times RoI pre-selection frame areas, to extract the RoI above and below the position Literary information characteristics;
5) image multi-stage characteristics extract:For each RoI, we are respectively to conv3_3, conv4_ caused by VGG16 3rd, conv5_3 three-levels characteristic pattern, features at different levels are extracted respectively using the operation of RoI pondizations in relevant position;
6) feature connects:L is carried out respectively to the image contextual characteristics and image multi-stage characteristics extracted2Normalization operations And zoom operations, then feature is linked together along channel dimension, i.e., contextual feature is combined with multi-stage characteristics, made Feature includes more information;
7) detect:The feature combined is sent in grader classified and callout box return, testing result for should Pre-selection frame is classified as the possibility score value of pedestrian's classification and the pre-selection frame coordinate value after callout box returns, according to score value 0.01 is set a threshold to, pre-selection frame and its corresponding coordinate position output of threshold value are will be greater than, so as to reach pedestrian detection Purpose.
Compared with prior art, the beneficial effects of the invention are as follows:
The invention provides pedestrian detection method in a kind of new image, deep learning is used for reference and has been obtained in object detection field The outstanding achievement in research obtained, and applied it in pedestrian detection field, a more good Detection results can be reached; Secondly, the present invention is in order to solve existing Faster R-CNN models disadvantage itself, respectively in connection with image context letter Breath and image multi-stage characteristics so that feature has more abundant information, helps grader preferably to classify and position.This Inventive method can be applied to during intelligent monitor system is either unmanned pedestrian in image or video after being caught to camera Detection, pedestrian position that may be present in image or video is obtained with this, is easy to system subsequent analysis and operation.
Compared with prior art, the present invention is in current the most widely used pedestrian detection data set Caltech data sets Test data part tested in assessment, by testing and assessing, its false drop rate is less than current most methods, with the best way Also about 6% point is differed only by, illustrates the technological merit of the inventive method.
Brief description of the drawings
Fig. 1 is the general frame figure of pedestrian detection method provided by the invention;
Wherein, 1.-input image to be detected;2.-utilize VGG16 progress feature extractions;3.-carried out using RPN networks Preselect frame RoI extractions;4.-utilize l (l>1) RoI pondizations operation extraction contextual feature again;5.-utilize 1 times of RoI ponds Change operation and be extracted in different characteristic figure epigraph multi-stage characteristics;6.-L is carried out respectively to multi-stage characteristics2Normalization operations and contracting Put;7.-contextual feature carries out L2Normalization operations and scaling;8.-by contextual feature and multi-stage characteristics according to passage Dimension connects;9.-dimensionality reduction is carried out to the feature combined;10.-feature after dimensionality reduction is transported in full articulamentum and divided Class;- feature after dimensionality reduction is transported in full articulamentum be labeled frame recurrence.
Fig. 2 is the method flow diagram of pedestrian detection method provided by the invention.
Fig. 3 is that image context information extracts schematic diagram in the present invention;
Wherein,- VGG16 multilayer feature figures;- pedestrian feature RoI ponds;- pedestrian image contextual feature Chi Hua;- RoI pondizations operate.
Fig. 4 is that the partial results after being detected in present invention specific implementation to some images in Caltech test sets are shown Illustration.
Embodiment
Below in conjunction with the accompanying drawings, the present invention, the model of but do not limit the invention in any way are further described by embodiment Enclose.
Pedestrian detection is the subproblem of object detection field, its intelligent monitor system, intelligent transportation system and nobody The fields such as driving play the technical support effect of key.Because video has by a frame group of picture into for pedestrian's inspection of image The performance of survey technology decides the performance of the pedestrian detection technology for video naturally, so the present invention by video mainly to being converted Static video image afterwards proposes a kind of novel pedestrian detection method.This method is first by object detection field one outstanding Depth model Faster R-CNN are applied in pedestrian detection field, reach more good Detection results;Then we combine It is more extensive that image context information around pedestrian helps grader " seeing " to obtain;Then we combine VGG16 multistage spy Sign, the high-rise coarse feature feature group fine with low layer is combined together so that feature includes more abundant information, helps Faster R-CNN can preferably detect small size pedestrian;Finally tested on Caltech data sets, as a result show ours Method false drop rate reaches 14.0%, less than current most of advanced algorithms.
Fig. 1 is the general frame figure of pedestrian detection method provided by the invention, and Fig. 2 is pedestrian detection side provided by the invention The method flow diagram of method, specifically comprises the following steps:
First, image to be detected is inputted, if input is video data, we need first to locate video data in advance Manage as multiple still images, still image is inputted to detection respectively.
Second, feature extraction.We are entered using the good ImageNet image classification depth models of pre-training to input picture Row feature extraction, currently used model have tool to be of five storeys ZF models (Zeiler, Matthew D., the and Rob of convolutional layer Fergus."Visualizing and understanding convolutional networks."European Conference on computer vision.Springer, Cham, 2014.), there is the VGG16 and tool of 13 layers of convolutional layer There are residual error network ResNet101 (He, Kaiming, et al. " the Deep residual learning for of 101 layers of convolutional layer image recognition."Proceedings of the IEEE conference on computer vision and Pattern recognition.2016), due to VGG16 have in training speed and precision aspect it is more good, here I From VGG16 as feature extraction network.The parameter of 13 layers of convolutional layer is as shown in table 1 in VGG16.
Convolutional layer is set with pond layer parameter in the VGG16 of table 1
Layer title Type Convolution kernel size Step-length Padding Convolution kernel number
conv1_1 Convolutional 3 1 1 64
conv1_2 Convolutional 3 1 1 64
pool1 Pooling 2 2 0 128
conv2_1 Convolutional 3 1 1 128
conv2_2 Convolutional 3 1 1 128
pool2 Pooling 2 2 0 256
conv3_1 Convolutional 3 1 1 256
conv3_2 Convolutional 3 1 1 256
conv3_3 Convolutional 3 1 1 256
pool3 Pooling 2 2 0 512
conv4_1 Convolutional 3 1 1 512
conv4_2 Convolutional 3 1 1 512
conv4_3 Convolutional 3 1 1 512
pool4 Pooling 2 2 0 512
conv5_1 Convolutional 3 1 1 512
conv5_2 Convolutional 3 1 1 512
conv5_3 Convolutional 3 1 1 512
pool5 Pooling 2 2 0 512
3rd, pre-selection frame extraction.The extraction of pre-selection frame is to be referred to as extracted region network by one in Faster R-CNN RPN module is completed.The module utilizes the space that a size is n × n on the characteristic pattern conv5_3 of last layers of VGG16 Window using step-length as 1 speed along it is long and it is wide slide, our pre-set p kinds yardstick and q kind length-width ratios then often slide into One position, we predict the reference block (being referred to as anchor box) for producing k=p × q fixed qty simultaneously, for Size is w × h characteristic pattern, can produce w × h × k reference block altogether.By experiment, we set as shown in table 2 10 kinds of yardsticks and a kind of length and width, so each position can produce 10 various sizes of reference blocks.We give each reference block A fraction is distributed, the possibility for including target in the pre-selection frame corresponding to the reference block is represented, is arranged from high to low according to fraction Sequence, 2000 most possibly preselect frame RoI comprising pedestrian before reservation.
The setting of the reference block yardstick of table 2 and length-width ratio
Yardstick Length-width ratio Yardstick Length-width ratio
2.02 2.44 5.71222 2.44
2.62 2.44 7.42592 2.44
3.382 2.44 9.6532 2.44
3.54312 2.44 12.54972 2.44
4.3942 2.44 16.31462 2.44
4th, image context information extraction.Fig. 3 is that image context information extracts schematic diagram in the present invention, for every One RoI, we are first by its RoI area with middle auxocardia l (l>1) again, then in last layer of characteristic pattern conv5_ 3, the operation of RoI pondizations is carried out to the RoI after expansion in relevant position, to extract contextual information features of the RoI in the position, Here l we be arranged to 1.5.Due to the feature classifiers in Faster R-CNN classification when merely with pedestrian feature, OK The peripheral region of people, which usually contains, more can help grader to do the useful information adjudicated, thus expand after RoI include than The former more information of RoI, so as to help grader " seeing " to obtain more extensively a bit, and make more correct judgement.
Wherein RoI pondizations operation is completed by RoI ponds layer, for being solid by the Feature Conversion in any effective RoI Wide H × the W of fixed length smaller characteristic pattern (for example, 7 × 7), wherein H and W are independently of any specific RoI layer hyper parameter.This In RoI refer to that RPN caused pre-selection frame, each RoI on characteristic pattern define x, y, w, h by 4 dimensional vectors, represent its upper left corner (x, y) and length and width (h, w).For RoI ponds layer by the way that h × w RoI windows are divided into H × W grid, each grid is size ForSubwindow, then to each subwindow carry out maximum extraction, as the output of respective window, equivalent to standard Maximum pond layer.
5th, the extraction of image multi-stage characteristics.For each RoI, we respectively to conv3_3 caused by VGG16, Conv4_3, conv5_3 three-level characteristic pattern, features at different levels are extracted respectively to the RoI using the operation of RoI pondizations in relevant position.By Feature is only extracted on conv5_3 in FasterR-CNN acquiescences, for small size pedestrian, the feature on conv5_3 is very thick It is rough, cause its can not detection image small-medium size pedestrian well, cause Faster R-CNN performances on pedestrian's test problems Not good enough, false drop rate is higher, so by the way that the fine feature of low layer and high-rise coarse feature are combined together, enables to small The feature of size pedestrian also can more be enriched, so as to improve Detection results of the Faster R-CNN to small pedestrian.
6th, feature connection.Due to extract image contextual characteristics, each have different models between multi-stage characteristics Number and yardstick, simply they link together along channel dimension can be so that detection performance declines, because numerical value is big Feature can occupy leading position, so in order to solve this problem, we are first to four groups of features (3 grades of feature+contextual features) L is carried out according to formula 12Normalization.
Assuming that being characterized as that d is tieed up, x=(x are designated as1,x2,…,xd), we are using formula 1 come regular this feature:
Wherein, It is characterized x normalized value.
But only regular feature can change the yardstick of each feature and can slow down pace of learning, it is therefore necessary to be Each passage of input introduces a zoom factor γi, the normalized value being so scaled isIn training rank Section, we are that each feature x and each zoom factor γ individually learn using back-propagation algorithm and chain rule, so By training each feature to have similar norm and yardstick.
By the way that feature is carried out into L2After normalization operations and zoom operations, training can be caused more stable and can be with Improving performance., will be then by L finally along channel dimension2Contextual feature after normalization operations and zoom operations with it is more Level feature links together, and forms final pedestrian's feature, is classified for grader.
7th, detection.Final pedestrian's feature after combination is compressed first with 1 × 1 convolutional layer, by feature Boil down to 512 × 7 × 7 tie up, be then sent in grader classified and callout box return, set threshold value, will be greater than The positive sample of threshold value and its corresponding coordinate position output, so as to reach the purpose of pedestrian detection.
It is the specific embodiment party that the present invention combines image context information and multi-stage characteristics carry out pedestrian detection above Case.The training and test of above-described embodiment are all carried out on single scale image, without selecting scale pyramid strategy, sheet Method is trained on Caltech-10x data sets, is trained 60,000 times with learning rate lr=0.001, then with learning rate lr= 0.0001 training 20,000 times, then has in Caltech data sets and is tested on the test set of 4024 images, and Assessed under the conditions of Reasonable (pedestrian level is more than 50 pixels, be blocked region be no more than 35%), evaluation criteria choosing With FPPI- false drop rates, table 3 is the inventive method and other assessments of six algorithms on Caltech data sets in contrast As a result, all score value all represents average false drop rate here, and score value is lower to represent that algorithm performance is better.As a result show, the present invention The false drop rate of method is less than most methods, and 6% point only higher than the false drop rate of current effect best method, it is sufficient to says The superiority of bright this method performance.
Test evaluation result of the distinct methods of table 3 on Caltech data sets
Algorithm title VJ[1] HOG[2] ACF++[3] Checkboards[4] LDCF[5] RPN+BF[6] F-DNN[7] Ours
False drop rate 94.7% 68.5% 17.7% 17.1% 15.0% 9.6% 8.2% 14.0%
Finally, Fig. 4 is more visual after being detected using the present invention to parts of images in Caltech test sets shows Example.
The existing method for being used to contrast in table 3 is documented in following corresponding document respectively:
[1]Viola,Paul,and Michael J.Jones."Robust real-time face detection." International journal of computer vision 57.2(2004):137-154.
[2]Dalal,Navneet,and Bill Triggs."Histograms of oriented gradients for human detection."Computer Vision and Pattern Recognition,2005.CVPR 2005.IEEE Computer Society Conference on.Vol.1.IEEE,2005.
[3]Ohn-Bar,Eshed,and Mohan M.Trivedi."To boost or not to boostOn the limits of boosted trees for object detection."Pattern Recognition(ICPR),2016 23rd International Conference on.IEEE,2016.
[4]Zhang,Shanshan,Rodrigo Benenson,and Bernt Schiele."Filtered channel features for pedestrian detection."Computer Vision and Pattern Recognition(CVPR),2015IEEE Conference on.IEEE,2015.
[5]Nam,Woonhyun,Piotr Dollár,and Joon Hee Han."Local decorrelation for improved pedestrian detection."Advances in Neural Information Processing Systems.2014.
[6]Zhang,Liliang,et al."Is faster r-cnn doing well for pedestrian detection."European Conference on Computer Vision.Springer International Publishing,2016.
[7]Du,Xianzhi,et al."Fused DNN:A deep neural network fusion approach to fast and robust pedestrian detection."Applications of Computer Vision (WACV),2017IEEE Winter Conference on.IEEE,2017.
It should be noted that the purpose for publicizing and implementing example is that help further understands the present invention, but the skill of this area Art personnel are appreciated that:Do not departing from the present invention and spirit and scope of the appended claims, various substitutions and modifications are all It is possible.Therefore, the present invention should not be limited to embodiment disclosure of that, and the scope of protection of present invention is with claim The scope that book defines is defined.

Claims (9)

1. the pedestrian detection method of a kind of combination image context information and image multi-stage characteristics, to image to be detected, utilizes figure As the progress feature extraction of depth of assortment model, characteristic pattern caused by each layer of convolutional layer in depth model is stored in internal memory In, extracted region network RPN is performed on last layer of characteristic pattern, obtains multiple high quality region of interest that may include pedestrian Domain RoI, that is, preselect frame;For each RoI, contextual information spy is extracted in relevant position to last layer of characteristic pattern first Sign, image multi-stage characteristics extraction is then carried out, extract the RoI feature of relevant position and the multistage spy of composition on multiple characteristic patterns Sign;Contextual information is linked together with multi-stage characteristics by channel dimension, is transported in grader and is carried out classification based training and determine Position detection;By constantly training identification pedestrian, thus reach the purpose accurately detected to pedestrian in image.
2. pedestrian detection method as claimed in claim 1, it is characterized in that, described image depth of assortment model is by document (Simonyan,Karen,and Andrew Zisserman."Very deep convolutional networks for large-scale image recognition."arXiv preprint arXiv:1409.1556 (the 2014)) tool recorded There are the VGG16 depth convolutional network models of 13 layers of convolutional layer;Last layer of characteristic pattern is conv5_3.
3. pedestrian detection method as claimed in claim 2, it is characterized in that, pre-selection frame extraction is specifically:Using a size be n × N spatial window, slided on last layer of characteristic pattern conv5_3 using the speed that step-length is 1 along long and width, often slide into one Position, while predict and produce k different scale, the reference block anchor box of different length-width ratios;For each pre-selection frame, according to The possibility comprising target predicts a fraction in the pre-selection frame;Sorted from high to low according to fraction, reservation is above multiple most to be had The pre-selection frame RoI of pedestrian may be included.
4. pedestrian detection method as claimed in claim 2, it is characterized in that, contextual information is extracted in relevant position to conv5_3 Feature, it is specifically:For each pre-selection frame RoI, in last layer of characteristic pattern conv5_3, using equivalent to l (l>1) again The RoI pondizations operation of the pre-selection frame RoI areas, extraction obtain contextual information features of the RoI in relevant position.
5. pedestrian detection method as claimed in claim 2, it is characterized in that, the extraction of image multi-stage characteristics is specific to each pre-selection Frame RoI, respectively to conv3_3, conv4_3, conv5_3 three-level characteristic pattern caused by VGG16, RoI ponds are utilized in relevant position Changing operation, extraction obtains features at different levels respectively, and forms multi-stage characteristics.
6. pedestrian detection method as claimed in claim 2, it is characterized in that, feature connection is:To the image context spy extracted Sign carries out L respectively with image multi-stage characteristics2Normalization operations and zoom operations, then feature is connected to one along channel dimension Rise, i.e., combined contextual feature with multi-stage characteristics so that the feature of acquisition includes more information;Specifically include following process:
1) L is carried out according to formula 1 to 3 grades of features and contextual feature2Normalization:
If being characterized as that d is tieed up, x=(x are designated as1,x2,…,xd), using formula 1 come regular this feature:
Wherein, It is characterized x normalized value;
2) each passage for input introduces a zoom factor γi, the normalized value being scaled is
3) along channel dimension, by L2Contextual feature after normalization operations and zoom operations links together with multi-stage characteristics, Final pedestrian's feature is formed, is classified for grader;
4) it is that each feature x and each zoom factor γ are mono- using back-propagation algorithm and chain rule in the training stage Solely study so that by training each feature to have similar norm and yardstick.
7. pedestrian detection method as claimed in claim 2, it is characterized in that, when detecting pedestrian, the feature combined is sent to classification Classification is carried out in device and callout box returns, testing result is the possibility score value and process that the pre-selection frame is classified as pedestrian's classification Pre-selection frame coordinate value after callout box recurrence, sets point threshold, will be greater than the pre-selection frame of threshold value and corresponding coordinate position Output, thus reach the purpose of pedestrian detection.
8. pedestrian detection method as claimed in claim 7, it is characterized in that, it is right before the feature combined is sent in grader The convolutional layer of characteristic use 1 × 1 is compressed, and is 512 × 7 × 7 dimensions by Feature Compression.
9. pedestrian detection method as claimed in claim 7, it is characterized in that, point threshold is arranged to 0.01.
CN201710624030.3A 2017-07-27 2017-07-27 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics Pending CN107463892A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710624030.3A CN107463892A (en) 2017-07-27 2017-07-27 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710624030.3A CN107463892A (en) 2017-07-27 2017-07-27 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics

Publications (1)

Publication Number Publication Date
CN107463892A true CN107463892A (en) 2017-12-12

Family

ID=60547587

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710624030.3A Pending CN107463892A (en) 2017-07-27 2017-07-27 Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics

Country Status (1)

Country Link
CN (1) CN107463892A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416314A (en) * 2018-03-16 2018-08-17 中山大学 The important method for detecting human face of picture
CN108520229A (en) * 2018-04-04 2018-09-11 北京旷视科技有限公司 Image detecting method, device, electronic equipment and computer-readable medium
CN108545021A (en) * 2018-04-17 2018-09-18 济南浪潮高新科技投资发展有限公司 A kind of auxiliary driving method and system of identification special objective
CN108549852A (en) * 2018-03-28 2018-09-18 中山大学 Pedestrian detector's Auto-learning Method under special scenes based on the enhancing of depth network
CN108681718A (en) * 2018-05-20 2018-10-19 北京工业大学 A kind of accurate detection recognition method of unmanned plane low target
CN108875537A (en) * 2018-02-28 2018-11-23 北京旷视科技有限公司 Method for checking object, device and system and storage medium
CN108898047A (en) * 2018-04-27 2018-11-27 中国科学院自动化研究所 The pedestrian detection method and system of perception are blocked based on piecemeal
CN109101908A (en) * 2018-07-27 2018-12-28 北京工业大学 Driving procedure area-of-interest detection method and device
CN109377474A (en) * 2018-09-17 2019-02-22 苏州大学 A kind of macula lutea localization method based on improvement Faster R-CNN
CN109583517A (en) * 2018-12-26 2019-04-05 华东交通大学 A kind of full convolution example semantic partitioning algorithm of the enhancing suitable for small target deteection
CN109697257A (en) * 2018-12-18 2019-04-30 天罡网(北京)安全科技有限公司 It is a kind of based on the network information retrieval method presorted with feature learning anti-noise
CN109886286A (en) * 2019-01-03 2019-06-14 武汉精测电子集团股份有限公司 Object detection method, target detection model and system based on cascade detectors
CN110020658A (en) * 2019-03-28 2019-07-16 大连理工大学 A kind of well-marked target detection method based on multitask deep learning
WO2019148362A1 (en) * 2018-01-31 2019-08-08 富士通株式会社 Object detection method and apparatus
CN110135243A (en) * 2019-04-02 2019-08-16 上海交通大学 A kind of pedestrian detection method and system based on two-stage attention mechanism
CN110263712A (en) * 2019-06-20 2019-09-20 江南大学 A kind of coarse-fine pedestrian detection method based on region candidate
CN110334622A (en) * 2019-06-24 2019-10-15 电子科技大学 Based on the pyramidal pedestrian retrieval method of self-adaptive features
CN110751181A (en) * 2019-09-23 2020-02-04 华中科技大学 Target identification method based on sum pooling characteristics
CN110826392A (en) * 2019-09-17 2020-02-21 安徽大学 Cross-modal pedestrian detection method combined with context information
CN110929668A (en) * 2019-11-29 2020-03-27 珠海大横琴科技发展有限公司 Commodity detection method and device based on unmanned goods shelf
CN111401418A (en) * 2020-03-05 2020-07-10 浙江理工大学桐乡研究院有限公司 Employee dressing specification detection method based on improved Faster r-cnn
CN113168705A (en) * 2018-10-12 2021-07-23 诺基亚技术有限公司 Method and apparatus for context-embedded and region-based object detection

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106096605A (en) * 2016-06-02 2016-11-09 史方 A kind of image obscuring area detection method based on degree of depth study and device
CN106127173A (en) * 2016-06-30 2016-11-16 北京小白世纪网络科技有限公司 A kind of human body attribute recognition approach based on degree of depth study
CN106778854A (en) * 2016-12-07 2017-05-31 西安电子科技大学 Activity recognition method based on track and convolutional neural networks feature extraction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096605A (en) * 2016-06-02 2016-11-09 史方 A kind of image obscuring area detection method based on degree of depth study and device
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106127173A (en) * 2016-06-30 2016-11-16 北京小白世纪网络科技有限公司 A kind of human body attribute recognition approach based on degree of depth study
CN106778854A (en) * 2016-12-07 2017-05-31 西安电子科技大学 Activity recognition method based on track and convolutional neural networks feature extraction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHENCHEN ZHU,ET AL: "CMS-RCNN Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection", 《ARXIV PREPRINT》 *
SHAOQING REN ET AL: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111095295B (en) * 2018-01-31 2021-09-03 富士通株式会社 Object detection method and device
CN111095295A (en) * 2018-01-31 2020-05-01 富士通株式会社 Object detection method and device
WO2019148362A1 (en) * 2018-01-31 2019-08-08 富士通株式会社 Object detection method and apparatus
CN108875537A (en) * 2018-02-28 2018-11-23 北京旷视科技有限公司 Method for checking object, device and system and storage medium
CN108416314A (en) * 2018-03-16 2018-08-17 中山大学 The important method for detecting human face of picture
CN108416314B (en) * 2018-03-16 2022-03-08 中山大学 Picture important face detection method
CN108549852A (en) * 2018-03-28 2018-09-18 中山大学 Pedestrian detector's Auto-learning Method under special scenes based on the enhancing of depth network
CN108549852B (en) * 2018-03-28 2020-09-08 中山大学 Specific scene downlink person detector automatic learning method based on deep network enhancement
CN108520229B (en) * 2018-04-04 2020-08-07 北京旷视科技有限公司 Image detection method, image detection device, electronic equipment and computer readable medium
CN108520229A (en) * 2018-04-04 2018-09-11 北京旷视科技有限公司 Image detecting method, device, electronic equipment and computer-readable medium
CN108545021A (en) * 2018-04-17 2018-09-18 济南浪潮高新科技投资发展有限公司 A kind of auxiliary driving method and system of identification special objective
CN108898047A (en) * 2018-04-27 2018-11-27 中国科学院自动化研究所 The pedestrian detection method and system of perception are blocked based on piecemeal
CN108898047B (en) * 2018-04-27 2021-03-19 中国科学院自动化研究所 Pedestrian detection method and system based on blocking and shielding perception
CN108681718A (en) * 2018-05-20 2018-10-19 北京工业大学 A kind of accurate detection recognition method of unmanned plane low target
CN108681718B (en) * 2018-05-20 2021-08-06 北京工业大学 Unmanned aerial vehicle low-altitude target accurate detection and identification method
CN109101908A (en) * 2018-07-27 2018-12-28 北京工业大学 Driving procedure area-of-interest detection method and device
CN109377474B (en) * 2018-09-17 2021-06-15 苏州大学 Macular positioning method based on improved Faster R-CNN
CN109377474A (en) * 2018-09-17 2019-02-22 苏州大学 A kind of macula lutea localization method based on improvement Faster R-CNN
CN113168705A (en) * 2018-10-12 2021-07-23 诺基亚技术有限公司 Method and apparatus for context-embedded and region-based object detection
CN109697257A (en) * 2018-12-18 2019-04-30 天罡网(北京)安全科技有限公司 It is a kind of based on the network information retrieval method presorted with feature learning anti-noise
CN109583517A (en) * 2018-12-26 2019-04-05 华东交通大学 A kind of full convolution example semantic partitioning algorithm of the enhancing suitable for small target deteection
CN109886286A (en) * 2019-01-03 2019-06-14 武汉精测电子集团股份有限公司 Object detection method, target detection model and system based on cascade detectors
CN110020658A (en) * 2019-03-28 2019-07-16 大连理工大学 A kind of well-marked target detection method based on multitask deep learning
CN110135243A (en) * 2019-04-02 2019-08-16 上海交通大学 A kind of pedestrian detection method and system based on two-stage attention mechanism
CN110135243B (en) * 2019-04-02 2021-03-19 上海交通大学 Pedestrian detection method and system based on two-stage attention mechanism
CN110263712B (en) * 2019-06-20 2021-02-23 江南大学 Coarse and fine pedestrian detection method based on region candidates
CN110263712A (en) * 2019-06-20 2019-09-20 江南大学 A kind of coarse-fine pedestrian detection method based on region candidate
CN110334622A (en) * 2019-06-24 2019-10-15 电子科技大学 Based on the pyramidal pedestrian retrieval method of self-adaptive features
CN110334622B (en) * 2019-06-24 2022-04-19 电子科技大学 Pedestrian retrieval method based on adaptive feature pyramid
CN110826392A (en) * 2019-09-17 2020-02-21 安徽大学 Cross-modal pedestrian detection method combined with context information
CN110826392B (en) * 2019-09-17 2023-03-10 安徽大学 Cross-modal pedestrian detection method combined with context information
CN110751181A (en) * 2019-09-23 2020-02-04 华中科技大学 Target identification method based on sum pooling characteristics
CN110929668A (en) * 2019-11-29 2020-03-27 珠海大横琴科技发展有限公司 Commodity detection method and device based on unmanned goods shelf
CN111401418A (en) * 2020-03-05 2020-07-10 浙江理工大学桐乡研究院有限公司 Employee dressing specification detection method based on improved Faster r-cnn

Similar Documents

Publication Publication Date Title
CN107463892A (en) Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics
CN111259850B (en) Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
Younis et al. Real-time object detection using pre-trained deep learning models MobileNet-SSD
CN110348376B (en) Pedestrian real-time detection method based on neural network
CN108304873A (en) Object detection method based on high-resolution optical satellite remote-sensing image and its system
CN108537824B (en) Feature map enhanced network structure optimization method based on alternating deconvolution and convolution
CN103886308B (en) A kind of pedestrian detection method of use converging channels feature and soft cascade grader
CN112446388A (en) Multi-category vegetable seedling identification method and system based on lightweight two-stage detection model
CN109711262B (en) Intelligent excavator pedestrian detection method based on deep convolutional neural network
CN111680655A (en) Video target detection method for aerial images of unmanned aerial vehicle
CN109902806A (en) Method is determined based on the noise image object boundary frame of convolutional neural networks
CN107688808A (en) A kind of quickly natural scene Method for text detection
CN108154102A (en) A kind of traffic sign recognition method
CN110197152A (en) A kind of road target recognition methods for automated driving system
CN103854016A (en) Human body behavior classification and identification method and system based on directional common occurrence characteristics
CN107315990A (en) A kind of pedestrian detection algorithm based on XCS LBP features and cascade AKSVM
CN111368775A (en) Complex scene dense target detection method based on local context sensing
US20240161315A1 (en) Accurate and robust visual object tracking approach for quadrupedal robots based on siamese network
Wei et al. Traffic sign detection and recognition using novel center-point estimation and local features
CN112329861A (en) Layered feature fusion method for multi-target detection of mobile robot
Ishioka et al. Single camera worker detection, tracking and action recognition in construction site
CN110188811A (en) Underwater target detection method based on normed Gradient Features and convolutional neural networks
Cao et al. Foreign object debris detection on airfield pavement using region based convolution neural network
Kheder et al. Transfer learning based traffic light detection and recognition using CNN inception-V3 model
Yang et al. Real-time pedestrian detection for autonomous driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171212

WD01 Invention patent application deemed withdrawn after publication