CN110472542A - A kind of infrared image pedestrian detection method and detection system based on deep learning - Google Patents

A kind of infrared image pedestrian detection method and detection system based on deep learning Download PDF

Info

Publication number
CN110472542A
CN110472542A CN201910716970.4A CN201910716970A CN110472542A CN 110472542 A CN110472542 A CN 110472542A CN 201910716970 A CN201910716970 A CN 201910716970A CN 110472542 A CN110472542 A CN 110472542A
Authority
CN
China
Prior art keywords
network
infrared image
detection
fidn
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910716970.4A
Other languages
Chinese (zh)
Inventor
孙立坤
林保均
王忠荣
焦玉海
吕建峰
时文忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Beidou Communications Technology Co Ltd
Original Assignee
Shenzhen Beidou Communications Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Beidou Communications Technology Co Ltd filed Critical Shenzhen Beidou Communications Technology Co Ltd
Priority to CN201910716970.4A priority Critical patent/CN110472542A/en
Publication of CN110472542A publication Critical patent/CN110472542A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The present invention provides a kind of infrared image pedestrian detection method and detection system based on deep learning, belongs to technical field of computer vision.Infrared image pedestrian detection method of the present invention includes the following steps: to obtain data and data prediction;Target detection FIDN network is constructed based on convolutional neural networks;Target detection FIDN network is constructed based on convolutional neural networks;Predict that the present invention also provides a kind of detection systems for realizing the infrared image pedestrian detection method based on optimal models.The invention has the benefit that ensure high-precision while being able to satisfy requirement of real-time, strong robustness.

Description

A kind of infrared image pedestrian detection method and detection system based on deep learning
Technical field
The present invention relates to a kind of image detecting method more particularly to a kind of infrared image pedestrian detections based on deep learning Method and detection system.
Background technique
Target detection is an important project in computer vision field, main task be positioned from image it is interested Target, need accurately to judge the specific category of each target, and provide the bounding box of each target.Due to visual angle, block, The factors such as posture cause target, and deformation occurs, and target detection is caused to become a challenging task.
Conventional target detection method be broadly divided into pretreatment, window sliding, feature extraction, feature selecting, tagsort and Post-process six steps.Conventional target detection generally by some preferable manual features are designed, then using classifier into Row classification.As the requirement of target detection accuracy and speed is higher and higher, conventional method is no longer satisfied demand.In recent years, Depth learning technology is widely used, and produces a series of algorithm of target detection, such as RCNN, Fast-RCNN, Faster-RCNN, YOLO, SSD and its a series of derivative algorithms, but these detection techniques or since precision is low or detection is time-consuming It is too long to be applied in commercial product well.Current algorithm of target detection is difficult meet the needs of practical application, In Scientific research field, most of researcher only focus on target detection precision (using mAP (Mean Average Precision, average essence Spend mean value) measurement), very complicated network can be designed and add some very complicated methods and some training skills, then open Obtain a preferable achievement on data set, but this is difficult to be applied directly to and goes in practice.Infrared imaging is by infrared biography The thermal imaging performance of sensor obtains image, is solely dependent upon the temperature and its heat radiated of object.Therefore at night, rain In the insufficient situation of the light intensities such as it or haze, infrared image has apparent advantage compared to visible images.Human body target It is all the research hotspot of target following and detection field all the time as factor main, most active in environment, and human body Target it is non-rigid, in addition the shortcomings that infrared image itself, so that the pedestrian detection based on infrared image is filled with difficulty and chooses War.
Summary of the invention
To solve the problems of the prior art, the present invention provides a kind of infrared image pedestrian detection side based on deep learning Method and detection system, it is ensured that high-precision while being able to satisfy requirement of real-time.
The present invention is based on the infrared image pedestrian detection methods of deep learning to include the following steps:
Step S1: data and data prediction are obtained: obtains the infrared image comprising pedestrian, infrared image is located in advance Reason, and pretreated infrared image is manually marked, the training set of detection model is then divided into according to setting ratio Collect with verifying;
Step S2: based on convolutional neural networks building target detection FIDN network: the target detection FIDN network includes Several layers convolutional layer and maximum pond layer, and be arranged in convolutional layer and the subsequent expansion convolutional layer of maximum pond layer, convolutional layer In stacking, when port number reaches setting value, the port number for expanding convolutional layer is not further added by;
Step S3: model training: model training is carried out to target detection FIDN network using training set, and selects and is verifying Collection shows optimal optimal models;
Step S4: optimal models prediction: being based on optimal models, predicted on GPU server, and realization flows into video Row target detection.
The present invention is further improved, and in step S2, the target detection FIDN network further includes that self-adaptive features figure is logical Trace weighting module, channel weighting of the setting in expansion convolutional layer output end, for the characteristic pattern to expansion convolutional layer output.
The present invention is further improved, the processing method of the self-adaptive features figure channel weighting module are as follows:
A1: using a global pool layer characteristic pattern boil down to 1*1*C, wherein the port number of C expression characteristic pattern;
A2: using full articulamentum port number boil down to C/16;
A3: by Relu activation primitive, port number is reduced to C using full articulamentum;
A4: output result connects sigmoid active coating, the weight vectors of a 1*1*C is obtained, at sigmoid function It manages, the weight value in the weight vectors is between 0-1;
A5: characteristic pattern channel dimension is weighted using weight.
The present invention is further improved, and in step S1, the pretreatment includes median filter process, and median filtering formula is such as Under:
G (x, y)=median { f (x-k, y-l), (k, l) ∈ W }
Wherein, f (x, y) and g (x, y) is respectively image after original image and processing, and W is two dimension pattern plate.
The present invention is further improved, and artificial mark is that the pedestrian in each picture is used rectangle using annotation tool Circle goes out, and rectangle frame is the minimum circumscribed rectangle of target pedestrian, and the corresponding XML file generated records in figure in XML file The coordinate of each target includes top left co-ordinate x, top left co-ordinatey, width w and height h, at the same delete picture blur or It is difficult to the picture marked, by above-mentioned data mixing, the ratio cut partition according to 9:1 is that the training set of detection model and verifying collect.
The present invention is further improved, and in step S2, the target detection FIDN network is by 7 layers of 1*1 convolution or 3*3 volumes The full convolutional network that network is constituted is accumulated, the candidate frame on image is directly to generate on original image, and generation method is as follows:
Original image: being directly divided into S*S region by B1, and wherein S is the size of the characteristic pattern of the last one convolution;
B2: in the different candidate frame of each Area generation several length-width ratios, specific length-width ratio is marked according to data set Rectangle frame is obtained using k-means algorithm;
B3: being distributed according to the size that real data collection calculates priori candidate frame, use (1-IoU) as distance metric, Middle IoU indicates the friendship of area between priori candidate frame and the rectangle frame of label and ratio, calculation formula are as follows:
Wherein, A indicates that priori candidate frame, B indicate that the rectangle frame of label, ∩ indicate the intersection of A and B, and ∪ indicates A's and B Union.
The present invention is further improved, and the target detection FIDN network is using lightweight convolutional neural networks as backbone network Network predicted according to algorithm of target detection using the convolution of a 1*1, the positioning loss function of the algorithm of target detection Are as follows:
Wherein, λ is coefficient of the control positioning loss in total loss accounting, and S indicates the characteristic pattern of last convolution Size, A indicate the number of each Area generation anchor frame,It is a 0-1 function, if there is target in the region of the i-th row j column, Value is 1, otherwise value 0, x, y, h, and w respectively indicates the height and width of the coordinate of central point, prediction block, wherein lower marker tape ^ is indicated It is true value, the expression predicted value not with ^.
The present invention is further improved, and in step S3, the model training refers to training of starting from scratch, and weight parameter uses The method of random initializtion carries out data enhancement operations to data by left and right overturning, random cropping, color jitter, by not Disconnected regularized learning algorithm rate, batch size, optimization method hyper parameter carry out training objective detection FIDN network.
The present invention is further improved, in step S4, the prediction technique are as follows: the forward direction for constructing network infers process, defeated Enter parameter be image data, be returned as prediction result, to video carry out target detection when, be added Kalman filter carry out with Track.
The present invention also provides a kind of detection systems for realizing the infrared image pedestrian detection method, comprising:
Obtain data module: for obtaining the infrared image comprising pedestrian;
Data preprocessing module: people is carried out for pre-processing to infrared image, and to pretreated infrared image Then work mark is divided into the training set of detection model according to setting ratio and verifying collects;
Construct target detection FIDN network module: for constructing target detection FIDN network, institute based on convolutional neural networks Stating target detection FIDN network includes several layers convolutional layer and maximum pond layer, and setting is behind convolutional layer and maximum pond layer Expansion convolutional layer, in the stacking of convolutional layer, when port number reaches setting value, the port number for expanding convolutional layer is not further added by;
Model training module: it for carrying out model training to target detection FIDN network using training set, and selects and is testing Card collection shows optimal optimal models;
Optimal models prediction module: being based on optimal models, predicted on GPU server, realizes and carries out to video flowing Target detection.
Compared with prior art, the beneficial effects of the present invention are: taking full advantage of the high property of deep learning accuracy, Shandong Stick is good, can adapt to the various change of external environment.By design construction FIDN network, precision with higher and extremely low Calculation amount can achieve 180fps on GPU, have 18fps or so on CPU, ensure that the requirement of real-time, has Very high practicability.
Detailed description of the invention
Fig. 1 is the method for the present invention flow chart;
Fig. 2 is target detection FIDN schematic network structure;
Fig. 3 is characterized figure channel weighting resume module method flow diagram;
Fig. 4 is former infrared image;
Fig. 5 is the image after detection.
Specific embodiment
The present invention is described in further details with reference to the accompanying drawings and examples.
As shown in Figure 1, the method for the present invention constructs FIDN (Fast-Infared-Detect-Network, fast infrared mesh Mark detection) deep neural network, include the following steps:
Step S1: data and data prediction are obtained: obtains the infrared image comprising pedestrian, infrared image is located in advance Reason, and pretreated infrared image is manually marked, the training set of detection model is then divided into according to setting ratio Collect with verifying.
After obtaining the largely picture comprising pedestrian, because the usual image quality of infrared image is bad, need to do some pre- places Then reason carries out artificial mark for the infrared image after processing, mark includes two parts, and target category and target are surrounded Frame.
Step S2: based on convolutional neural networks building target detection FIDN network (abbreviation FIDN network): the target inspection Surveying FIDN network includes several layers convolutional layer and maximum pond layer, and setting in convolutional layer and the subsequent expansion volume of maximum pond layer Lamination, in the stacking of convolutional layer, when port number reaches setting value, the port number for expanding convolutional layer is not further added by;
Step S3: model training: model training is carried out to target detection FIDN network using training set, and selects and is verifying Collection shows optimal optimal models;
Step S4: optimal models prediction: being based on optimal models, predicted on GPU server, and realization flows into video Row target detection can achieve 180fps (video real-time detection speed, the frame number of detection per second) or more, specifically on GPU Pre- flow gauge is shown in Fig. 3.
In step sl, the pretreatment includes median filter process.Due to by external environment and infrared camera imaging Principle influences, and infrared image imaging process can generate more noise, cause picture imaging quality bad, and clarity is inadequate, increases Add the difficulty to pedestrian detection and identification, so starting to pre-process image and filter out noise.Median filtering formula It is as follows:
G (x, y)=median { f (x-k, y-l), (k, l) ∈ W }
Wherein, f (x, y) and g (x, y) is respectively image after original image and processing, and W is two dimension pattern plate, and k, l are respectively W In two dimension value.
The artificial mark of this example refers to: all being outlined the pedestrian in each picture with rectangle frame using annotation tool, square Shape frame is the minimum circumscribed rectangle of target pedestrian, the corresponding XML file generated.In XML file, each target in figure is recorded Coordinate includes top left co-ordinate x, top left co-ordinate y, width w and height h, while deleting what picture blur or be difficult to marked Picture.By above-mentioned data mixing, the ratio cut partition according to 9:1 is that the training set of detection model and verifying collect, and training set is used for mould Type training, verifying collection is not involved in model training, for verifying the training effect of model.
In step S2, the FIDN network is the full convolutional network being made of 7 layers of 1*1 convolution or 3*3 convolutional network. The whole flow process of this method is a single phase detector, without specially generating candidate frame, the candidate frame of this method in a network It is directly to be generated in original image, generation method is as follows, and original image is directly divided into S*S part, and (wherein S is the last one volume The size of long-pending characteristic pattern, usually 13*13, original image are 416*416), it is then different in 5 length-width ratios of each Area generation Candidate frame, specific length-width ratio are to be obtained according to data set indicia framing using k-means algorithm.It is calculated according to real data collection The size of anchors (priori candidate frame) is distributed, which is obtained by K-means algorithm, uses (1-IoU) as apart from degree Amount, wherein IoU indicates the friendship of area and ratio between priori candidate frame and indicia framing.Calculation formula is as follows:
Wherein, A indicates that priori candidate frame, B indicate that the rectangle frame of label, ∩ indicate the intersection of A and B, and ∪ indicates A's and B Union.
As shown in Fig. 2, wherein conv indicates that convolutional layer, Dilated conv indicate expansion convolution, maxpool is maximum value Chi Hua, predicted portions are the convolution of a 1*1, and target detection FIDN network described in the target detection FIDN network of this example includes 5 Layer convolutional layer and maximum pond layer, and setting is in convolutional layer and the subsequent 2 expansions convolutional layer of maximum pond layer, the heap of convolutional layer In folded, when port number reaches setting value 256, the port number 256 for expanding convolutional layer is not further added by.
Using Dilated Convolution (expansion convolution), the great advantage for expanding convolution exists most latter two convolutional layer In the operation for not doing pond or down-sampling, receptive field can be increased, each convolution output is allowed to include large range of information, Retain the spatial information of biggish characteristic pattern and image as far as possible simultaneously, this is very crucial for small target deteection.For target Test problems, can great retaining space information using expansion convolution.When using expansion convolution, since characteristic pattern does not reduce, Calculation amount can be significantly greatly increased in this, different from general network structure, and FIDN network in the last one module, lead to by all convolution Road number is both configured to 256, and due to having compressed the number of plies, we attached a self-adaptive features figure channel after this layer of convolution and add Module is weighed, self-adaptive features figure channel weighting module, setting is in expansion convolutional layer output end, for expansion convolutional layer output The channel weighting of characteristic pattern.
As shown in figure 3, the processing method of the self-adaptive features figure channel weighting module are as follows:
A1: using a global pool layer characteristic pattern boil down to 1*1*C, wherein C indicates the port number of characteristic pattern, this Place is 256;
A2: using full articulamentum port number boil down to C/16;
A3: connecing Relu activation primitive again, and by Relu activation primitive, port number is reduced to C using full articulamentum;
A4: output result connects sigmoid active coating, is equivalent to have obtained the weight vectors of a 1*1*C, passes through Sigmoid function is handled, and the weight value in the weight vectors is between 0-1, as the output characteristic pattern of convolutional layer before Channel weighting allows network oneself to learn the weight in channel, because there is different role in channel different in characteristic pattern so multichannel With different significance levels;
A5: being weighted characteristic pattern channel dimension using weight,
In Fig. 3, conv indicates that convolutional layer, avgpool indicate that average pond layer, fc indicate full articulamentum, and ReLU expression makes Use relu function as activation primitive, Sigmoid expression uses sigmoid function as active coating.ReWeight indicates basis The weight that the right branch obtains is weighted characteristic pattern channel dimension.
It is demonstrated experimentally that the convolutional layer port number is 256 (being denoted as FIDN-256 network) and port number is 1024 (to be denoted as FIDN-1024 network) it compares, on self-built data set, detection accuracy is respectively 80.1% (FIDN-256 network) and 80.6% (FIND-1024 network).As shown in Figure 2, whole network is using lightweight convolutional neural networks as bone for entire FIDN network structure Dry network, detection part is similar with a most of common step algorithm of target detection, is predicted using a full articulamentum, FIDN is predicted using the convolution of a 1*1.This example is improved in the loss function part of network, in algorithm of target detection In, loss function generally comprises two parts, respectively positioning loss and Classification and Identification loss.Positioning is lost, it is contemplated that Influence of the different size of target detection frame to loss be it is different, therefore, this example be provided with following positioning loss function:
Wherein, λ is a control positioning loss in the coefficient of total loss accounting, and default is 5, because positioning loss is opposite Classification Loss is more important, so accounting is heavier.S indicates the size of the characteristic pattern of last convolution, and A indicates each Area generation anchor frame Number, default is 5,It is a 0-1 function, if there is a target in the region of the i-th row j column, value 1, otherwise value 0.x, y, h, w respectively indicate the coordinate of central point and the height and width of prediction block, wherein the ^ expression of lower marker tape is true value, no band Expression predicted value.
In step S3, the model training refers to training of starting from scratch, because network is smaller, training of directly starting from scratch Quickly, there is no over-fitting risk yet, be trained on data set directly in step sl, weight parameter is all using random yet The method of initialization carries out the data enhancement operations such as flip horizontal, random cropping, color jitter, continuous regularized learning algorithm to data The hyper parameters such as rate, batch size (batch_size), optimization method train FIDN network.
The optimal models are: in training process, every by 1 wheel, (1 wheel refers to that all pictures are all trained to one in data set It is secondary) model of storage, ordinary circumstance, 60 wheel of training.And by the model in verifying collection test, according to the essence of pedestrian detection It spends mAP and selects optimal models.
In step S4, the prediction technique is: the forward direction for constructing network infers process, and forward direction infers the network knot of process Structure is process that is identical, only losing without calculating loss and passback with structure when training.Input parameter be image data, It is returned as prediction result, input picture does a simple pretreatment, is then passed to the input of network, which can be adaptive The picture of any size, network internal can scale automatically.And can centainly be post-processed, target detection is being carried out to video When, it is tracked by the way that Kalman filter is added, so that detection process is more smooth and stablizes.To Fig. 4 by of the invention The result of object detection method detection is as shown in Figure 5.
Of the invention takes full advantage of the high property of deep learning accuracy based on the infrared pedestrian detection method of deep learning, Robustness is good, can adapt to the various change of external environment.By design construction FIDN network, the network have higher precision and Extremely low calculation amount, can achieve 180fps on GPU, have 18fps or so on CPU, ensure that wanting for real-time It asks, there is very high practicability.
The present invention has following two points main innovation point:
(1) new target detection network FIDN is designed.Method proposes a kind of new efficient target detection networks, are used for Infrared image pedestrian detection is a kind of single phase object detection method, and the priori for obtaining data set by k-means method is candidate Then the distribution of frame carries out the positioning of target frame using the method returned.It (does not include channel that whole network, which only has 7 convolutional layers, The part of weighting), comprising some convolutional layers and maximum pond layer, then do not reduce the size of characteristic pattern using expansion convolution finally It is helpful to the precision improvement of pedestrian detection with enough receptive fields.In the stacking of convolutional layer, there is no as general networks that The progress of one straight grip port number of sample is double, and when port number is 256, port number is not just further added by, and can greatly reduce calculating in this way Amount.
(2) self-adaptive features figure channel weighting method is designed.Since in planned network, no picture Normal practice is to channel Number progress is double, and this reduces characteristic pattern port numbers, can there is certain influence on effect, and the present invention devises one adaptively The method of characteristic pattern channel weighting, it is several hundred or even thousands of because the port number of characteristic pattern is usually very much, but the letter of their offers Breath and significance level are different, and the self-adaptive features figure channel weighting method that the present invention designs can pass through network oneself Learn a set of weighting parameters out, be then dissolved into characteristic pattern, and this method has certain versatility, may be added to very much In network, part convolutional layer can be added to unrestricted choice followed by characteristic pattern channel weighting.
The specific embodiment of the above is better embodiment of the invention, is not limited with this of the invention specific Practical range, the scope of the present invention includes being not limited to present embodiment, all equal according to equivalence changes made by the present invention Within the scope of the present invention.

Claims (10)

1. a kind of infrared image pedestrian detection method based on deep learning, which is characterized in that the infrared image pedestrian detection Method includes the following steps:
Step S1: data and data prediction are obtained: obtain the infrared image comprising pedestrian, infrared image is pre-processed, And pretreated infrared image is manually marked, be then divided into the training set of detection model according to setting ratio and is tested Card collection;
Step S2: construct target detection FIDN network based on convolutional neural networks: the target detection FIDN network includes several Layer convolutional layer and maximum pond layer, and setting is in convolutional layer and the subsequent expansion convolutional layer of maximum pond layer, the stacking of convolutional layer In, when port number reaches setting value, the port number for expanding convolutional layer is not further added by;
Step S3: model training: model training is carried out to target detection FIDN network using training set, and is selected in verifying collection table Existing optimal optimal models;
Step S4: optimal models prediction: being based on optimal models, predicted on GPU server, realizes and carries out mesh to video flowing Mark detection.
2. infrared image pedestrian detection method according to claim 1, it is characterised in that: in step S2, the target inspection Surveying FIDN network further includes self-adaptive features figure channel weighting module, and setting is in expansion convolutional layer output end, for rolling up to expansion The channel weighting of the characteristic pattern of lamination output.
3. infrared image pedestrian detection method according to claim 2, it is characterised in that: self-adaptive features figure channel The processing method of weighting block are as follows:
A1: using a global pool layer characteristic pattern boil down to 1*1*C, wherein the port number of C expression characteristic pattern;
A2: using full articulamentum port number boil down to C/16;
A3: by Relu activation primitive, port number is reduced to C using full articulamentum;
A4: output result connects sigmoid active coating, obtains the weight vectors of a 1*1*C, handles by sigmoid function, institute The weight value in weight vectors is stated between 0-1;
A5: characteristic pattern channel dimension is weighted using weight.
4. infrared image pedestrian detection method according to claim 1-3, it is characterised in that: in step S1, institute Stating pretreatment includes median filter process, and median filtering formula is as follows:
G (x, y)=median fx-k, y-l), (k, l) ∈ W }
Wherein, f (x, y) and g (x, y) is respectively image after original image and processing, and W is two dimension pattern plate.
5. infrared image pedestrian detection method according to claim 4, it is characterised in that: artificial mark is using mark work Tool all outlines the pedestrian in each picture with rectangle frame, and rectangle frame is the minimum circumscribed rectangle of target pedestrian, corresponding to generate XML file, in XML file, record figure in each target coordinate, include top left co-ordinate x, top left co-ordinate y, width W and height h, while deleting picture blur or being difficult to the picture marked, by above-mentioned data mixing, according to the ratio cut partition of 9:1 For the training set and verifying collection of detection model.
6. infrared image pedestrian detection method according to claim 5, it is characterised in that: in step S2, the target inspection The full convolutional network that FIDN network is made of 7 layers of 1*1 convolution or 3*3 convolutional network is surveyed, the candidate frame on image is direct It is generated on original image, generation method is as follows:
Original image: being directly divided into S*S region by B1, and wherein S is the size of the characteristic pattern of the last one convolution;
B2: in the different candidate frame of each Area generation several length-width ratios, the rectangle that specific length-width ratio is marked according to data set Frame is obtained using k-means algorithm;
B3: it is distributed according to the size that real data collection calculates priori candidate frame, uses (1-IoU) as distance metric, wherein IoU It indicates the friendship of area between priori candidate frame and the rectangle frame of label and ratio, calculation formula is as follows:
Wherein, A indicates that priori candidate frame, B indicate that the rectangle frame of label, ∩ indicate the intersection of A and B, and ∪ indicates the union of A and B.
7. infrared image pedestrian detection method according to claim 6, it is characterised in that: the target detection FIDN network Using lightweight convolutional neural networks as backbone network, according to algorithm of target detection, predicted using the convolution of a 1*1, The positioning loss function of the algorithm of target detection are as follows:
Wherein, λ is coefficient of the control positioning loss in total loss accounting, and S indicates the size of the characteristic pattern of last convolution, A indicates the number of each Area generation anchor frame,It is a 0-1 function, if there is target in the region of the i-th row j column, value is 1, otherwise value 0, x, y, h, w respectively indicate the height and width of the coordinate of central point, prediction block, wherein lower marker tape ^ expression is true Value, the expression predicted value not with ^.
8. infrared image pedestrian detection method according to claim 1-3, it is characterised in that: in step S3, institute It states model training and refers to training of starting from scratch, the method that weight parameter uses random initializtion passes through left and right overturning, random sanction Cut, color jitter to data carry out data enhancement operations, pass through continuous regularized learning algorithm rate, batch size, the super ginseng of optimization method Number carrys out training objective and detects FIDN network.
9. infrared image pedestrian detection method according to claim 8, it is characterised in that: in step S4, the prediction side Method are as follows: the forward direction for constructing network infers that process, input parameter are image data, are returned as prediction result, is carrying out mesh to video When mark detection, Kalman filter is added and is tracked.
10. a kind of detection system for realizing the described in any item infrared image pedestrian detection methods of claim 1-9, feature exist In, comprising:
Obtain data module: for obtaining the infrared image comprising pedestrian;
Data preprocessing module: it is manually marked for being pre-processed to infrared image, and to pretreated infrared image Then note is divided into the training set of detection model according to setting ratio and verifying collects;
Construct target detection FIDN network module: for constructing target detection FIDN network, the mesh based on convolutional neural networks Mark detection FIDN network includes several layers convolutional layer and maximum pond layer, and setting in convolutional layer and the subsequent expansion of maximum pond layer Convolutional layer, in the stacking of convolutional layer, when port number reaches setting value, the port number for expanding convolutional layer is not further added by;
Model training module: it for carrying out model training to target detection FIDN network using training set, and selects and collects in verifying Show optimal optimal models;
Optimal models prediction module: being based on optimal models, predicted on GPU server, realizes and carries out target to video flowing Detection.
CN201910716970.4A 2019-08-05 2019-08-05 A kind of infrared image pedestrian detection method and detection system based on deep learning Pending CN110472542A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910716970.4A CN110472542A (en) 2019-08-05 2019-08-05 A kind of infrared image pedestrian detection method and detection system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910716970.4A CN110472542A (en) 2019-08-05 2019-08-05 A kind of infrared image pedestrian detection method and detection system based on deep learning

Publications (1)

Publication Number Publication Date
CN110472542A true CN110472542A (en) 2019-11-19

Family

ID=68509998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910716970.4A Pending CN110472542A (en) 2019-08-05 2019-08-05 A kind of infrared image pedestrian detection method and detection system based on deep learning

Country Status (1)

Country Link
CN (1) CN110472542A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105372A (en) * 2019-12-10 2020-05-05 北京都是科技有限公司 Thermal infrared image processor, system, method and apparatus
CN111259736A (en) * 2020-01-08 2020-06-09 上海海事大学 Real-time pedestrian detection method based on deep learning in complex environment
CN112101434A (en) * 2020-09-04 2020-12-18 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112102394A (en) * 2020-09-17 2020-12-18 中国科学院海洋研究所 Remote sensing image ship size integrated extraction method based on deep learning
CN112307955A (en) * 2020-10-29 2021-02-02 广西科技大学 Optimization method based on SSD infrared image pedestrian detection
CN112464884A (en) * 2020-12-11 2021-03-09 武汉工程大学 ADAS infrared night vision method and system
CN112488165A (en) * 2020-11-18 2021-03-12 杭州电子科技大学 Infrared pedestrian identification method and system based on deep learning model
CN112733589A (en) * 2020-10-29 2021-04-30 广西科技大学 Infrared image pedestrian detection method based on deep learning
CN112949633A (en) * 2021-03-05 2021-06-11 中国科学院光电技术研究所 Improved YOLOv 3-based infrared target detection method
CN113159277A (en) * 2021-03-09 2021-07-23 北京大学 Target detection method, device and equipment
CN113408471A (en) * 2021-07-02 2021-09-17 浙江传媒学院 Non-green-curtain portrait real-time matting algorithm based on multitask deep learning
CN114299429A (en) * 2021-12-24 2022-04-08 宁夏广天夏电子科技有限公司 Human body recognition method, system and device based on deep learning

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN109086678A (en) * 2018-07-09 2018-12-25 天津大学 A kind of pedestrian detection method extracting image multi-stage characteristics based on depth supervised learning
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
CN109902677A (en) * 2019-01-30 2019-06-18 深圳北斗通信科技有限公司 A kind of vehicle checking method based on deep learning
CN109961009A (en) * 2019-02-15 2019-07-02 平安科技(深圳)有限公司 Pedestrian detection method, system, device and storage medium based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106096561A (en) * 2016-06-16 2016-11-09 重庆邮电大学 Infrared pedestrian detection method based on image block degree of depth learning characteristic
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
US20190114511A1 (en) * 2017-10-16 2019-04-18 Illumina, Inc. Deep Learning-Based Techniques for Training Deep Convolutional Neural Networks
CN109086678A (en) * 2018-07-09 2018-12-25 天津大学 A kind of pedestrian detection method extracting image multi-stage characteristics based on depth supervised learning
CN109902677A (en) * 2019-01-30 2019-06-18 深圳北斗通信科技有限公司 A kind of vehicle checking method based on deep learning
CN109961009A (en) * 2019-02-15 2019-07-02 平安科技(深圳)有限公司 Pedestrian detection method, system, device and storage medium based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张顺 等: "深度卷积神经网络的发展及其在计算机视觉领域的应用", 《计算机学报》 *
耿磊 等: "结合深度可分离卷积与通道加权的全卷积神经网络视网膜图像血管分割" *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105372A (en) * 2019-12-10 2020-05-05 北京都是科技有限公司 Thermal infrared image processor, system, method and apparatus
CN111259736A (en) * 2020-01-08 2020-06-09 上海海事大学 Real-time pedestrian detection method based on deep learning in complex environment
CN111259736B (en) * 2020-01-08 2023-04-07 上海海事大学 Real-time pedestrian detection method based on deep learning in complex environment
CN112101434B (en) * 2020-09-04 2022-09-09 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112101434A (en) * 2020-09-04 2020-12-18 河南大学 Infrared image weak and small target detection method based on improved YOLO v3
CN112102394A (en) * 2020-09-17 2020-12-18 中国科学院海洋研究所 Remote sensing image ship size integrated extraction method based on deep learning
CN112307955A (en) * 2020-10-29 2021-02-02 广西科技大学 Optimization method based on SSD infrared image pedestrian detection
CN112733589A (en) * 2020-10-29 2021-04-30 广西科技大学 Infrared image pedestrian detection method based on deep learning
CN112488165A (en) * 2020-11-18 2021-03-12 杭州电子科技大学 Infrared pedestrian identification method and system based on deep learning model
CN112464884A (en) * 2020-12-11 2021-03-09 武汉工程大学 ADAS infrared night vision method and system
CN112949633A (en) * 2021-03-05 2021-06-11 中国科学院光电技术研究所 Improved YOLOv 3-based infrared target detection method
CN112949633B (en) * 2021-03-05 2022-10-21 中国科学院光电技术研究所 Improved YOLOv 3-based infrared target detection method
CN113159277A (en) * 2021-03-09 2021-07-23 北京大学 Target detection method, device and equipment
CN113408471A (en) * 2021-07-02 2021-09-17 浙江传媒学院 Non-green-curtain portrait real-time matting algorithm based on multitask deep learning
CN113408471B (en) * 2021-07-02 2023-03-28 浙江传媒学院 Non-green-curtain portrait real-time matting algorithm based on multitask deep learning
CN114299429A (en) * 2021-12-24 2022-04-08 宁夏广天夏电子科技有限公司 Human body recognition method, system and device based on deep learning

Similar Documents

Publication Publication Date Title
CN110472542A (en) A kind of infrared image pedestrian detection method and detection system based on deep learning
CN113065558B (en) Lightweight small target detection method combined with attention mechanism
CN105069746B (en) Video real-time face replacement method and its system based on local affine invariant and color transfer technology
CN109902677A (en) A kind of vehicle checking method based on deep learning
CN107067415B (en) A kind of object localization method based on images match
CN110889324A (en) Thermal infrared image target identification method based on YOLO V3 terminal-oriented guidance
CN109740665A (en) Shielded image ship object detection method and system based on expertise constraint
CN108460403A (en) The object detection method and system of multi-scale feature fusion in a kind of image
CN107204010A (en) A kind of monocular image depth estimation method and system
CN104794737B (en) A kind of depth information Auxiliary Particle Filter tracking
CN109934862A (en) A kind of binocular vision SLAM method that dotted line feature combines
CN107330357A (en) Vision SLAM closed loop detection methods based on deep neural network
CN110533695A (en) A kind of trajectory predictions device and method based on DS evidence theory
CN114220035A (en) Rapid pest detection method based on improved YOLO V4
CN104573731A (en) Rapid target detection method based on convolutional neural network
CN110795982A (en) Apparent sight estimation method based on human body posture analysis
CN110175504A (en) A kind of target detection and alignment schemes based on multitask concatenated convolutional network
CN106023257A (en) Target tracking method based on rotor UAV platform
CN106599994A (en) Sight line estimation method based on depth regression network
CN106991411B (en) Remote Sensing Target based on depth shape priori refines extracting method
CN110197152A (en) A kind of road target recognition methods for automated driving system
CN109887029A (en) A kind of monocular vision mileage measurement method based on color of image feature
CN109344878A (en) A kind of imitative hawk brain feature integration Small object recognition methods based on ResNet
CN110245587B (en) Optical remote sensing image target detection method based on Bayesian transfer learning
CN114036969B (en) 3D human body action recognition algorithm under multi-view condition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191119

RJ01 Rejection of invention patent application after publication