CN113408410A - Traffic sign detection method based on YOLOv4 algorithm - Google Patents

Traffic sign detection method based on YOLOv4 algorithm Download PDF

Info

Publication number
CN113408410A
CN113408410A CN202110676065.8A CN202110676065A CN113408410A CN 113408410 A CN113408410 A CN 113408410A CN 202110676065 A CN202110676065 A CN 202110676065A CN 113408410 A CN113408410 A CN 113408410A
Authority
CN
China
Prior art keywords
algorithm
model
traffic sign
yolov4
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110676065.8A
Other languages
Chinese (zh)
Inventor
彭军
龚宇
李小兵
杨志
谭玉春
许可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Science and Technology
Original Assignee
Chongqing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Science and Technology filed Critical Chongqing University of Science and Technology
Priority to CN202110676065.8A priority Critical patent/CN113408410A/en
Publication of CN113408410A publication Critical patent/CN113408410A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a traffic sign detection method based on a YOLOv4 algorithm, which relates to the technical field of images, and comprises the following steps: model training is carried out on the preprocessed traffic sign images through an original YOLOv4 algorithm; through training iteration, the model with the optimal model training parameters and the minimum loss function is stored; the images in the test set are tested through the trained model, and finally the frame with the highest confidence coefficient is selected to be output, so that the detection of the traffic sign is completed.

Description

Traffic sign detection method based on YOLOv4 algorithm
Technical Field
The invention relates to the technical field of image detection, in particular to a traffic sign detection method based on a YOLOv4 algorithm.
Background
Road traffic safety is a common concern for people worldwide, and approximately 125 thousand deaths annually result from traffic accidents. However, study [1] showed that the driver was instructed 1.5 seconds before the occurrence of the traffic accident, which could reduce the traffic accident by nearly 90%. Therefore, the target detection algorithm for the road traffic sign is particularly important, and the detection algorithm detects the target in real time. The traditional target detection algorithm is mainly divided into three steps, namely candidate region segmentation, feature extraction and image candidate region detection, and the prior art has a traffic sign detection algorithm based on SIFT features, extracting a characteristic region from an input image, calculating the characteristic region by adopting an SIFT descriptor, or effectively detecting traffic signs in different shapes by a traffic sign detection algorithm of a deformation component (DPM) of HOG and classification of an SVM classifier, however, the traditional detection algorithm needs manual pre-extraction of features, consumes a lot of time, is easy to miss detection, and the image quality is required to be higher, and the ideal recognition accuracy rate is difficult to achieve, in recent years, with the deep learning widely applied to the image processing field, although the target detection algorithm based on the convolutional neural network has achieved great success, the target detection algorithm based on the convolutional neural network still needs to be improved in real-time.
Therefore, the invention discloses a traffic sign detection method based on a YOLOv4 algorithm, compared with the prior art, the invention can improve the detection speed on the premise of ensuring the calculation cost, the invention provides a method for improving the original YOLOv4 trunk extraction network through deep separable convolution to obtain a new trunk extraction network, and the designed experiment result shows that the mAP of the improved YOLOv4 model and the original YOLOv4 model on a CSTD traffic sign data set only differs by 0.88 percent, but the detection speed is improved by nearly 3 times, and the parameter quantity of the models is reduced to a certain extent.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a traffic sign detection method based on a YOLOv4 algorithm.
The invention is realized by the following technical scheme: a traffic sign detection method based on YOLOv4 algorithm, the method comprising the steps of:
model training is carried out on the preprocessed traffic sign images through an original YOLOv4 algorithm;
through training iteration, the model with the optimal model training parameters and the minimum loss function is stored;
and testing the images in the test set through the trained model, and finally selecting the frame with the highest confidence coefficient for output to finish the detection of the traffic sign.
Preferably, the backbone extraction network CSPDarknet of the original YOLOv4 algorithm is processed by a depth separable convolution, said processing comprising the steps of:
step 100: the method comprises the steps of performing depth separable convolution on a multi-channel feature map in an original algorithm, specifically, performing convolution on each channel by adopting a convolution kernel of 3 multiplied by 3, and decomposing the multi-channel feature map into a single-channel feature map;
step 200: the feature map decomposed into the single channel is convolved again by using the convolution kernel of 1 × 1 to adjust the number of channels, and a second feature map is output.
Preferably, the traffic image is embodied as an indication sign, a prohibition sign and a warning sign.
The invention discloses a traffic sign detection method based on a YOLOv4 algorithm, which is compared with the prior art that:
the invention discloses a method for detecting actual life traffic signs based on a YOLOv4 target detection algorithm, which is characterized in that on the basis of an original YOLOv4 algorithm, the network structure of a trunk extraction network CSPDarknet53 is improved by means of the idea of deep separable convolution, and input images are respectively subjected to channel-by-channel convolution and point-by-point convolution to obtain a new trunk characteristic extraction network, the mAP value of the improved YOLOv4 network model for detecting three types of traffic signs reaches 92.63 percent, compared with the original YOLOv4 network, the mAP is only reduced by 0.82 percent, through comparison, the improved YOLOv4 algorithm can achieve the detection accuracy rate of remote small traffic signs, the detection speed of the improved YOLOv4 model is improved by nearly 3 times, and the number of model parameters is greatly reduced.
Drawings
FIG. 1 is a diagram of a depth separable convolution;
FIG. 2 is a modified network architecture diagram;
FIG. 3 is a diagram of a traffic sign detection process;
FIG. 4 is a schematic exterior view of an embodiment of a traffic sign;
FIG. 5a is a comparison graph of the directory AP value;
FIG. 5b is a schematic comparison of the prohibitory AP value;
FIG. 5c is a comparison graph of warning AP values.
Detailed Description
The following examples are given for the detailed implementation and specific operation of the present invention, but the scope of the present invention is not limited to the following examples.
The invention discloses a traffic sign detection method based on a YOLOv4 algorithm, which comprises the following steps:
model training is carried out on the preprocessed traffic sign images through an original YOLOv4 algorithm;
through training iteration, the model with the optimal model training parameters and the minimum loss function is stored;
and testing the images in the test set through the trained model, and finally selecting the frame with the highest confidence coefficient for output to finish the detection of the traffic sign.
For convenience of understanding, the YOLOv4 is introduced in detail in the invention, the YOLOv4 algorithm is an enhanced version obtained by improvement on the basis of a single-stage target detection YOLOv3 algorithm, although there is no qualitative change in the development process of the target detection algorithm, on the premise that the FPS is not reduced, the accuracy is improved obviously, as the single-stage target detection algorithm, three feature layers with different scales are used for classification and regression prediction, a feature graph for detection in each scale is divided into grids of S × S, and if the central coordinate of a real labeled frame of a certain target is in the grids, the grids are responsible for the detection of the target. The YOLOv4 algorithm improves upon the YOLOv3 algorithm in four main ways compared to the YOLOv3 algorithm.
Firstly, the method comprises the following steps: the improvement of the trunk feature extraction network is that Darknet53 in the YOLOv3 algorithm is modified into CSPDarknet53, the CSPnet structure splits the stack of the original residual blocks into a left part and a right part, the trunk part continues to stack the original residual blocks, and the other part is directly connected to the end after a small amount of processing.
Secondly, the method comprises the following steps: the improvement of the feature enhancement network uses an SPP structure and a PANET structure in the feature pyramid part, the SPP structure can greatly increase the receptive field, the most obvious context features are separated, and the PANET structure has the obvious characteristic that the features are repeatedly extracted to obtain richer feature information.
Thirdly, the method comprises the following steps: the improvement of training skill, the robustness of the network is increased by using a masoic data enhancement mode during training, and CIOU is used as a loss function, so that the problems of divergence and the like in the training process are avoided, and the regression of the target frame is more stable.
Fourthly: and the improvement of the activation function uses a Mish activation function to replace an LeakReLU activation function, so that the accuracy and the generalization of the model are improved.
The method also processes the CSPDarknet of the backbone extraction network of the original YOLOv4 algorithm through the deep separable convolution, compared with the standard convolution, the deep separable convolution decomposes the convolution operation into two parts, namely channel-by-channel convolution and point-by-point convolution, and each characteristic channel in the channel-by-channel convolution Depthwise uses a specific convolution kernel to extract characteristics; the Pointwise convolution Pointwise performs secondary feature extraction on the feature map in the channel-wise convolution by using N1 × 1 convolution kernels to obtain a final feature map, and the depth separable convolution structure is shown in fig. 1.
Specifically, assume that the size of the input feature map in the conventional convolutional layer is Dx×DxX M, size of convolution kernel Dy×DyX N, size of output characteristic diagram is Dz×Dz×N,DxAnd DzRepresenting the width and height of the input and output profiles, respectively, DyThe method refers to the space dimension of a convolution kernel, and M and N respectively represent the number of channels of an input characteristic diagram and an output characteristic diagram. The calculation formula is as follows:
the conventional convolution parameters are:
Dy×Dy×M×N (1)
the conventional convolution calculations are:
Dx×Dx×M×N×Dy×Dy (2)
the depth separable convolution parameters are:
Dy×Dy×M×N+Dy×Dy×N (3)
the depth separable convolution calculated quantity is:
Dx×Dx×M×Dy×Dy+Dx×Dx×M×N (4)
the ratio of the parameter quantities of the depth separable convolution to the conventional convolution is:
Figure BDA0003121176030000041
in order to make the YOLOv4 algorithm more suitable for detecting traffic signs, further improve the real-time performance of detection and meet the requirement of detecting traffic signs for drivers in real life, the trunk extraction network CSPDarknet of the original YOLOv4 network is improved by using the deep separable convolution, the multichannel feature map in the original algorithm is convolved by using the deep separable convolution, each channel is firstly convolved by using a convolution kernel of 3 × 3 to be decomposed into a feature map of a single channel, then the feature map of the single channel is convolved by using a convolution kernel of 1 × 1 to adjust the number of the channels, the feature map is output, and the parameter number and the calculation amount of the model are greatly reduced. The improved network structure is shown in fig. 2 below.
For more rigorous reasons, the present invention also discloses specific experiments in the examples to illustrate the examples:
the experimental framework structure is as follows: when the improved YOLOv4 detection algorithm is used for detecting the traffic sign, firstly, configuration files in the algorithm are modified, detection categories are modified into three categories according to experimental requirements, then a light-weighted trunk extraction network replaces the original CSPDarknet-53 and is added into the network, network parameters are reasonably modified according to the actual use condition of experimental equipment, and finally, model training is carried out on pictures in a training set, the improved traffic sign detection process is shown in a figure 3, firstly, the improved YOLOv4 algorithm is used for carrying out model training on a preprocessed traffic sign image, and through training iteration, a model with optimal model training parameters and minimum loss functions is stored. And testing the images in the test set by using the trained model, and finally selecting the frame with the highest confidence coefficient for output. The experiment identifies three types of common traffic signs in daily life, specifically an indication sign, a prohibition sign and a warning sign.
Preparing an experimental data set, wherein the data set selected in the experiment is a China traffic sign CTSD data set issued by a Chinese academy of sciences, and three traffic signs are mainly detected and respectively used as an indication sign (autonomity), a prohibition sign (prohibitory) and a warning sign (warning). Three types of traffic sign examples are shown in fig. 4, wherein a CTSD data set is composed of 1100 images acquired under different scenes and weather conditions, and because training samples of the data set are too few to meet requirements of model training, about 9000 images containing traffic signs are acquired through real life and network resources on the basis of an original data set in the experiment, the image quality of the data set is improved through region clipping, median filtering denoising, color histogram equalization and other processing, then the traffic signs in the data set are labeled according to a production format of a Pascal voc 2007 data set to generate corresponding XML files, and finally the data set is divided into a training set and a test set according to a ratio of 9: 1.
The experimental procedures and results were analyzed as follows.
Respectively training a Yolov4 network and an improved Yolov4 network under a Windows10 operating system to obtain a final model, wherein hardware parameters of a computer used for an experiment are detailed in a table 1;
Figure BDA0003121176030000051
TABLE 1 Experimental computer configuration parameters
Initial parameters of model training as shown in table 2 below, the learning rate was reduced to 10% of the initial learning rate when the number of training times reached 50.
Figure BDA0003121176030000052
TABLE 2 Experimental initial parameters
The evaluation indexes of the experiment on the performance of the detection model comprise an AP (access precision) value and a detection rate. The AP value is the area under the curve resulting from the combination of Precision and Recall. Accuracy Precision refers to the proportion of the part that is predicted to be a positive class and is indeed a positive class to the total number of predicted to be positive classes. Recall recalling refers to the proportion of the portion predicted to be positive and indeed positive to the total number of all positive classes. The Precision and Recall calculation formulas are as follows:
Figure BDA0003121176030000053
Figure BDA0003121176030000054
in the formula, tp (true positive) indicates the number of correctly detected positive samples, fp (false positive) indicates the number of detected negative samples, and fn (false positive) indicates the number of detected positive samples.
The images in the training set are trained through an original YOLOv4 algorithm and an improved YOLOv4 algorithm, and the obtained detection model tests the images in the test set to obtain the following three types of traffic sign AP value comparison graphs as shown in fig. 5.
The average accuracy and detection rate obtained are shown in table 3 below, and the mAP value is obtained by dividing the average accuracy of all classes by the average accuracy of all classes.
Figure BDA0003121176030000055
TABLE 3 original YOLOv4 Algorithm and improved YOLOv4 Algorithm model test results
As can be seen from fig. 5 and table 3, the mapp of the original YOLOv4 model is 93.45%, the mapp of the improved YOLOv4 model is 92.63%, by comparison, the difference in accuracy between the YOLOv4 model before and after improvement is only 0.82%, the difference is not large, but the FPS of the model detection after improvement is increased from 11 to 31, and the number of the model parameters after improvement is reduced from 6.17 × 107 to 4.12 × 107, which reduces the number of the original parameters by about 2/3, and greatly reduces the model computation. The improved YOLOv4 model can improve the model detection speed and reduce the operation cost on the basis of ensuring the detection accuracy of the original YOLOv4 algorithm. Fig. 5 shows the detection effect of the improved traffic sign detection model. According to the detection graph, the algorithm can accurately detect three different types of traffic signs, and the prediction box can correctly frame the traffic signs. The traffic sign detection method can effectively help a driver to detect the traffic sign according to real-time road conditions in actual driving.
In summary, compared with the prior art, the invention introduces a method for detecting actual life traffic signs based on a YOLOv4 target detection algorithm, and based on an original YOLOv4 algorithm, the network structure of a trunk extraction network CSPDarknet53 is improved by means of the idea of deep separable convolution, and channel-by-channel convolution and point-by-point convolution operations are respectively performed on an input image to obtain a new trunk feature extraction network, the improved YOLOv4 network model has an mAP value reaching 92.63% for detecting three types of traffic signs, and compared with the original YOLOv4 network, the mAP is reduced by only 0.82%. By contrast, the improved YOLOv4 algorithm can achieve the detection accuracy of the remote small traffic signs. And the detection speed of the improved YOLOv4 model is improved by nearly 3 times, and the parameter quantity of the model is greatly reduced.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be able to cover the technical scope of the present invention and the equivalent alternatives or modifications according to the technical solution and the inventive concept of the present invention within the technical scope of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (3)

1. A traffic sign detection method based on a YOLOv4 algorithm is characterized by comprising the following steps:
model training is carried out on the preprocessed traffic sign images through an original YOLOv4 algorithm;
through training iteration, the model with the optimal model training parameters and the minimum loss function is stored;
and testing the images in the test set through the trained model, and finally selecting the frame with the highest confidence coefficient for output to finish the detection of the traffic sign.
2. The method of claim 1, wherein the backbone extraction network CSPDarknet of the original YOLOv4 algorithm is processed by deep separable convolution, the processing comprising the steps of:
step 100: the method comprises the steps of performing depth separable convolution on a multi-channel feature map in an original algorithm, specifically, performing convolution on each channel by adopting a convolution kernel of 3 multiplied by 3, and decomposing the multi-channel feature map into a single-channel feature map;
step 200: the feature map decomposed into the single channel is convolved again by using the convolution kernel of 1 × 1 to adjust the number of channels, and a second feature map is output.
3. The method as claimed in claim 2, wherein the traffic image is a sign, a prohibition sign and a warning sign.
CN202110676065.8A 2021-06-18 2021-06-18 Traffic sign detection method based on YOLOv4 algorithm Pending CN113408410A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110676065.8A CN113408410A (en) 2021-06-18 2021-06-18 Traffic sign detection method based on YOLOv4 algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110676065.8A CN113408410A (en) 2021-06-18 2021-06-18 Traffic sign detection method based on YOLOv4 algorithm

Publications (1)

Publication Number Publication Date
CN113408410A true CN113408410A (en) 2021-09-17

Family

ID=77685123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110676065.8A Pending CN113408410A (en) 2021-06-18 2021-06-18 Traffic sign detection method based on YOLOv4 algorithm

Country Status (1)

Country Link
CN (1) CN113408410A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723377A (en) * 2021-11-02 2021-11-30 南京信息工程大学 Traffic sign detection method based on LD-SSD network
CN114495061A (en) * 2022-01-25 2022-05-13 青岛海信网络科技股份有限公司 Road traffic sign board identification method and device
CN114830915A (en) * 2022-04-13 2022-08-02 华南农业大学 Litchi vision picking robot based on laser radar navigation and implementation method thereof
CN115810183A (en) * 2022-12-09 2023-03-17 燕山大学 Traffic sign detection method based on improved VFNet algorithm

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723377A (en) * 2021-11-02 2021-11-30 南京信息工程大学 Traffic sign detection method based on LD-SSD network
CN114495061A (en) * 2022-01-25 2022-05-13 青岛海信网络科技股份有限公司 Road traffic sign board identification method and device
CN114495061B (en) * 2022-01-25 2024-04-05 青岛海信网络科技股份有限公司 Road traffic sign board identification method and device
CN114830915A (en) * 2022-04-13 2022-08-02 华南农业大学 Litchi vision picking robot based on laser radar navigation and implementation method thereof
CN114830915B (en) * 2022-04-13 2023-09-26 华南农业大学 Litchi vision picking robot based on laser radar navigation and implementation method thereof
CN115810183A (en) * 2022-12-09 2023-03-17 燕山大学 Traffic sign detection method based on improved VFNet algorithm
CN115810183B (en) * 2022-12-09 2023-10-24 燕山大学 Traffic sign detection method based on improved VFNet algorithm

Similar Documents

Publication Publication Date Title
CN113408410A (en) Traffic sign detection method based on YOLOv4 algorithm
CN108664996B (en) Ancient character recognition method and system based on deep learning
CN108446678B (en) Dangerous driving behavior identification method based on skeletal features
CN110348357B (en) Rapid target detection method based on deep convolutional neural network
CN110175649B (en) Rapid multi-scale estimation target tracking method for re-detection
WO2019080203A1 (en) Gesture recognition method and system for robot, and robot
CN110163069B (en) Lane line detection method for driving assistance
CN107092884B (en) Rapid coarse-fine cascade pedestrian detection method
CN110503613A (en) Based on the empty convolutional neural networks of cascade towards removing rain based on single image method
CN111986699B (en) Sound event detection method based on full convolution network
CN111008608B (en) Night vehicle detection method based on deep learning
CN110705558A (en) Image instance segmentation method and device
CN112101219A (en) Intention understanding method and system for elderly accompanying robot
CN112861785B (en) Instance segmentation and image restoration-based pedestrian re-identification method with shielding function
CN105893971A (en) Traffic signal lamp recognition method based on Gabor and sparse representation
WO2024051296A1 (en) Method and apparatus for obstacle detection in complex weather
CN116630932A (en) Road shielding target detection method based on improved YOLOV5
CN116340746A (en) Feature selection method based on random forest improvement
CN110909674B (en) Traffic sign recognition method, device, equipment and storage medium
CN112233105A (en) Road crack detection method based on improved FCN
CN110060221B (en) Bridge vehicle detection method based on unmanned aerial vehicle aerial image
CN106971377A (en) A kind of removing rain based on single image method decomposed based on sparse and low-rank matrix
CN115761834A (en) Multi-task mixed model for face recognition and face recognition method
CN116469020A (en) Unmanned aerial vehicle image target detection method based on multiscale and Gaussian Wasserstein distance
CN114359601A (en) Target similarity calculation method and device, electronic equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication