CN114581904A - End-to-end license plate detection and identification method based on deep learning - Google Patents
End-to-end license plate detection and identification method based on deep learning Download PDFInfo
- Publication number
- CN114581904A CN114581904A CN202210332461.3A CN202210332461A CN114581904A CN 114581904 A CN114581904 A CN 114581904A CN 202210332461 A CN202210332461 A CN 202210332461A CN 114581904 A CN114581904 A CN 114581904A
- Authority
- CN
- China
- Prior art keywords
- license plate
- network
- obj
- activation function
- angle
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an end-to-end license plate detection and identification method based on deep learning, which comprises the steps of collecting license plate images, labeling and sorting license plate image data sets; constructing a first half part of an end-to-end license plate detection and recognition network, namely a detection network, wherein the network is divided into a backbone network, a feature extraction network and a full-connection network, the extracted feature is mapped onto a width vector, a height vector, a coordinate vector and an angle vector of a prediction frame, and a layer network is output, and finally width information, height information, coordinate information and angle information are output after activation; training the detection network by using the labeled data set; step four, a second half part, namely an identification network, in the end-to-end license plate detection and identification network is set up, the position of the prediction frame in the convolution layer of the main network is calculated, the characteristics of the part are respectively taken out, and pooling, splicing and character identification are carried out on the taken out characteristics; and step five, training, detecting and identifying the whole network. The invention eliminates the segmentation step, improves the efficiency and simultaneously improves the accuracy and the speed.
Description
Technical Field
The invention relates to the technical field of pattern recognition and deep learning, in particular to an end-to-end license plate detection and recognition method based on deep learning.
Background
With the continuous development of deep learning and pattern recognition, license plate recognition methods are continuously updated, and from the beginning of license plate recognition based on an image processing method, to the later hog + svm recognition method based on machine learning, to the current recognition method based on deep learning, the recognition speed and the recognition accuracy are also improved. However, most of the existing license plate recognition adopts two-step or three-step strategies, namely license plate detection and license plate recognition; license plate detection, character segmentation and character recognition. The detection, the segmentation and the identification which are carried out step by step not only have very high requirements on the speed and the precision of each method, but also have higher requirements on the connection among the steps.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention aims to provide an end-to-end license plate detection and recognition method based on deep learning, the method is based on pattern recognition and deep learning, firstly, a convolutional neural network is adopted to extract image characteristics, then, the detection of license plate positions and license plate rotation angles is carried out through the extracted characteristics, secondly, the license plate part in a characteristic diagram is cut out by using the detected result to carry out ROI posing sampling and fusion, the fused characteristics are sent to a character recognition network to carry out the next detection, and finally, the result is output.
In order to achieve the purpose, the invention adopts the technical scheme that:
an end-to-end license plate detection and identification method based on deep learning comprises the following steps;
acquiring license plate images under various weather conditions and scenes through a digital camera and a mobile phone camera, and labeling and sorting a license plate image data set;
step two, constructing a first half part, namely a detection network, in an end-to-end license plate detection and recognition network, wherein the network is divided into a backbone network and is used for extracting image characteristics of a license plate; the full-connection network is used for mapping the extracted image features of the license plate to the width, height, coordinates and angle vectors of the prediction frame; the output layer network is used for activating and finally outputting width, height, coordinate and angle information;
thirdly, training a detection network by using the license plate image data set which is arranged in the first step, training a basic license plate image data set in order to enable the network to be fitted more quickly, then adjusting the learning rate to train the rest data sets, and finally integrally training all the data sets;
step four, building a second half part-recognition network in the end-to-end license plate detection and recognition network, calculating the position of the prediction frame on the convolution layer of the main network on the basis of the prediction frame obtained in the step two, respectively taking out the characteristics of each characteristic diagram at the position, and performing pooling, splicing and character recognition on the taken out characteristics;
and step five, training, detecting and identifying the whole network.
The labeling and sorting of the license plate image data set in the first step specifically comprises the following steps:
step1, distinguishing all images according to categories, wherein the images comprise a basic license plate image, a license plate image rotating at a small angle and a license plate image rotating at a large angle;
the basic license plate image does not rotate, no complex scene exists, and a clear license plate image is shot on the front side;
the license plate image rotating at a small angle is rotated at a angle of-10 degrees, the license plate image is blurred, the license plate image with dark light and bright light is far away from the camera;
the rotation angle of the license plate image rotating at a large angle is-30 degrees, and the license plate image in rainy days, snowy days and foggy days and the license plate image without license plate are obtained;
step2, directly carrying out the next processing on the license plate image data set marked by the license plate target; marking the license plate target by using a RoLabelimage for the license plate image which is not marked;
step3, carrying out normalization processing on the labeled image data set;
step4, the license plate characters are coded into corresponding numbers, so that the loss is calculated conveniently after deep learning reasoning, and the number coded by the first character (province) is as follows:
wan (Anhui province) | Hu (Chinese character of 'Hu') | Jin-jin | Yu wine | Wing | Jin (jin) | Covering for window | Liao (Chinese character of 'Liao') | Lucky toy | Black colour | Su (Chinese character of 'su') | Zhejiang province |
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Jing made of Chinese medicinal materials | Min (the Min) | Gan Jiang (Jiang) | Lu et al | Yu | Jaw | Xiang (Chinese character of Xiang) | Guangdong (a kind of Chinese character) | Sweet osmanthus | Qiongqiong (a Chinese character of' qiong | Sichuan style food | Noble |
13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 |
Cloud | Tibetan medicine | Shaanxi | Sweet taste | Blue leaf | Ning (medicine for curing rheumatism) | New | Police | Study the design | 0 | ||
25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 |
The second character (downtown) encodes the numbers as follows:
the numbers of the third to seventh characters are as follows:
because the number 0 and the letter O are difficult to distinguish, and the number 1 and the letter I are difficult to distinguish, the third to seventh characters of the license plate do not have the letters I and O according to the provisions of the motor vehicle license plate standard GA36-2007 5.9.1 of the people's republic of China, so that the two characters are omitted during coding, and the second character 0 is generally a police car;
step5, storing the address of the photo and the label corresponding to the license plate image into a file in a line form, wherein the storage format is as follows:
photo address, xobj,yobj,Wobj,Hobj,Angleobj,[code1,code2,code3,code4,code5,code6,code7]
xobj,yobj,Wobj,Hobj,AngleobjThe horizontal coordinate, vertical coordinate, length, width and angle parameter values of the prediction frame after Step3 conversion, code1=code7The code value is obtained by the Step4 license plate character code.
The marking of the license plate target by the RoLabelimage specifically comprises the following steps:
1) the data set marked by the software ROLabelimage is converted by the following formula;
cx,cy: the horizontal coordinate and the vertical coordinate of the central point of the license plate target;
wx,hx: width and height of the license plate target;
angle: the rotation angle of the license plate target;
w, h: the width and height of the image of the license plate target;
xobj,yobj: the converted horizontal coordinate and vertical coordinate of the license plate target;
Wobj,Hobj: the width and the height of the license plate target after conversion;
Angleobj: the converted rotation angle of the license plate target is clockwise positive and anticlockwise negative;
for a dataset labeled with four vertex coordinates, the following formula is required to convert to a normalized input, with the four points being the top left corner p1(x1,y1) Upper right corner p2(x2,y2) Lower right corner p3(x3,y3) Lower left corner p4(x4,y4) The conversion formula is as follows:
xobj,yobj: the converted horizontal coordinate and vertical coordinate of the license plate target center point;
Wobj,Hobj: the width and the height of the license plate target after conversion;
k: the slope of the converted license plate target rotation angle;
Angleobj: and the converted rotation angle of the license plate target is clockwise positive and anticlockwise negative.
The structure of the detection network in the second step is specifically as follows:
with VGG16() as the backbone network, the structure of each convolutional layer contained is as follows:
l1: using 64 convolution kernels of 3 x 3, using a LeakyRelu activation function;
l2: adopting 64 convolution kernels of 3 x 3, adopting a LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l3: using 128 convolution kernels of 3 x 3, using a LeakyRelu activation function;
l4: adopting 128 convolution kernels of 3 x 3, adopting LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l5: using 256 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l6: using 256 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l7: adopting 256 convolution kernels of 3 x 3, adopting a LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l8: using 512 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l9: using 512 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l10: adopting 512 convolution kernels of 3 x 3, adopting a LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l11: using 512 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l12: using 512 convolution kernels of 3 x 3 and using a LeakyRelu activation function;
l13: 512 convolution kernels of 3 x 3 are adopted, a LeakyRelu activation function is adopted, the maximum pooling with the step length of 2 is adopted to obtain a matrix, and then the matrix is flattened.
The full-connection layer network structure is as follows:
l14: a full-connected layer, Relu activation function, with an input dimension of 184832 and an output dimension of 4096;
l15: with a fully connected layer, Relu, of 4096 input dimensions and 128 output dimensions.
The output layer network structure is as follows:
l16: adopting a full-connection layer with an input dimension of 128 and an output dimension of 5, and a sigmod activation function, wherein the 5 output dimensions are the length, the width, the abscissa, the ordinate and the rotation angle of the license plate prediction frame respectively;
Angleobj=(angle-0.5)*Π
k1=tan(Angleobj)
y3=(x3-x1)*k1+y1
x4=x1+x2-x3
y4=y1+y2-y3
(x1,y1),(x2,y2),(x3,y3),(x4,y4): coordinates of four vertexes of the license plate are respectively;
x and y are the ratios of the horizontal coordinate and the vertical coordinate of the central point of the predicted license plate relative to the whole image;
w and h are the predicted width and height of the license plate relative to the whole image ratio respectively;
w, H is the actual width and height of the whole image;
angle is the predicted rotation angle of the license plate;
Angleobjis the license plate rotation angle after conversion;
k1the slope of the license plate after rotation angle conversion is obtained;
the training detection network in the third step is specifically as follows:
firstly, scrambling a basic license plate data set, and training 100 batches;
secondly, disordering the rest license plate image data sets of which the license plates pass through, and training 200 batches;
then, the data sets without license plates are disordered and 50 batches of training are carried out;
finally, for all data sets, 50 batches were trained.
The identification network in the fourth step is specifically as follows:
converting the length, width, abscissa and ordinate of the L16 prediction box in the second step into specific characteristic coordinate information, wherein the conversion formula is as follows:
(x1,y1),(x2,y2),(x3,y3),(x4,y4) Four points on the feature map, w and h are relative length of width and height, scalejIs the actual size of each feature map, (x)i,yi) Is the actual coordinate point on each feature map;
calculating the characteristic graphs of L2, L4 and L7 according to the formula, and cutting out the characteristic graph L2 which needs to be subjected to license plate recognitioncrop,L4crop,L7cropFeature size is (608 × w) × (608 × h) × 64, (304 × w) × (304 × h) × 128, (152 × w) × (152 × h) × 256, w and h are relative lengths of the predicted width and height of the license plate, respectively;
for the cut-out feature map L2crop,L4crop,L7cropROI Pooling was performed to obtain three characteristic maps L2pooling,L4pooling,L7poolingSizes of 8 × 16 × 64, 8 × 16 × 128 and 8 × 16 × 256 respectively;
the three signatures were stitched in the third dimension at size 8 x 16 x 448 and then expanded to synthesize signature F at size 1 x 57344.
The seven characters classify the license plate by constructing 7 classifiers;
classifier 1 (classifier)1): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 2 (classifier)2): an 57344 x 128 fully connected layer, a 128 x 25 fully connected layer and a sigmod activation function;
classifier 3 (classifier)3): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 4 (classifier)4): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 5 (classifier)4): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 6 (classifier)6): an 57344 x 128Fully-connected layers, a 128 x 34 fully-connected layer and a sigmod activation function;
classifier 7 (classifier)7): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
The invention has the beneficial effects that:
the method adopts a brand-new deep learning network structure, integrates the license plate detection and the license plate recognition into a large frame, reduces the steps of feature extraction frequency and license plate segmentation, and greatly improves the speed and the precision of the recognition.
Firstly, the traditional license plate recognition needs to be executed in multiple steps, namely license plate detection, license plate segmentation and license plate recognition, although the steps are clear, the continuity is not strong, and the middle of the license plate detection and the license plate recognition is interrupted by the license plate segmentation, so that the characteristics need to be extracted twice, the same steps are operated twice, and much time is wasted. Secondly, most of the traditional license plate recognition application scenes are provided with light supplement lamps and are shot from the front side in a short distance, and the method can collect license plate images for recognition from the side or a long distance scene and has a good recognition effect.
Drawings
FIG. 1 is a schematic view of a sorted data set according to the present invention.
FIG. 2 is a schematic diagram of a labeled data set according to the present invention.
FIG. 3 is a diagram illustrating the network results of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1-3: the invention discloses an end-to-end license plate detection and identification method based on deep learning, which specifically comprises the following steps:
firstly, license plate images under various weather conditions and scenes are collected, as shown in figure 1, and a license plate image data set is marked and sorted as shown in figure 2.
Firstly, the license plate image is uniformly processed into 608 × 608 by adopting self-adaptive scaling, then the unlabelled license plate image is labeled, as shown in the left diagram of fig. 2, the coordinate information of all labeled data sets is converted, so that the representation modes of the labeled data sets can be unified, and finally the converted data sets are sorted and stored, so that the label and the image information can be conveniently extracted in the subsequent training, as shown in the right diagram of fig. 2.
And step two, constructing a first half part, namely a detection network, in the end-to-end license plate detection and identification network, wherein the network structure is shown in figure 3.
Step1, the image with arbitrary size of m × n is subjected to adaptive scaling processing, firstly, the image is scaled to be an image with w and h both smaller than 608, then the image is placed in the image with 608 × 608 pixel value of (128, 128, 128), the image is subjected to normalization processing, and the processed data inputData is input into the network.
Step2, first extract features with 64 convolution kernels of 3 × 3, then normalize them, and finally activate them with LeakyRelu, with a feature map size of 608 × 64.
Step3, first extract features with 64 convolution kernels of 3 × 3, then normalize, then activate with LeakyRelu, and finally down sample with max pooling, with a feature map size of 608 × 64.
Step4, first extract features with 128 convolution kernels of 3 × 3, then perform normalization, and finally activate with LeakyRelu, with a feature map size of 304 × 128.
Step5, first extract features with 128 convolution kernels of 3 × 3, then normalize, then activate with LeakyRelu, and finally down sample with max pooling, with feature map size 304 × 128.
Step6, first extract features with 256 convolution kernels of 3 × 3, then normalize them, and finally activate them with LeakyRelu, with a feature map size of 152 × 256.
Step7, first extract features with 256 3 × 3 convolution kernels, then normalize, then activate with LeakyRelu, and finally down sample with max pooling, with feature map size 152 × 256.
Step8, extracting features by using 512 convolution kernels of 3 × 3, then carrying out normalization processing, and finally adopting LeakyRelu to carry out activation, wherein the size of a feature map is 76 × 512.
Step9, first extracting features with 512 convolution kernels of 3 × 3, then performing normalization processing, then activating with LeakyRelu, and finally performing down-sampling with maximum pooling, wherein the size of the feature map is 76 × 512.
Step10, first extract features with 512 convolution kernels 3 × 3, then normalize them, and finally activate them with LeakyRelu, with a feature map size of 38 × 512.
Step11, first extract features with 512 convolution kernels of 3 × 3, then perform normalization, then activate with LeakyRelu, and finally perform down-sampling with maximum pooling, with a feature map size of 38 × 512.
Feature map Feature after Step12 and Step11 processingoutSize 19 x 512, flattening the feature map to obtain a Vector of 1 x 1848321。
Step13, Vector mapping is carried out by using the full connection layer, and Vector is mapped1Mapping to another Vector2,Vector2Size 1 x 4096, using the LeakyRelu activation function.
Step14 Vector mapping with fully connected layers from Vector2Mapping to another Vector3,Vector3Size 1 x 128, using the LeakyRelu activation function.
Step15, Vector mapping output, and adopting sigmod activation function to Vector3Processing is performed to control the value range of the output variable to [0, 1 ]]In between, the output Vector is Vector4The magnitude is 1 × 5, and the output vectors are x, y, w, h, and angle, respectively, and are converted by the following formula.
Angleobj=(angle-0.5)*Π
k1=tan(Angleobj)
y3=(x3-x1)*k1+y1
x4=x1+x2-x3
y4=y1+y2-y3
(x1,y1),(x2,y2),(x3,y3),(x4,y4): coordinates of four vertexes of the license plate are respectively;
x and y are the horizontal coordinate and the vertical coordinate of the central point of the predicted license plate;
w and h are the predicted width and height of the license plate relative to the whole image ratio respectively;
w, H is the actual width and height of the image;
angle is the predicted rotation angle of the license plate;
Angleobjthe rotation angle of the license plate after conversion;
k1is the slope after rotation angle conversion;
and step three, training a detection network by using the marked data sets, firstly training a basic license plate image data set in order to enable the network to fit more quickly, then adjusting the learning rate to train the rest data sets, and finally integrally training all the data sets together.
And step four, constructing a second half part, namely an identification network, in the end-to-end license plate detection and identification network.
And on the basis of the prediction frame obtained in the step two, calculating the position of the prediction frame in the main network feature map, and respectively extracting the features of the part.
As shown in FIG. 3, the recognition network calculates the characteristic graphs L2, L4 and L10 of the detection network by using the formula of the step two, and cuts out the characteristic graph L2 which needs to be subjected to license plate recognitioncrop,L5crop,L10cropThe feature size is (608 × w) × (608 × h) × 64, (304 × w) × (304 × h) × 128, (152 × w) × (152 × h) × 256, w and h are the relative lengths of the width and height, respectively.
For the cut-out feature map L2crop,L5crop,L10cropROI Pooling was performed to obtain three characteristic maps L2pooling,L5pooling,L10poolingThe sizes are 8 × 16 × 64, 8 × 16 × 128, 8 × 16 × 256, respectively.
The three signatures were stitched in the first dimension at size 8 x 16 x 448 and then expanded to synthesize signature F at size 57344 x 1.
And 7 classifiers are constructed to classify seven characters on the license plate.
Classifier 1 (classifier)1): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
Classifier 2 (classifier)2): an 57344 x 128 fully connected layer, a 128 x 25 fully connected layer and a sigmod activation function.
Classifier 3 (classifier)3): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
Classifier 4 (classifier)4): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
Classifier 5 (classifier)4): a full connected layer of 57344 x 128, a full connected layer of 128 x 34 and aSigmod activates the function.
Classifier 6 (classifier)6): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
Classifier 7 (classifier)7): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
And finding the position of the maximum value in the vector and obtaining the subscript of the maximum value according to the result of the classifier, and finding the value corresponding to the subscript from the table of Step one 4 through the subscript.
And step five, training, detecting and identifying the whole network.
The invention carries out experiments in the public CCPD data set and the new energy license plate data set collected by the user, and the specific software and hardware environment is as follows:
TABLE 1 hardware and software environment parameter table
Table1.Parameters of the Hardware and Software
Environmental parameter | Description of the invention |
CPU | Inter(R)Core(TM)i5-9600 |
Memory device | 16.00GB |
Display card | GEFORCE GTX 1660super |
Hard disk | SA400S37/480G |
CUDA version | CUDA10.0 |
Operating system platform | Ubuntu16.04 |
Experiment simulation platform | Python3.6 |
At present, the common indexes for evaluating target detection comprise the number of photos which can be processed per minute and average precision, experiments are respectively carried out on the 8 data sets, and the results are shown in the table:
FPS | AP | Base | Rotate | |
SSD300 | 40 | 94.4 | 99.1 | 95.6 |
YOLOV3-416 | 42 | 93.1 | 98 | 94 |
Fast-Rcnn | 15 | 92.9 | 98.1 | 91.8 |
MyNet | 64 | 90.5 | 99.3 | 93 |
it can be seen from the experimental results that the speed of the invention can reach 64 frames per second, and although the average precision is reduced, the effect of the invention is the best no matter the average precision of all data sets or the basic data and the rotating license plate data sets.
Claims (8)
1. An end-to-end license plate detection and identification method based on deep learning is characterized by comprising the following steps;
acquiring license plate images under various weather conditions and scenes through a digital camera and a mobile phone camera, and labeling and sorting a license plate image data set;
step two, establishing a detection network which is the first half part of an end-to-end license plate detection and identification network, wherein the network is divided into a backbone network and is used for extracting the image characteristics of the license plate; the full-connection network is used for mapping the extracted image features of the license plate to the width, height, coordinates and angle vectors of the prediction frame; the output layer network is used for activating and finally outputting width, height, coordinate and angle information;
training a detection network by using the license plate image data set arranged in the step one, training a basic license plate image data set firstly in order to enable the network to be fitted more quickly, then training the rest data sets by adjusting the learning rate, and finally integrally training by using all the data sets;
step four, building a second half part-recognition network in the end-to-end license plate detection and recognition network, calculating the position of the prediction frame on the convolution layer of the main network on the basis of the prediction frame obtained in the step two, respectively taking out the characteristics of the part, and performing pooling, splicing and character recognition on the taken out characteristics;
and step five, training, detecting and identifying the whole network.
2. The end-to-end license plate detection and identification method based on deep learning of claim 1, wherein the labeling and sorting of the license plate image data set in the first step is specifically as follows:
step1, distinguishing all images according to categories, wherein the images comprise a basic license plate image, a license plate image rotating at a small angle and a license plate image rotating at a large angle;
the basic license plate image does not rotate, no complex scene exists, and a clear license plate image is shot on the front side;
the license plate image rotating at a small angle has a rotating angle of-10 degrees, the license plate image is blurred, the license plate image with dark light and bright light is the license plate image farther away from the camera;
the rotation angle of the license plate image rotating at a large angle is-30 degrees, and the license plate image in rainy days, snowy days and foggy days and the license plate image without license plate are obtained;
step2, directly carrying out the next processing on the license plate image data set marked on the license plate target; marking the license plate target by using a RoLabelimage for the license plate image which is not marked;
step3, carrying out normalization processing on the labeled image data set;
step4, the license plate characters are coded into corresponding numbers, so that the loss is calculated conveniently after deep learning reasoning, and the number coded by the first character (province) is as follows:
The second character (downtown) encodes the numbers as follows:
the numbers of the third to seventh characters are as follows:
because the number 0 and the letter O are difficult to distinguish, and the number 1 and the letter I are difficult to distinguish, the third to seventh characters of the license plate do not have the letters I and O according to the provisions of the motor vehicle license plate standard GA36-2007 5.9.1 of the people's republic of China, so that the two characters are omitted during coding, and the second character 0 is generally a police car;
step5, storing the address of the photo and the label corresponding to the license plate image into a file in a line form, wherein the storage format is as follows:
photo address, xobj,yobj,Wobj,Hobj,Angleobj,[code1,code2,code3,code4,code5,code6,code7]
xobj,yobj,Wobj,Hobj,AngleobjThe horizontal coordinate, vertical coordinate, length, width and angle parameter values of the prediction frame after Step3 conversion, code1=code7The code value is obtained by the Step4 license plate character code.
3. The end-to-end license plate detection and identification method based on deep learning of claim 2, wherein the labeling of the license plate target by the rolelalimage is specifically as follows:
1) the data set marked by the software ROLabelimage is converted by the following formula;
cx,cy: the horizontal coordinate and the vertical coordinate of the central point of the license plate target;
wx,hx: width and height of the license plate target;
an angle: the rotation angle of the license plate target;
w, h: the width and height of the image of the license plate target;
xobj,yobj: the converted horizontal coordinate and vertical coordinate of the license plate target;
Wobj,Hobj: the width and the height of the license plate target after conversion;
Angleobj: the converted rotation angle of the license plate target is clockwise positive and anticlockwise negative;
for a dataset labeled with four vertex coordinates, the following formula is required to convert to a normalized input, with the four points being the top left corner p1(x1,y1) Upper right corner p2(x2,y2) Lower right corner p3(x3,y3) Lower left corner p4(x4,y4) The conversion formula is as follows:
xobj,yobj: the converted horizontal coordinate and vertical coordinate of the license plate target center point;
Wobj,Hobj: the width and the height of the license plate target after conversion;
k: the slope of the converted license plate target rotation angle;
Angleobj: and the converted rotation angle of the license plate target is clockwise positive and anticlockwise negative.
4. The end-to-end license plate detection and identification method based on deep learning of claim 1, wherein the structure of the detection network in the second step is specifically as follows:
the structure of each convolutional layer contained by adopting VGG16() as a backbone network is shown as follows:
l1: using 64 convolution kernels of 3 x 3, using a LeakyRelu activation function;
l2: adopting 64 convolution kernels of 3 x 3, adopting a LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l3: using 128 convolution kernels of 3 x 3, using a LeakyRelu activation function;
l4: adopting 128 convolution kernels of 3 x 3, adopting LeakyRelu activation function and adopting maximum pooling with the step length of 2;
l5, using 256 convolution kernels of 3 × 3, using the LeakyRelu activation function;
l6, using 256 convolution kernels of 3 × 3, using the LeakyRelu activation function;
l7, adopting 256 convolution kernels of 3 × 3, adopting LeakyRelu activation function, and adopting maximum pooling with the step length of 2;
l8, using 512 convolution kernels of 3 x 3 and using LeakyRelu activation function;
l9, using 512 convolution kernels of 3 x 3 and using LeakyRelu activation function;
l10, adopting 512 convolution kernels of 3 x 3, adopting LeakyRelu activation function, and adopting maximum pooling with step length of 2;
l11, using 512 convolution kernels of 3 x 3 and using LeakyRelu activation function;
l12, using 512 convolution kernels of 3 x 3 and using LeakyRelu activation function;
l13, adopting 512 convolution kernels of 3 × 3, adopting a LeakyRelu activation function, adopting maximum pooling with the step length of 2 to obtain a matrix, and then flattening the matrix.
5. The deep learning-based end-to-end license plate detection and identification method of claim 1, wherein the full-connection layer network structure is as follows:
l14: a full-connected layer, Relu activation function, with an input dimension of 184832 and an output dimension of 4096;
l15: with a fully connected layer, Relu activation function, of 4096 input dimensions and 128 output dimensions.
The output layer network structure is as follows:
l16: adopting a full-connection layer with 128 input dimensions and 5 output dimensions, and a sigmod activation function, wherein the 5 output dimensions are the length, width, abscissa, ordinate and rotation angle of the license plate prediction frame respectively;
Angleobj=(angle-0.5)*Π
k1=tan(Angleobj)
y3=(x3-x1)*k1+y1
x4=x1+x2-x3
y4=y1+y2-y3
(x1,y1),(x2,y2),(x3,y3),(x4,y4): coordinates of four vertexes of the license plate are respectively;
x and y are the ratios of the horizontal coordinate and the vertical coordinate of the central point of the predicted license plate relative to the whole image;
w and h are the predicted width and height of the license plate relative to the whole image ratio respectively;
w, H is the actual width and height of the whole image;
angle is the predicted rotation angle of the license plate;
Angleobjis the license plate rotation angle after conversion;
k1the slope after the license plate rotation angle is converted is shown.
6. The end-to-end license plate detection and recognition method based on deep learning of claim 1, wherein the training detection network in the third step is specifically:
firstly, a basic license plate data set is disturbed, and 100 batches of training are carried out;
secondly, disturbing the rest license plate image data sets of the license plates which pass through the license plates, and training 200 batches;
then, the data sets without license plates are disordered and 50 batches of training are carried out;
finally, for all data sets, 50 batches were trained.
7. The deep learning-based end-to-end license plate detection and identification method according to claim 1, wherein the identification network in the fourth step is specifically:
converting the length, width, abscissa and ordinate of the L16 prediction box in the second step into specific characteristic coordinate information, wherein the conversion formula is as follows:
(x1,y1),(x2,y2),(x3,y3),(x4,y4) Four points on the feature map, w and h are relative length of width and height, scalejIs the actual size of each feature map, (x)i,yi) Is the actual coordinate point on each feature map;
calculating the characteristic diagrams of L2, L4 and L7 according to the formula, and cutting out the characteristic diagram L2 needing license plate recognitioncrop,L4crop,L7cropFeature sizes are (608 × w) (608 × h) × 64, (304 × w) × (304 × h) × 128, (152 × w) × (152 × h) × 256, respectively, w and h are the relative lengths of the predicted width and height of the license plate;
for the cut-out feature map L2crop,L4crop,L7cropPerforming ROI Pooling, three feature maps L2 can be finally obtainedpooling,L4pooling,L7poolingSizes of 8 × 16 × 64, 8 × 16 × 128 and 8 × 16 × 256 respectively;
the three signatures were stitched in the third dimension at size 8 x 16 x 448 and then expanded to synthesize signature F at size 1 x 57344.
8. The deep learning-based end-to-end license plate detection and recognition method of claim 1, wherein the seven characters classify the license plate by constructing 7 classifiers;
classifier 1 (classifier)1): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 2 (classifier)2): an 57344 x 128 fully connected layer, a 128 x 25 fully connected layer and a sigmod activation function;
classifier 3 (classifier)3): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 4 (classifier)4): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 5 (classifier)4): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 6 (classifier)6): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function;
classifier 7 (classifier)7): an 57344 x 128 fully connected layer, a 128 x 34 fully connected layer and a sigmod activation function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210332461.3A CN114581904A (en) | 2022-03-31 | 2022-03-31 | End-to-end license plate detection and identification method based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210332461.3A CN114581904A (en) | 2022-03-31 | 2022-03-31 | End-to-end license plate detection and identification method based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114581904A true CN114581904A (en) | 2022-06-03 |
Family
ID=81778812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210332461.3A Pending CN114581904A (en) | 2022-03-31 | 2022-03-31 | End-to-end license plate detection and identification method based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114581904A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114898352A (en) * | 2022-06-29 | 2022-08-12 | 松立控股集团股份有限公司 | Method for simultaneously realizing image defogging and license plate detection |
-
2022
- 2022-03-31 CN CN202210332461.3A patent/CN114581904A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114898352A (en) * | 2022-06-29 | 2022-08-12 | 松立控股集团股份有限公司 | Method for simultaneously realizing image defogging and license plate detection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang et al. | Deep learning for vehicle detection in aerial images | |
CN108154102B (en) | Road traffic sign identification method | |
CN109345547B (en) | Traffic lane line detection method and device based on deep learning multitask network | |
CN104021375B (en) | A kind of model recognizing method based on machine learning | |
Khan et al. | 3D model based vehicle classification in aerial imagery | |
CN109918971B (en) | Method and device for detecting number of people in monitoring video | |
CN108918532A (en) | A kind of through street traffic sign breakage detection system and its detection method | |
CN107665492A (en) | Colon and rectum panorama numeral pathological image tissue segmentation methods based on depth network | |
CN109886086B (en) | Pedestrian detection method based on HOG (histogram of oriented gradient) features and linear SVM (support vector machine) cascade classifier | |
CN102147867B (en) | Method for identifying traditional Chinese painting images and calligraphy images based on subject | |
CN113128507B (en) | License plate recognition method and device, electronic equipment and storage medium | |
CN112560852A (en) | Single-stage target detection method with rotation adaptive capacity based on YOLOv3 network | |
CN106919939B (en) | A kind of traffic signboard tracks and identifies method and system | |
CN114581904A (en) | End-to-end license plate detection and identification method based on deep learning | |
CN109919223A (en) | Object detection method and device based on deep neural network | |
CN110390228A (en) | The recognition methods of traffic sign picture, device and storage medium neural network based | |
CN109977941A (en) | Licence plate recognition method and device | |
CN115862055A (en) | Pedestrian re-identification method and device based on comparison learning and confrontation training | |
Wang et al. | Vehicle license plate recognition based on wavelet transform and vertical edge matching | |
CN112907972B (en) | Road vehicle flow detection method and system based on unmanned aerial vehicle and computer readable storage medium | |
CN108734170B (en) | License plate character segmentation method based on machine learning and template | |
JP5759124B2 (en) | Computerized method and system for analyzing objects in images obtained from a camera system | |
CN116704490B (en) | License plate recognition method, license plate recognition device and computer equipment | |
CN110110665A (en) | The detection method of hand region under a kind of driving environment | |
CN107092908A (en) | A kind of plane pressed characters automatic identifying method based on train bogie |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |