CN116994236A - Low-quality image license plate detection method based on deep neural network - Google Patents

Low-quality image license plate detection method based on deep neural network Download PDF

Info

Publication number
CN116994236A
CN116994236A CN202310972307.7A CN202310972307A CN116994236A CN 116994236 A CN116994236 A CN 116994236A CN 202310972307 A CN202310972307 A CN 202310972307A CN 116994236 A CN116994236 A CN 116994236A
Authority
CN
China
Prior art keywords
license plate
corner
image
feature
low
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310972307.7A
Other languages
Chinese (zh)
Inventor
王天磊
虞结福
曹九稳
刘德康
陈家贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhichuang Technology Co ltd
Hangzhou Dianzi University
Original Assignee
Hangzhou Zhichuang Technology Co ltd
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhichuang Technology Co ltd, Hangzhou Dianzi University filed Critical Hangzhou Zhichuang Technology Co ltd
Priority to CN202310972307.7A priority Critical patent/CN116994236A/en
Publication of CN116994236A publication Critical patent/CN116994236A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/625License plates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/766Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/147Determination of region of interest
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1918Fusion techniques, i.e. combining data from various sources, e.g. sensor fusion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a low-quality image license plate detection method based on a deep neural network, which comprises the steps of firstly obtaining training data, constructing a low-quality image license plate detection network, and then preprocessing the marked training data; then, the corner coordinates of the positioning are restrained by utilizing the corner difference loss; and finally, training a low-quality image license plate detection network according to the determined loss function, and detecting and positioning a license plate region in the image through the trained low-quality image license plate detection network. The invention provides a dual-pixel type regression strategy using corner points and center points, designs the advantages of implicitly integrating two regression types by using the constraint with relaxation, and realizes more excellent detection performance.

Description

Low-quality image license plate detection method based on deep neural network
Technical Field
The invention belongs to the field of image target detection, and relates to a license plate detection method in a low-quality image based on a deep neural network.
Background
With the rapid development of computer technology and communication technology, the capability of automatically processing information is increasingly enhanced, and a plurality of intelligent systems applied to traffic management are correspondingly generated, such as a vehicle navigation system, an electronic police system, a global positioning system, a license plate recognition system and the like which are serving the transportation industry. Because the license plate is the only effective mark of the vehicle identity information, the vehicle can accurately and efficiently extract the license plate information in the road driving process, and the license plate information has important significance for standardizing road traffic, so that automatic license plate recognition (Automatic License Plate Recognition, ALPR for short) becomes an important component in an intelligent traffic system. Although the existing license plate recognition technology is relatively mature, the license plate detection under the input of low-quality images represented by fuzzy imaging and bad weather still faces the defect of low accuracy, and therefore, the license plate character recognition is wrong, and the identity information of the vehicle cannot be accurately recognized. Due to uncertainty of the running state of the vehicle in the real scene and complex uncontrollable external environment, license plate recognition faces the following problems: 1) Because the large-scale inclination and space distortion on the geometric plane exist in the license plate due to improper shooting positions of the camera, the license plate area is difficult to accurately position; 2) The visual characteristics of the license plate are not obvious and are difficult to be effectively detected by a detector due to the influence of factors such as rain and fog weather, uneven illumination, motion blur and the like. Because the visual features of the license plate in the low-quality image are very weak, the prior method has no special solution for the boundary samples, which is a key to the practical application of license plate detection.
Disclosure of Invention
In order to overcome the problems of license plate recognition in the real scene, the invention provides a low-quality image license plate detection method based on a deep neural network. The method is respectively improved for the problems, and comprises the following steps: 1) Considering different regression strategies each has a emphasis on the processing power of different types of boundary samples. Specifically, the corner regression mode has excellent performance in corner detection of distorted license plates. The fuzzy license plate is affected by the surrounding environment, so that the visual characteristics are not obvious, and compared with the corner regression mode, the center point regression mode can lock the area range of the license plate more efficiently. Based on this observation, a two-pixel type regression strategy based on corner points, center points is presented herein to cope with different forms of noise samples. 2) In order to solve the contradiction of the two regression strategies, the two tasks are respectively encoded by the decoupled detection head, and the relevance of decoupling different regression with loose constraint is designed. The constraint provided above implicitly integrates the advantages of two regression types, and greatly improves the detection performance of the detector on license plate boundary samples.
A low-quality image license plate detection method based on a deep neural network comprises the following steps:
step 1, collecting a low-quality image of a vehicle containing a license plate, respectively labeling four license plate corner points of the license plate in the image from the upper left corner to the clockwise direction, and encoding the labeling result into a file name corresponding to the image as a tag of training data.
Step 2, constructing a low-quality image license plate detection network (LPCDet);
the low-quality image license plate detection network (LPCDet) adopts ResNet-50 as a feature extractor, namely a backbone network, and a feature fusion module (FPEM) is introduced to perform multi-scale feature fusion on the feature map extracted by the backbone network. The multi-scale fused feature maps are input into decoupled network sub-branches, respectively.
And step 3, preprocessing the marked training data.
And 4, restraining the positioned angular point coordinates by utilizing angular point difference loss.
Step 5, training a low-quality image license plate detection network according to the determined loss function:
and 6, detecting and positioning license plate areas in the images through a trained low-quality image license plate detection network.
Further, the step 2 is specifically implemented as follows:
the low-quality image license plate detection network (LPCDet) adopts ResNet-50 as a feature extractor, namely a backbone network, and the size of the output feature map is as followsWhere B is the batch size and W and H are the size of the input image. And a feature fusion module (FPEM) is introduced to perform multi-scale feature fusion on the feature map extracted by the backbone network so as to improve the capture of richer license plate features by the detection network. The feature images after multi-scale fusion are respectively input into decoupled network sub-branches, and the network sub-branches respectively form a heat map positioning license plate corner module and a center point-based offset positioning license plate corner module.
For all network sub-branches, feature extraction is first performed on the input feature graphs using a convolution block containing a3×3 convolution, a batch normalization layer and a Rectified LinearUnits (ReLU) activation function, respectively, while keeping the number of channels of each feature graph unchanged. Further, for the heat map positioning license plate corner module, a corner heat map sub-branch for predicting a corner heat map and an accurate sub-branch for accurately positioning a corresponding heat map are included. The former would adjust the channel number of the feature map to 4 by a 1 x 1 convolution and use Sigmoid activation function to explicitly represent the predicted heat map confidence on the feature map, the output could be expressed asThe latter directly uses a convolution of 1 x 1 to adjust the number of channels of the feature map to 8, the output can be expressed as +.>The license plate corner module for positioning the offset of the central point comprises a central point heat map sub-branch for predicting the heat map of the central point of the license plate and an offset sub-branch for predicting the offset from the central point to four corners. The former adjusts the channel number of the feature map to be 1 through a convolution of 1 multiplied by 1, and then utilizes the Sigmoid to activate the confidence coefficient of the corresponding central point heat map on the feature map to be expressed as +.>The latter directly uses a 1 x 1 convolution to adjust the channel number of the characteristic diagram to be 8, which can be expressed as
Further, the specific method in the step 3 is as follows:
3-1, adjusting the image to a size (512 multiplied by 512) conforming to the network (LPCDet) input by utilizing a bilinear interpolation mode, in order to avoid image distortion and reduce the calculation amount of model reasoning, firstly calculating the maximum side length of the image size, then complementing a shorter side length area with gray bars, and then scaling the processed image to a target size in an equal proportion;
3-2, carrying out standardized processing on the license plate image with the adjusted size according to the average value and the standard deviation calculated in the ImageNet data set, namely scaling the numerical value of each channel, respectively subtracting the average value, dividing the average value by the standard deviation, and adopting a calculation formula:
wherein I is the image after size adjustment, pixel=255 is the maximum threshold of the pixel point of the image, mean represents the average value of the image in the ImageNet dataset, and std represents the standard deviation of the image in the ImageNet dataset. The network can be converged more quickly during data training after standardized processing, and generalization of the model is effectively improved.
3-3 for license plate tags in each imageFirstly, calculating a scaled low-resolution corner coordinateR is a scaling factor, and the value is 4; all corner points are then mapped to the heat map using gaussian kernelsThe calculation of mapping corner points from gaussian functions to heat maps is as follows:
wherein σp Is the adaptive standard deviation of the current license plate. In order to reduce the discrete error caused by the output stride, the offset corresponding to the ith corner is additionally calculated
Calculating the center point of the area by using the coordinates of the corner points of the license plateObtaining the offset distance +.> wherein /> and />Representing the offset of the ith corner point from the center point in the x and y directions.
Further, the specific method in the step 4 is as follows:
and the robustness of angular point positioning is improved through a relaxation constraint strategy. The strategy includes a loss of relaxation constraint that is used to couple the two regression modes.
Step 4-1In the regression process of the license plate corner module by heat map positioning, the feature map is output by branching the corner heat mapAnd performing feature decoding to obtain the position coordinates of the corner points of the license plate. Then the characteristic diagram of the sub-branch output is refined +.>Decoding is carried out, and the deviation amount for accurately locating the corner of the corresponding license plate is calculated. And obtaining the accurate coordinates of the license plate corner according to the obtained position coordinates and the deviation amount of the license plate corner.
Step 4-2, in the regression process of the license plate corner module positioned by the offset of the central point, firstly, outputting a characteristic diagram by branching the heat diagram of the central pointPerforming feature decoding to obtain position coordinates of a license plate center point, and outputting a feature map +.>And (3) decoding to calculate the offset from the center point to each corner point, and adding the corresponding offset to the center point to position the corner points.
And 4-3, loose constraint loss based on angular point difference.
The relaxed constraint loss function couples the regression process of the two modules so that more efficient constraint and guidance can be produced between the two modules, the loss function being formulated as follows:
wherein ,coordinate values corresponding to the ith corner point based on direct heat map prediction are represented by +_>And representing coordinate values corresponding to the ith corner point predicted based on the offset of the license plate center point. Alpha is the modulation factor and beta is the radius of the relaxed boundary. In summary, the total loss of relaxation is defined as:
further, the specific method in step 5 is as follows:
the ResNet-50 adopts pre-trained weights on the ImageNet dataset, adopts the CCPD license plate dataset to train a low-quality image license plate detection network (LPCDet), and then uses the training data processed in the step 3 to adjust. Setting the batch size as 28 on Nvidia3080GPU, iterating the total training for 50 rounds, simultaneously using Adam as an optimizer, setting the initial learning rate as 0.01 and setting the weight attenuation as 10 -5 The learning rate is regulated by cosine function, and gradually decreases to 5×10 along with the increment of the learning rate -4 . In order to improve generalization of the model, data enhancement is realized by using an imgauge library, and 1 to 3 enhancement operations are randomly selected for a single image at a time during training.
Further, enhancement operations include brightness adjustment, contrast adjustment, color gamut conversion, histogram equalization, random rotation and clipping, horizontal inversion, mean shift blurring, motion blurring, affine transformation, perspective transformation, and rainy day simulation.
Further, the specific method in the step 6 is as follows:
for the input image to be detected, performing image size adjustment and standardization pretreatment, putting the image into a trained model for reasoning, and outputting a characteristic diagram of the angular point heat diagramAnd deviation amount characteristic diagram->By means of->Calculating the vector maximum index to obtain the predicted value with highest confidence, and substituting the predicted value into the vector maximum indexAnd calculating the deviation amount of the corner points in a Cartesian coordinate system. In summary, the calculation formula of license plate corner prediction decoding can be expressed as:
wherein confi A heat map region representing the maximum confidence level for the ith corner,representing the coordinate value corresponding to the ith corner point, mod represents the remainder operation.
In the implementation process of the invention, the four corner marked pictures of the license plate are used as training data of a low-quality image license plate detection network; the model learns the invariance information of the license plate by using an online data enhancement method, so that the generalization of the model is improved on the premise of not increasing data; the GT coordinates are generated into corresponding heat maps according to the two-dimensional Gaussian function, and the GT coordinates are subjected to soft labeling by utilizing the Gaussian heat maps, so that the direction guidance is increased for training of the network, and the network converges more quickly; for positioning license plate corner points, the license plate corner point module can better determine license plate areas based on offset of the center point, and the license plate corner point module can more efficiently and accurately position the license plate corner points by directly positioning the license plate corner points through a heat map, and the license plate corner point module guides the license plate corner points to form a weak supervision relation, so that the problem that the license plate corner points in a low-quality image are unclear and difficult to position is effectively solved; the TIoU is introduced as a new license plate detection evaluation index, so that the defect of high index caused by the traditional IoU measurement is avoided, the detection frame is more matched with the real license plate area, and the guarantee is provided for the subsequent license plate character recognition.
The invention has the following beneficial effects:
the existing method only considers single pixel type regression to obtain license plate corner points, and the scheme may not achieve ideal effects in various low-quality license plate images. Through experiments, the corner regression mode has excellent performance in corner detection of distorted license plates, and the center point regression mode can more effectively lock the areas of the license plates in blurred images. Based on the observation, the invention provides a dual-pixel type regression strategy using corner points and center points, designs the advantages of implicitly integrating two regression types by constraint with relaxation, and realizes more excellent detection performance.
Drawings
In order to more clearly illustrate the technical solutions involved in the present invention, the following brief description will be given of the drawings used in the implementation process of the present invention.
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a diagram of a model structure designed by the method of the present invention;
FIG. 3 is a block diagram of a feature fusion module (FPEM);
FIG. 4 is a graph of the relaxation loss function and the corresponding gradient curve according to the present invention.
Detailed description of the preferred embodiments
The invention is further described below with reference to the drawings and examples.
Taking license plate images shot by gate equipment in places such as cells, parking lots and the like as an example, a pre-trained ResNet-50 network is used as a characteristic extraction backbone network of a model, and the invention is further described by describing the processes of model training and license plate detection. The following description is merely exemplary and explanatory and is not intended to limit the present invention in any manner.
As shown in fig. 1, a low-quality image license plate detection method based on a deep neural network is specifically implemented as follows:
and 1, marking the acquired license plate data set by using an open source marking tool in a way of marking four corner points clockwise from the upper left corner along a license plate region, wherein the formed closed loop region is the marked license plate region, decoding the contents (coordinates of the four corner points in the license plate and license plate numbers) in the generated json marking file as file names of corresponding license plate pictures, and dividing a training set, a verification set and a test set according to the proportion of 7:2:1 for training of a subsequent detection model.
Step 2, as shown in fig. 2, the low-quality image license plate detection network (LPCDet) adopts ResNet-50 as a feature extractor, namely a backbone network, and the problem that the network is seriously degraded along with the increase of depth is avoided by adopting a direct connection mode of residual blocks, so that the feature information of an input image can be efficiently extracted, and the size of an output feature image is as followsWhere B is the batch size and W and H are the size of the input image. Furthermore, a feature fusion module (FPEM) is introduced to perform multi-scale feature fusion on the feature map extracted by the main network, so that the detection network is improved to capture more abundant license plate features. The FPEM is largely divided into two stages, upscaling and downscaling, as shown in fig. 3. In the upscaling stage, the input feature map is iteratively feature enhanced in steps of 32, 16, 8, and 4 pixels. In the downscaling stage, the feature map of the previous stage is used as input, and features are enhanced from 4 steps to 32 steps from small to large. The feature map size of the output after feature fusion is +.> Where B is the batch size and W and H are the size of the input image. The feature images after multi-scale fusion are respectively input into decoupled network sub-branches, and the network sub-branches respectively form a heat map positioning license plate corner module and a center point-based offset positioning license plate corner module. For all network sub-branches, first one is used respectivelyThe feature extraction is performed on the input feature map by a convolution block comprising a3 x 3 convolution, a batch normalization layer and a Rectified Linear Units (ReLU) activation function, while keeping the number of channels of each feature map unchanged. Further, for the heat map positioning license plate corner module, a corner heat map sub-branch for predicting a corner heat map and an accurate sub-branch for accurately positioning a corresponding heat map are included. The former would adjust the channel number of the feature map to 4 by a 1 x 1 convolution and use Sigmoid activation function to explicitly represent the predicted heat map confidence on the feature map, the output could be expressed as +.>The latter directly uses a convolution of 1 x 1 to adjust the number of channels of the feature map to 8, the output can be expressed as +.>The license plate corner module for positioning the offset of the central point comprises a central point heat map sub-branch for predicting the heat map of the central point of the license plate and an offset sub-branch for predicting the offset from the central point to four corners. The former adjusts the channel number of the feature map to be 1 through a convolution of 1 multiplied by 1, and then utilizes the Sigmoid to activate the confidence coefficient of the corresponding central point heat map on the feature map to be expressed as +.>The latter directly uses a convolution of 1 x 1 to adjust the number of channels of the feature map to 8, which can be expressed as +.>
And step 3, preprocessing the marked training data.
3-1, adjusting the image to a size (512 multiplied by 512) conforming to the network (LPCDet) input by utilizing a bilinear interpolation mode, in order to avoid image distortion and reduce the calculation amount of model reasoning, firstly calculating the maximum side length of the image size, then complementing a shorter side length area with gray bars, and then scaling the processed image to a target size in an equal proportion;
3-2, carrying out standardized processing on the license plate image with the adjusted size according to the average value and the standard deviation calculated in the ImageNet data set, namely scaling the numerical value of each channel, respectively subtracting the average value, dividing the average value by the standard deviation, and adopting a calculation formula:
wherein I is the image after the size adjustment, pixel=255 is the maximum pixel threshold of the image, mean= [0.40789655 0.44719303 0.47026116] represents the average value of the image in the ImageNet dataset, std= [0.2886383 0.27408165 0.27809834] represents the standard deviation of the image in the ImageNet dataset. The network can be converged more quickly during data training after standardized processing, and generalization of the model is effectively improved.
3-3. For the corner coordinates p of the license plate tag in each image, a scaled corner coordinate with low resolution is calculated firstR is a scaling factor, and the value is 4; all corner points are then mapped to the heat map using gaussian kernelsThe calculation of mapping corner points from gaussian functions to heat maps is as follows:
wherein σp Is the adaptive standard deviation of the current license plate (the superscript is the square and the subscript is the gaussian function parameter calculated from the relative size of the current license plate). Since the feature map size (128×128) of the network (LPCDet) output is one fourth of the input image (512×512), the deviation amount corresponding to the ith corner point is additionally calculated in order to reduce the discrete error caused by the output stride
Calculating the center point of the area by using the coordinates of the corner points of the license plateObtaining the offset distance +.> wherein /> and />Representing the offset of the ith corner point from the center point in the x and y directions.
And 4, restraining the positioned angular point coordinates by utilizing angular point difference loss.
In order to solve the problem that corner features of license plates in low-quality images are not obvious and corners of license plates are difficult to detect effectively, a relaxation constraint strategy is provided to improve robustness of corner positioning. The strategy includes a loss of relaxation constraint that is used to couple the two regression modes.
Step 4-1. In the regression process of the license plate corner module by the heat map positioning, the feature map is output by the branches of the corner heat mapAnd performing feature decoding to obtain the position coordinates of the corner points of the license plate. Then the characteristic diagram of the sub-branch output is refined +.>Decoding is carried out, and the deviation amount for accurately locating the corner of the corresponding license plate is calculated. And obtaining the accurate coordinates of the license plate corner according to the obtained position coordinates and the deviation amount of the license plate corner. The heat map positioning license plate corner module is a main module of the model, and finally, the output of the model is a prediction result based on the module.
The number of channels of the feature map on the diagonal point heat map sub-branch is adjusted to be 4, the channel corresponds to 4 corners of the license plate to be detected, and finally, a Sigmoid activation function is applied, so that the probability that the heat map corresponds to the real corners is explicitly represented. 8 channels respectively correspond to 8 coordinate parameters regressed by the heat map in the directions of the x axis and the y axis, and are used for compensating discretization errors caused by output step length, so that predicted license plate corner points are more accurate.
Step 4-2, in the regression process of the license plate corner module positioned by the offset of the central point, firstly, outputting a characteristic diagram by branching the heat diagram of the central pointPerforming feature decoding to obtain position coordinates of a license plate center point, and outputting a feature map +.>And (3) decoding to calculate the offset from the center point to each corner point, and adding the corresponding offset to the center point to position the corner points. The center point offset positioning license plate corner module is an auxiliary module of the model, is only used for assisting the heat map to directly position the license plate corner module for training when the model is trained, and does not participate in the final model output result.
And 4-3, loose constraint loss based on angular point difference.
Compared with directly predicting the corner points of the license plate, the visual characteristics of the center of the license plate are more obvious under a low-quality image, so that the approximate area of the license plate is easier to lock through the center point of the license plate. Specifically, in the initial stage of model training, the center point offset positioning license plate corner module can more effectively lock out a license plate area than the heat map positioning license plate corner module, and at the moment, the center point offset positioning license plate corner module can provide positive guidance for the regression process of the heat map positioning license plate corner module. However, as the model gradually converges, the heat map positioning license plate corner module can accurately position the corner position of the license plate, and at the moment, the constraint imposed on the heat map positioning license plate corner module by the center point offset positioning license plate corner module can be negatively influenced. The relaxed constraint loss function presented herein couples the regression process of the two modules to enable more efficient constraint and guidance between the two modules, the loss function being formulated as follows:
wherein ,coordinate values corresponding to the ith corner point based on direct heat map prediction are represented by +_>And representing coordinate values corresponding to the ith corner point predicted based on the offset of the license plate center point. α=0.1 is the modulation factor and β=5 is the radius of the relaxed boundary. In summary, the total loss of relaxation is defined as:
as shown in fig. 4, assuming that the radius β=5 of the loose border, the sample with small difference in angular point coordinates should obtain a smaller gradient to eliminate the negative influence of the angular point positioning module based on the central point in the later stage of training on the direct positioning of the angular point module based on the heat map. Meanwhile, samples with great variability are likely to cause gradient explosions, resulting in erroneous learning.
Step 5, training a low-quality image license plate detection network according to the determined loss function:
the main network ResNet-50 in the low-quality image license plate detection network adopts pre-trained weights on an image Net data set, adopts a CCPD license plate data set to train the low-quality image license plate detection network (LPCDet), and then uses the processed data in the step 3And (3) training data are adjusted, namely, after the LPCDet model is loaded with the trained parameters on the CCPD license plate data set, fine adjustment is carried out through the preprocessed marked training data. Setting the batch size as 28 on Nvidia3080GPU, iterating the total training for 50 rounds, simultaneously using Adam as an optimizer, setting the initial learning rate as 0.01 and setting the weight attenuation as 10 -5 The learning rate is regulated by cosine function, and gradually decreases to 5×10 along with the increment of the learning rate -4 . To enhance generalization of the model, data enhancement is achieved using imgauge, 1 to 3 enhancement operations are randomly selected each time, the enhancement operations include brightness adjustment, contrast adjustment, color gamut conversion, histogram equalization, random rotation and clipping, horizontal inversion, mean shift blurring, motion blurring, affine transformation, perspective transformation, rainy day simulation, and the like.
And 6, detecting and positioning license plate areas in the images through a trained low-quality image license plate detection network.
For the input image to be detected, performing image size adjustment and standardization pretreatment, putting the image into a trained model for reasoning, and outputting a characteristic diagram of the angular point heat diagramAnd deviation amount characteristic diagram->By means of->Calculating the vector maximum index to obtain the predicted value with highest confidence, and substituting the predicted value into the vector maximum indexAnd calculating the deviation amount of the corner points in a Cartesian coordinate system. In summary, the calculation formula of license plate corner prediction decoding can be expressed as:
wherein confi A heat map region representing the maximum confidence level for the ith corner,representing the coordinate value corresponding to the ith corner point, mod represents the remainder operation.
Step 7, we introduce a TIoU metric as a detection evaluation index, and currently IoU as an evaluation index of the metric cannot accurately represent the accuracy of the detection result, and even the detection result with character deletion and excessive background noise is sometimes used as a correct sample. The TIoU measurement standard introduced by us can make the detection result pay more attention to all areas of the GT, namely, ensure the integrity of the detection frame relative to the GT, and punish the detection frames outside the GT, so that the detection result with higher score has better compactness. Based on the two points, the detection result of the high TIoU can be ensured to be better than the result of the low TIoU. The calculation of recall, precision, and F1 score in the TIoU will be described below.
Calculation of the TIoU recall: first, the undetected area in GT is calculated:
C t =A(G i )-A(D j ∩G i ),C t ∈[0,A(G i )]
wherein A (x) represents the area of the region, G i For the ith GT region, D j Is the region of the j-th detection frame, G i The ratio of the intersections is calculated from the following equation:
the calculation formula of the TIoU recall rate obtained by the method is as follows:
calculation of the accuracy of the TIoU: firstly, calculating an abnormal region which is not in a target GT region in a detection frame:
O t =A(D j -D j ∩G i ),O t ∈[0,A(D j )]
the correct detection area intersection ratio is:
similarly, the calculation formula of the accuracy of the TIoU is as follows:
7-3. Metric index based on TIoU: in order to calculate the final score, a harmonic mean of recall and precision is generally used as a main index, and the calculation formula is as follows:
the calculation formula of the recall rate and the precision is as follows:
wherein ,Numgt For the total number of GT frames, num dt To detect the total number of frames.
Examples
On the self-built license plate dataset, we compared the detection and recognition accuracy of the LPCDet and baseline model (i.e. the centrnet before improvement) (the same license plate recognition model, the detection results before and after improvement are used as input). As shown in the following table, in the conventional IoU and new TIoU evaluation protocols, LPCDet exhibited performance superior to the baseline model, with 3.3% and 5.1% improvement in detection performance and 0.5% improvement in recognition accuracy. The advantages of LPCDet over baseline are further amplified under the more stringent TIoU metric, reflecting that the improved energy model enables more efficient and accurate detection.
The foregoing is a further detailed description of the invention in connection with specific/preferred embodiments, and it is not intended that the invention be limited to such description. It will be apparent to those skilled in the art that several alternatives or modifications can be made to the described embodiments without departing from the spirit of the invention, and these alternatives or modifications should be considered to be within the scope of the invention.
The invention, in part not described in detail, is within the skill of those skilled in the art.

Claims (7)

1. The low-quality image license plate detection method based on the deep neural network is characterized by comprising the following steps of:
step 1, collecting a low-quality image of a vehicle containing license plates, respectively labeling four license plate corner points of the license plates in the image from the upper left corner to the clockwise direction, and coding labeling results into file names corresponding to the image as labels of training data;
step 2, constructing a low-quality image license plate detection network;
the low-quality image license plate detection network adopts ResNet-50 as a feature extractor, namely a main network, and a feature fusion module is introduced to perform multi-scale feature fusion on a feature map extracted by the main network; the feature images after multi-scale fusion are respectively input into decoupled network subbranches;
step 3, preprocessing the marked training data;
step 4, constraining the positioned angular point coordinates by utilizing angular point difference loss;
step 5, training a low-quality image license plate detection network according to the determined loss function:
and 6, detecting and positioning license plate areas in the images through a trained low-quality image license plate detection network.
2. The method for detecting the license plate of the low-quality image based on the deep neural network according to claim 1, wherein the step 2 is specifically implemented as follows:
the low-quality image license plate detection network adopts ResNet-50 as a feature extractor, namely a backbone network, and the size of the output feature map is as followsWherein B is the batch size, W and H are the size of the input image; introducing a feature fusion module to perform multi-scale feature fusion on the feature map extracted by the backbone network so as to improve the capture of richer license plate features by the detection network; the feature images after multi-scale fusion are respectively input into decoupled network sub-branches, and the network sub-branches respectively form a heat map positioning license plate corner module and a center point-based offset positioning license plate corner module;
for all network sub-branches, firstly, respectively carrying out feature extraction on the input feature images by using a convolution block, wherein the convolution block comprises a3×3 convolution, a batch normalization layer and a ReLU activation function, and the channel number of each feature image is kept unchanged; further, for the heat map positioning license plate corner module, the module comprises a corner heat map sub-branch for predicting a corner heat map and an accurate sub-branch for accurately positioning a corresponding heat map; the former would adjust the channel number of the feature map to 4 by a 1 x 1 convolution and use Sigmoid activation function to explicitly represent the predicted heat map confidence on the feature map, the output could be expressed asThe latter directly uses a convolution of 1 x 1 to adjust the number of channels of the feature map to 8, the output can be expressed as +.>For a central point offset positioning license plate corner module, the license plate corner module comprises a central point heat map sub-branch for predicting a license plate central point heat map and an offset sub-branch for predicting offsets from a central point to four corners; the former adjusts the channel number of the feature map to be 1 through a convolution of 1 multiplied by 1, and then utilizes the Sigmoid to activate the confidence coefficient of the corresponding central point heat map on the feature map to be expressed as +.>The latter directly uses a convolution of 1 x 1 to adjust the number of channels of the feature map to 8, which can be expressed as +.>
3. The method for detecting the license plate of the low-quality image based on the deep neural network according to claim 2, wherein the specific method in the step 3 is as follows:
3-1, adjusting the image to a size (512 multiplied by 512) conforming to network input by utilizing a bilinear interpolation mode, in order to avoid image distortion and reduce the calculation amount of model reasoning, firstly calculating the maximum side length of the image size, then complementing a shorter side length area with gray bars, and then scaling the processed image to a target size in an equal proportion;
3-2, carrying out standardized processing on the license plate image with the adjusted size according to the average value and the standard deviation calculated in the ImageNet data set, namely scaling the numerical value of each channel, respectively subtracting the average value, dividing the average value by the standard deviation, and adopting a calculation formula:
wherein I is an image after size adjustment, pixel=255 is a maximum pixel threshold of the image, mean represents a mean value of the image in the ImageNet dataset, and std represents a standard deviation of the image in the ImageNet dataset; the network can be converged more quickly during the data training after the standardization processing, and the generalization of the model is effectively improved;
3-3. For the corner coordinates p of the license plate tag in each image, a scaled corner coordinate with low resolution is calculated firstR is a scaling factor, and the value is 4; all corner points are then mapped to the heat map using gaussian kernelsThe calculation of mapping corner points from gaussian functions to heat maps is as follows:
wherein σp Is the self-adaptive standard deviation of the current license plate; in order to reduce the discrete error caused by the output stride, the offset corresponding to the ith corner is additionally calculated
Calculating the center point of the area by using the coordinates of the corner points of the license plateObtaining the offset distance +.> wherein /> and />Representing the offset of the ith corner point from the center point in the x and y directions.
4. The method for detecting the license plate of the low-quality image based on the deep neural network according to claim 3, wherein the specific method in the step 4 is as follows:
the robustness of angular point positioning is improved through a relaxation constraint strategy; the strategy includes a loss of relaxation constraint that is used to couple the two regression modes;
step 4-1. In the regression process of the license plate corner module by the heat map positioning, the feature map is output by the branches of the corner heat mapPerforming feature decoding to obtain position coordinates of license plate corner points; then the characteristic diagram of the sub-branch output is refined +.>Decoding is carried out, and deviation amount for accurately locating the corner point of the corresponding license plate is calculated; obtaining accurate coordinates of license plate corner points according to the obtained position coordinates and deviation amount of the license plate corner points;
step 4-2, in the regression process of the license plate corner module positioned by the offset of the central point, firstly, outputting a characteristic diagram by branching the heat diagram of the central pointPerforming feature decoding to obtain position coordinates of a license plate center point, and outputting a feature map +.>Calculating offset from the center point to each corner point by decoding, and adding the corresponding offset to the center point to position the corner points;
step 4-3, relaxation constraint loss based on angular point difference;
the relaxed constraint loss function couples the regression process of the two modules so that more efficient constraint and guidance can be produced between the two modules, the loss function being formulated as follows:
wherein ,coordinate values corresponding to the ith corner point based on direct heat map prediction are represented by +_>Representing coordinate values corresponding to the ith corner point predicted based on the offset of the license plate center point; alpha is the modulation factor and beta is the radius of the relaxed boundary; in summary, the total loss of relaxation is defined as:
5. the method for detecting the license plate of the low-quality image based on the deep neural network according to claim 4, wherein the specific method in the step 5 is as follows:
the ResNet-50 adopts pre-trained weights on the ImageNet dataset, adopts the CCPD license plate dataset to train a low-quality image license plate detection network (LPCDet), and then uses the training data processed in the step 3 to adjust. Setting the batch size as 28 on Nvidia3080GPU, iterating the total training for 50 rounds, simultaneously using Adam as an optimizer, setting the initial learning rate as 0.01 and setting the weight attenuation as 10 -5 And learn by cosine function pairsThe learning rate is gradually reduced to 5 multiplied by 10 along with the gradual increment of the rounds -4 The method comprises the steps of carrying out a first treatment on the surface of the In order to improve generalization of the model, data enhancement is realized by using an imgauge library, and 1 to 3 enhancement operations are randomly selected for a single image at a time during training.
6. The method of claim 5, wherein the enhancing operation comprises brightness adjustment, contrast adjustment, color gamut conversion, histogram equalization, random rotation and clipping, horizontal inversion, mean shift blurring, motion blurring, affine conversion, perspective conversion, and rainy day simulation.
7. The method for detecting the license plate of the low-quality image based on the deep neural network according to claim 5 or 6, wherein the specific method in the step 6 is as follows:
for the input image to be detected, performing image size adjustment and standardization pretreatment, putting the image into a trained model for reasoning, and outputting a characteristic diagram of the angular point heat diagramAnd deviation amount characteristic diagram->By aligningCalculating the vector maximum index to obtain the predicted value with highest confidence, and substituting the predicted value into the vector maximum indexCalculating the deviation amount of the corner points in a Cartesian coordinate system; in summary, the calculation formula of license plate corner prediction decoding can be expressed as:
wherein confi A heat map region representing the maximum confidence level for the ith corner,representing the coordinate value corresponding to the ith corner point, mod represents the remainder operation.
CN202310972307.7A 2023-08-03 2023-08-03 Low-quality image license plate detection method based on deep neural network Pending CN116994236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310972307.7A CN116994236A (en) 2023-08-03 2023-08-03 Low-quality image license plate detection method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310972307.7A CN116994236A (en) 2023-08-03 2023-08-03 Low-quality image license plate detection method based on deep neural network

Publications (1)

Publication Number Publication Date
CN116994236A true CN116994236A (en) 2023-11-03

Family

ID=88533517

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310972307.7A Pending CN116994236A (en) 2023-08-03 2023-08-03 Low-quality image license plate detection method based on deep neural network

Country Status (1)

Country Link
CN (1) CN116994236A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117671473A (en) * 2024-02-01 2024-03-08 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion
CN117994594A (en) * 2024-04-03 2024-05-07 武汉纺织大学 Power operation risk identification method based on deep learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117671473A (en) * 2024-02-01 2024-03-08 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion
CN117671473B (en) * 2024-02-01 2024-05-07 中国海洋大学 Underwater target detection model and method based on attention and multi-scale feature fusion
CN117994594A (en) * 2024-04-03 2024-05-07 武汉纺织大学 Power operation risk identification method based on deep learning

Similar Documents

Publication Publication Date Title
CN113221905B (en) Semantic segmentation unsupervised domain adaptation method, device and system based on uniform clustering and storage medium
CN112633277B (en) Channel ship plate detection, positioning and recognition method based on deep learning
WO2021135254A1 (en) License plate number recognition method and apparatus, electronic device, and storage medium
CN109785385B (en) Visual target tracking method and system
CN116994236A (en) Low-quality image license plate detection method based on deep neural network
CN111008633B (en) License plate character segmentation method based on attention mechanism
CN112200143A (en) Road disease detection method based on candidate area network and machine vision
CN112395951B (en) Complex scene-oriented domain-adaptive traffic target detection and identification method
CN116665176B (en) Multi-task network road target detection method for vehicle automatic driving
Xie et al. A binocular vision application in IoT: Realtime trustworthy road condition detection system in passable area
CN113989604B (en) Tire DOT information identification method based on end-to-end deep learning
CN112215190A (en) Illegal building detection method based on YOLOV4 model
CN110969164A (en) Low-illumination imaging license plate recognition method and device based on deep learning end-to-end
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
CN111104941B (en) Image direction correction method and device and electronic equipment
CN110135435B (en) Saliency detection method and device based on breadth learning system
Liu et al. SETR-YOLOv5n: A lightweight low-light lane curvature detection method based on fractional-order fusion model
CN112528994A (en) Free-angle license plate detection method, license plate identification method and identification system
CN117115616A (en) Real-time low-illumination image target detection method based on convolutional neural network
CN117011819A (en) Lane line detection method, device and equipment based on feature guidance attention
CN116758421A (en) Remote sensing image directed target detection method based on weak supervised learning
CN116824330A (en) Small sample cross-domain target detection method based on deep learning
CN116452472A (en) Low-illumination image enhancement method based on semantic knowledge guidance
CN116958952B (en) License plate target detection method suitable for expressway monitoring video
CN114694091B (en) Traffic participant detection method based on surveillance video under complex traffic environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination