CN111079753A

CN111079753A - License plate recognition method and device based on deep learning and big data combination

Info

Publication number: CN111079753A
Application number: CN201911325648.5A
Authority: CN
Inventors: 罗茜; 张斯尧; 王思远; 蒋杰; 张�诚; 李乾; 谢喜林; 黄晋
Original assignee: Changsha Qianshitong Intelligent Technology Co ltd
Current assignee: Changsha Qianshitong Intelligent Technology Co ltd
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2020-04-28
Anticipated expiration: 2039-12-20
Also published as: CN111079753B

Abstract

The invention provides a license plate recognition method and device based on deep learning and big data combination, wherein the method comprises the following steps: carrying out iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; batch training the loaded more vehicle image data through an improved linear scaling and preheating strategy to improve the accuracy of training the deep learning model of the big data vehicle image; training a large batch of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model; inputting the vehicle image trained by the deep learning model and the rapid training model into the RNN, and detecting whether the ROI of the vehicle image is a license plate or not; if yes, license plate recognition is carried out on the ROI through the integrated depth network model, and therefore efficiency and accuracy of large-data license plate recognition can be improved.

Description

License plate recognition method and device based on deep learning and big data combination

Technical Field

The invention belongs to the technical field of computer vision and intelligent traffic, and particularly relates to a license plate recognition method and device based on deep learning and big data combination, terminal equipment and a computer readable medium.

Background

At present, distributed training of a model based on big data and deep learning is an important research foundation of a deep learning network in the field of computer vision. In general, for deep learning applications, a larger data set and a larger model can result in a significant increase in accuracy, but at the cost of taking longer training times. With the rise of deep learning in recent years, many researchers have tried to construct a deep learning network training model based on the rise of deep learning, and both accuracy and effectiveness can be achieved. The method aims to train the real vehicle images, pedestrian images and the like, so that the distributed training method has wide application value in real scenes. The existing training method of the big data vehicle image deep learning model has the defects of low training speed, high training cost and the like, for example, the training of a residual error network-50 (ResNet-50) of millions of vehicle images by using a GPU (image processor) of M40 of great invida takes nearly 14 days. This training requires a total of 10 to the power of 18 single-precision operations. This is clearly disadvantageous both in terms of time and cost. Moreover, in the existing license plate recognition technology, a large amount of calculation of license plate character segmentation is required, so that the problems of slow and inaccurate license plate recognition, poor real-time performance and the like are caused.

Disclosure of Invention

In view of this, embodiments of the present invention provide a license plate recognition method and apparatus based on deep learning and big data combination, a terminal device, and a computer readable medium, which can improve efficiency and accuracy of big data license plate recognition.

The first aspect of the embodiment of the invention provides a license plate recognition method based on deep learning and big data combination, which comprises the following steps:

carrying out iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein, each time the deep learning model of the big-data vehicle image is iteratively trained, more vehicle image data is loaded using more processors than the previous iterative training;

batch training the loaded more vehicle image data to improve the accuracy of training the deep learning model of the large data vehicle images by an improved linear scaling and warm-up strategy, wherein the improved linear scaling comprises increasing the batch from B to kB while increasing the learning rate from η to k η, the improved warm-up strategy comprises increasing the relatively small learning rate η to the relatively large learning rate k η for the first several time periods from the relatively small learning rate η value if the relatively large learning rate k η is used;

training a large batch of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a rapid training model;

inputting the vehicle images trained by the deep learning model and the rapid training model into an RNN (navigation network) to detect whether the RoI (RoI) of the vehicle images is a license plate or not;

if the license plate is the license plate, performing license plate identification on the RoI through an integrated deep network model; the integrated depth network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

A second aspect of the embodiments of the present invention provides a license plate recognition device based on deep learning and big data combination, including:

the iterative training module is used for performing iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein, each time the deep learning model of the big data vehicle image is iteratively trained, more processors are used to load more vehicle image data than the previous iterative training;

a accuracy training module for batch training the loaded more vehicle image data to improve the accuracy of training the deep learning model for the large data vehicle images by an improved linear scaling and warm-up strategy, the improved linear scaling including increasing the batch from B to kB while increasing the learning rate from η to k η, the improved warm-up strategy including increasing the relatively small learning rate η to the relatively large learning rate k η for the first several time periods starting from a relatively small learning rate η value if a relatively large learning rate k η is used;

the scaling improvement module is used for training a large batch of training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a fast training model;

the license plate detection module is used for inputting the vehicle image trained by the deep learning model and the rapid training model into the RNN so as to detect whether the RoI of the vehicle image is a license plate or not;

the license plate recognition module is used for recognizing the license plate of the RoI through an integrated deep network model when the license plate is detected; the integrated deep network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

A third aspect of the embodiments of the present invention provides a terminal device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the license plate recognition method based on deep learning and big data combination when executing the computer program.

A fourth aspect of the embodiments of the present invention provides a computer-readable medium, where a computer program is stored, and when the computer program is processed and executed, the steps of the license plate recognition method based on deep learning and big data combination are implemented.

In the license plate recognition method based on deep learning and big data combination provided by the embodiment of the invention, a big data license plate image can be trained by combining a relevant model through an improved random gradient descent iterative algorithm, a linear scaling and preheating strategy adaptation rate scaling algorithm, the trained vehicle data is input into an RNN to detect whether the RoI of the vehicle image is a license plate or not, and when the RoI is detected to be the license plate, the license plate recognition is carried out on the RoI through an integrated deep network model, so that the efficiency and the accuracy of the big data license plate recognition can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following briefly introduces the embodiments or drawings used in the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

Fig. 1 is a flowchart of a license plate recognition method based on deep learning and big data combination according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a process of identifying the RoI of a license plate image through an integrated deep network model according to an embodiment of the present invention;

FIG. 3 is a schematic structural diagram of a license plate recognition device based on deep learning and big data combination according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a detailed structure of the license plate detection module in FIG. 3;

FIG. 5 is a schematic diagram of a detailed structure of the license plate recognition module in FIG. 3;

fig. 6 is a schematic diagram of a terminal device according to an embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Referring to fig. 1, fig. 1 is a diagram illustrating a license plate recognition method based on deep learning and big data combination according to an embodiment of the present invention. As shown in fig. 1, the license plate recognition method based on deep learning and big data combination of the embodiment includes the following steps:

s101: and carrying out iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm.

In the embodiments of the present invention, generally speaking, an asynchronous method using a parameter server cannot guarantee stability on a large system. For very large Deep Neural Network (DNN) training, the data parallel synchronization method is more stable. The idea is also simple-by using large batch sizes for random gradient descent (SGD), the work per iteration can be easily distributed to multiple processors. In the ideal vehicle image training case, ResNet-50 requires 772 billions of single precision operations to process one 225x225 vehicle image. If an epoch (time period) is run 90 times for an image mesh (ImageNet) dataset, the operand is 90 x 128 x 77.2 million (18 th power of 10). Currently, the most powerful supercomputers can complete 200 × 1015 single-precision operations per second. If an algorithm is available that makes full use of the supercomputer, the training of ResNet-50 can theoretically be completed in 5 seconds. For this reason, it is necessary to have the algorithm use more processors and load more vehicle image data at each iteration. Thereby reducing the total training time. Generally, to the extent that larger batches will result in sheetsThe speed of the GPU is higher (as shown in fig. 2). The reason is that the low-level matrix computation library will be more efficient. For training the Res-Net 50 model using ImageNet, the optimal batch size for each GPU is 512. If it is desired to use many GPUs and have each GPU active, a larger batch size is required. For example, if there are 16 GPUs, then the batch size should be set to 16 × 512 — 8192. Ideally, if the total number of accesses is fixed, and the batch size is linearly increased as the number of processors increases, the number of improved SGD iterations is linearly decreased, the time cost per iteration remains the same, and thus the total time is linearly decreased with the number of processors. The specific improved SGD iterative algorithm is as follows: let w represent the weight of (deep neural network) DNN, X represent the training data, n be the number of samples in X, and Y represent the label of the training data X. Let us order x_iSample for X, l (X)_i,y_iW) is x_iAnd its label y_i(i ∈ {1, 2.., n)) calculated losses. Embodiments of the present invention use a loss function like the cross-entropy function, and the goal of DNN training is to minimize the loss function in equation (1), with the following formula:

wherein w represents the weight of DNN, X is training data, n is the number of samples in X, Y represents the label of the training data X, and X represents the weight of DNN_iAre samples in the training data X.

In the t-th iteration, embodiments of the invention use forward and backward propagation to find the gradient of the loss function versus the weight. This gradient is then used to update the weights, with equation (2) for updating the weights according to the gradient as follows:

wherein, w_tIs the weight after the t-1 th iteration, w_t+1Weight after the t-th iteration, η learning rate, batch size of the t-th iteration B_tAnd B is_tIs largeAnd b is small. In the embodiment of the invention, the batch size of the t iteration is B_tAnd B is_tB, the weights may then be updated based on the following equation (3):

wherein, w_tIs the weight after the t-1 th iteration, w_t+1Weight after the t-th iteration, η learning rate, batch size of the t-th iteration B_tAnd B is_tThe size of (a) is b.

To simplify the expression, we can say that the update rule in equation (4) represents that we use the gradient of the weight

Update the weight w_tIs w_t+1。

By adopting the method, iteration is carried out, and more image data are loaded by using the processor as much as possible, so that the training time can be reduced linearly and greatly. In addition, before the iterative training of the deep learning model is performed through the improved stochastic gradient descent iterative algorithm, the deep learning model is established, and the method for establishing the deep learning model is the same as the prior art, so that the detailed description is omitted.

S102: and training the loaded more image data in batch through an improved linear scaling and preheating strategy so as to improve the accuracy of training the deep learning model.

In the embodiment of the present invention, when training a large batch, we need to ensure that the test accuracy is as good as that of a small batch under the condition of running the same number of time periods (epochs). Here we fix the number of time periods (epochs) because statistically one time period (epoch) means that the algorithm will touch the entire data set once; and, in calculation, it is fixedThe present invention trains large batches of data using an improved linear scaling and warm-up strategy 1. Linear scaling, which increases batches from B to kB, while also increasing the learning rate from η to k η. the warm-up strategy, which, if a relatively large learning rate k η is used, starts with a relatively small learning rate η value and increases the relatively small learning rate η to the relatively large learning rate k η for the first few time periods

The weights are updated. Using the data parallel approach, multiple machine versions can be handled in the same way. Each layer of the deep learning model has own weight w and gradient

The standard SGD algorithm uses the same LR for all layers (η), however, from routine experimentation, it can be observed that different layers may require different LR's due to the weight norm | | w | | and the weight gradient norm

The ratio between them is very different in different layers. Embodiments of the present invention solve this problem using a modified LARS algorithm (a new updated learning rate rule), the basic LR rule being defined in equation (1). L in equation (1) is a scaling factor, and in embodiments of the present invention, l can be set to 0.001 during AlexNet and ResNet training. Gamma is the user's adjustment parameter. Usually a good gamma, all values being [1, 50]]BetweenIn this equation, different layers may have different LR. adding momentum (denoted as μ) and weight attenuation (denoted as β) to the SGD and using the method steps of obtaining a local learning rate η for each learnable parameter in a large batch of training layers in the batch, obtaining a true learning rate η ' for each layer in a large batch of training layers in the batch, the true learning rate being η ' γ × α × η, where γ is a user's tuning parameter and γ ranges from [1, 50] y]α is an acceleration term, by formula

Updating the weight gradient; wherein the content of the first and second substances,

is weight gradient, w is weight, β is weight decay, and the weight gradient is determined by the formula

The Local Response Normalization (LRN) needs to be changed to Batch Normalization (BN) in order to extend to larger batch sizes (e.g., 32 k). The present method adds BN. the improved LARS provided by embodiments of the present invention after each convolution layer of the deep neural network can help ResNet-50 maintain high test accuracy.

S103: and training the large-batch training layers in the batch training through an improved adaptive rate scaling algorithm to obtain a fast training model.

Specifically, in order to improve the accuracy of mass training, the embodiment of the method of the inventionA new update Learning Rate (LR) rule is used. The stand-alone case must be considered here, the use of which

The weights are updated. Using the data parallel approach, multiple machine versions can be handled in the same manner. Each layer of the deep learning model has own weight w and gradient

The standard SGD algorithm uses the same LR for all layers (η), however, from routine experimentation, it can be observed that different layers may require different LRs due to the weight norm | | w | | and the weight gradient norm

The ratio between them varies greatly in different layers. Embodiments of the present invention solve this problem using a modified LARS algorithm (a new updated learning rate rule), the basic LR rule being defined in equation (1). L in equation (1) is a scaling factor, and in the embodiment of the present invention, l can be set to 0.001 in AlexNet and ResNet training. Gamma is the user's adjustment parameter. Usually a good gamma, all values being [1, 50]]In this equation, different layers may have different LR. adding momentum (denoted as μ) and weight attenuation (denoted as β) to the SGD and using the method steps of obtaining a local learning rate η for each learnable parameter in a large batch of training layers in the batch training, obtaining a true learning rate η ' for each layer in a large batch of training layers in the batch training, the true learning rate being η ' γ × α × η, where γ is a user's adjustment parameter and γ has a range of values [1, 50 × η ]]α is an acceleration term, by formula

is weight gradient, w is weight, β is weight decay, byFormula (II)

The Local Response Normalization (LRN) needs to be changed to Batch Normalization (BN) in order to scale to a larger batch size (e.g., 32 k). The method of the present invention adds BN. the improved LARS provided by embodiments of the present invention after each convolution layer of the deep neural network to help ResNet-50 maintain high test accuracy.

S104: inputting the vehicle image trained by the deep learning model and the fast training model into a Recurrent Neural Network (RNN) to detect whether a region of interest (RoI) of the vehicle image is a license plate.

In the embodiment of the invention, the vehicle images trained by the deep learning model and the fast training model can be input into the RNN, and the trained vehicle images are subjected to RoI pooling by using the RNN; then, an extraction layer is added in two Full Connection (FC) layers in the RNN to convert the pooled features (or called region features) into feature vectors; and then, scoring and frame regression can be carried out on the RoI through the characteristic vector, and whether the RoI is the license plate or not is judged according to the scoring and frame regression. If the RoI license plate is detected, the process goes to S105. It can be understood that, since one extraction layer is added to two FC layers and the license plate is detected by way of scoring and frame regression, a new RNN different from the prior art is constructed in the embodiments of the present invention.

S105: and carrying out license plate identification on the RoI through an integrated deep network model.

In the embodiment of the invention, because the character segmentation of the license plate image requires a large amount of calculation, in order to improve the recognition efficiency, the invention provides an integrated deep network model for license plate recognition, and the integrated deep network model comprises a convolutional layer, a Bidirectional Recurrent Neural Network (BRNN) layer, a linear transformation layer and a connected significance time classification (CTC) layer. In particular, a specific method for identifying the region of interest of the vehicle image by the integrated depth network model can be understood with reference to fig. 2, as follows:

a first step of subjecting a region of interest (e.g., bay a.02u10) of the vehicle image to feature extraction after RoI pooling, processing the extracted features (e.g., as region features C × X × Y) through two convolutional layers and a rectangular pooling layer between the two convolutional layers to transform the extracted features into a feature sequence D × L; wherein D-512 and L-19, and said signature sequence is represented by V-V (V1, V2, VL).

And secondly, applying the characteristic sequence V at a BRNN layer to form two mutually separated RNNs, wherein one RNN forwards processes the characteristic sequence V, the other RNN backwards processes the characteristic sequence V, two implicit states are concatenated together and input into a linear transformation layer with 37 outputs, the linear transformation layer is switched to a Softmax layer, the 37 outputs are converted into probabilities, the probabilities correspond to the probabilities of 26 letters, 10 numbers and a non-character class, the probabilities are coded by the BRNN layer, the characteristic sequence V is converted into a probability estimation q which has the same length as that of L, the probability is (q1, q 2.,. q L.), and meanwhile, a long-short-term memory network (LSTM) is used for defining memory cells comprising three multiplication gates so as to selectively store related information and solve the problem of gradient disappearance in RNN training.

Thirdly, performing sequence decoding on the probability estimation q through a CTC layer, and searching an approximate optimal path with the maximum probability through the decoded probability estimation q:

where pi is the near-optimal path with the highest probability (e.g., a02U10), B operator is used for one repeated token and non-character token, P is a probability operation, an example is: b (a-a-B-) (B (-a-bb) ═ aab), and the specific details of CTCs are the structure of existing CTCs and are not described herein.

And fourthly, determining a loss function of the integrated depth network model through the approximate optimal path, and performing license plate recognition on the RoI through the loss function (for example, recognizing that the license plate is Gui A.02U10). The method for identifying the RoI through the overall loss function of the model is the same as that in the prior art, and therefore, the details are not repeated herein. It should be noted that the integrated deep network model may include a Softmax layer and a rectangular pooling layer between two convolutional layers in addition to the main convolutional layers (two), the BRNN layer, the linear transformation layer, and the CTC layer, and the convolutional layers may also be regarded as convolutional neural networks.

In the license plate recognition method based on deep learning and big data combination provided in fig. 1, a deep learning model of a big data vehicle image can be iteratively trained through an improved stochastic gradient descent iterative algorithm, more processors are used to load more image data than the previous iterative training each time the deep learning model of the big data vehicle image is iteratively trained, the loaded more image data is batch trained through an improved linear scaling and preheating strategy to adjust the training accuracy, a large batch of training layers in the batch training are trained through an improved adaptation rate scaling algorithm to obtain a fast training model, and the vehicle images trained through the deep learning model and the fast training model are license plate detected and recognized through an improved RNN and an integrated depth network model, therefore, the training cost of license plate recognition can be reduced, character segmentation with large calculation amount is avoided, and the efficiency and effectiveness of large-data license plate recognition are improved.

Referring to fig. 3, fig. 3 is a block diagram illustrating a license plate recognition device based on deep learning and big data combination according to an embodiment of the present invention. As shown in fig. 3, the license plate recognition device 30 based on deep learning and big data combination of the present embodiment includes an iterative training module 301, a precision training module 302, a scaling improvement module 303, a license plate detection module 304, and a license plate recognition module 305. The iterative training module 301, the accuracy training module 302, the scaling improvement module 303, the license plate detection module 304, and the license plate recognition module 305 are respectively configured to perform the specific methods in S101 to S105 in fig. 1, and details can be referred to the related introduction of fig. 1, which is only briefly described here:

the iterative training module 301 is configured to perform iterative training on the depth learning model of the big-data vehicle image through an improved stochastic gradient descent iterative algorithm; wherein, each time the deep learning model of the big data vehicle image is iteratively trained, more processors are used to load more vehicle image data than the previous iterative training.

A accuracy training module 302 for batch training the loaded more vehicle image data to improve the accuracy of training the deep learning model for the large data vehicle images by an improved linear scaling and warm-up strategy, the improved linear scaling including increasing the batch from B to kB while increasing the learning rate from η to k η, the improved warm-up strategy including increasing the relatively small learning rate η to the relatively large learning rate k η for the previous several time periods, starting from a relatively small learning rate η value if a relatively large learning rate k η is used.

And a scaling improvement module 303, configured to train a large batch of training layers in the batch training through an improved adaptive scaling algorithm to obtain a fast training model.

And the license plate detection module 304 is configured to input the vehicle image trained by the deep learning model and the fast training model into the RNN, so as to detect whether the RoI of the vehicle image is a license plate.

The license plate recognition module 305 is used for recognizing the license plate of the RoI through an integrated deep network model when the license plate is detected; the integrated deep network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

Further, referring to fig. 4, the license plate detection module 304 may specifically include a pooling unit 3041, a conversion unit 3042, and a determination unit 3043:

a pooling unit 3041, configured to input the vehicle images trained by the deep learning model and the fast training model into an RNN, and perform RoI pooling on the trained vehicle images by using the RNN.

A converting unit 3042, configured to convert the pooled features into feature vectors by adding an extraction layer to two fully-connected layers in the RNN.

The determining unit 3043 is configured to score the RoI by using the feature vector and perform border regression, and determine whether the RoI is a license plate according to the score and the border regression.

Further, referring to fig. 5, the license plate recognition module 305 may specifically include a feature extraction unit 3051, a probability estimation unit 3052, an optimal path unit 3053, and a recognition unit 3054:

a feature extraction unit 3051, configured to perform feature extraction after the RoI is subjected to the RoI pooling, and process the extracted features through two convolutional layers and a rectangular pooling layer between the two convolutional layers, so as to transform the extracted features into a feature sequence dx L; wherein D-512 and L-19, and said signature sequence is represented by V-V (V1, V2, VL).

A probability estimation unit 3052, configured to apply the feature sequence V at the BRNN layer to form two mutually separated recurrent neural networks RNN, where one RNN processes the feature sequence V forward and the other RNN processes the feature sequence V backward, concatenates two implicit states together, inputs the two implicit states into a linear transformation layer having 37 outputs, and switches the linear transformation layer to a Softmax layer, converts the 37 outputs into probabilities corresponding to the probabilities of 26 letters, 10 numbers and a non-character class, the probabilities are encoded by the BRNN layer, so that the feature sequence V is converted into a probability estimate q ═ having the same length as L (q1, q 2.., qL), and simultaneously defines a memory cell including three multiplication gates using LSTM to selectively store related information, thereby solving a gradient vanishing problem in RNN training.

An optimal path unit 3053, configured to perform sequence decoding on the probability estimate q through a CTC layer, and find an approximate optimal path with a maximum probability through the decoded probability estimate q:

wherein pi is an approximate optimal path with the maximum probability, the B operator is used for repeated marks and non-character marks at one position, and P is probability operation.

And the identification unit 3054 is configured to determine a loss function of the integrated deep network model according to the approximate optimal path, and perform license plate identification on the RoI through the loss function.

The license plate recognition device based on deep learning and big data combination provided by fig. 3 can perform iterative training on a deep learning model of a big data vehicle image through an improved random gradient descent iterative algorithm, when the iterative training is performed on the deep learning model of the big data vehicle image each time, more processors are used for loading more image data than the previous iterative training, the loaded more image data are subjected to batch training through an improved linear scaling and preheating strategy to adjust the training accuracy, and license plate detection and recognition are performed on the vehicle image through an improved RNN and an integrated deep network model, so that the training cost of license plate recognition can be reduced, large-calculation-amount character segmentation is avoided, and the efficiency and effectiveness of the license plate recognition of the big data vehicle are improved.

Fig. 6 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 6, the terminal device 6 of this embodiment includes: a processor 60, a memory 61 and a computer program 62 stored in said memory 61 and executable on said processor 60, such as a program for license plate recognition based on deep learning combined with big data. The processor 60, when executing the computer program 62, implements the steps in the above-described method embodiments, e.g., S101 to S105 shown in fig. 1. Alternatively, the processor 60, when executing the computer program 62, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 301 to 305 shown in fig. 2.

Illustratively, the computer program 62 may be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 62 in the terminal device 6. For example, the computer program 62 may be partitioned into an iterative training module 301, a accuracy training module 302, a scaling improvement module 303, a license plate detection module 304, and a license plate recognition module 305. (modules in the virtual device), the specific functions of each module are as follows:

The terminal device 6 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. Terminal device 6 may include, but is not limited to, a processor 60, a memory 61. Those skilled in the art will appreciate that fig. 6 is merely an example of a terminal device 6 and does not constitute a limitation of terminal device 6 and may include more or less components than those shown, or some components in combination, or different components, for example, the terminal device may also include input output devices, network access devices, buses, etc.

The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal device 6. Further, the memory 61 may also include both an internal storage unit of the terminal device 6 and an external storage device. The memory 61 is used for storing the computer programs and other programs and data required by the terminal device 6. The memory 61 may also be used to temporarily store data that has been output or is to be output.

It will be clear to those skilled in the art that, for convenience and simplicity of description, the foregoing functional units and modules are merely illustrated in terms of their division, and in practical applications, the foregoing functional allocation may be performed by different functional units and modules as needed, that is, the internal structure of the device is divided into different functional units or modules to perform all or part of the above described functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the system can refer to the corresponding process in the foregoing method embodiments, and is not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A license plate recognition method based on deep learning and big data combination is characterized by comprising the following steps:

carrying out iterative training on the deep learning model of the big data vehicle image through an improved random gradient descent iterative algorithm; wherein, each time the deep learning model of the big data vehicle image is iteratively trained, more processors are used to load more vehicle image data than the previous iterative training;

inputting the vehicle images trained by the deep learning model and the rapid training model into a Recurrent Neural Network (RNN) to detect whether the region of interest (RoI) of the vehicle images is a license plate;

if the license plate is the license plate, performing license plate identification on the RoI through an integrated deep network model; the integrated deep network model comprises a convolutional layer, a bidirectional cyclic neural network BRNN layer, a linear transformation layer and a connection meaning time classification CTC layer.

2. The license plate recognition method based on deep learning and big data combination of claim 1, wherein the iterative training of the deep learning model of the big data vehicle image through the improved stochastic gradient descent iterative algorithm comprises:

constructing a loss function L (W) of the deep learning model of the big data vehicle image:

wherein w represents the weight of the deep neural network DNN, X is training data, n is the number of samples in X, Y represents the label of the training data X, and X represents the weight of the deep neural network DNN_iFor samples in the training data X, l (X)_i，y_iW) is for x_iAnd its label y_i(i ∈ {1, 2.., n));

updating the weight of the DNN according to the gradient of the loss function to the weight when the deep learning model of the big data vehicle image is iteratively trained each time:

wherein, w_tIs the weight after the t-1 th iteration, w_t+1Weight after the t-th iteration, η learning rate, batch size of the t-th iteration B_tAnd B is_tThe size of (a) is b; more processors are used to load more image data per iteration of training than in the previous iteration.

3. The license plate recognition method based on deep learning and big data combination of claim 1, wherein the training of the large batch of training layers in the batch training through the improved adaptive scaling algorithm to obtain the fast training model comprises:

obtaining a local learning rate η for each learnable parameter in a bulk training layer in the bulk training;

acquiring a real learning rate η 'of each layer in a large batch of training layers in the batch training, wherein the real learning rate is η' ═ γ × α × η, γ is an adjustment parameter of a user, the numeric range of γ is [1, 50], and α is an acceleration item;

by the formula

is weight gradient, w is weight, β is weight decay;

by the formula

Updating an acceleration term α, where μ is momentum;

the weights are updated with the formula w- α to arrive at a fast training model.

4. The license plate recognition method based on deep learning and big data combination of claim 1, wherein the step of inputting the vehicle image trained by the deep learning model and the fast training model into an integrated deep network model to detect whether the RoI of the vehicle image is a license plate comprises the steps of:

inputting the vehicle images trained by the deep learning model and the fast training model into an RNN, and performing RoI pooling on the trained vehicle images by using the RNN,

converting the pooled features into feature vectors by adding an extraction layer to two fully-connected layers in the RNN;

and scoring and frame regression are carried out on the RoI through the characteristic vector, and whether the RoI is the license plate or not is judged according to the scoring and frame regression.

5. The license plate recognition method based on deep learning and big data combination of claim 1, wherein the license plate recognition of the ROI through an integrated deep network model comprises:

performing feature extraction after the region of interest is subjected to RoI pooling, and processing the extracted features through two convolution layers and a rectangular pooling layer between the two convolution layers so as to transform the extracted features into a feature sequence DxL; wherein D-512 and L-19, said signature sequence is represented by V-1, V2, VL;

applying the characteristic sequence V at a BRNN layer to form two mutually separated recurrent neural networks RNN, wherein one RNN processes the characteristic sequence V forward, the other RNN processes the characteristic sequence V backward, two implicit states are concatenated together, the input is input into a linear transformation layer with 37 outputs, the output is transferred to a Softmax layer, the 37 outputs are converted into probabilities, the probabilities correspond to the probabilities of 26 letters, 10 numbers and a non-character class, the probabilities are coded by the BRNN layer, the characteristic sequence V is converted into a probability estimation q (q1, q 2.., q L.) with the same length as L, and a long-short term memory network LSTM is used to define a memory cell containing three multiplication gates so as to selectively store related information and solve the problem of gradient disappearance in RNN training,

performing sequence decoding on the probability estimation q through a CTC layer, and searching an approximate optimal path with the maximum probability through the decoded probability estimation q:

wherein pi is an approximate optimal path with the maximum probability, the B operator is used for repeated marks and non-character marks at one position, and P is probability operation;

and determining a loss function of the integrated depth network model through the approximate optimal path, and identifying the license plate of the RoI through the loss function.

6. A license plate recognition device based on deep learning and big data combination, characterized by includes:

the license plate recognition module is used for recognizing the license plate of the ROI through an integrated depth network model when the license plate is detected; the integrated deep network model comprises a convolution layer, a BRNN layer, a linear transformation layer and a CTC layer.

7. The deep learning and big data combination-based license plate recognition device of claim 6, wherein the license plate detection module comprises:

a pooling unit for inputting the vehicle images trained by the deep learning model and the fast training model into an RNN, and performing RoI pooling on the trained vehicle images by using the RNN,

a conversion unit, configured to add an extraction layer to two fully-connected layers in the RNN to convert the pooled features into feature vectors;

and the judging unit is used for scoring and frame regression on the Roi through the feature vector and judging whether the RoI is the license plate or not according to the scoring and frame regression.

8. The deep learning and big data based license plate recognition device of claim 6, wherein the recognition module comprises:

a feature extraction unit, configured to perform feature extraction after RoI pooling on the region of interest, and process the extracted features through two convolutional layers and a rectangular pooling layer between the two convolutional layers to transform the extracted features into a feature sequence D × L; wherein D-512 and L-19, said signature sequence is represented by V-1, V2, VL;

a probability estimation unit, configured to apply the feature sequence V at a BRNN layer to form two mutually separated recurrent neural networks RNN, where one RNN processes the feature sequence V forward, and the other RNN processes the feature sequence V backward, concatenates two implicit states, inputs the concatenated states into a linear transformation layer having 37 outputs, and switches the linear transformation layer to a Softmax layer, and converts the 37 outputs into probabilities, where the probabilities correspond to probabilities of 26 letters, 10 numbers, and a non-character class, and the probabilities are encoded by the BRNN layer, so that the feature sequence V is converted into a probability estimate q (q1, q 2.., qL) having the same length as L, and at the same time, LSTM is used to define a memory cell including three multiplication gates, so as to selectively store related information and solve a gradient vanishing problem in RNN training;

the optimal path unit is used for performing sequence decoding on the probability estimation q through a CTC layer, and searching an approximate optimal path with the maximum probability through the decoded probability estimation q:

and the identification unit is used for determining a loss function of the integrated depth network model through the approximate optimal path and identifying the license plate of the RoI through the loss function.

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1-5 when executing the computer program.

10. A computer-readable medium, in which a computer program is stored which, when being processed and executed, carries out the steps of the method according to any one of claims 1 to 5.