CN115601741A

CN115601741A - Non-motor vehicle retrograde detection incremental learning and license plate recognition method

Info

Publication number: CN115601741A
Application number: CN202211399484.2A
Authority: CN
Inventors: 郑艳伟; 高杨; 孙钦平; 于东晓; 马嘉林; 崔方剑; 张春雨
Original assignee: Qingdao Hisense Information Technology Co ltd; Shandong University
Current assignee: Qingdao Hisense Information Technology Co ltd; Shandong University
Priority date: 2022-11-09
Filing date: 2022-11-09
Publication date: 2023-01-13

Abstract

The invention discloses a non-motor vehicle retrograde motion detection incremental learning and license plate recognition method which comprises a training module, a camera scheduling module, a motion detection module, a GPU scheduling module, a non-motor vehicle retrograde motion detection module, a license plate detection module, an OCR license plate recognition module, a risk storage and reporting module and a log module. The invention mainly carries out stream taking operation on a multi-path camera through the RTSP and then carries out frame extraction processing on the collected video stream. Batch information of training data added into the data set is recorded, learning rules are determined according to batches, and catastrophic forgetting is avoided. And simultaneously detecting the illegal and retrograde non-motor vehicles in the video and identifying the license plate information of the illegal non-motor vehicles. The invention combines the target detection and OCR recognition technology with the non-motor vehicle retrograde motion detection, effectively supervises the non-motor vehicle in an artificial intelligence mode, improves the efficiency and reduces the labor cost.

Description

Non-motor vehicle retrograde detection incremental learning and license plate recognition method

Technical Field

The invention relates to the technical field of retrograde motion detection incremental learning and license plate recognition methods, in particular to a retrograde motion detection incremental learning and license plate recognition method for a non-motor vehicle.

Background

With the enlargement of urban scale and the increasing of population, various non-motor vehicles become essential transportation means in normal travel life of people, and the characteristics of large number of non-motor vehicles, high speed, random riding, difficult management and the like bring great potential safety hazards to the traffic travel safety. Therefore, the improvement of the civilized law-keeping rate of non-motor vehicle drivers becomes an important link in traffic control work. However, in the process of traffic control, a management department invests a large amount of police force resources to each intersection for field control, so that on one hand, the management department is difficult to completely cover each intersection and cannot play a role in lasting and broad deterrence, and on the other hand, the management department is difficult to accurately evaluate the effect of field law enforcement. Therefore, the situation that the non-motor vehicle drives backwards depends on manpower to detect has very large instability and subjectivity, real-time monitoring is difficult to carry out in a full angle, and huge manpower and material resource costs are consumed.

With the rapid development of computer vision and deep learning, the target detection and OCR recognition technology is more and more widely applied in the traditional field, an all-dimensional and multi-level learning system is constructed through a convolutional neural network, a deep belief network, a neural network and the like, and the accuracy, the efficiency and the convenience of artificial intelligence can be better exerted. Particularly, the LPRnet network is generated, so that the OCR recognition accuracy is improved. Meanwhile, the appearance of YOLOv5 makes a great breakthrough in real-time performance of the target detection technology.

In the actual application process of the visual algorithm, the detection algorithm is often disabled due to the occurrence of new categories (such as new types of non-motor vehicles), the occurrence of completely different scenes and other factors, however, the time and resource cost for retraining a model from the beginning is very high, and in order to solve the problem, incremental learning is required; in incremental learning, catastrophic forgetfulness is the condition that should be avoided.

Disclosure of Invention

The invention aims to provide a non-motor vehicle retrograde motion detection incremental learning and license plate recognition method, which aims to solve the problems of instability, subjectivity, incompleteness and large manpower and material consumption in the situation of detecting the retrograde motion of the non-motor vehicle by manpower, avoid catastrophic forgetting in incremental learning and improve the efficiency and quality of non-motor vehicle supervision.

In order to achieve the purpose, the invention provides the following technical scheme:

the invention mainly uses the technologies of target detection based on deep learning, OCR recognition, incremental learning and real-time streaming protocol RTSP. Specifically, the non-motor vehicle retrograde motion detection incremental learning and license plate recognition method comprises a training module, a camera scheduling module, a motion detection module, a GPU scheduling module, a non-motor vehicle retrograde motion detection module, a license plate detection module, an OCR license plate recognition module, a risk storage and reporting module and a log module, wherein:

1.1 training module: the system is responsible for collecting training data, establishing learning rules, realizing incremental learning, avoiding catastrophic forgetting and finally obtaining a YOLOv5 non-motor vehicle retrograde motion detection model and a YOLOv5 license plate detection model;

1.2 camera scheduling module: based on real-time streaming protocol of RTSP, selecting a camera group to be connected through a polling algorithm, acquiring video stream and extracting frames from the video stream;

1.3 motion detection module: detecting whether a moving object exists in a picture;

1.4 A GPU scheduling module: corresponding GPU resources are mainly distributed for models of a non-motor vehicle retrograde motion detection module, a license plate detection module and an OCR module, so that reasoning is carried out after a video frame reaches a batch, and the efficiency and the accuracy of the models are improved;

1.5 the non-motor vehicle converse detection module: the method mainly comprises the steps of detecting a retrograde non-motor vehicle in a video of a camera based on a YOLOv5 retrograde detection model of the non-motor vehicle;

1.6 license plate detection module: detecting a license plate in a retrograde non-motor vehicle based on a YOLOv5 license plate detection model;

1.7OCR license plate recognition module: identifying information in the license plate mainly based on an improved LPRnet model;

1.8 risk saving and reporting module: storing the illegal picture and reporting the information of the illegal vehicle;

1.9 Log Module: and storing error information and warning information in the running process of the system.

Preferably, the training module (1.1) is implemented as follows: the method comprises the following steps of training by using a YOLOv5x network model in a YOLOv5 target detection model to obtain a YOLOv5 non-motor vehicle retrograde motion detection model, training by using a YOLOv5s network model in the YOLOv5 target detection model to obtain a YOLOv5 license plate detection model, and carrying out specific training methods of the YOLOv5 non-motor vehicle retrograde motion detection model and the YOLOv5 license plate detection model according to the following steps:

(1) Collecting pictures of newly added categories on the basis of an original training set, labeling the pictures to generate a labeling frame, dividing the labeling frame into a training set, a verification set and a test set, and then constructing a model for training, wherein the labels of the labeling frame adopt the following data (I, X, Y, w, h, c and b), wherein I is ID of the pictures, X is X coordinate of the upper left corner of the labeling frame, Y is Y coordinate of the upper left corner of the labeling frame, w is width of the labeling frame, h is height of the labeling frame, c is a label of the category of the labeling frame, and b is batch of the labeling frame, wherein when training is carried out in an OLOYV 5X network model, c =0 is a bicycle, c =1 is an electric vehicle, c =2 is a motorcycle, c =3 is a pedestrian, c =4 is other (including a motor vehicle), and c =0 is a license plate when training is carried out in a YOLOv5s network model;

(2) Regarding the determination of the batch of the labeling frame, the batch is set to 1 when the training set is initialized, and the batch of the subsequent newly added pictures is increased by 1;

(3) In incremental learning, fine-tuning (fine-tuning) is performed on the basis of an original YOLOv5x network model or a YOLOv5ss network model, batch information of training data added into a data set is recorded, a learning rule is determined according to batches, and catastrophic forgetting is prevented through the following loss function:

wherein I ^obj An indication function, which indicates that a certain anchor window is responsible for the category and then participates in the calculation; b is a mixture of _i Is the batch information recorded in step (1), c _i Is a class label, p _j (c _i ) Is the probability of a prediction category;

(4) For the detection box regression, the following coordinate method was used for calculation:

wherein x, x ^* And x _a Respectively representing the coordinates of the upper left corner X, y and y of the marking frame, the prediction frame and the anchor frame ^* And y _a Respectively representing the Y coordinates of the upper left corners of the marking frame, the prediction frame and the anchor frame, w and w ^* And w _a Frame widths h, h of the mark frame, the prediction frame and the anchor frame are respectively expressed ^* And h _a Respectively representing the frame heights of the marking frame, the prediction frame and the anchor frame; t is t _x The frame regression parameters obtained during the training are used,

adopting CIOU loss function to obtain frame regression parameters during prediction

Regression to obtain optimum when CIOU loss function is minimized

Finally, a trained YOLOv5 non-motor vehicle retrograde motion detection model and a YOLOv5 license plate detection model are obtained;

preferably, the implementation method of the camera scheduling module (1.2) is as follows:

(1) Writing information of multiple paths of cameras into a configuration file, wherein the information comprises the id of the cameras, the names of the cameras and the frame taking time interval;

(2) The method comprises the following steps of connecting corresponding cameras through a polling algorithm, reading video streams based on an RTSP (real time streaming protocol), and realizing the following polling algorithm:

(2.1) assume that the cameras are x, respectively ₁ ,x ₂ ,x ₃ ,…,x _i The initial score of each camera is s, all the cameras are traversed for one round, each camera is connected for 1 minute, and if a non-motor vehicle which runs in the wrong direction is detected, the score of m is added; if not, subtracting n points, and sorting each camera according to the points after one round of traversal, wherein the points are s ₁ ,s ₂ ,s ₃ ,…,s _i ；

(2.2) post-ligation s ₁ ,s ₂ ,s ₃ ,…,s _i The 30 cameras with the highest scores are established, a thread is established for each camera, the connection is disconnected after 3 minutes, the operation in the repeated (2.2) process of each camera score (2.3) is counted according to the rule in the (2.1), and if the continuous connection times of a certain camera exceed the maximum threshold value C, the score is decreased by z; if a certain camera is not connected all the time within the given time T, the lower wheel connection can be forcibly connected with the camera;

the cameras with many reverse vehicles can be preferentially connected to the greatest extent through the polling algorithm of the reward and punishment mechanism, and the CPU and GPU resources can be better utilized.

(3) A broken line reconnection mechanism is added in the camera connection process, and the method is specifically realized as follows:

(3.1) setting an error reconnection interval time _ error _ waiting =10 and a longest reconnection interval max _ error _ waiting =600 in the connection process, wherein each error time delay is 4 times:

time_error_waiting＝min(max_error_waiting,time_error_waiting*4)；

(3.2) setting a failure reconnection interval timeout =1 in the opening process, a longest failure reconnection interval time _ error _ waiting =60, and delaying by 2 times when each failure occurs:

timeout＝min(timeout_max,timeout*2)。

preferably, the motion detection module (1.3) is implemented as follows:

(1) Whether a camera picture moves or not is detected through a frame difference method, because a video sequence acquired by a camera has the characteristic of continuity, if no moving object exists in a scene, the change of continuous frames is weak, and if the moving object exists, the continuous frames and the frames can obviously change, and the method is specifically realized as follows:

(1.1) let the image of the nth frame and the image of the n-1 frame in the video sequence be f respectively _n And f _n-1 And the gray value of the pixel point corresponding to the two frames is recorded as f _n (x, y) and f _n-1 (x, y), subtracting the gray values of the pixel points corresponding to the two frames of images, and taking the absolute value to obtain a differential image D _n :

D _n (x,y)＝|f _n (x,y)-f _n-1 (x,y)|；

(1.2) setting a threshold value T, and carrying out binarization processing on the pixel points one by one according to a difference image formula to obtain a binarized image R _n Wherein, the point with the gray value of 255 is the foreground (moving object) point, and the point with the gray value of 0 is the background point; for image R _n Performing connectivity analysis to obtain an image R containing a complete moving target _n 。

(2) Only in the case of motion will the GPU be used for reasoning.

Preferably, the GPU control module (1.4) is implemented as follows:

(1) Storing the current frame I in a global image linked list Is;

(2) Detecting whether the length of the global image linked list Is exceeds a certain threshold MinLen;

(3) When the image exceeds a given threshold value MinLen, locking the global image linked list Is to obtain images with a given threshold value number, deleting the images from the global image linked list Is, and unlocking;

(4) The method comprises the following steps of forming a batch from acquired images, loading the batch into a GPU for training, and specifically realizing the following processes:

if Is.Lenght>＝MinLen

{

Lock(Is)；

Batch＝Is.Get(MinLen)；

Is.Remove(MinLen)；

Unlock(Is)；

Pack(Batch)；

}。

preferably, the non-motor vehicle reverse running detection module (1.5) is realized by the following method:

(1) Receiving an image of a batch of a GPU scheduling module;

(2) Because the resolutions of different cameras may have differences, adaptive picture scaling is adopted here, and imgsize is set to 1280 x 1280, because in the actual use process, because the aspect ratios of pictures are different, the black edges at two ends are different in size after scaling filling, and under the condition of more filling, information redundancy may exist, which affects the inference speed, so that the least black edges are added by the self-adaptation of the original image;

(3) Reasoning is carried out through a YOLOv5 non-motor vehicle retrograde motion detection model to obtain prediction types and prediction frame information, and the obtained prediction frames are screened through NMS non-maximum value inhibition according to the following principle:

(3.1) obtaining the result of the predicted non-motor vehicle retrograde motion picture, and dividing n prediction framesIs other than σ _k (x _k ,y _k ,w _k ,h _k ,c _k ,p _k ) K =1,2, …, n, where c is the predicted category, c =0 is bicycle, c =1 is electric vehicle, c =2 is motorcycle, c =3 is pedestrian, c =4 is other (including motor vehicle), p is the probability of the predicted category, 0<p<1；

(3.2) calculate the intersection ratio of any two prediction boxes IoU (σ) _I ,σ _J ):

Intersection of two prediction boxes:

Inter(σ _i ,σ _j )＝max(min(x _i +w _i ,x _j +w _j )-max(x _i ,x _j )+1,0)×max(min(y _i +h _i ,y _j +h _j )-max(y _i ,y _j )+1,0)；

cross-over ratio:

(3.3) setting the threshold τ if IoU (σ) _i ,σ _j ) Is not less than tau and c _i ＝c _j Then compare p _i And p _j Deleting the prediction box with lower probability;

and (3.4) drawing a prediction frame of c =0,1,2,3, namely a prediction frame of a retrograde-motion non-motor vehicle in a retrograde-motion picture of the non-motor vehicle, and dividing and storing the prediction frame from the retrograde-motion picture of the non-motor vehicle according to (x, y, w, h).

Preferably, the license plate detection module (1.6) is implemented as follows:

(1) Receiving the screened non-motor vehicle retrograde image in the (3.4) of claim 6;

(2) Obtaining a screened license plate picture through a YOLOv5 license plate detection model according to the following principle:

(2.1) obtaining the result of the predicted license plate picture of the retrograde motion non-motor vehicle, wherein n prediction frames are respectively sigma _k (x _k ,y _k ,w _k ,h _k ,c _k ,p _k ) K =1,2, …, n where c is the predicted class, c =0Is the license plate, p is the probability of the prediction category, 0<p<1；

(2.2) calculate the intersection ratio of any two prediction boxes IoU (σ) _I ,σ _J ):

Intersection of two prediction boxes:

cross-over ratio:

(2.3) setting the threshold τ if IoU (σ) _i ,σ _j ) Is not less than tau and c _i ＝c _j Then compare p _i And p _j Deleting the prediction box with lower probability;

and (2.4) drawing a prediction frame with c =0, namely the prediction frame of the license plate of the retrograde non-motor vehicle in the license plate picture of the retrograde non-motor vehicle, and segmenting and storing the prediction frame from the license plate picture of the retrograde non-motor vehicle according to (x, y, w, h) to obtain the screened license plate picture.

Preferably, the OCR license plate recognition module (1.7) is implemented as follows:

(1) Setting a self-defined dictionary for OCR license plate recognition, wherein the self-defined dictionary comprises letters, numbers and Chinese characters, and the dictionary contains all contents to be recognized;

(2) Carrying out graying processing on the screened license plate picture, wherein graying adopts the current universal standard average value method, gray values after gray are expressed by G, R, G and B express red, green and blue components in the original true color picture, and the method comprises the following steps:

g＝0.110B+0.588G+0.322R；

(3) The image contrast of the picture obtained in the last step is enhanced, because the license plate recognition belongs to the all-weather working property, if no ideal complementary light illumination exists, the natural illumination changes day and night to cause the license plateThe image contrast is seriously insufficient to make the picture blurred, so an image gray scale linear expansion method is adopted to highlight interested targets or gray scale intervals and inhibit uninteresting gray scale intervals, the gray scale value range of an original image f (x, y) is set to be a less than or equal to f (x, y) less than or equal to b, and after linear transformation, the range of an image g (x, y) is set to be 0 less than or equal to g (x, y) less than or equal to M _f And the transformation relation expression of f (x, y) and g (x, y) is as follows:

(4) The image median filtering is carried out on the image obtained in the last step, the median filtering is a nonlinear filtering technology, the image statistical characteristics are not needed in the actual calculation process, so the method is more convenient, the image detail blurring caused by a linear filter can be overcome under certain conditions, and the method is more effective on the good image scanning noise of the filtering pulse ₁ ,f ₂ ,…,f _n Taking the window length as odd number m, and performing median filtering on the sequence, i.e. continuously extracting m number f from the input sequence _i-v ,f _i ,f _i+v Wherein, the central value of the fire fighting window, v = (m-1)/2, then the numerical values of the m points are sorted according to the numerical value, the numerical value with the serial number being the middle is taken as the output of the filter, and the expression is as follows:

Y _i ＝Med{f _i-v ,…,f _i ,…,f _i+v }i∈Z v＝(m-1)/2，

the median filtering of the digital image is actually the median filtering of a two-dimensional sequence, the filtering window is also two-dimensional, a sliding window W is used for scanning the image, the image pixels contained in the window are arranged according to the ascending or descending order of the gray level, the gray level of the pixel with the middle gray level is taken as the gray level of the pixel at the center of the window, and the formula is as follows:

X(m,n)＝Median{f(m-k,n-1),(k-1)∈W}，

the window of the median filtering adopted by the method is 3*3;

(5) The image edge detection is carried out on the picture obtained in the last step, and the method is mainly used for detecting the place where the gray level has sudden change, which indicates the ending of one region and also indicates the beginning of the other region, the method adopts a LOG operator, and the LOG edge detector has the basic characteristics that:

(5.1) the smoothing filter is a gaussian filter;

(5.2) the enhancement step uses the second derivative (two-dimensional Laplace function);

(5.3) the edge detection criterion is a zero crossing point of the second derivative and corresponds to a larger peak value of the first derivative;

(5.4) using linear interpolation;

the method is characterized in that the image is firstly convolved with a Gaussian filter, the step of smoothing the image reduces noise, isolated noise points and smaller structural tissues are filtered, because smoothing causes edge extension, an edge detector considers edge points with local gradient maximum values, the point can be realized by zero crossing points of a second derivative, a Laplace function is used as an approximation of the two-dimensional second derivative, because the Laplace function is a non-directional operator, in order to avoid detection of non-significant edges, the zero crossing points of which the first derivative is greater than a certain threshold value are selected as an edge LOG operator, edge detection is carried out on the image f (x, y), and the output h (x, y) is obtained by convolution operation:

because the smoothing of the image can cause the blurring of the edge, the Gaussian smoothing operation causes the blurring of the edge and other sharp discontinuous parts in the image, wherein the blurring amount depends on the sigma value, the larger the sigma value is, the better the noise filtering effect is, but simultaneously, a lot of edge information can be lost, the performance of an edge detector is influenced, if the sigma value is small, the possibility of incomplete smoothing and too much noise left is high, the two are combined, and the value of sigma in the method is 2;

(6) And (3) performing license plate inclination correction on the picture obtained in the previous step, and performing horizontal correction: detecting the inclination angles of two obvious straight lines at the upper edge and the lower edge of the license plate by adopting Hough transformation, and then performing horizontal inclination correction on the license plate; vertical straightening: detecting the left edge and the right edge of the license plate by using Hough transformation to obtain an inclination angle and then correcting;

(7) The picture obtained in the last step is subjected to character segmentation and recognition, an LPRnet network model is adopted for recognition, the LPRnet network model is also light weight modification of crnn essentially, a used loss function is ctc loss, the reasoning speed is high, the accuracy is high, running on various embedded platforms is supported, and meanwhile the method has good robustness and is slightly influenced by camera parameters, visual angles and illumination;

the LPRnet network model input picture size is 94 x 24, the LPRnet network model finally uses convolution of 1 x 13 to replace Istm in original crnn, and finally adopts ctc loss to carry out training, in a testing stage, a greedy search or a beam search decoding mode is adopted, the greedy search selects a prediction result of the maximum probability of each prediction position to carry out decoding, and the beam search selects the maximum probability of the whole prediction sequence to carry out decoding.

Preferably, the risk saving and reporting module (1.8) is implemented as follows:

(1) The screened non-motor vehicle retrograde motion pictures, the screened license plate pictures and the screened license plate information of the retrograde motion non-motor vehicle;

(2) Storing the screened non-motor vehicle reverse-running picture and the screened license plate picture into a minio memory;

(3) The wrong-way-driving non-motor vehicle license plate information is transmitted to kafka through a producer.

The invention adopts the above-mentioned structure of the method for detecting the retrograde motion of the non-motor vehicle and recognizing the license plate, adopts the idea of combining the target detection and the OCR recognition, and effectively supervises the non-motor vehicle in an artificial intelligence mode, thereby improving the efficiency and reducing the labor cost. Incremental learning is added to avoid high-cost zero-starting-point learning, and meanwhile, a batch mode is adopted to avoid catastrophic forgetting in incremental learning. The invention uses two target detection models to detect the license plate of the non-motor vehicle in the wrong direction, the first model is used to detect the non-motor vehicle in the wrong direction, and the second model is used to detect the illegal license plate, so that the accuracy is higher.

Drawings

FIG. 1 is a block diagram of the system of the present invention;

FIG. 2 is a system flow diagram of the de-training module of the present invention;

FIG. 3 is a flow chart of the training module of the present invention.

Detailed Description

The technical scheme of the invention is further explained by combining the drawings and the embodiment.

A non-motor vehicle retrograde motion detection incremental learning and license plate recognition method comprises a training module, a camera scheduling module, a motion detection module, a GPU scheduling module, a non-motor vehicle retrograde motion detection module, a license plate detection module, an OCR license plate recognition module, a risk storage and reporting module and a log module as shown in figure 1. The training module is responsible for model training, and the obtained data model provides a basis for the recognition of the non-motor vehicle retrograde motion detection module. The camera scheduling module is responsible for connecting a camera group to be read through a polling algorithm, obtaining a video stream from the camera according to an RTSP (real time streaming protocol), performing frame extraction according to a certain time interval, and sending the video frame of the same camera to the motion detection module for detection. The motion detection module is mainly used for detecting whether the camera picture has motion, and because the load of the GPU is extremely large if all the cameras are subjected to uninterrupted 24-hour deep learning video analysis, motion detection is introduced, image analysis processing is carried out only under the condition that image motion exists, and expensive GPU resources are converted into a cheap CPU computing core. The GPU scheduling module is mainly used for receiving the video frames screened by the motion detection module, transmitting the video frames to a queue to be inferred of a data model obtained in the step of the training module for waiting processing, meanwhile distributing GPU resources for models of the non-motor vehicle reverse driving detection module, the license plate detection module and the OCR license plate recognition module, and packing the elements into a batch when the number of the elements exceeds a set threshold value and transmitting the batch to the non-motor vehicle reverse driving detection module for inference; when the number of elements does not meet the requirement after a certain time, the remaining elements are forced to be pushed to a non-motor vehicle reverse driving detection module, and the efficient utilization of the GPU is guaranteed. The non-motor vehicle reverse driving detection module is mainly used for detecting a video frame based on a YOLOv5 non-motor vehicle reverse driving detection model, framing a rectangular frame for the reverse driving non-motor vehicle in the video frame, and meanwhile, segmenting a reverse driving non-motor vehicle picture from an original picture and sending the image to the license plate detection model. The license plate detection model is mainly used for detecting a reverse non-motor vehicle picture based on a YOLOv5 license plate detection model, detecting the license plate of the non-motor vehicle, transmitting the license plate picture to an OCR license plate recognition model, and recognizing license plate information by the recognition model. The risk storing and reporting module is mainly used for reporting risk information and sending a warning signal to the front-end platform so as to carry out front-end display. And the log module records errors to provide a basis for later debugging of the system.

1.1 training module: the method is responsible for collecting training data, establishing learning rules, realizing incremental learning, avoiding catastrophic forgetting and finally obtaining a YOLOv5 non-motor vehicle reverse driving detection model and a YOLOv5 license plate detection model.

The implementation method of the training module is as follows: the method comprises the following steps of training by using a YOLOv5x network model in a YOLOv5 target detection model to obtain a YOLOv5 non-motor vehicle retrograde motion detection model, training by using a YOLOv5s network model in the YOLOv5 target detection model to obtain a YOLOv5 license plate detection model, and carrying out specific training methods of the YOLOv5 non-motor vehicle retrograde motion detection model and the YOLOv5 license plate detection model according to the following steps:

(1) On the basis of an original training set, pictures of newly added categories are collected, the pictures can be collected to acquire existing video data and image data from multiple cameras, video and image data under the conditions of different angles, different scenes and large traffic flow are picked up, or video frame extraction processing is directly carried out on the cameras through an RTSP protocol, and then the acquired video and image data are transmitted to the system. In the video frame extraction process, the system needs to perform frame skipping operation on the obtained video, so as to improve the accuracy of the system for the practical situation of actual road conditions as much as possible, and when the frame skipping process is performed, frames with the same interval time are selected, and then necessary selection is performed on the obtained images. Some non-motor vehicle changes are not obvious or no non-motor vehicle appears on the image, and the non-motor vehicle on the image is fuzzy, so that the pictures which do not have positive effect on model training are correspondingly discarded. The method can be used for keeping comprehensive images of the non-motor vehicles with obvious vehicle flow, various non-motor vehicles and short-distance and long-distance shooting conditions.

Labeling the picture to generate a labeling frame, wherein labels of the labeling frame adopt the following data (I, X, Y, w, h, c, b), wherein I is ID of the picture, X is X coordinate of the upper left corner of the labeling frame, Y is Y coordinate of the upper left corner of the labeling frame, w is width of the labeling frame, h is height of the labeling frame, c is a labeling frame type label, and b is a labeling frame batch, wherein when the labeling frame is trained in a YOLOv5X network model, c =0 is a bicycle, c =1 is an electric vehicle, c =2 is a motorcycle, c =3 is a pedestrian, c =4 is other (including a motor vehicle), and c =0 is a license plate when the labeling frame is trained in a YOLOv5s network model. Because the acquired data is limited, the data needs to be preprocessed, namely, the acquired data is subjected to Mosaic data enhancement operation, translation and rotation zooming data enhancement operation, self-adaptive anchor frame calculation and self-adaptive picture zooming, and the accuracy of the method is improved. And marking is mainly carried out simultaneously by depending on manpower and related marking software, the marked data are divided into a training set, a verification set and a test set after the marking of the data is finished, and then a model is constructed for training.

wherein I ^obj An indication function, which indicates that a certain anchor window is responsible for the category and then participates in the calculation; b _i Is the batch information recorded in step (1), c _i Is a class label, p _j (c _i ) Is the probability of a prediction category;

wherein x, x ^* And x _a Respectively representing the X coordinates of the upper left corners of the marking frame, the prediction frame and the anchor frame, y and y ^* And y _a Respectively representing the Y coordinates of the upper left corners of the marking frame, the prediction frame and the anchor frame, w and w ^* And w _a Respectively representing the frame widths h, h of the marking frame, the prediction frame and the anchor frame ^* And h _a Respectively representing the frame heights of the marking frame, the prediction frame and the anchor frame; t is t _x The frame regression parameters obtained during the training are used,

Regression to obtain the best when CIOU loss function is minimized

Finally, the trained YOLOv5 non-motor vehicle retrograde motion detection model and the YOLOv5 license plate detection model are obtained. The two models are respectively supplied to the non-motor vehicle retrograde motion detection module and the license plate detection module for use.

1.2 camera scheduling module: the method comprises the steps of loading camera information according to configuration file information, dividing cameras into a plurality of groups according to thirty groups, selecting the camera group to be connected through a polling algorithm, acquiring a video stream based on a real-time streaming protocol of RTSP (real time streaming protocol), extracting frames from the video stream, and sending the frames to a motion detection module for detection.

The implementation method of the camera scheduling module comprises the following steps:

(2) Loading camera information according to the configuration file information, connecting a corresponding camera through a polling algorithm, reading a video stream based on an RTSP (real time streaming protocol), wherein the polling algorithm is realized as follows:

(2.2) post-ligation s ₁ ,s ₂ ,s ₃ ,…,s _i The 30 cameras with the highest scores are obtained, a thread is created for each camera, the connection is disconnected after 3 minutes, and the scores of the cameras are counted according to the rule in the step (2.1);

(2.3) repeating the operation in (2.2), and if the continuous connection times of a certain camera exceed the maximum threshold value C, reducing the fraction by z; if a certain camera is not connected all the time within the given time T, the lower wheel connection can be forcibly connected with the camera;

(3) When the camera is connected through the RTSP, the phenomenon of time-out can occur due to the network or the cable, therefore, a disconnection reconnection mechanism is added in the camera connection process, when the connection is time-out, the step of connecting the camera is returned again, and if the connection is successful, whether the camera is normally opened or not is judged. The concrete implementation is as follows:

(3.1) setting an error reconnection interval time _ error _ waiting =10 in a connection process, setting a longest reconnection interval max _ error _ waiting =600, and delaying the error by 4 times each time:

time_error_waiting＝min(max_error_waiting,time_error_waiting*4)；

timeout＝min(timeout_max,timeout*2)。

1.3 motion detection module: detecting whether a moving object exists in the picture;

the motion detection module is implemented as follows:

(1) And detecting whether the camera picture moves or not by a frame difference method. The frame difference method is based on the following principle: when a moving object exists in the video, the gray levels of adjacent frames (or adjacent three frames) are different, and the absolute value of the gray level difference of the two frames of images is obtained, so that the static object is represented as 0 on the difference image, and the moving object, particularly the outline of the moving object, can be judged as a moving object due to the gray level change of non-0, when the absolute value exceeds a certain threshold, so as to realize the detection function of the object. Because a video sequence acquired by a camera has the characteristic of continuity, if no moving target exists in a scene, the change of continuous frames is weak, and if the moving target exists, the continuous frames and the frames can obviously change, and the method is specifically realized as follows:

(1.1) let the image of the nth frame and the image of the n-1 frame in the video sequence be f respectively _n And f _n-1 The gray value of the pixel point corresponding to two frames is marked as f _n (x, y) and f _n-1 (x, y), subtracting the gray values of the pixel points corresponding to the two frames of images, and taking the absolute value to obtain a differential image D _n :

D _n (x,y)＝|f _n (x,y)-f _n-1 (x,y)|；

(2) Only in the case of motion will the GPU be used for reasoning.

1.4GPU scheduling module: the method mainly allocates corresponding GPU resources for models of the non-motor vehicle retrograde motion detection module, the license plate detection module and the OCR module, guarantees that reasoning is carried out after a video frame reaches a batch, and improves efficiency and accuracy of the models.

The GPU control module (1.4) is realized by the following method:

and (3) taking frames of the video stream passing through the motion detection module according to a certain time interval, attaching a unique time stamp to each video frame, packaging the video frame, the time stamp and a camera picture queue into elements, and sending the elements into an element queue to be inferred of each model. When the number of the elements meets a batch, the elements of the batch are handed to the non-motor vehicle retrograde motion detection module for reasoning, and when the number of the elements does not meet the batch number after a given time threshold value is exceeded, the system can forcibly push the rest elements to the non-motor vehicle retrograde motion detection module for reasoning. The purpose of giving the time stamp is to ensure that the front and back time sequence of each camera when sharing a model inference can not cause problems, and the specific steps are as follows:

(1) Storing the current frame I in a global image linked list Is;

(4) And forming a batch by the acquired images, and loading the batch into a GPU for training.

The specific algorithm code is as follows:

if Is.Lenght>＝MinLen

{

Lock(Is)；

Batch＝Is.Get(MinLen)；

Is.Remove(MinLen)；

Unlock(Is)；

Pack(Batch)；

}。

1.5 the non-motor vehicle converse detection module: the method is mainly used for detecting the retrograde non-motor vehicles in the video of the camera based on a YOLOv5 retrograde non-motor vehicle detection model. If the non-motor vehicles in the reverse direction exist in the video frame, the model can detect the non-motor vehicles in the reverse direction, the outlines of the non-motor vehicles are framed by rectangular frames, the position and the category information are stored, the non-motor vehicles in the reverse direction framed by the rectangular frames are divided from the original images and are transmitted to the license plate detection model for judgment, and the license plates in the non-motor vehicles are detected by license plate detection.

The realization method of the non-motor vehicle retrograde motion detection module comprises the following steps:

(1) Receiving an image of a batch of a GPU scheduling module;

(3.1) obtaining the result of the predicted non-motor vehicle retrograde motion picture, wherein n prediction frames are respectively sigma _k (x _k ,y _k ,w _k ,h _k ,c _k ,p _k ) K =1,2, …, n, where c is the predicted category, c =0 is bicycle, c =1 is electric vehicle, c =2 is motorcycle, c =3 is pedestrian, c =4 is other (including motor vehicle), p is the probability of the predicted category, 0<p<1；

Intersection of two prediction boxes:

cross-over ratio:

(3.4) drawing a prediction frame of c =0,1,2,3, namely a prediction frame of a retrograde non-motor vehicle in a retrograde image of the non-motor vehicle, and dividing and storing the prediction frame from the retrograde image of the non-motor vehicle according to (x, y, w, h).

1.6 license plate detection module: the license plate detection method is mainly based on a YOLOv5 license plate detection model and used for detecting the license plate in the retrograde non-motor vehicle.

The license plate detection module is realized by the following method:

(1) Receiving a screened non-motor vehicle retrograde view picture as in (3.4) in claim 6;

(2.1) obtaining the result of the predicted license plate picture of the retrograde motion non-motor vehicle, wherein n prediction frames are respectively sigma _k (x _k ,y _k ,w _k ,h _k ,c _k ,p _k ) K =1,2, …, n, where c is the predicted class, c =0 is the license plate, p is the probability of the predicted class, 0<p<1；

The intersection of the two prediction boxes:

cross-over ratio:

(2.3) setting a threshold value tau, if IoU (sigma) _i ,σ _j ) Is not less than tau and c _i ＝c _j Then compare p _i And p _j Deleting the prediction box with lower probability;

1.7OCR license plate recognition module: the method mainly identifies information in the license plate based on an improved LPRnet model, and transmits the license plate picture of the non-motor vehicle and risk information to a risk storage and reporting module.

The OCR license plate recognition module is realized by the following steps:

g＝0.110B+0.588G+0.322R；

(3) Image contrast enhancement is carried out on the picture obtained in the last step, because the license plate identification belongs to all-weather working property, if no ideal supplementary illumination is available, the picture contrast of the license plate is seriously insufficient due to day-night change of natural illumination, so that the picture is blurred, an image gray scale linear expansion method is adopted, interested targets or gray scale intervals are highlighted, uninteresting gray scale intervals are inhibited, the gray scale value range of an original image f (x, y) is set to be a and f (x, y) and b, and after linear transformation, the range of the image g (x, y) is set to be 0 and g (x, y) and M (x, y) are set to be less than or equal to M _f And the transformation relation expression of f (x, y) and g (x, y) is as follows:

(4) The picture obtained in the last step is subjected to image median filtering, the median filtering is a nonlinear filtering technology, image statistical characteristics are not needed in the actual calculation process, so the method is convenient, the image detail blurring caused by a linear filter can be overcome under certain conditions, and the method is more effective for filtering good image scanning noise of pulse, and has the specific principle that a one-dimensional sequence f is arranged ₁ ,f ₂ ,…,f _n Taking the window length as odd number m, and performing median filtering on the sequence, i.e. continuously extracting m number f from the input sequence _i-v ,f _i ,f _i+v Wherein, the central value of the fire fighting window, v = (m-1)/2, then the numerical values of the m points are sorted according to the numerical value, the numerical value with the serial number being the middle is taken as the output of the filter, and the expression is as follows:

Y _i ＝Med{f _i-v ,…,f _i ,…,f _i+v }i∈Z v＝(m-1)/2，

X(m,n)＝Median{f(m-k,n-1),(k-1)∈W}，

the window of the median filtering adopted by the method is 3*3;

(5.1) the smoothing filter is a gaussian filter;

(5.4) using linear interpolation;

the method is characterized in that an image is firstly convolved with a Gaussian filter, the step is to smooth the image and reduce noise, isolated noise points and smaller structural tissues are filtered, because smoothing can cause the extension of edges, an edge detector considers edge points with local gradient maximum values, the point can be realized by zero crossing points of a second derivative, a Laplace function is used as an approximation of the two-dimensional second derivative, because the Laplace function is a non-directional operator, in order to avoid the occurrence of non-significant edges, zero crossing points with a first derivative larger than a certain threshold value are selected as an edge LOG operator, the image f (x, y) is subjected to edge detection, and an output h (x, y) is obtained through convolution operation:

(6) And (3) performing license plate inclination correction on the picture obtained in the previous step, and performing horizontal correction: detecting the inclination angles of two obvious straight lines at the upper edge and the lower edge of the license plate by adopting Hough transformation, and then performing horizontal inclination correction on the license plate; vertical rectification: detecting the left edge and the right edge of the license plate by adopting Hough transformation, obtaining an inclination angle and then correcting;

the LPRnet network model input picture size is 94 x 24, the LPRnet network model finally uses a convolution of 1 x 13 to replace the Istm in the original crnn, and finally adopts ctc loss to carry out training, in the testing stage, a greedy search or a beam search decoding mode is adopted, the greedy search selects the prediction result of the maximum probability of each prediction position to carry out decoding, and the beam search selects the maximum probability of the whole prediction sequence to carry out decoding.

1.8 risk saving and reporting module: and storing the illegal picture and reporting the information of the illegal vehicle. And when the risk information and the pictures are received, the risk information is transmitted to kafka through the production value, the risk pictures are stored in the minio server, and an alarm signal is sent to the front-end platform for front-end display.

The risk saving and reporting module (1.8) is realized by the following steps:

(1) Receiving the screened non-motor vehicle retrograde motion pictures, the screened license plate pictures and the screened license plate information of the retrograde motion non-motor vehicle;

(3) The wrong-way non-motor vehicle license plate information is transmitted to kafka through the producer.

Therefore, the invention adopts the non-motor vehicle retrograde motion detection incremental learning and license plate recognition method with the structure, adopts the idea of combining target detection and OCR recognition, and effectively supervises the non-motor vehicle in an artificial intelligence mode, thereby improving the efficiency and reducing the labor cost. Incremental learning is added to avoid high-cost zero-starting-point learning, and meanwhile, a batch mode is adopted to avoid catastrophic forgetting in incremental learning. According to the invention, two target detection models are used for detecting the license plate of the retrograde non-motor vehicle, the first model YOLOv5 non-motor vehicle retrograde motion detection model is used for detecting the retrograde non-motor vehicle, and the second model YOLOv5 license plate detection model is used for detecting the illegal license plate, so that the accuracy is higher.

The above is a specific embodiment of the present invention, but the scope of the present invention should not be limited thereto. Any changes or substitutions that can be easily made by those skilled in the art within the technical scope of the present invention are included in the protection scope of the present invention, and therefore, the protection scope of the present invention is subject to the protection scope defined by the appended claims.

Claims

1. A non-motor vehicle retrograde motion detection incremental learning and license plate recognition method is characterized in that: including training module, camera scheduling module, motion detection module, GPU scheduling module, non-motor vehicle converse driving detection module, license plate detection module, OCR license plate recognition module, risk are preserved and are reported module, log module, wherein:

1.4GPU scheduling module: corresponding GPU resources are mainly distributed for models of a non-motor vehicle retrograde motion detection module, a license plate detection module and an OCR module, inference is carried out after a video frame reaches a batch, and the efficiency and the accuracy of the models are improved;

2. The method for incremental learning of non-motor vehicle retrograde motion detection and license plate recognition according to claim 1, wherein: the implementation method of the training module is as follows: the method comprises the following steps of training by using a YOLOv5x network model in a YOLOv5 target detection model to obtain a YOLOv5 non-motor vehicle retrograde motion detection model, training by using a YOLOv5s network model in the YOLOv5 target detection model to obtain a YOLOv5 license plate detection model, and carrying out specific training methods of the YOLOv5 non-motor vehicle retrograde motion detection model and the YOLOv5 license plate detection model according to the following steps:

(2) Regarding the determination of the batch of the labeling frame, the batch is set to 1 when the training set is initialized, and the batch of the subsequently added pictures is increased by 1;

wherein x, x ^* And x _a Respectively representing the coordinates of the upper left corner X, y and y of the marking frame, the prediction frame and the anchor frame ^* And y _a Respectively representing the Y coordinates of the upper left corners of the marking frame, the prediction frame and the anchor frame, w and w ^* And w _a Respectively representing the frame widths h, h of the marking frame, the prediction frame and the anchor frame ^* And h _a Respectively representing the frame heights of the marking frame, the prediction frame and the anchor frame; t is t _x The frame regression parameters obtained during the training are used,

for the frame regression parameters obtained in prediction, CIOU loss function is adopted to carry out the frame regression parameters obtained in prediction

Regression to obtain the best when CIOU loss function is minimized

Finally, the well-trained YOLOv5 non-motor vehicle reverse driving detection model and the YOLOv5 license plate detection model are obtained.

3. The method for incremental learning of non-motor vehicle retrograde motion detection and license plate recognition according to claim 2, wherein: the implementation method of the camera scheduling module comprises the following steps:

(2.2) post-ligation s ₁ ,s ₂ ,s ₃ ,…,s _i The 30 cameras with the highest scores are established, a thread is established for each camera, the connection is disconnected after 3 minutes, the operation in the repeated (2.2) process of each camera score (2.3) is counted according to the rule in the (2.1), and if the continuous connection times of a certain camera exceed the maximum threshold value C, the score is decreased by z; if a certain camera is not connected all the time within the given time T, the lower wheel connection can be forced to connect the camera;

(3) A disconnection reconnection mechanism is added in the camera connection process, and the method is specifically realized as follows:

(3.1) setting an error reconnection interval time _ error _ waiting =10 and a longest reconnection interval max _ error _ waiting =600 in the connection process, wherein each error time is delayed by 4 times;

(3.2) setting a failure reconnection interval timeout =1 in the opening process, a longest failure reconnection interval time _ error _ waiting =60, and delaying by 2 times when each failure occurs.

4. The method for incremental learning of retrograde detection and license plate recognition of non-motor vehicles according to claim 3, wherein: the motion detection module is implemented as follows:

(1.1) let the image of the nth frame and the image of the n-1 frame in the video sequence be f respectively _n And f _n-1 The gray value of the pixel point corresponding to two frames is marked as f _n (x, y) and f _n-1 (x, y), subtracting the gray values of the pixel points corresponding to the two frames of images, and taking the absolute value to obtain a differential image D _n ；

(1.2) setting a threshold value T, and carrying out binarization processing on the pixel points one by one according to a difference image formula to obtain a binarized image R _n Wherein, the point with the gray value of 255 is the foreground (moving object) point, and the point with the gray value of 0 is the background point; for image R _n Performing connectivity analysis to obtain an image R containing a complete moving target _n ；

(2) Only in the case of motion will the GPU be used for reasoning.

5. The method for incremental learning of retrograde motion detection and license plate recognition of the non-motor vehicle according to claim 4, wherein: the GPU control module is realized by the following steps:

(1) Storing the current frame I in a global image linked list Is;

6. The method for incremental learning of retrograde detection and license plate recognition of non-motor vehicles according to claim 5, wherein: the realization method of the non-motor vehicle retrograde motion detection module comprises the following steps:

(1) Receiving an image of a batch of a GPU scheduling module;

(2) Adopting self-adaptive picture scaling to set imgsize to 1280 x 1280;

(3) Reasoning is carried out through a YOLOv5 non-motor vehicle retrograde motion detection model to obtain prediction categories and prediction frame information, and the obtained prediction frames are screened through NMS non-maximum value inhibition according to the following principle:

The intersection of the two prediction boxes:

cross-over ratio:

(3.3) setting a threshold value tau, if IoU (sigma) _i ,σ _j ) Is not less than tau and c _i ＝c _j Then compare p _i And p _j Deleting the prediction box with lower probability;

(3.4) drawing a prediction frame of c =0,1,2,3, namely the prediction frame of the retrograde non-motor vehicle in the retrograde image of the non-motor vehicle, and dividing and storing the prediction frame from the retrograde image of the non-motor vehicle according to (x, y, w, h) to obtain the screened retrograde image of the non-motor vehicle.

7. The method for incremental learning of retrograde motion detection and license plate recognition of the non-motor vehicle according to claim 6, wherein: the license plate detection module is realized by the following method:

Intersection of two prediction boxes:

and (3) cross-linking ratio:

(2.3) setting a threshold value tau, if IoU (sigma) _i ,σ _j ) Is not less than tau and c _i ＝c _j Then, then comparison of p _i And p _j Deleting the prediction frame with smaller probability;

8. The method for incremental learning of retrograde detection and license plate recognition of non-motor vehicles according to claim 7, wherein: the OCR license plate recognition module is realized by the following steps:

g＝0.110B+0.588G+0.322R；

(3) Performing image contrast enhancement on the image obtained in the last step, adopting an image gray scale linear expansion method to highlight interested targets or gray scale intervals and inhibit uninteresting gray scale intervals, setting the gray scale range of an original image f (x, y) to be a less than or equal to f (x, y) less than or equal to b, and after linear transformation, setting the range of an image g (x, y) to be 0 less than or equal to g (x, y) less than or equal to M _f The transformation relational expression of f (x, y) and g (x, y) is:

(4) Carrying out image median filtering on the picture obtained in the last step, wherein the window of the median filtering adopted in the method is 3*3;

(5) Performing image edge detection on the picture obtained in the previous step, performing edge detection on an image f (x, y), and obtaining an output h (x, y) through convolution operation:

the value of sigma in the method is 2;

(7) Performing character segmentation and recognition on the picture obtained in the last step, adopting an LPRnet network model for recognition, inputting pictures with the size of 94 x 24 by the LPRnet network model, finally using a convolution of 1 x 13 to replace the Istm in the original crnn, finally adopting ctc loss for training, in a testing stage, adopting a greedy search or beam search decoding mode, selecting the prediction result of the maximum probability of each prediction position by the greedy search for decoding, and selecting the maximum probability of the whole prediction sequence by the beam search for decoding.

9. The method for incremental learning of retrograde detection and license plate recognition of a non-motor vehicle according to claim 8, wherein: the realization method of the risk saving and reporting module comprises the following steps: