CN117037085A - Vehicle identification and quantity statistics monitoring method based on improved YOLOv5 - Google Patents

Vehicle identification and quantity statistics monitoring method based on improved YOLOv5 Download PDF

Info

Publication number
CN117037085A
CN117037085A CN202311022687.4A CN202311022687A CN117037085A CN 117037085 A CN117037085 A CN 117037085A CN 202311022687 A CN202311022687 A CN 202311022687A CN 117037085 A CN117037085 A CN 117037085A
Authority
CN
China
Prior art keywords
motor vehicle
model
yolov5
frame
vehicle identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311022687.4A
Other languages
Chinese (zh)
Inventor
陈大龙
魏东迎
刘振洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Howso Technology Co ltd
Original Assignee
Nanjing Howso Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Howso Technology Co ltd filed Critical Nanjing Howso Technology Co ltd
Priority to CN202311022687.4A priority Critical patent/CN117037085A/en
Publication of CN117037085A publication Critical patent/CN117037085A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/017Detecting movement of traffic to be counted or controlled identifying vehicles
    • G08G1/0175Detecting movement of traffic to be counted or controlled identifying vehicles by photographing vehicles, e.g. when violating traffic rules
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/065Traffic control systems for road vehicles by counting the vehicles in a section of the road or in a parking area, i.e. comparing incoming count with outgoing count
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a monitoring method for vehicle identification and quantity statistics based on improved YOLOv5, which comprises the following steps: s1: generating a motor vehicle data set, dividing and preprocessing; s2: training by adopting the preprocessed data set to obtain a YOLOv5 motor vehicle identification statistical model; s3: identifying a YOLOv5 motor vehicle identification statistical model and carrying out INT8 quantization and calibration on the YOLOv5 motor vehicle identification statistical model to obtain a quantized engine model; s4: reading data of image acquisition equipment in a monitoring area to acquire image data of each frame; s5: inputting each frame of image data into an engine model for detection and result analysis; s6: tracking and updating the track state of the motor vehicle ID; s7: it is determined whether the motor vehicle ID passes through the target area and statistics are made. The method reduces the loss caused by quantization of the model INT8 and improves the reasoning speed of the model.

Description

Vehicle identification and quantity statistics monitoring method based on improved YOLOv5
Technical Field
The invention relates to the technical field of visual positioning, in particular to a monitoring method for vehicle identification and quantity statistics based on improved YOLOv 5.
Background
With the rapid development of economy and society, the demands of people for transportation travel are increasing, especially for long-distance travel. The number of domestic automobiles kept by 26150 ten thousand at 2019 is counted, and 2122 ten thousand is increased compared with the last year. The conflict between the growing traffic demand and existing road conditions is increasingly prominent. Reasonable traffic management can effectively reduce the occurrence of traffic jam. In order to improve the traffic management level, accurate analysis of driving behaviors of vehicles on a road is required.
Deep learning object detection algorithms are currently commonly employed to identify motor vehicles and count numbers. This method relies on the accuracy of the target detection algorithm, and is not solved in some scenes where motor vehicles are identified and counted, and the algorithm solves the problem that counting is performed when motor vehicles appear in an area, which is obviously unreasonable, so that repeated counting occurs in motor vehicle identification counting in various scenes.
Conventional deep learning object detection algorithms may have some limitations in the motor vehicle identification and quantity statistics tasks. These algorithms typically rely on detecting and locating the bounding boxes of the motor vehicle in the image and using these bounding boxes for quantity statistics. However, in some scenarios, this approach may lead to problems with repeated counting or missing counts, especially in cases where the vehicles are dense, overlapping or moving rapidly.
Meanwhile, the Monx model is directly converted into an INT8 precision model by adopting TensorRT, the loss of the model by the method is relatively large, the accuracy of the model is reduced relatively much, the problems of false detection and omission detection easily occur when the identification and the quantity statistics of motor vehicles are realized, and the method has great defects when the method is used in a production scene. Converting a model from floating point precision to INT8 precision is a common optimization method that can improve reasoning performance and reduce storage requirements in some cases. However, converting a model to INT8 accuracy using TensorRT or other tools may introduce some information loss, resulting in reduced model accuracy. This penalty is typically introduced by quantization techniques, mapping the floating point parameters to integer representations of lower bit precision. The reduced accuracy may result in the model not accurately representing certain features and details, resulting in false and missed detection problems. Particularly for tasks requiring high precision identification and quantity statistics, INT8 precision may not be satisfactory.
The Chinese patent literature discloses a method for detecting and counting vehicles in expressway monitoring video based on YOLOv3, and the method for detecting and counting the vehicles also uses a target detection algorithm in the field of deep learning computer vision, and can be seen from the flow of the method, the method uses YOLOv3 as a target detection method, and performs target tracking according to the result obtained by target detection and judges whether to enter a target area so as to count. The method mainly comprises the steps of carrying out target tracking by using a Kalman filtering algorithm, carrying out data fusion on a Kalman filtering by utilizing a value predicted by a mathematical model and an observed value obtained by measurement, finding an optimal estimated value, and finding the most accurate target in the next frame, thereby reducing the error of statistics of the number of motor vehicles.
For the above reasons, this method depends on the accuracy of the target detection algorithm, and there are always some scenes in the judgment of the motor vehicle identification and statistics, and the algorithm solves the problems that the motor vehicle is classified into motor vehicles and counted when the motor vehicle appears in the area, which is obviously unreasonable, and the motor vehicle detection and the counting are repeated in a short time under various scenes.
For the above reasons, this method may reduce the accuracy of the model, and thus, the motor vehicle cannot be accurately identified, so it is necessary to propose a monitoring method based on improved YOLOv5 vehicle identification and quantity statistics, to reduce the loss caused by quantization of the model INT8, and to increase the reasoning speed of the model.
Disclosure of Invention
The invention aims to solve the technical problem of providing a monitoring method for vehicle identification and quantity statistics based on improved YOLOv5, which reduces loss caused by quantification of a model INT8 and improves the reasoning speed of the model.
In order to solve the technical problems, the invention adopts the following technical scheme: the monitoring method based on the improved YOLOv5 vehicle identification and quantity statistics specifically comprises the following steps:
s1: generating and dividing a motor vehicle data set, and preprocessing to obtain a preprocessed data set;
s2: training by adopting the preprocessed data set to obtain a YOLOv5 motor vehicle identification statistical model;
s3: identifying a YOLOv5 motor vehicle identification statistical model and carrying out INT8 quantization and calibration on the YOLOv5 motor vehicle identification statistical model to obtain a quantized engine model;
s4: reading data of image acquisition equipment in a monitoring area, and acquiring image data of each frame;
s5: inputting each frame of image data into an engine model for detection and result analysis;
s6: tracking the target of the motor vehicle ID and updating the track state;
s7: it is determined whether the motor vehicle ID passes through the target area and statistics are made.
By adopting the technical scheme, the motor vehicle is identified and counted by adding the crisscross attention mechanism and the target tracking algorithm in the YOLOv5, the improved YOLOv5 network reduces the memory occupation of the model, improves the calculation performance, further improves the detection speed and the identification accuracy of the model, and uses the object tracking algorithm of deep learning and the target detection algorithm to detect whether the motor vehicle passes through the target area before counting the motor vehicle, so that the motor vehicle is identified and counted more accurately and in real time by combining the three, and can meet richer scenes. The INT8 quantization is carried out on the model, so that the loss caused by the conversion of the model is reduced, meanwhile, the second mode based on the deep learning target detection and the target tracking algorithm is used for identifying and counting the motor vehicles, and the advantages of the deep learning target detection and the target tracking algorithm are used, so that the problem that repeated counting or missing counting occurs when the motor vehicles are identified and counted by the traditional deep learning target detection algorithm is avoided; that is, the problems of difficult motor vehicle identification and quantity statistics in complex scenes are solved by combining a plurality of technologies.
Preferably, the specific steps of the step S1 are as follows:
s11: collecting videos of motor vehicles and videos of different types of motor vehicles under a simulated shooting development scene, and performing frame extraction processing on the shot videos to generate a motor vehicle data set;
s12: dividing the motor vehicle dataset into a training set and a validation set;
s13: and respectively carrying out data enhancement on the training set and the verification set by adopting a data enhancement mode to obtain an enhanced training set and an enhanced verification set.
Preferably, the specific steps of the step S2 are as follows:
s21: constructing a YOLOv5 algorithm model, and adding a crisscross (Criss-Cross) attention mechanism on a YOLOv5 network structure, wherein the formula of the Criss-Cross attention mechanism is as follows:
wherein H is u An output vector representing a u-th position; t represents the length of the sequence, H represents the height, W represents the width, A i,u An attention weight for weighting information considering different positions, the importance of the ith position to the ith position being represented; phi i,u Representing the feature vector obtained through affine transformation; h u An input vector representing a u-th position;
s22: inputting data into the improved YOLOv5 algorithm model and training to obtain algorithm model weight, thereby obtaining the YOLOv5 motor vehicle identification statistical model.
By adopting the technical scheme, the motor vehicle is counted by adding a Criss-Cross attention mechanism and using a target tracking algorithm to track the motor vehicle in real time, so that the accuracy of recognition is improved, and the memory occupation of a model is reduced.
Preferably, the specific steps of the step S3 are as follows:
s31: taking out a plurality of samples from the verification set generated in the step S1 to manufacture a calibration data set and generating the calibration data set;
s32: writing a calibration data unit to generate an IInt8 relative entropy calibrator;
s33: configuring parameters required for constructing an INT8 quantization model, carrying out INT8 quantization, continuously adjusting a threshold value, and calculating relative entropy to obtain an optimal solution;
s34: and carrying out INT8 quantization on the YOLOv5 motor vehicle identification statistical model according to the calculated relative entropy, simultaneously reading the calibration data set in the step S31, collecting the histogram of each layer of activation value, calculating the minimum threshold value of KL divergence by adopting a KL divergence calibration method, and carrying out model calibration to obtain an engine model after INT8 quantization. The problem of large model loss when the Monx model is converted into an INT8 precision model by using TensorRT currently, so that the INT8 quantization of the model by using a KL divergence calibration method in the technical scheme perfects the problem of large loss when the model is converted; the problem that the motor vehicle cannot be accurately identified due to information loss caused by using TensorRT to conduct INT8 precision model conversion is avoided. In order to improve the recognition speed of the motor vehicle and ensure that the recognition accuracy has the lowest loss, a KL divergence calibration method is used for carrying out INT8 quantization and calibration on a model trained by YOLOv 5; the identification speed of the model after INT8 quantization can be improved to be three times of that of the original YOLOv5 model, and the real-time performance of detection is greatly improved.
Preferably, the formula of the KL divergence calibration method in step S34 is:
KL(P||Q)=ΣP(x)*log(P(x)/Q(x));
wherein P represents the actual probability distribution, Q represents the probability distribution output by the model, KL (P||Q) represents the KL divergence, which is used to measure the difference between the two probability distributions P and Q, P (x) represents the probability of the probability distribution P over event x, Q (x) represents the probability of the probability distribution Q over event x, and Σ represents the summation. The model loss is reduced through a calibration algorithm, the real-time performance of detection is improved, a complex scene result is obtained based on the combination of multiple models and common detection, the detection of each step is the embodiment of the detection accuracy of the algorithm, the problem of the reduction of the model accuracy caused by the traditional model conversion is solved, and the accuracy and the speed of motor vehicle identification are improved.
Preferably, the specific steps of the step S5 are: inputting each frame of image data acquired in the step S4 into the step S3 to acquire an engine model, and detecting each frame of image data; if the existence of the motor vehicle is detected in the current frame of image data, storing a detection result; the coordinates of four points of the rectangular frame of the detected motor vehicle are put into a group to be stored, and the coordinates of the central point of the rectangular frame are calculated, wherein the calculation formula is as follows: (c_x= ((x_left+x_right))/2), (c_y= ((y_left+y_right))/2), wherein (x_left, y_left), (x_right, y_right) represent coordinates of upper left and lower right corners of the rectangular frame, and c_x, c_y represent coordinates of a center point of the rectangular frame.
Preferably, the specific steps of the step S6 are:
s61: detecting motor vehicles appearing in each frame of image data by adopting a pre-trained multi-target tracking model (deep model), extracting the characteristics of each motor vehicle, and endowing the motor vehicles with ID;
s62: using a detection result of the engine model as a target frame input of a multi-target tracking model deep model, and taking the obtained track section as a current frame track;
s63: and (3) matching the target frame of the current frame image data with the track by means of cross-over ratio (IOU), predicting the target frame state of the next frame image data according to the track state by means of Kalman filtering, and updating all track states by means of Kalman filtering observation values and estimation values, so that the motor vehicle ID tracking is completed.
The Deep SORT multi-target tracking algorithm used in the multi-target tracking model Deep SORT model is an improvement on the SORT algorithm, and cascade matching, judgment on track states and other improvements are added on the basis of the Deep SORT algorithm. In the matching process, three situations are considered in which the prediction frame, the track and the state thereof can respectively represent the target: targets that continue to appear in the video, new targets that appear, and old targets that disappear. And for the continuously-appearing target, carrying out Kalman filtering prediction according to the result of the current frame and continuously carrying out matching in the next frame according to the detection result and the prediction result. For the newly emerging target, it is similar to the processing of the first frame and is directly converted into track information, which is temporarily retained and matched in the subsequent frames. For the old target that has disappeared, track information is still kept for it temporarily until after disappearing a certain number of times, the track is deleted. The SORT is characterized in that the target detection method based on the fast R-CNN utilizes a Kalman filtering algorithm and a Hungary algorithm, so that the speed of multi-target tracking is greatly improved, and meanwhile, the accuracy of SOTA is achieved. The algorithm is really one widely used in practical application, and the core is two algorithms: kalman filtering and hungarian algorithms. The Kalman filtering algorithm is divided into two processes, namely prediction and updating. The algorithm defines the motion state of the object as 8 normally distributed vectors. And (3) predicting: when the target moves, the parameters such as the position and the speed of the target frame of the current frame are predicted through the parameters such as the target frame and the speed of the previous frame. Updating: and carrying out linear weighting on the predicted value and the observed value and the two normally distributed states to obtain the state predicted by the current system. Hungarian algorithm: the problem of bipartite graph distribution is solved, and the IOU cost matrix for calculating the similarity in the main MOT step is obtained, so that the similarity matrix of the front frame and the rear frame is obtained. The hungarian algorithm solves the real matching goal of the two frames before and after by solving the similarity matrix.
Deep sort is mainly characterized in that: appearance information is added on the basis of the SORT algorithm, appearance features (namely Deep Association Metric in the title) are extracted by means of the ReID domain model, and the number of times of ID switches is reduced. The matching mechanism changes the original matching based on the IOU cost matrix into cascade matching and IOU matching.
Preferably, the specific steps of the step S7 are as follows:
s71: judging whether the motor vehicle enters a specified area according to the central point coordinates of the motor vehicle detected in the step S5, drawing the motor vehicle in the specified area by using opencv, and obtaining a target area by taking the coordinate points of the polygon as parameters;
s72: judging whether the motor vehicle passes through the target area for the first time within a certain time period, if so, marking the motor vehicle as a counted state, and adding one to the counted number of the motor vehicles; then adopting a multi-target tracking model (deep model) to continuously track the targets of the motor vehicle so as to prevent the vehicles from repeatedly counting when passing through the target area again; if not, not counting the vehicles;
s73: and saving the number information of the motor vehicles passing through the target area to a database and ending. Wherein opencv is generally known as Open Source Computer Vision Library, and is a cross-platform computer vision library.
Preferably, in step S4, the video of the image capturing device is read in the form of video or RTSP pull stream. After the model INT8 is quantized, the video of the camera is read in an RTSP (real time streaming) mode, the motor vehicles in the video are continuously detected, the ID of each vehicle is determined through a target tracking algorithm, whether the motor vehicles pass through a target area or not is judged in the process of continuously detecting the motor vehicles, statistics of the number of the motor vehicles is achieved, the motor vehicles passing through the target area and the number information are stored in a database, and the algorithm is finished.
Preferably, in the step S1, the Mosaic mosaics mode is used for data enhancement. Enhancing the data set improves the generalization ability of the model.
Compared with the prior art, the invention has the following beneficial effects:
(1) The KL divergence calibration method is used for reducing a great deal of loss of accuracy brought by the INT8 quantization of the model; the problem of accuracy reduction after the INT8 quantization is carried out on the model is solved, so that the model can improve the recognition speed by three times than the original model and the accuracy of the model is not lost;
(2) Training is performed by utilizing the improved network, a Crisscross (Crisscross) attention mechanism is added on the basis of the original YOLOv5s network structure, so that the GPU memory is saved, and higher computing performance is achieved;
(3) The method for solving the complex problem by reasonably using the multi-model hybrid application according to the requirements is provided, and meanwhile, the problem that the motor vehicle is repeatedly counted in a time period is effectively avoided; namely, the method for improving the yolov5+deep tracking algorithm is used for completing the algorithm development of the complex scene, and the method for solving the complex problem by using the multi-model mixture is used for completing the algorithm.
Drawings
FIG. 1 is a flow chart of a method for monitoring vehicle identification and quantity statistics based on improved YOLOv5 of the present invention;
FIG. 2 is a flow chart of INT8 quantization and calibration in step S3 of the improved YOLOv5 based vehicle identification and quantity statistics monitoring method of the present invention;
FIG. 3 is a flowchart of steps S3-S7 in the improved Yolov5 based vehicle identification and quantity statistics monitoring method of the present invention;
FIG. 4 is a block diagram of the Criss-Cross attention mechanism added in step S2 in the improved Yolov5 based vehicle identification and quantity statistics monitoring method of the present invention;
fig. 5 is a diagram showing the structure of the improved YOLOv5 network in step S2 in the method for monitoring vehicle identification and quantity statistics based on improved YOLOv5 according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the drawings of the embodiments of the present invention.
Examples: as shown in fig. 1, the method for monitoring vehicle identification and quantity statistics based on improved YOLOv5 specifically includes the following steps:
s1: generating and dividing a motor vehicle data set, and preprocessing to obtain a preprocessed data set;
the specific steps of the step S1 are as follows:
s11: collecting a large number of videos related to the motor vehicle and videos of different types of motor vehicles in a simulated shooting development scene, and performing frame extraction processing on the shot videos to generate a motor vehicle data set; the data set comprises the types of motor vehicles to be detected, such as cars, muck trucks, vans and the like;
s12: dividing the motor vehicle dataset into a training set and a validation set;
s13: respectively carrying out data enhancement on the training set and the verification set by adopting a data enhancement mode to obtain an enhanced training set and an enhanced verification set; in the step S1, a Mosaic mode is used for data enhancement; such as rotation, cropping, and increasing or decreasing image brightness;
s2: training by adopting the preprocessed data set to obtain a YOLOv5 motor vehicle identification statistical model;
the specific steps of the step S2 are as follows:
s21: constructing a YOLOv5 algorithm model, and adding a crisscross Criss-Cross attention mechanism on a YOLOv5 network structure, wherein the formula of the crisscross Criss-Cross attention mechanism is as follows:
wherein H is u An output vector representing a u-th position; t represents the length of the sequence, H represents the height, W represents the width, A i,u An attention weight for weighting information considering different positions, the importance of the ith position to the ith position being represented; phi i,u Representing the feature vector obtained through affine transformation; h u An input vector representing a u-th position;
s22: inputting data into the improved YOLOv5 algorithm model and training to obtain algorithm model weight, thereby obtaining the YOLOv5 motor vehicle identification statistical model.
S3: identifying a YOLOv5 motor vehicle identification statistical model and carrying out INT8 quantization and calibration on the YOLOv5 motor vehicle identification statistical model to obtain a quantized engine model;
the specific steps of the step S3 are as follows:
s31: taking out a plurality of samples from the verification set generated in the step S1 to manufacture a calibration data set and generating the calibration data set; in the embodiment, about 500 calibration sets are taken out from the verification set in the data set used for training, and the calibration set has better sample representativeness;
s32: writing a calibration data unit to generate an IInt8 relative entropy calibrator; constructing an IInt8 relative entropy calibrator during model quantization;
s33: configuring parameters required for constructing an INT8 quantization model, carrying out INT8 quantization, continuously adjusting a threshold value, and calculating relative entropy to obtain an optimal solution;
s34: and carrying out INT8 quantization on the YOLOv5 motor vehicle identification statistical model according to the calculated relative entropy, simultaneously reading the calibration data set in the step S31, reasoning and collecting the histogram of each layer of activation value under the FP32 precision network, calculating the minimum threshold value of KL divergence by adopting the KL divergence calibration method, and carrying out model calibration to obtain an engine model after INT8 quantization. The problem of large model loss when the Monx model is converted into an INT8 precision model by using TensorRT currently, so that the INT8 quantization of the model by using a KL divergence calibration method in the technical scheme perfects the problem of large loss when the model is converted; the problem that the motor vehicle cannot be accurately identified due to information loss caused by using TensorRT to conduct INT8 precision model conversion is avoided;
the formula of the KL divergence calibration method in step S34 is as follows:
KL(P||Q)=ΣP(x)*log(P(x)/Q(x));
wherein P represents the actual probability distribution, Q represents the probability distribution output by the model, KL (P||Q) represents the KL divergence, which is used to measure the difference between the two probability distributions P and Q, P (x) represents the probability of the probability distribution P over event x, Q (x) represents the probability of the probability distribution Q over event x, and Σ represents the summation.
S4: reading data of image acquisition equipment in a monitoring area, and acquiring image data of each frame;
in the step S4, the video of the image acquisition equipment is read in a video or RTSP (real time streaming protocol) pull stream mode; s5:
inputting each frame of image data into an engine model for detection and result analysis;
the specific steps of the step S5 are as follows: inputting each frame of image data acquired in the step S4 into the step S3 to acquire an engine model, and detecting each frame of image data; if the existence of the motor vehicle is detected in the current frame of image data, storing a detection result; the coordinates of four points of the rectangular frame of the detected motor vehicle are put into a group to be stored, and the coordinates of the central point of the rectangular frame are calculated, wherein the calculation formula is as follows: (c_x= ((x_left+x_right))/2), (c_y= ((y_left+y_right))/2), wherein (x_left, y_left), (x_right, y_right) represent coordinates of upper left and lower right corners of the rectangular frame, and c_x, c_y represent coordinates of a center point of the rectangular frame;
s6: tracking the target of the motor vehicle ID and updating the track state;
the specific steps of the step S6 are as follows:
s61: detecting motor vehicles appearing in each frame of image data by adopting a pre-trained multi-target tracking model (deep model), extracting the characteristics of each motor vehicle, and endowing the motor vehicles with ID;
s62: using the detection result of the engine model as a target frame input of a multi-target tracking model (deep model), and taking the obtained track as a current frame track;
s63: and (3) matching the target frame of the current frame image data with the track by means of cross-over ratio (IOU), predicting the target frame state of the next frame image data according to the track state by means of Kalman filtering, and updating all track states by means of Kalman filtering observation values and estimation values, so that the motor vehicle ID tracking is completed.
The Deep SORT multi-target tracking algorithm used in the multi-target tracking model (Deep SORT model) adopted by the invention is an improvement on the SORT algorithm, and the improvement such as cascade matching and judgment on the track state is added on the basis of the Deep SORT algorithm. In the matching process, three situations are considered in which the prediction frame, the track and the state thereof can respectively represent the target: targets that continue to appear in the video, new targets that appear, and old targets that disappear. And for the continuously-appearing target, carrying out Kalman filtering prediction according to the result of the current frame and continuously carrying out matching in the next frame according to the detection result and the prediction result. For the newly emerging target, it is similar to the processing of the first frame and is directly converted into track information, which is temporarily retained and matched in the subsequent frames. For the old target that has disappeared, track information is still kept for it temporarily until after disappearing a certain number of times, the track is deleted. The SORT is characterized in that the target detection method based on the fast R-CNN utilizes a Kalman filtering algorithm and a Hungary algorithm, so that the speed of multi-target tracking is greatly improved, and meanwhile, the accuracy of SOTA is achieved. The algorithm is really one widely used in practical application, and the core is two algorithms: kalman filtering and hungarian algorithms. The Kalman filtering algorithm is divided into two processes, namely prediction and updating. The algorithm defines the motion state of the object as 8 normally distributed vectors. And (3) predicting: when the target moves, the parameters such as the position and the speed of the target frame of the current frame are predicted through the parameters such as the target frame and the speed of the previous frame. Updating: and carrying out linear weighting on the predicted value and the observed value and the two normally distributed states to obtain the state predicted by the current system. Hungarian algorithm: the problem of bipartite graph distribution is solved, and the IOU cost matrix for calculating the similarity in the main MOT step is obtained, so that the similarity matrix of the front frame and the rear frame is obtained. The hungarian algorithm solves the real matching goal of the two frames before and after by solving the similarity matrix.
Deep sort is mainly characterized in that: appearance information is added on the basis of the SORT algorithm, appearance features (namely Deep Association Metric in the title) are extracted by means of the ReID domain model, and the number of times of ID switches is reduced. The matching mechanism changes the original matching based on the IOU cost matrix into cascade matching and IOU matching.
S7: judging whether the motor vehicle ID passes through the target area and counting;
the specific steps of the step S7 are as follows:
s71: judging whether the motor vehicle enters a specified area according to the central point coordinates of the motor vehicle detected in the step S5, drawing the motor vehicle in the specified area by using opencv, and obtaining a target area by taking the coordinate points of the polygon as parameters; opencv is known as Open Source Computer Vision Library, which is a cross-platform computer vision library;
s72: judging whether the motor vehicle passes through the target area for the first time within a certain time period, if so, marking the motor vehicle as a counted state, and adding one to the counted number of the motor vehicles; then adopting a multi-target tracking model (deep model) to continuously track the targets of the motor vehicle so as to prevent the vehicles from repeatedly counting when passing through the target area again; if not, not counting the vehicles;
s73: and saving the number information of the motor vehicles passing through the target area to a database and ending.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims (10)

1. The monitoring method for vehicle identification and quantity statistics based on improved YOLOv5 is characterized by comprising the following steps of:
s1: generating and dividing a motor vehicle data set, and preprocessing to obtain a preprocessed data set;
s2: training by adopting the preprocessed data set to obtain a YOLOv5 motor vehicle identification statistical model;
s3: identifying a YOLOv5 motor vehicle identification statistical model and carrying out INT8 quantization and calibration on the YOLOv5 motor vehicle identification statistical model to obtain a quantized engine model;
s4: reading data of image acquisition equipment in a monitoring area, and acquiring image data of each frame;
s5: inputting each frame of image data into an engine model for detection and result analysis;
s6: tracking the target of the motor vehicle ID and updating the track state;
s7: it is determined whether the motor vehicle ID passes through the target area and statistics are made.
2. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 according to claim 1, wherein the specific steps of the step S1 are as follows:
s11: collecting videos of motor vehicles and videos of different types of motor vehicles under a simulated shooting development scene, and performing frame extraction processing on the shot videos to generate a motor vehicle data set;
s12: dividing the motor vehicle dataset into a training set and a validation set;
s13: and respectively carrying out data enhancement on the training set and the verification set by adopting a data enhancement mode to obtain an enhanced training set and an enhanced verification set.
3. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 according to claim 2, wherein the specific steps of the step S2 are as follows:
s21: building a YOLOv5 algorithm model, and adding a crisscross attention mechanism on a YOLOv5 network structure, wherein the formula of the crisscross attention mechanism is as follows:
wherein H' u An output vector representing a u-th position; t represents the length of the sequence, H represents the height, W represents the width, A i,u An attention weight for weighting information considering different positions, the importance of the ith position to the ith position being represented; phi i,u Representing the feature vector obtained through affine transformation; h u An input vector representing a u-th position;
s22: inputting data into the improved YOLOv5 algorithm model and training to obtain algorithm model weight, thereby obtaining the YOLOv5 motor vehicle identification statistical model.
4. The method for monitoring the identification and quantity statistics of vehicles based on improved YOLOv5 according to claim 3, wherein the specific steps of step S3 are as follows:
s31: taking out a plurality of samples from the verification set generated in the step S1 to manufacture a calibration data set and generating the calibration data set;
s32: writing a calibration data unit to generate an IInt8 relative entropy calibrator;
s33: configuring parameters required for constructing an INT8 quantization model, carrying out INT8 quantization, continuously adjusting a threshold value, and calculating relative entropy to obtain an optimal solution;
s34: and carrying out INT8 quantization on the YOLOv5 motor vehicle identification statistical model according to the calculated relative entropy, simultaneously reading the calibration data set in the step S31, collecting the histogram of each layer of activation value, calculating the minimum threshold value of KL divergence by adopting a KL divergence calibration method, and carrying out model calibration to obtain an engine model after INT8 quantization.
5. The method for monitoring the identification and quantity statistics of vehicles based on the improved YOLOv5 according to claim 4, wherein the formula of the KL divergence calibration method in step S34 is:
KL(P||Q)=ΣP(x)*log(P(x)/Q(x));
wherein P represents the actual probability distribution, Q represents the probability distribution output by the model, KL (P||Q) represents the KL divergence, which is used to measure the difference between the two probability distributions P and Q, P (x) represents the probability of the probability distribution P over event x, Q (x) represents the probability of the probability distribution Q over event x, and Σ represents the summation.
6. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 according to claim 4, wherein the specific steps of the step S5 are as follows: inputting each frame of image data acquired in the step S4 into the step S3 to acquire an engine model, and detecting each frame of image data; if the existence of the motor vehicle is detected in the current frame of image data, storing a detection result; the coordinates of four points of the rectangular frame of the detected motor vehicle are put into a group to be stored, and the coordinates of the central point of the rectangular frame are calculated, wherein the calculation formula is as follows: (c_x= ((x_left+x_right))/2), (c_y= ((y_left+y_right))/2), wherein (x_left, y_left), (x_right, y_right) represent coordinates of upper left and lower right corners of the rectangular frame, and c_x, c_y represent coordinates of a center point of the rectangular frame.
7. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 of claim 6, wherein the specific steps of step S6 are as follows:
s61: detecting motor vehicles appearing in each frame of image data by adopting a pre-trained multi-target tracking model, extracting the characteristics of each motor vehicle, and endowing the motor vehicles with ID;
s62: using the detection result of the engine model as the target frame input of the multi-target tracking model, and taking the obtained track as the current frame track;
s63: and (3) matching the target frame of the current frame image data with the track by means of cross-over ratio (IOU), predicting the target frame state of the next frame image data according to the track state by means of Kalman filtering, and updating all track states by means of Kalman filtering observation values and estimation values, so that the motor vehicle ID tracking is completed.
8. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 of claim 7, wherein the specific steps of step S7 are as follows:
s71: judging whether the motor vehicle enters a specified area according to the central point coordinates of the motor vehicle detected in the step S5, drawing the motor vehicle in the specified area by using opencv, and obtaining a target area by taking the coordinate points of the polygon as parameters;
s72: judging whether the motor vehicle passes through the target area for the first time within a certain time period, if so, marking the motor vehicle as a counted state, and adding one to the counted number of the motor vehicles; then adopting a multi-target tracking model to continuously track the targets of the motor vehicle so as to prevent the vehicles from repeatedly counting when passing through the target area again; if not, not counting the vehicles;
s73: and saving the number information of the motor vehicles passing through the target area to a database and ending.
9. The method for monitoring the vehicle identification and quantity statistics based on the improved YOLOv5 according to claim 8, wherein the video of the image capturing device is read in step S4 in the form of video or RTSP pull stream.
10. The method for monitoring the identification and quantity statistics of vehicles based on improved YOLOv5 according to claim 5, wherein the step S1 uses Mosaic mobile mode for data enhancement.
CN202311022687.4A 2023-08-15 2023-08-15 Vehicle identification and quantity statistics monitoring method based on improved YOLOv5 Pending CN117037085A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311022687.4A CN117037085A (en) 2023-08-15 2023-08-15 Vehicle identification and quantity statistics monitoring method based on improved YOLOv5

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311022687.4A CN117037085A (en) 2023-08-15 2023-08-15 Vehicle identification and quantity statistics monitoring method based on improved YOLOv5

Publications (1)

Publication Number Publication Date
CN117037085A true CN117037085A (en) 2023-11-10

Family

ID=88633243

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311022687.4A Pending CN117037085A (en) 2023-08-15 2023-08-15 Vehicle identification and quantity statistics monitoring method based on improved YOLOv5

Country Status (1)

Country Link
CN (1) CN117037085A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117994987A (en) * 2024-04-07 2024-05-07 东南大学 Traffic parameter extraction method and related device based on target detection technology

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117994987A (en) * 2024-04-07 2024-05-07 东南大学 Traffic parameter extraction method and related device based on target detection technology
CN117994987B (en) * 2024-04-07 2024-06-11 东南大学 Traffic parameter extraction method and related device based on target detection technology

Similar Documents

Publication Publication Date Title
Li et al. Traffic light recognition for complex scene with fusion detections
CN107563372B (en) License plate positioning method based on deep learning SSD frame
CN109558823B (en) Vehicle identification method and system for searching images by images
US7519197B2 (en) Object identification between non-overlapping cameras without direct feature matching
CN111667512A (en) Multi-target vehicle track prediction method based on improved Kalman filtering
CN111914911B (en) Vehicle re-identification method based on improved depth relative distance learning model
CN114627447A (en) Road vehicle tracking method and system based on attention mechanism and multi-target tracking
CN114170580A (en) Highway-oriented abnormal event detection method
CN110781785A (en) Traffic scene pedestrian detection method improved based on fast RCNN algorithm
CN111915583A (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN114898326A (en) Method, system and equipment for detecting reverse running of one-way vehicle based on deep learning
CN117037085A (en) Vehicle identification and quantity statistics monitoring method based on improved YOLOv5
CN116402850A (en) Multi-target tracking method for intelligent driving
CN112434566A (en) Passenger flow statistical method and device, electronic equipment and storage medium
CN113256731A (en) Target detection method and device based on monocular vision
CN113744316A (en) Multi-target tracking method based on deep neural network
CN114049610B (en) Active discovery method for motor vehicle reversing and reverse driving illegal behaviors on expressway
CN115761888A (en) Tower crane operator abnormal behavior detection method based on NL-C3D model
CN114926796A (en) Bend detection method based on novel mixed attention module
CN114842285A (en) Roadside berth number identification method and device
CN112560799B (en) Unmanned aerial vehicle intelligent vehicle target detection method based on adaptive target area search and game and application
CN109063543B (en) Video vehicle weight recognition method, system and device considering local deformation
CN115100249B (en) Intelligent factory monitoring system based on target tracking algorithm
CN110889347A (en) Density traffic flow counting method and system based on space-time counting characteristics
CN115565157A (en) Multi-camera multi-target vehicle tracking method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination