CN113077496A - Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium - Google Patents
Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium Download PDFInfo
- Publication number
- CN113077496A CN113077496A CN202110413744.6A CN202110413744A CN113077496A CN 113077496 A CN113077496 A CN 113077496A CN 202110413744 A CN202110413744 A CN 202110413744A CN 113077496 A CN113077496 A CN 113077496A
- Authority
- CN
- China
- Prior art keywords
- tracking
- algorithm
- vehicle
- vehicles
- lightweight
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 70
- 238000001914 filtration Methods 0.000 claims abstract description 22
- 238000009432 framing Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 abstract 1
- 238000000605 extraction Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000007477 logistic regression Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000005284 excitation Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 102100034112 Alkyldihydroxyacetonephosphate synthase, peroxisomal Human genes 0.000 description 1
- 101000799143 Homo sapiens Alkyldihydroxyacetonephosphate synthase, peroxisomal Proteins 0.000 description 1
- 238000000848 angular dependent Auger electron spectroscopy Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000003014 reinforcing effect Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Probability & Statistics with Applications (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biomedical Technology (AREA)
- Evolutionary Biology (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a real-time vehicle detection and tracking method based on lightweight YOLOv3, which comprises the following steps: step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames; step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm; and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized. The whole scheme of the invention has strong robustness and low omission factor, is easy to expand to various vehicle categories, and meets the requirements of vehicle detection and continuous tracking in the monitoring video.
Description
Technical Field
The invention relates to the technical field of deep learning, in particular to a real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and a medium.
Background
With the rapid growth of automobile reserves, a large number of scholars carry out intensive research on advanced assistant driving systems, vehicle detection becomes a key point in ADAS research, effective detection and tracking of vehicles on the front road are important components for making judgment and early warning of a safety assistant driving system, and miniaturization of a detection algorithm model becomes a premise and means for rapid and real-time running of vehicle-mounted embedded equipment.
In a target detection algorithm based on a traditional method, a classifier of a feature extractor has poor generalization performance, different features need to be designed and selected in different scenes, reasonable feature extraction difficulty is high, operation complexity is high, and practical application is limited.
Detection algorithms based on deep learning can be classified into two methods, namely a region-based method and a regression-based method. The region-based method generates candidate regions through a selective search algorithm and then uses a convolutional neural network for classification, and the main methods include R-CNN, Fast R-CNN and the like. The region-based method is used for detecting in two steps, has high detection precision, but has the defects of complex network and low detection speed. Regression methods such as SSD and YOLO take the target detection problem as a regression problem, and can directly regress the object class probability and the coordinate position. However, the network structure of the algorithm is still large, and the defects that the operation speed of transplanting and deploying the algorithm to the real vehicle embedded equipment is low, the actual deployment cost is high and the like exist.
Disclosure of Invention
In view of the defects in the prior art, the present invention aims to provide a real-time vehicle detection and tracking method and system and medium based on lightweight YOLOv 3.
The invention provides a real-time vehicle detection and tracking method based on lightweight YOLOv3, which comprises the following steps:
step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames;
step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
Preferably, the backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeating residual units are employed.
Preferably, the step 1 comprises the steps of:
step 1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
step 1.2: acquiring a single-frame video image in a traffic video;
step 1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
Preferably, in step 2, a kalman filter algorithm is used to predict the position of the frame at the next time, and the state of the kalman filter is updated.
The invention also provides a real-time vehicle detection and tracking system based on the lightweight YOLOv3, which comprises the following modules:
module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame;
module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
Preferably, the backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeating residual units are employed.
Preferably, the module M1 includes the following modules:
module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
module M1.2: acquiring a single-frame video image in a traffic video;
module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
Preferably, the module M2 uses a kalman filter algorithm to predict the position of the box at the next time, and updates the state of the kalman filter.
The invention also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method as set forth above.
Compared with the prior art, the invention has the following beneficial effects:
1. according to the invention, the light-weight YOLOv3 algorithm is used for vehicle detection, vehicles with different sizes can be detected and the positions of the vehicles can be framed, the characteristics are not required to be designed manually, the process of characteristic selection is omitted, the quality of extracted characteristics is better, the vehicle detection method is more robust in the face of complex scenes, and the omission ratio is low;
2. the invention improves the traditional YOLOv3 algorithm, reduces the network parameters under the condition that the average accuracy rate is basically kept unchanged, reduces the network model to 1/4 of the original YOLOv3, and doubles the detection speed;
3. the method adopts a Kalman filtering algorithm to track the vehicle position at the next moment after target detection;
4. on the basis of adopting a Kalman filtering algorithm, a Hungarian matching algorithm is utilized to carry out association matching on vehicles in adjacent frames of a video, the unique label ID of a detected target is determined, accurate positioning and tracking of a plurality of targets are realized, and the unstable detection conditions such as detection discontinuity, omission, target occlusion and the like are improved;
5. the detection tracking algorithm has stronger robustness in a complex road environment, and can meet the requirements on the precision and the speed of vehicle detection tracking in the actual intelligent driving process.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a diagram of a neural network architecture of the present invention;
FIG. 3 is a diagram of the residual network architecture of the present invention;
FIG. 4 is a diagram illustrating the effect of the present invention on the KITTI training set;
FIG. 5 is a diagram of the effect of the model of the present invention on the validation set.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
Referring to fig. 1, a real-time vehicle detection and tracking method in a traffic video is divided into a vehicle detection module and a vehicle tracking module.
Referring to fig. 3, in a vehicle detection algorithm, a target detection algorithm based on a lightweight YOLOv3 algorithm is invented, the algorithm is improved on a traditional YOLOv3 network structure, a darknet53 is not used as a backbone network, but a YOLOv3-tiny structure similar to the darknet19 is used for reference, the backbone network uses 7 convolutional layers, in a residual error network structure of each convolutional layer, fewer repeated residual error units are used, and the network depth is reduced by reducing the convolutional layers. And selecting a proper prior frame size by using a modified K-means + + algorithm in the selection of the prior frame size. On the final network output, a logistic regression is used to enable the output prediction box to better cover the vehicle to be detected.
The improved lightweight YOLOv3 network structure is shown in fig. 2 and can be divided into a backbone network and a detection head; firstly, a backbone network is used for extracting the characteristics of an image; the input size of the backbone network is 416 multiplied by 3, and a convolution layer containing 16 convolution kernels multiplied by 3 is adopted to carry out primary feature extraction on the input network. The convolution formula for feature extraction is:
in the formula, ai,jA value representing the coordinate (i, j) in the feature map; w is ac,m,nThe c channel coordinate is the value of (m, n) for the convolution kernel; x is the number ofc,i+m,j+nInputting a value with the c channel coordinate being (i + m, j + n); w is abIs the bias term of the convolution kernel.
The extracted feature layer network output of the previous layer is used as the input of the next layer, then 32 convolution kernels with the step size of 2 are adopted for filtering, and a residual block is adopted for reinforcing the network feature extraction, and as shown in fig. 3, the residual block is formed by connecting 16 convolution kernels with the step size of 1 × 1 and 32 convolution kernels with the step size of 3 × 3. The residual network formula is:
out=f2[f1(in)]+in,
where in is the input, f1Is a 1X 1 convolutional layer, f23 × 3 convolutional layers, out is the output.
By adopting the same principle method, filtering is carried out on subsequent network structures by respectively sequentially adopting 64, 28, 256, 512 and 1024 3 multiplied by 3 convolution kernels with the step length of 2, so that the length and the width of a feature extraction layer are reduced, the depth is increased, and deeper features can be extracted. Meanwhile, 2, 4 and 4 residual blocks are sequentially adopted for connecting the first 4 subsequent different convolution kernels, feature extraction is enhanced, a large number of network layers are reduced compared with YOLOv3, the network structure is miniaturized, and the learning and network forward reasoning running speed can be increased.
Referring to fig. 4 and 5, next is a detection head for performing target detection on a feature map given by the backbone network and giving the position of a detection frame.
The detection head extracts the feature extraction convolution layers with the sizes of 52 x 52, 26 x 26 and 13 x 13, wherein 3 layers are extracted separately, and the depths of the feature extraction convolution layers are 128, 256 and 1024 respectively. The 13 × 13 feature layer is convolved with 512 1 × 1 convolution kernels, 1024 3 × 3 convolution kernels, and 18 1 × 1 convolution kernels in this order, so that the input of the feature layer network is 13 × 13 × 18. Each vector of 1 × 1 × 18 is responsible for determining whether the object in the receptive field is an automobile, and if so, the central coordinate and the size of the prediction frame are given at the same time.
The 13 × 13 feature extraction layer not only outputs the result alone, but also amplifies the 13 × 13 features to 26 × 26 size by means of upsampling, connects the up-sampled features with the original 26 × 26 feature layer, and outputs the up-sampled features simultaneously after convolving the 26 × 26 feature layer. The original 26 × 26 layers are also connected with 52 × 52 for output in the same way, so as to obtain multi-scale target detection.
And for the size of the output prediction box, selecting a proper prior box size in the training set by using a clustering algorithm to predict the automobiles with different scales in different videos. The invention adopts the K-means + + algorithm to calculate the clusters, and the K-means + + algorithm improves the selection of the initial points of the K-means algorithm, so that the distance between the cluster centers is as far as possible. The key steps of the K-means + + algorithm are as follows:
step S1: randomly taking a sample from the data set as an initial clustering center u1;
Step S2: calculate the distance D (X) of each sample xi of the data X from the nearest cluster centeri),
D(xi)=argmin|xi-ur|;
Step S3: calculating the probability of each sample being selected as the next cluster centerSelecting a next clustering center;
step S4: and repeating the step 2 and the step 3 until k clustering centers are selected.
After 9 prior frames are obtained according to the clustering center, the larger 52 × 52 feature extraction convolutional layer adopts the smaller 3 prior frames and has the largest receptive field, the middle 26 × 26 feature extraction convolutional layer adopts the middle 3 prior frames, and the smaller 13 × 13 feature extraction convolutional layer adopts the larger 3 prior frames and has the smallest receptive field.
After the neural network outputs the prediction frame, in order to enable the prediction frame to better cover the detected target, a logistic regression function is used for carrying out confidence regression on each prior frame on different scales, the frame value of the object is predicted, and the most appropriate target category area is selected according to the confidence. The logistic regression function prediction formula is:
in the formula, cx,cyCoordinate offset of grid coordinates relative to the center of the image; p is a radical ofw,phThe length of the length and the width of the prior frame is the side length; t is tx,ty,tw,thA target value for deep web learning; bx,by,bw,bhAnd the coordinate values of the predicted frame are finally calculated by a formula.
In a vehicle tracking module, the invention uses Kalman filtering algorithm and Hungarian algorithm to realize the accurate positioning and tracking of a plurality of targets.
And calculating the centroid coordinate of the frame according to the prior frame detected by the lightweight YOLOv3 target detection algorithm, predicting the position of the frame at the next moment by using a Kalman filtering algorithm, and updating the state of a Kalman filter.
And the Kalman filtering algorithm predicts the coordinate position of the target at the moment according to the coordinate position of the vehicle detected at the last moment. Firstly, calculating the coordinates (X, y) of the centroid of the detected object according to the coordinates of the vehicle frame detected by the deep learning detection algorithm, and expressing the coordinates as the current state X of the targett|t,Xt-1|t-1Is the target state at the last moment, Xt|t-1Predicting the target state of the current time, the observation state ZtFor the coordinates of the actually detected object centroid, Pt|tEstimating the error covariance, P, for the current time instantt|t-1And marking the estimation error covariance of the current moment predicted by the previous moment. A is a state transition matrix, H is an observation matrix, KtGain matrix, W, for Kalman filteringt-1|t-1For the excitation noise at the previous time instant, Q, R are the covariance matrices of the excitation noise and the observation noise, respectively. The Kalman filtering tracking collective formula is as follows:
Xt|t-1=AXt-1|t-1+Wt-1|t-1, (1)
Pt|t-1=APt-1|t-1AT+Wt-1|t-1, (2)
Xt|t=Xt|t-1+Kt(Zt-Xt-1|t-1), (3)
Pt|t=Pt|t-1-KtHPt|t-1, (4)
Kt=Pt|t-1HT(HPt|t-1HT+R)-1 (5)
the position of the vehicle detected at the previous time at the current time is predicted using equations (1), (2), and the state of the kalman filter is updated using equations (3), (4), (5).
On the basis of tracking the vehicle position at the next moment after the target is detected by adopting a Kalman filtering algorithm, vehicles in adjacent frames of the video are associated and matched by utilizing a Hungarian matching algorithm, the unique ID label of the detected target is determined, and accurate positioning and tracking of a plurality of targets are realized.
The Hungarian optimal matching algorithm adopts Euclidean distance of two different set coordinates as a cost matrix when performing association matching, and then adopts the Hungarian algorithm to perform characteristic association, namely, the minimum value d of the Euclidean distance between the centroid coordinate predicted at different previous moments and the detection coordinate at the current moment is solvedminPredicting coordinates and time of the last momentAnd assigning and associating the detection coordinates. The Euclidean distance calculation formula is as follows:
set of predicted box centroid coordinates predicted for multiple previous momentsIs the set of the coordinates of the centroid of the prediction frame detected at the moment.
And when the detection value at the moment is not distributed to any predicted value at the previous moment, namely the number of the predicted coordinates at the previous moment is less than the number of the detected coordinates at the moment, tracking the detection value as a new target. The specific formula is as follows:
nt-1<nt,
wherein the number of predicted coordinates at time t-1 is nt-1And the number of actual detection coordinates at the time t is nt。
In the actual tracking situation, considering the situations of missing detection and tracking failure, when the calculated Euclidean distance exceeds a set threshold value or a plurality of frames fail to detect the vehicle object, the tracking loss is determined. The specific formula is as follows:
f>fmax∨d>dmax,
in the formula, f is the number of target frames which are not continuously detected; f. ofmaxThe maximum number of lost frames; d is the Euclidean distance; dmaxIs the maximum distance threshold.
The invention also provides a real-time vehicle detection and tracking system based on the lightweight YOLOv3, which comprises the following modules: module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame; module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm; module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
The backbone network of the lightweight YOLOv3 algorithm employs one 7-layer convolutional layer, and in the residual network structure of each convolutional layer, fewer repeated residual units are employed.
Module M1 includes the following modules: module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames; module M1.2: acquiring a single-frame video image in a traffic video; module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
The kalman filter algorithm is used in block M2 to predict the position of the box at the next time while updating the state of the kalman filter.
The invention also provides a computer-readable storage medium having a computer program stored thereon, which, when being executed by a processor, carries out the steps of the method as described above.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices, modules, units provided by the present invention as pure computer readable program code, the system and its various devices, modules, units provided by the present invention can be fully implemented by logically programming method steps in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices, modules and units thereof provided by the invention can be regarded as a hardware component, and the devices, modules and units included in the system for realizing various functions can also be regarded as structures in the hardware component; means, modules, units for performing the various functions may also be regarded as structures within both software modules and hardware components for performing the method.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.
Claims (9)
1. A real-time vehicle detection and tracking method based on lightweight YOLOv3 is characterized by comprising the following steps:
step 1: detecting the vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles in the video by using a priori frames;
step 2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
and step 3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
2. The method as claimed in claim 1, wherein the backbone network of the lightweight YOLOv3 algorithm employs 7 convolutional layers, and in the residual network structure of each convolutional layer, fewer repeated residual units are employed.
3. The method for detecting and tracking the vehicle in real time based on the lightweight YOLOv3 as claimed in claim 1, wherein the step 1 comprises the following steps:
step 1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
step 1.2: acquiring a single-frame video image in a traffic video;
step 1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
4. The method as claimed in claim 1, wherein the step 2 uses kalman filter algorithm to predict the position of the frame at the next time, and updates the state of the kalman filter.
5. A real-time vehicle detection and tracking system based on lightweight YOLOv3 is characterized by comprising the following modules:
module M1: detecting vehicles in the traffic video by adopting a lightweight YOLOv 3-based algorithm, and marking the positions of the vehicles by using a priori frame;
module M2: tracking the vehicle position at the next moment by applying a Kalman filtering algorithm;
module M3: on the basis of Kalman filtering algorithm tracking, the unique tag ID of the detected target is determined by using Hungarian matching algorithm, and accurate positioning and tracking of a plurality of targets are realized.
6. The system of claim 5, wherein the backbone network of the lightweight YOLOv3 algorithm employs 7 convolutional layers, and fewer repeating residual units are employed in the residual network structure of each convolutional layer.
7. The system for real-time vehicle detection and tracking based on the lightweight YOLOv3 of claim 5, wherein the module M1 comprises the following modules:
module M1.1: performing K-means + + clustering on the vehicle frames in the training set, selecting three types of vehicle frames with different sizes, selecting three vehicle frames with different shapes in each type of size, and taking nine vehicle frame shapes as prior frames;
module M1.2: acquiring a single-frame video image in a traffic video;
module M1.3: and predicting the coordinates of the candidate vehicles by adopting a target detection algorithm based on the lightweight YOLOv3, and framing all the candidate vehicles by using a priori frames according with the sizes of the vehicles.
8. The system for real-time vehicle detection and tracking based on the lightweight YOLOv3 of claim 5, wherein the module M2 utilizes a kalman filter algorithm to predict the position of the frame at the next time and update the state of the kalman filter.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110413744.6A CN113077496A (en) | 2021-04-16 | 2021-04-16 | Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110413744.6A CN113077496A (en) | 2021-04-16 | 2021-04-16 | Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113077496A true CN113077496A (en) | 2021-07-06 |
Family
ID=76618134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110413744.6A Pending CN113077496A (en) | 2021-04-16 | 2021-04-16 | Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113077496A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113129336A (en) * | 2021-03-31 | 2021-07-16 | 同济大学 | End-to-end multi-vehicle tracking method, system and computer readable medium |
CN116778224A (en) * | 2023-05-09 | 2023-09-19 | 广州华南路桥实业有限公司 | Vehicle tracking method based on video stream deep learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472467A (en) * | 2019-04-08 | 2019-11-19 | 江西理工大学 | The detection method for transport hub critical object based on YOLO v3 |
CN111126152A (en) * | 2019-11-25 | 2020-05-08 | 国网信通亿力科技有限责任公司 | Video-based multi-target pedestrian detection and tracking method |
CN111476826A (en) * | 2020-04-10 | 2020-07-31 | 电子科技大学 | Multi-target vehicle tracking method based on SSD target detection |
CN112241969A (en) * | 2020-04-28 | 2021-01-19 | 北京新能源汽车技术创新中心有限公司 | Target detection tracking method and device based on traffic monitoring video and storage medium |
-
2021
- 2021-04-16 CN CN202110413744.6A patent/CN113077496A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110472467A (en) * | 2019-04-08 | 2019-11-19 | 江西理工大学 | The detection method for transport hub critical object based on YOLO v3 |
CN111126152A (en) * | 2019-11-25 | 2020-05-08 | 国网信通亿力科技有限责任公司 | Video-based multi-target pedestrian detection and tracking method |
CN111476826A (en) * | 2020-04-10 | 2020-07-31 | 电子科技大学 | Multi-target vehicle tracking method based on SSD target detection |
CN112241969A (en) * | 2020-04-28 | 2021-01-19 | 北京新能源汽车技术创新中心有限公司 | Target detection tracking method and device based on traffic monitoring video and storage medium |
Non-Patent Citations (2)
Title |
---|
何丹妮: "基于深度学习的多车辆检测及跟踪算法研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 * |
刘军等: "基于增强Tiny+YOLOV3算法的车辆实时检测与跟踪", 《农业工程学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113129336A (en) * | 2021-03-31 | 2021-07-16 | 同济大学 | End-to-end multi-vehicle tracking method, system and computer readable medium |
CN116778224A (en) * | 2023-05-09 | 2023-09-19 | 广州华南路桥实业有限公司 | Vehicle tracking method based on video stream deep learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10733755B2 (en) | Learning geometric differentials for matching 3D models to objects in a 2D image | |
CN108805083B (en) | Single-stage video behavior detection method | |
CN110033002B (en) | License plate detection method based on multitask cascade convolution neural network | |
CN108509859B (en) | Non-overlapping area pedestrian tracking method based on deep neural network | |
Dewangan et al. | RCNet: road classification convolutional neural networks for intelligent vehicle system | |
CN112101221B (en) | Method for real-time detection and identification of traffic signal lamp | |
CN108197326B (en) | Vehicle retrieval method and device, electronic equipment and storage medium | |
CN110175649B (en) | Rapid multi-scale estimation target tracking method for re-detection | |
CN109800692B (en) | Visual SLAM loop detection method based on pre-training convolutional neural network | |
CN108171112A (en) | Vehicle identification and tracking based on convolutional neural networks | |
CN111667512B (en) | Multi-target vehicle track prediction method based on improved Kalman filtering | |
CN113378890B (en) | Lightweight pedestrian vehicle detection method based on improved YOLO v4 | |
US11741368B2 (en) | Image segmentation | |
CN103984948B (en) | A kind of soft double-deck age estimation method based on facial image fusion feature | |
CN112052802B (en) | Machine vision-based front vehicle behavior recognition method | |
CN111640136B (en) | Depth target tracking method in complex environment | |
CN113077496A (en) | Real-time vehicle detection and tracking method and system based on lightweight YOLOv3 and medium | |
CN113762209A (en) | Multi-scale parallel feature fusion road sign detection method based on YOLO | |
CN112738470B (en) | Method for detecting parking in highway tunnel | |
CN111626120B (en) | Target detection method based on improved YOLO-6D algorithm in industrial environment | |
CN112990065A (en) | Optimized YOLOv5 model-based vehicle classification detection method | |
Mayr et al. | Self-supervised learning of the drivable area for autonomous vehicles | |
CN113963333B (en) | Traffic sign board detection method based on improved YOLOF model | |
CN113792631B (en) | Aircraft detection and tracking method based on multi-scale self-adaption and side-domain attention | |
CN116630932A (en) | Road shielding target detection method based on improved YOLOV5 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210706 |
|
RJ01 | Rejection of invention patent application after publication |