CN118155012A - YOLOv5 s-based lightweight vehicle model training method - Google Patents
YOLOv5 s-based lightweight vehicle model training method Download PDFInfo
- Publication number
- CN118155012A CN118155012A CN202311806698.1A CN202311806698A CN118155012A CN 118155012 A CN118155012 A CN 118155012A CN 202311806698 A CN202311806698 A CN 202311806698A CN 118155012 A CN118155012 A CN 118155012A
- Authority
- CN
- China
- Prior art keywords
- model
- yolov
- network
- training
- network model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 52
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000001514 detection method Methods 0.000 claims abstract description 23
- 238000002372 labelling Methods 0.000 claims abstract description 11
- 238000007781 pre-processing Methods 0.000 claims abstract description 11
- 230000006870 function Effects 0.000 claims description 26
- 238000004821 distillation Methods 0.000 claims description 16
- 230000007246 mechanism Effects 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 13
- 238000013140 knowledge distillation Methods 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 5
- 230000006978 adaptation Effects 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 4
- 238000004088 simulation Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a YOLOv s-based lightweight vehicle-mounted model training method, and relates to the technical field of lightweight vehicle-mounted models. The invention at least comprises the following steps: s1: acquiring an original image data set and preprocessing the original image data set to obtain a target image data set, wherein the original image data set is a large number of image sets containing targets to be detected, which are acquired through an automobile camera, and the preprocessing comprises data labeling and data enhancement; s2: training YOLOv s network model by using the target image dataset to obtain a first network model; s3: based on a lightweight MobileOne network and a Slim-Neck structure, the YOLOv5s network is improved to obtain a lightweight YOLOv s network model. According to the method, the target image data set is obtained by preprocessing the original image set such as data labeling and data enhancement, so that the data breadth and the distribution space of the data set are improved, and further the detection precision and the generalization capability of the lightweight YOLOv s network model are effectively ensured.
Description
Technical Field
The invention relates to the technical field of lightweight vehicle-mounted models, in particular to a YOLOv s-based lightweight vehicle-mounted model training method.
Background
The task of object detection is to find all objects of interest in an image, determine their category and location, which is the premise and basis for many computer vision tasks, and is always the most challenging problem in the field of computer vision.
Along with a great deal of experimental research, more and more image processing and recognition technologies are continuously emerging, and particularly, in recent years, application and popularization of artificial intelligence technologies represented by deep learning are realized, an important new model is provided for target detection, and proper target characteristics can be automatically calculated by only building a proper network model and training a data set.
Along with deep learning, automatic driving and rapid development of new energy automobiles, more and more models need to run at the vehicle-mounted end, however, along with the continuous deepening of network layers, the target detection model becomes more and more complex, and the required calculation amount is also continuously increased, so that the models are difficult to run at the vehicle-mounted end, and meanwhile, the accuracy and the speed of model detection are also difficult to reach balance. The YOLO series network has the advantages of high detection speed, high real-time performance and the like, and is widely applied to the field of real-time target detection, but the existing YOLO algorithm still cannot meet the vehicle-mounted application scene in terms of accuracy and speed.
There is therefore a need to propose a new solution to the above problems.
Disclosure of Invention
The invention aims to provide a YOLOv s-based lightweight vehicle model training method.
In order to achieve the above purpose, the present invention provides the following technical solutions: a YOLOv s-based lightweight vehicle-mounted model training method at least comprises the following steps:
s1: acquiring an original image data set and preprocessing the original image data set to obtain a target image data set, wherein the original image data set is a large number of image sets containing targets to be detected, which are acquired through an automobile camera, and the preprocessing comprises data labeling and data enhancement;
S2: training YOLOv s network model by using the target image dataset to obtain a first network model;
s3: based on a lightweight MobileOne network and a Slim-Neck structure, improving a YOLOv5s network to obtain a lightweight YOLOv s network model;
s4: and taking the first network model as a teacher model, and taking the light YOLOv s network model as a student model to carry out knowledge distillation to obtain a target network model.
Preferably, the data annotation tags each image in the dataset by LabelImg means, converts the image into a format that can be read for YOLOv s, and finally separates the dataset into a training set and a validation set.
Preferably, the data enhancement method comprises a Mosaic algorithm and a Mixup algorithm;
The Mosaic algorithm is to splice 4 pictures in a random scaling, random cutting and random arrangement mode to form a large picture, so that the diversity of training data is increased;
The Mixup algorithm randomly selects two pictures, and mixes the two pictures according to a certain proportion to generate a new image as enhanced data;
The method comprises the steps of carrying out data labeling on an original image set to obtain a first image set, carrying out data enhancement on the first image data set to obtain a target image data set, and improving the data breadth and the distribution space of the data set, so that the detection precision and the generalization capability of a lightweight YOLOv s network model are effectively ensured.
Preferably, the step S2 includes at least the following steps, after configuring relevant training parameters according to the resource calculation condition of the server, training the YOLOv S network model by using the training set of the target image dataset; after training is completed, evaluating the trained YOLOv s network model by using a verification set of the target image data set until the result meets a preset index to obtain a first network model;
the evaluation uses a target detection model in deep learning to evaluate mAP (multi-class average Precision), precision (accuracy), recall (Recall) and FLOPs (number of floating point operations performed per second) indexes of the main stream;
When YOLOv s network model is trained, the original CIoU loss function is replaced by MPDIoU loss function, MPDIoU contains all relevant factors considered in the existing loss function, namely overlapping or non-overlapping area, center point distance and wide-high deviation, and meanwhile, the calculation process is simplified;
MPDIoU the loss function can be expressed by the following formula:
Where IOU represents the conventional cross-ratio penalty, d 1 represents the relative distance between the upper left corner of the real frame and the predicted frame, d 2 represents the relative distance between the real frame and the lower right corner of the predicted frame, w represents the width of the input image, and h represents the height of the input image.
Preferably, the MobileOne network in S3 is a mobile-end efficient backbone network recently released by apple team;
the lightweight MobileOne network in the S3 is a MobileOne network after the CoordAttention attention mechanism module is used for replacing the original SE attention mechanism module of the network;
the Slim-Neck structure is a Neck structure which is introduced with a lightweight convolution technology GSConv to replace standard convolution;
CoordAttention is a light-weight attention mechanism module based on coordinated attention, which can further reduce the computational complexity of MobileOne networks by replacing SE attention mechanism modules in MobileOne networks with CoordAttention attention mechanism modules, and also help MobileOne networks more accurately locate and identify the position of a target object in an input image;
The calculation cost of the lightweight convolution technique GSConv is about 60% -70% of that of the standard convolution, so that the calculation amount can be effectively reduced, and the output is as close to the standard convolution as possible;
The lightweight YOLOv s network model obtained by the method not only can reduce the parameter quantity of the model and reduce the hardware resource consumption, but also can effectively ensure the detection precision and generalization capability of the network model.
Preferably, the step S4 at least includes the following steps:
Using a first network model obtained after training the target image dataset in the step S2 as a teacher model, using a lightweight YOLOv S network model obtained by improvement in the step S3 as a student model, performing fine tuning training on the student model by using knowledge distillation, and finally obtaining a target network model which meets expectations by adopting an evaluation method in the step S2;
A FGFI (Fine-grained feature simulation) technical method is adopted during distillation, and the core idea of the method is that a teacher model transmits more key effective information to a student model, but not ineffective background information;
In general, feature maps near key locations of target areas contain some important information of the teacher model, so key locations near the target areas can be estimated first, and then the student model can be made to simulate feature maps of the teacher model at these locations to obtain better performance.
The specific operation of distillation is as follows:
For each truth box, calculating the IOU between the truth box and the key point to obtain an IOU graph w×h×k (W is the width of the feature graph, H is the height of the feature graph, and K is the number of key positions), called M, taking the maximum value M (m=max (M)), and calculating the threshold F according to M:
Filtering out the positions with IOU values lower than F according to a threshold F, and obtaining a mask of W.H by combining the IOU graphs of the rest positions through OR operation;
After all the truth boxes are subjected to the operation, each mask is combined to obtain a final FGFI mask, and the mask contains information of the key position of the target area;
The distance formula between the teacher model and the student model feature map is calculated as follows:
Wherein (i, j) represents a position, c represents a channel, s and t are respectively network feature graphs of a student model and a teacher model, and f adap is an adaptation function;
for all estimated key locations, the distillation objective is to minimize the distance of the student model from the teacher model feature map at these locations, i.e., minimize:
wherein I is the mask obtained by the above operation, N P is the number of positive examples in the mask, and the loss function of the final student model is as follows:
L=LMPDIoU+Limitation
where L MPDIoU is the MPDIoU penalty function used in training the teacher model.
And the model distillation technology is adopted, knowledge is extracted from the trained YOLOv s network model to the light YOLOv s network model, and the detection precision of the target network model can be ensured. The fine granularity characteristic is applied to the model distillation process by FGFI technical method, so that the student model can be helped to capture more abundant characteristic information; by introducing the supervision signal of the original target detection task in the training process, the FGFI technology can better keep the detail characteristics in the complex model, thereby improving the performance of the lightweight model.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, the target image data set is obtained by preprocessing the original image set such as data labeling and data enhancement, so that the data breadth and the distribution space of the data set are improved, and the detection precision and the generalization capability of the lightweight YOLOv s network model are further effectively ensured;
2. according to the invention, the MPDIoU loss function is adopted to replace the original CIoU loss function during training, and MPDIoU contains all relevant factors considered in the existing loss function, namely an overlapped or non-overlapped area, a center point distance and a wide-high deviation, so that the calculation process is simplified, and the convergence rate of the model can be accelerated while the model is enabled to obtain higher precision through training;
3. According to the invention, the SE attention mechanism module in the MobileOne network is replaced by the CoordAttention attention mechanism module, so that the computational complexity of the MobileOne network can be further reduced, and the MobileOne network is helped to more accurately locate and identify the position of the target object in the input image; the calculation cost of the lightweight convolution technique GSConv is about 60% -70% of that of the standard convolution, so that the calculation amount can be effectively reduced, and the output is as close to the standard convolution as possible; the lightweight YOLOv s network model obtained by the method not only can reduce the parameter quantity of the model and reduce the hardware resource consumption, but also can effectively ensure the detection precision and generalization capability of the network model.
4. According to the invention, a model distillation technology is adopted, knowledge is extracted from a trained YOLOv s network model to a lightweight YOLOv s network model, and the detection precision of a target network model can be ensured; the fine granularity characteristic is applied to the model distillation process by FGFI technical method, so that the student model can be helped to capture more abundant characteristic information; by introducing the supervision signal of the original target detection task in the training process, the FGFI technology can better keep the detail characteristics in the complex model, thereby improving the performance of the lightweight model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a YOLOv s-based lightweight vehicle model training method;
FIG. 2 is a schematic diagram of a CoordAttention attention mechanism module structure introduced in the present invention;
FIG. 3 is a schematic diagram of a MobileOne network architecture incorporating the present invention;
FIG. 4 is a schematic diagram of a GSConv network architecture incorporating the present invention;
FIG. 5 is a schematic diagram of a model distillation structure employing a fine grain characterization simulation method of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.
Referring to fig. 1-5, a lightweight vehicle model training method based on YOLOv s at least includes the following steps:
s1: acquiring an original image data set and preprocessing the original image data set to obtain a target image data set, wherein the original image data set refers to a large number of image sets containing targets to be detected, which are acquired through an automobile camera, and the preprocessing comprises data labeling and data enhancement;
S2: training YOLOv s network model by using the target image dataset to obtain a first network model;
s3: based on a lightweight MobileOne network and a Slim-Neck structure, improving a YOLOv5s network to obtain a lightweight YOLOv s network model;
s4: and taking the first network model as a teacher model, and taking the light YOLOv s network model as a student model to carry out knowledge distillation to obtain a target network model.
The data labeling tags each image in the data set through LabelImg tools, converts the images into a format which can be read and trained by YOLOv s, and finally divides the data set into a training set and a verification set.
The data enhancement method comprises a Mosaic algorithm and a Mixup algorithm;
The Mosaic algorithm is to splice 4 pictures in a random scaling, random cutting and random arrangement mode to form a large picture, so that the diversity of training data is increased;
the Mixup algorithm randomly selects two pictures, and mixes the two pictures according to a certain proportion to generate a new image as enhanced data;
The method comprises the steps of carrying out data labeling on an original image set to obtain a first image set, carrying out data enhancement on the first image data set to obtain a target image data set, and improving the data breadth and the distribution space of the data set, so that the detection precision and the generalization capability of a lightweight YOLOv s network model are effectively ensured.
S2, at least, in the following steps, after relevant training parameters are configured according to the resource calculation condition of a server, training a YOLOv S network model by using a training set of a target image dataset; after training is completed, evaluating the trained YOLOv s network model by using a verification set of the target image data set until the result meets a preset index to obtain a first network model;
During evaluation, using a target detection model in deep learning to evaluate mAP (multi-class average Precision), precision (accuracy), recall (Recall) and FLOPs (number of floating point operations performed by the model per second) indexes of the main stream;
When YOLOv s network model is trained, the original CIoU loss function is replaced by MPDIoU loss function, MPDIoU contains all relevant factors considered in the existing loss function, namely overlapping or non-overlapping area, center point distance and wide-high deviation, and meanwhile, the calculation process is simplified;
MPDIoU the loss function can be expressed by the following formula:
Where IOU represents the conventional cross-ratio penalty, d 1 represents the relative distance between the upper left corner of the real frame and the predicted frame, d 2 represents the relative distance between the real frame and the lower right corner of the predicted frame, w represents the width of the input image, and h represents the height of the input image.
The MobileOne network in the S3 is a mobile terminal efficient backbone network recently released by apple team;
the lightweight MobileOne network in the S3 is a MobileOne network after the CoordAttention attention mechanism module is used for replacing the original SE attention mechanism module of the network;
The Slim-Neck structure is a Neck structure which is introduced with a lightweight convolution technique GSConv to replace standard convolution;
CoordAttention is a light-weight attention mechanism module based on coordinated attention, which can further reduce the computational complexity of MobileOne networks by replacing SE attention mechanism modules in MobileOne networks with CoordAttention attention mechanism modules, and also help MobileOne networks more accurately locate and identify the position of a target object in an input image;
The calculation cost of the lightweight convolution technique GSConv is about 60% -70% of that of the standard convolution, so that the calculation amount can be effectively reduced, and the output is as close to the standard convolution as possible;
The lightweight YOLOv s network model obtained by the method not only can reduce the parameter quantity of the model and reduce the hardware resource consumption, but also can effectively ensure the detection precision and generalization capability of the network model.
S4 at least comprises the following steps:
Using a first network model obtained after training the target image dataset in the step S2 as a teacher model, using a lightweight YOLOv S network model obtained by improvement in the step S3 as a student model, performing fine tuning training on the student model by using knowledge distillation, and finally obtaining a target network model which meets expectations by adopting an evaluation method in the step S2;
A FGFI (Fine-grained feature simulation) technical method is adopted during distillation, and the core idea of the method is that a teacher model transmits more key effective information to a student model, but not ineffective background information;
In general, feature maps near key locations of target areas contain some important information of the teacher model, so key locations near the target areas can be estimated first, and then the student model can be made to simulate feature maps of the teacher model at these locations to obtain better performance.
The specific operation of distillation is as follows:
For each truth box, calculating the IOU between the truth box and the key point to obtain an IOU graph w×h×k (W is the width of the feature graph, H is the height of the feature graph, and K is the number of key positions), called M, taking the maximum value M (m=max (M)), and calculating the threshold F according to M:
Filtering out the positions with IOU values lower than F according to a threshold F, and obtaining a mask of W.H by combining the IOU graphs of the rest positions through OR operation;
After all the truth boxes are subjected to the operation, each mask is combined to obtain a final FGFI mask, and the mask contains information of the key position of the target area;
The distance formula between the teacher model and the student model feature map is calculated as follows:
Wherein (i, j) represents a position, c represents a channel, s and t are respectively network feature graphs of a student model and a teacher model, and f adap is an adaptation function;
for all estimated key locations, the distillation objective is to minimize the distance of the student model from the teacher model feature map at these locations, i.e., minimize:
wherein I is the mask obtained by the above operation, N P is the number of positive examples in the mask, and the loss function of the final student model is as follows:
L=LMPDIoU+Limitation
where L MPDIoU is the MPDIoU penalty function used in training the teacher model.
And the model distillation technology is adopted, knowledge is extracted from the trained YOLOv s network model to the light YOLOv s network model, and the detection precision of the target network model can be ensured. The fine granularity characteristic is applied to the model distillation process by FGFI technical method, so that the student model can be helped to capture more abundant characteristic information; by introducing the supervision signal of the original target detection task in the training process, the FGFI technology can better keep the detail characteristics in the complex model, thereby improving the performance of the lightweight model.
1. Based on a lightweight MobileOne network and a Slim-Neck structure, improving a YOLOv5s network to obtain a lightweight YOLOv s network model;
2. Acquiring an image dataset of a target to be detected, and preprocessing the image dataset to obtain a target image dataset; training the YOLOv s network model by adopting the target image dataset and replacing the training loss function with MPDIoU loss function to obtain a first network model;
3. And taking the first network model as a teacher model, taking the lightweight YOLOv s network model as a student model, and carrying out knowledge distillation by adopting the target image dataset and adopting a fine-grained feature imitation technical method to obtain the target network model.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Claims (6)
1. A YOLOv s-based lightweight vehicle-mounted model training method is characterized by comprising the following steps of: at least comprises the following steps:
s1: acquiring an original image data set and preprocessing the original image data set to obtain a target image data set, wherein the original image data set is a large number of image sets containing targets to be detected, which are acquired through an automobile camera, and the preprocessing comprises data labeling and data enhancement;
S2: training YOLOv s network model by using the target image dataset to obtain a first network model;
s3: based on a lightweight MobileOne network and a Slim-Neck structure, improving a YOLOv5s network to obtain a lightweight YOLOv s network model;
s4: and taking the first network model as a teacher model, and taking the light YOLOv s network model as a student model to carry out knowledge distillation to obtain a target network model.
2. The YOLOv s-based lightweight vehicle model training method as claimed in claim 1, wherein: the data labeling tags each image in the data set through LabelImg tools, converts the images into a format which can be read and trained by YOLOv s, and finally divides the data set into a training set and a verification set.
3. The YOLOv s-based lightweight vehicle model training method as claimed in claim 2, wherein: the data enhancement method comprises a Mosaic algorithm and a Mixup algorithm;
The Mosaic algorithm is to splice 4 pictures in a random scaling, random cutting and random arrangement mode to form a large picture, so that the diversity of training data is increased;
The Mixup algorithm randomly selects two pictures, and mixes the two pictures according to a certain proportion to generate a new image as enhanced data;
The method comprises the steps of carrying out data labeling on an original image set to obtain a first image set, carrying out data enhancement on the first image data set to obtain a target image data set, and improving the data breadth and the distribution space of the data set, so that the detection precision and the generalization capability of a lightweight YOLOv s network model are effectively ensured.
4. The YOLOv s-based lightweight vehicle model training method as claimed in claim 1, wherein: the step S2 at least comprises the following steps that after relevant training parameters are configured according to the resource calculation condition of a server, a YOLOv S network model is trained by using a training set of a target image dataset; after training is completed, evaluating the trained YOLOv s network model by using a verification set of the target image data set until the result meets a preset index to obtain a first network model;
During the evaluation, using a target detection model in deep learning to evaluate mAP, precision, recall and FLOPs indexes of the main stream;
When YOLOv s network model is trained, the original CIoU loss function is replaced by MPDIoU loss function, MPDIoU contains all relevant factors considered in the existing loss function, namely overlapping or non-overlapping area, center point distance and wide-high deviation, and meanwhile, the calculation process is simplified;
MPDIoU the loss function can be expressed by the following formula:
Where IOU represents the conventional cross-ratio penalty, d 1 represents the relative distance between the upper left corner of the real frame and the predicted frame, d 2 represents the relative distance between the real frame and the lower right corner of the predicted frame, w represents the width of the input image, and h represents the height of the input image.
5. The YOLOv s-based lightweight vehicle model training method as claimed in claim 1, wherein: the lightweight MobileOne network in the S3 is a MobileOne network after the CoordAttention attention mechanism module is used for replacing the original SE attention mechanism module of the network;
The Slim-Neck structure is Neck structure which introduces a lightweight convolution technique GSConv instead of standard convolution.
6. The YOLOv s-based lightweight vehicle model training method as claimed in claim 1, wherein: the step S4 at least comprises the following steps:
Using a first network model obtained after training the target image dataset in the step S2 as a teacher model, using a lightweight YOLOv S network model obtained by improvement in the step S3 as a student model, performing fine tuning training on the student model by using knowledge distillation, and finally obtaining a target network model which meets expectations by adopting an evaluation method in the step S2;
the FGFI technical method is adopted during distillation;
the specific operation of distillation is as follows:
For each truth box, calculating the IOU between the truth box and the key point to obtain an IOU graph w×h×k (W is the width of the feature graph, H is the height of the feature graph, and K is the number of key positions), called M, taking the maximum value M (m=max (M)), and calculating the threshold F according to M:
Filtering out the positions with IOU values lower than F according to a threshold F, and obtaining a mask of W.H by combining the IOU graphs of the rest positions through OR operation;
After all the truth boxes are subjected to the operation, each mask is combined to obtain a final FGFI mask, and the mask contains information of the key position of the target area;
The distance formula between the teacher model and the student model feature map is calculated as follows:
Wherein (i, j) represents a position, c represents a channel, s and t are respectively network feature graphs of a student model and a teacher model, and f adap is an adaptation function;
for all estimated key locations, the distillation objective is to minimize the distance of the student model from the teacher model feature map at these locations, i.e., minimize:
wherein I is the mask obtained by the above operation, N P is the number of positive examples in the mask, and the loss function of the final student model is as follows:
L=LMPDIoU+Limitation
where L MPDIoU is the MPDIoU penalty function used in training the teacher model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311806698.1A CN118155012A (en) | 2023-12-26 | 2023-12-26 | YOLOv5 s-based lightweight vehicle model training method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311806698.1A CN118155012A (en) | 2023-12-26 | 2023-12-26 | YOLOv5 s-based lightweight vehicle model training method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118155012A true CN118155012A (en) | 2024-06-07 |
Family
ID=91292574
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311806698.1A Pending CN118155012A (en) | 2023-12-26 | 2023-12-26 | YOLOv5 s-based lightweight vehicle model training method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118155012A (en) |
-
2023
- 2023-12-26 CN CN202311806698.1A patent/CN118155012A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109902806B (en) | Method for determining target bounding box of noise image based on convolutional neural network | |
CN109753913B (en) | Multi-mode video semantic segmentation method with high calculation efficiency | |
CN111353505B (en) | Device based on network model capable of realizing semantic segmentation and depth of field estimation jointly | |
CN112287941B (en) | License plate recognition method based on automatic character region perception | |
CN112906631B (en) | Dangerous driving behavior detection method and detection system based on video | |
CN111639564A (en) | Video pedestrian re-identification method based on multi-attention heterogeneous network | |
CN114202743A (en) | Improved fast-RCNN-based small target detection method in automatic driving scene | |
CN112116593A (en) | Domain self-adaptive semantic segmentation method based on Gini index | |
CN113269133A (en) | Unmanned aerial vehicle visual angle video semantic segmentation method based on deep learning | |
CN112990065A (en) | Optimized YOLOv5 model-based vehicle classification detection method | |
CN108491828B (en) | Parking space detection system and method based on level pairwise similarity PVAnet | |
CN115830531A (en) | Pedestrian re-identification method based on residual multi-channel attention multi-feature fusion | |
CN110503049B (en) | Satellite video vehicle number estimation method based on generation countermeasure network | |
CN116630932A (en) | Road shielding target detection method based on improved YOLOV5 | |
Sun et al. | IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes | |
CN113963333B (en) | Traffic sign board detection method based on improved YOLOF model | |
CN114973199A (en) | Rail transit train obstacle detection method based on convolutional neural network | |
CN115035159A (en) | Video multi-target tracking method based on deep learning and time sequence feature enhancement | |
CN114596548A (en) | Target detection method, target detection device, computer equipment and computer-readable storage medium | |
CN113870312A (en) | Twin network-based single target tracking method | |
CN117576149A (en) | Single-target tracking method based on attention mechanism | |
CN117456480A (en) | Light vehicle re-identification method based on multi-source information fusion | |
CN113076840A (en) | Vehicle post-shot image brand training method | |
CN117011728A (en) | Unmanned aerial vehicle aerial photographing target detection method based on improved YOLOv7 | |
CN116363072A (en) | Light aerial image detection method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination |