CN115205667A - Dense target detection method based on YOLOv5s - Google Patents

Dense target detection method based on YOLOv5s Download PDF

Info

Publication number
CN115205667A
CN115205667A CN202210920891.7A CN202210920891A CN115205667A CN 115205667 A CN115205667 A CN 115205667A CN 202210920891 A CN202210920891 A CN 202210920891A CN 115205667 A CN115205667 A CN 115205667A
Authority
CN
China
Prior art keywords
convolution
module
training
channel
fish
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210920891.7A
Other languages
Chinese (zh)
Inventor
宋雪桦
顾寅武
张舜尧
王昌达
金华
袁昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University
Original Assignee
Jiangsu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University filed Critical Jiangsu University
Priority to CN202210920891.7A priority Critical patent/CN115205667A/en
Publication of CN115205667A publication Critical patent/CN115205667A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/05Underwater scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Quality & Reliability (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a dense target detection method based on YOLOv5 s. Respectively adding a space attention mechanism and a channel attention mechanism into different branches of the CSP module; a RepVGG Block module is used in the backhaul to improve the identification precision of targets with different scales and increase the reasoning speed; an SA attention module is added to improve the feature extraction capability of the algorithm; CARAFE upsampling is used in Neck to obtain a larger receptive field; a variable local Loss function is introduced, and high-quality positive samples are more concerned in the dense target sample training. According to the method, the fish is used as the data set for training, and the trained model weight is used for detection, so that the consumption of manpower and material resources is effectively reduced, the detection accuracy is improved, and the requirements of intensive target detection tasks can be better met.

Description

Dense target detection method based on YOLOv5s
Technical Field
The invention relates to the field of computer vision target detection, in particular to a dense target detection method based on YOLOv5 s.
Background
Visual target detection aims at positioning and identifying objects existing in images, belongs to one of classic tasks in the field of computer vision, is also a premise and a basis of a plurality of computer vision tasks, and has important theoretical research significance and practical application value in the fields of automatic driving, video monitoring, aquaculture, intelligent agriculture and the like. With the rapid development of deep learning technology, the target detection has made great progress. The traditional manual detection mode has poor accuracy, low efficiency, time consumption and labor consumption. With the continuous development of image processing technology, the traditional machine learning carries out classification and identification through a support vector machine, and the method has low accuracy of detection results and is easy to cause the situations of missing detection, false detection and the like. In recent years, for the situation that dense targets exist in many fields, detection by adopting a computer vision technology and a deep learning method is gradually mainstream, and a target detection and identification algorithm automatically extracts target features through a convolutional neural network and has higher detection speed and higher detection accuracy compared with the conventional method.
Disclosure of Invention
Aiming at the existing problems, a dense target detection model based on YOLOv5s is provided. The model can better meet the requirements of intensive target detection tasks.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows: a dense target detection method based on YOLOv5s comprises the following steps:
1) Placing a detection device at the front end of a bait casting boat to detect the number of fish schools, wherein the detection device comprises a camera device and an illuminating device; the camera device is used for shooting fish schools for quantity detection; the lighting device is kept normally on for underwater lighting;
2) Construction of Fish numberData set D2, partition training set D train And a verification set D test
3) Constructing a YOLOv5s network model, wherein the YOLOv5s network model comprises Input, backsbone, neck and Prediction; the Input comprises Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling; the Backbone comprises a Focus module, an SPP module and a C3 module; the Neck network tack comprises an FPN module, a PAN module and a C3 module; the Prediction comprises a Bounding box loss function and NMS;
4) Modifying the convolution module of the backbone network, and modifying the convolution module of the backbone network into a RepVGG Block module;
5) Modifying a backbone network structure, and inserting an SA attention mechanism between the RepVGG module and the SPP module;
6) Modifying an upsampling mode of a Yolov5s neck network, and changing nearest upsampling into a CARAFE upsampling mode;
7) Modifying a Loss function Focal Loss of class Loss and confidence coefficient Loss of an evaluation target frame and a prediction frame into a Varifocal Loss function;
8) Carrying out migration training on the fish data set D2 to obtain a training weight w; using GIOU _ Loss as a Loss function, stopping training when the model Loss curve approaches to 0 and has no obvious fluctuation, and obtaining a training weight w, otherwise, continuing training;
9) Inputting images, detecting fish shoals, inputting the obtained fish shoals images into a model with training weight w, and automatically identifying the number of the fish shoals by the model according to the weight.
Further, the step 2) includes the following steps:
2.1 N pieces of fish public data are selected to construct a data set D1;
2.2 Using a labeling tool Labelimg to label the fish in each image in the data set D1 to construct a fish data set D2;
2.3 Proportionally dividing a fish data set D2 into training sets D train And a verification set D test
Further, the step 4) includes the following steps:
4.1 Training the multi-branch model: during training, adding parallel 1 × 1 convolution branches and identity mapping branches for each 3 × 3 convolution layer;
4.2 Equivalent transformation of the multi-branch model into a one-way model: considering the 1 × 1 convolution as a 3 × 3 convolution with many 0's in the convolution kernel, the identity mapping is a special 1 × 1 convolution; according to the additive principle of convolution, three branches of each ReVGG Block module can be combined into a 3 x 3 convolution;
4.3 Structural parameter reconstruction: and transferring the weight of the multi-branch network to the simple network through the actual data flow.
Further, the step 5) includes the following steps:
5.1 Feature grouping: suppose the input features are X ∈ R C×H×W Wherein C, H, W represent channel number, height and width respectively, the feature grouping will split the input X into g groups along the channel dimension, so that each sub-function gradually captures specific semantic response in the training process;
5.2 Using a channel attention mechanism to capture channel correlation information, the calculation formula is as follows:
Figure BDA0003777514530000021
X′ k1 =σ(W 1 s+b 1 )·X k1
in the formula: s denotes channel statistics, X k1 Is a branch, X ', divided in the channel dimension' k1 Represents the final output of the channel attention, σ is sigmoid activation function, W 1 And b 1 Is a parameter having a shape of C/2 G.times.1X 1.
5.3 Using a spatial attention mechanism to capture spatial correlation information, the calculation formula is as follows:
X′ k2 =σ(W 2 ·GN(X K2 )+b 2 )·X k2
in the formula: x k2 Is a branch, X ', divided in the channel dimension' k2 Final output, W, representing spatial attention 2 And b 2 Is a parameter with a shape of C/2G × 1 × 1, GN represents the group normalization method;
5.4 Polymerization: after the calculation of the channel attention and the spatial attention is completed, the two kinds of attention are integrated and fused by Concat to obtain: x' k =[X′ k1 ,X′ k2 ]∈R C/2G×H×W And channel permutation operation (channel buffer) is adopted for inter-group communication.
Further, the step 6) includes the following steps:
6.1 Feature map channel compression: assuming that the up-sampling multiplying factor is sigma, for an input feature map with the shape of C multiplied by H multiplied by W, wherein C, H and W respectively represent the number of channels, height and width, the 1 multiplied by 1 convolution is used for compressing the number of channels to C m
6.2 Content coding and upsampling kernel prediction: for the input characteristic diagram compressed in the step 6.1), utilizing
Figure BDA0003777514530000031
The upsampled convolution kernel of (1) assuming the upsampled convolution kernel is k up ×k up The number of input channels is C m The number of output channels is
Figure BDA0003777514530000032
The channel dimension is expanded in the space dimension to obtain the shape of
Figure BDA0003777514530000033
The upsampling core of (a);
6.3 Upsampling kernel normalization: for each channel k of the up-sampling kernel obtained in the step 6.2) up ×k up Normalization is performed by softmax, so that the sum of the weights of convolution kernels is 1; for each position in the output profile, it is mapped back to the input profile, taking out the k centered on it up ×k up Performing dot product on the predicted up-sampling kernel of the point to obtain an output value; different channels at the same location share the same upsampling core.
Further, in the step 7), the formula of the variacal local Loss function is as follows:
Figure BDA0003777514530000034
in the formula: p is the predicted IACS, q is the target IoU score, q is the IoU between the prediction bounding box and the gt box for positive samples, and q is 0 for negative samples.
Further, in the step 8), the GIoU _ Loss function conversion formula is as follows:
Figure BDA0003777514530000041
in the formula:
Figure BDA0003777514530000042
IoU represents the intersection ratio between two overlapping rectangular boxes; i denotes the overlapping part of two rectangles, U denotes the sum A of the areas of the two rectangles p +A g The intersection area I, A of two rectangles c Is the minimum outer interface area of the two rectangles.
The invention provides a dense target detection method based on YOLOv5s, which adopts a detection model integrating a RepMVGG module, an attention mechanism and a CARAFE up-sampling module. The method can effectively improve the comprehensive performance in the intensive target image detection task, greatly improve the detection accuracy, and has important significance for the development of automatic driving, video monitoring and aquaculture industry.
Drawings
FIG. 1 is a flow chart of a dense target detection method based on YOLOv5s in the present invention.
Fig. 2 is a diagram of the YOLOv5s network structure in the present invention.
Fig. 3 is a structure diagram of a backbone network RepVGG Block module in the invention.
FIG. 4 is a diagram showing the structure of the SA attention mechanism of the present invention.
Detailed Description
The present invention is further described with reference to the drawings and the specific embodiments, it should be noted that the technical solutions and design principles of the present invention are only described in detail with a preferred technical solution, but the scope of the present invention is not limited thereto.
The present invention is not limited to the above-described embodiments, and any obvious improvements, substitutions or modifications can be made by those skilled in the art without departing from the spirit of the present invention.
The dense target detection method based on YOLOv5s provided by the invention has a flow chart shown in figure 1, and comprises the following steps:
1) Placing a detection device at the front end of a bait casting boat to detect the number of fish schools, wherein the detection device comprises a camera device and an illuminating device; the camera device is used for shooting fish schools for quantity detection; the lighting device is kept normally bright for underwater lighting;
2) Constructing a fish data set D2, and dividing a training set D train And a verification set D test
3) Constructing a YOLOv5s network model, wherein the structure diagram of the YOLOv5s network is shown in figure 2, and the YOLOv5s network model comprises Input, backhaul, neck and Prediction; the Input comprises Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling; the backhaul comprises a Focus module, an SPP module and a C3 module; the neck network comprises an FPN module, a PAN module and a C3 module; the Prediction comprises a Bounding box loss function and NMS; the structure of the backbone network C3 module is divided into two branches, one branch uses a plurality of Bottleneck stacks and 3 standard convolution layers, the other branch only passes through one basic convolution module, and finally two branches are subjected to concat operation;
4) Modifying the convolution module of the backbone network, and modifying the convolution module of the backbone network into a RepVGG Block module;
5) Modifying the structure of the backbone network, and inserting an SA attention mechanism between the RepVGG module and the SPP module;
6) The upsampling mode of the YOLOv5s neck network is modified, and nearest neighbor upsampling is changed into CARAFE upsampling mode.
7) And modifying a Loss function Focal Loss for the class Loss and the confidence coefficient Loss of the evaluation target frame and the prediction frame into a Varifocal Loss.
8) Carrying out migration training on the fish data set D2 to obtain a training weight w; using GIOU _ Loss as a Loss function, stopping training when the model Loss curve approaches to 0 and has no obvious fluctuation, and obtaining a training weight w, otherwise, continuing training;
9) Inputting images, detecting fishes, inputting the obtained fish images into a model with training weight w, and automatically identifying the quantity of the fishes by the model according to the weight.
As a preferred embodiment of the present invention, the step 2) comprises the steps of:
2.1 N pieces of fish public data are selected to construct a data set D1;
2.2 Labeling the fish in each image of the data set D1 by using a labeling tool Labelimg to construct a fish data set D2;
2.3 Proportionally dividing the fish data set D2 into training sets D train And a verification set D test
As a preferred embodiment of the present invention, the RepVGG Block convolution structure is shown in FIG. 3, and the step 4) includes the following steps:
4.1 ) train the multi-branch model. In training, parallel 1 × 1 convolution branches and identity mapping branches are added for each 3 × 3 convolutional layer.
4.2 Equivalent conversion of the multi-branch model into a single-pass model. Considering a 1 × 1 convolution as a 3 × 3 convolution with many 0's in the convolution kernel, the identity mapping is a special 1 × 1 convolution. According to the additive principle of convolution, three branches of each RepVGG Block module can be combined into a 3 × 3 convolution.
4.3 ) structure parameter reconstruction. And transferring the weight of the multi-branch network into the simple network through the actual data flow.
As a preferred embodiment of the present invention, the SA module structure is shown in fig. 4, and the step 5) includes the following steps:
5.1 Feature grouping, assuming input featuresFor X ∈ R C×H×W Wherein C, H, W represent channel number, height and width respectively, the feature grouping will split the input X into g groups along the channel dimension, so that each sub-function gradually captures specific semantic response in the training process;
5.2 Using a channel attention mechanism. Channel correlation information is captured, and the calculation formula is as follows:
Figure BDA0003777514530000061
X′ k1 =σ(W 1 s+b 1 )·X k1
in the formula: s denotes channel statistics, X k1 Is a branch, X ', divided in the channel dimension' k1 Represents the final output of the channel attention, σ being the sigmoid activation function, W 1 And b 1 Is a parameter with a shape of C/2G × 1 × 1.
5.3 Use a spatial attention mechanism. Spatial correlation information is captured, and the calculation formula is as follows:
X′ k2 =σ(W 2 ·GN(X K2 )+b 2 )·X k2
in the formula: x k2 Denotes a branch, X ', divided in the channel dimension' k2 Final output, W, representing spatial attention 2 And b 2 Is a parameter with a shape of C/2G × 1 × 1, GN represents the group normalization method.
5.4 ) polymerization. After the two previous attention calculations are completed, they are integrated, first by a simple Concat fusion: x' k =[X′ k1 ,X′ k2 ]∈R C/2G×H×W . And finally, performing inter-group communication by adopting channel permutation operation (channel buffer).
As a preferred embodiment of the present invention, the step 6) above includes the steps of:
6.1 ) feature map channel compression, assuming an upsampling magnification of σ, for an input feature map having a shape of C × H × W, where C, H, and W represent the number of channels, height, and width, respectively, its number of channels is convolved with 1 × 1Compression to C m And the calculation amount of the subsequent steps is reduced.
6.2 Content coding and upsampling kernel prediction, using the input feature map compressed in the first step
Figure BDA0003777514530000062
Predicting the upsampled kernel by assuming the upsampled convolution kernel to be k up ×k up The number of input channels is C m The number of output channels is
Figure BDA0003777514530000063
The channel dimension is expanded in the space dimension to obtain the shape of
Figure BDA0003777514530000064
The upsampling kernel of (1).
6.3 ) normalization of the upsampled kernels, with each channel k of the resulting upsampled kernels up ×k up Normalization is performed using softmax such that the sum of the convolution kernel weights is 1. For each position in the output profile, we map it back to the input profile, taking out the k centered on it up ×k up And performing dot product on the predicted upsampled kernel of the point to obtain an output value. Different channels at the same location share the same upsampling core.
As a preferred embodiment of the present invention, the formula of the variacal Loss function in step 7) is as follows:
Figure BDA0003777514530000071
where p is the predicted IACS, q is the target IoU score, q is the IoU between the prediction bounding box and the gt box for positive samples, and q is 0 for negative samples.
As a preferred embodiment of the present invention, the GIoU _ Loss function conversion formula in step 8) is as follows:
Figure BDA0003777514530000072
in the formula (I), the compound is shown in the specification,
Figure BDA0003777514530000073
IoU represents the intersection ratio between two overlapping rectangular boxes; i denotes the overlapping part of two rectangles, U denotes the sum A of the areas of the two rectangles p +A g The intersection area I, A of two rectangles c Is the minimum outer interface area of the two rectangles.

Claims (7)

1. A dense target detection method based on YOLOv5s is characterized by comprising the following steps:
1) Placing a detection device at the front end of a bait casting boat to detect the number of fish schools, wherein the detection device comprises a camera device and an illuminating device; the camera device is used for shooting the fish school for quantity detection; the lighting device is kept normally on for underwater lighting;
2) Constructing a fish data set D2, and dividing a training set D train And a verification set D test
3) Constructing a YOLOv5s network model, wherein the YOLOv5s network model comprises Input, backhaul, neck and Prediction; the Input comprises Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling; the backhaul comprises a Focus module, an SPP module and a C3 module; the Neck network tack comprises an FPN module, a PAN module and a C3 module; the Prediction comprises a Bounding box loss function and NMS;
4) Modifying the backbone network convolution module, and modifying the backbone network convolution module into a RepVGG Block module;
5) Modifying a backbone network structure, and inserting an SA attention mechanism between the RepVGG module and the SPP module;
6) Modifying an upsampling mode of a Yolov5s neck network, and changing nearest upsampling into a CARAFE upsampling mode;
7) Modifying a Loss function Focal Loss of class Loss and confidence coefficient Loss of an evaluation target frame and a prediction frame into a Varifocal Loss function;
8) Carrying out migration training on the fish data set D2 to obtain a training weight w; using GIOU _ Loss as a Loss function, stopping training when the model Loss curve approaches to 0 and has no obvious fluctuation, and obtaining a training weight w, otherwise, continuing training;
9) Inputting images, detecting fish shoals, inputting the obtained fish shoal images into a model with a training weight of w, and automatically identifying the number of the fish shoals by the model according to the weight.
2. The YOLOv5 s-based dense target detection method of claim 1, wherein the step 2) comprises the steps of:
2.1 N pieces of fish public data are selected to construct a data set D1;
2.2 Using a labeling tool Labelimg to label the fish in each image in the data set D1 to construct a fish data set D2;
2.3 Proportionally dividing a fish data set D2 into training sets D train And a verification set D test
3. The YOLOv5 s-based dense target detection method of claim 1, wherein the step 4) comprises the steps of:
4.1 Training the multi-branch model: during training, adding parallel 1 × 1 convolution branches and identity mapping branches for each 3 × 3 convolution layer;
4.2 Equivalent transformation of the multi-branch model into a single-way model: considering the 1 × 1 convolution as a 3 × 3 convolution with many 0's in the convolution kernel, the identity mapping is a special 1 × 1 convolution; according to the additive principle of convolution, three branches of each RepVGG Block module can be combined into a 3 multiplied by 3 convolution;
4.3 Structural parameter reconstruction: and transferring the weight of the multi-branch network into the simple network through the actual data flow.
4. The YOLOv5 s-based dense target detection method of claim 1, wherein the step 5) comprises the steps of:
5.1 Feature grouping: assuming that the input features are X ∈ R C×H×W Wherein C, H, W represent channel number, height and width respectively, the feature grouping will split the input X into g groups along the channel dimension, so that each sub-function gradually captures specific semantic response in the training process;
5.2 Using a channel attention mechanism to capture channel correlation information, the calculation formula is as follows:
Figure FDA0003777514520000021
X k1 =σ(W 1 s+b 1 )·X k1
in the formula: s denotes channel statistics, X k1 For a branch divided in the channel dimension, X k1 Represents the final output of the channel attention, σ is sigmoid activation function, W 1 And b 1 Is a parameter with a shape of C/2G × 1 × 1;
5.3 Using a spatial attention mechanism to capture spatial correlation information, the calculation formula is as follows:
X′ k2 =σ(W 2 ·GN(X K2 )+b 2 )·X k2
in the formula: x k2 Is a branch, X 'divided in the channel dimension' k2 Final output, W, representing spatial attention 2 And b 2 Is a parameter with a shape of C/2G × 1 × 1, GN represents the group normalization method;
5.4 Polymerization: after the calculation of the channel attention and the space attention is completed, the two kinds of attention are integrated and fused by Concat to obtain: x k ′=[X′ k1 ,X′ k2 ]∈R C/2G×H×W And performing inter-group communication by adopting channel permutation operation (channel buffer).
5. The method for detecting dense targets based on YOLOv5s as claimed in claim 1, wherein the step 6) comprises the following steps:
6.1 ) feature map channelsCompression: assuming that the up-sampling multiplying factor is sigma, for an input feature map with the shape of C × H × W, wherein C, H, and W respectively represent the number of channels, height, and width, the 1 × 1 convolution is used to compress the number of channels to C m
6.2 Content coding and upsampling kernel prediction: for the input characteristic diagram compressed in the step 6.1), use
Figure FDA0003777514520000036
The upsampled convolution kernel of (1) assuming the upsampled convolution kernel is k up ×k up The number of input channels is C m The number of output channels is
Figure FDA0003777514520000031
The channel dimension is expanded in the space dimension to obtain the shape of
Figure FDA0003777514520000032
The upsampling core of (a);
6.3 Upsampling kernel normalization: for each channel k of the up-sampling kernel obtained in the step 6.2) up ×k up Normalization is performed by softmax, so that the sum of the weights of convolution kernels is 1; for each position in the output profile, it is mapped back to the input profile, taking out the k centered on it up ×k up Performing dot product on the predicted up-sampling kernel of the point to obtain an output value; different channels at the same location share the same upsampling core.
6. The YOLOv5 s-based dense target detection method according to claim 1, wherein in the step 7), the Varifocal Loss function formula is as follows:
Figure FDA0003777514520000033
in the formula: p is the predicted IACS, q is the target IoU score, q is the IoU between the prediction bounding box and the gt box for positive samples, and q is 0 for negative samples.
7. The YOLOv5 s-based dense target detection method according to claim 1, wherein in the step 8), the GIoU _ Loss function conversion formula is as follows:
Figure FDA0003777514520000034
in the formula:
Figure FDA0003777514520000035
IoU represents the intersection ratio between two overlapping rectangular boxes; i denotes the overlap of two rectangles, U denotes the sum A of the areas of the two rectangles p +A g The intersection area I, A of two rectangles c Is the minimum outer area of the two rectangles.
CN202210920891.7A 2022-08-02 2022-08-02 Dense target detection method based on YOLOv5s Pending CN115205667A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210920891.7A CN115205667A (en) 2022-08-02 2022-08-02 Dense target detection method based on YOLOv5s

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210920891.7A CN115205667A (en) 2022-08-02 2022-08-02 Dense target detection method based on YOLOv5s

Publications (1)

Publication Number Publication Date
CN115205667A true CN115205667A (en) 2022-10-18

Family

ID=83586088

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210920891.7A Pending CN115205667A (en) 2022-08-02 2022-08-02 Dense target detection method based on YOLOv5s

Country Status (1)

Country Link
CN (1) CN115205667A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343045A (en) * 2023-03-30 2023-06-27 南京理工大学 Lightweight SAR image ship target detection method based on YOLO v5
CN116958907A (en) * 2023-09-18 2023-10-27 四川泓宝润业工程技术有限公司 Method and system for inspecting surrounding hidden danger targets of gas pipeline
CN117132767A (en) * 2023-10-23 2023-11-28 中国铁塔股份有限公司湖北省分公司 Small target detection method, device, equipment and readable storage medium
CN117274192A (en) * 2023-09-20 2023-12-22 重庆市荣冠科技有限公司 Pipeline magnetic flux leakage defect detection method based on improved YOLOv5
CN117496475A (en) * 2023-12-29 2024-02-02 武汉科技大学 Target detection method and system applied to automatic driving

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116343045A (en) * 2023-03-30 2023-06-27 南京理工大学 Lightweight SAR image ship target detection method based on YOLO v5
CN116343045B (en) * 2023-03-30 2024-03-19 南京理工大学 Lightweight SAR image ship target detection method based on YOLO v5
CN116958907A (en) * 2023-09-18 2023-10-27 四川泓宝润业工程技术有限公司 Method and system for inspecting surrounding hidden danger targets of gas pipeline
CN116958907B (en) * 2023-09-18 2023-12-26 四川泓宝润业工程技术有限公司 Method and system for inspecting surrounding hidden danger targets of gas pipeline
CN117274192A (en) * 2023-09-20 2023-12-22 重庆市荣冠科技有限公司 Pipeline magnetic flux leakage defect detection method based on improved YOLOv5
CN117132767A (en) * 2023-10-23 2023-11-28 中国铁塔股份有限公司湖北省分公司 Small target detection method, device, equipment and readable storage medium
CN117132767B (en) * 2023-10-23 2024-03-19 中国铁塔股份有限公司湖北省分公司 Small target detection method, device, equipment and readable storage medium
CN117496475A (en) * 2023-12-29 2024-02-02 武汉科技大学 Target detection method and system applied to automatic driving
CN117496475B (en) * 2023-12-29 2024-04-02 武汉科技大学 Target detection method and system applied to automatic driving

Similar Documents

Publication Publication Date Title
CN115205667A (en) Dense target detection method based on YOLOv5s
CN108805070A (en) A kind of deep learning pedestrian detection method based on built-in terminal
CN112884064B (en) Target detection and identification method based on neural network
CN107358257B (en) Under a kind of big data scene can incremental learning image classification training method
CN109949316A (en) A kind of Weakly supervised example dividing method of grid equipment image based on RGB-T fusion
CN114220035A (en) Rapid pest detection method based on improved YOLO V4
CN108154102A (en) A kind of traffic sign recognition method
CN112633277A (en) Channel ship board detection, positioning and identification method based on deep learning
CN113420643B (en) Lightweight underwater target detection method based on depth separable cavity convolution
CN110647802A (en) Remote sensing image ship target detection method based on deep learning
CN109784278A (en) The small and weak moving ship real-time detection method in sea based on deep learning
CN111507275B (en) Video data time sequence information extraction method and device based on deep learning
CN109948696A (en) A kind of multilingual scene character recognition method and system
CN113128335B (en) Method, system and application for detecting, classifying and finding micro-living ancient fossil image
CN114022408A (en) Remote sensing image cloud detection method based on multi-scale convolution neural network
CN113743505A (en) Improved SSD target detection method based on self-attention and feature fusion
CN109903339A (en) A kind of video group personage's position finding and detection method based on multidimensional fusion feature
CN116168240A (en) Arbitrary-direction dense ship target detection method based on attention enhancement
CN115631407A (en) Underwater transparent biological detection based on event camera and color frame image fusion
CN115393635A (en) Infrared small target detection method based on super-pixel segmentation and data enhancement
CN113421222B (en) Lightweight coal gangue target detection method
CN114596480A (en) Yoov 5 optimization-based benthic organism target detection method and system
CN112990066B (en) Remote sensing image solid waste identification method and system based on multi-strategy enhancement
Li et al. Object detection for uav images based on improved yolov6
CN113361496A (en) City built-up area statistical method based on U-Net

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination