CN114140665A - Dense small target detection method based on improved YOLOv5 - Google Patents

Dense small target detection method based on improved YOLOv5 Download PDF

Info

Publication number
CN114140665A
CN114140665A CN202111474306.7A CN202111474306A CN114140665A CN 114140665 A CN114140665 A CN 114140665A CN 202111474306 A CN202111474306 A CN 202111474306A CN 114140665 A CN114140665 A CN 114140665A
Authority
CN
China
Prior art keywords
image
training
yolov5
network
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111474306.7A
Other languages
Chinese (zh)
Inventor
陆声链
刘晓宇
李帼
陈明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Normal University
Original Assignee
Guangxi Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Normal University filed Critical Guangxi Normal University
Priority to CN202111474306.7A priority Critical patent/CN114140665A/en
Publication of CN114140665A publication Critical patent/CN114140665A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a dense small target detection method based on an improved YOLOv5 algorithm, which further improves the YOLOv5 algorithm. The idea is that (1) a coordinated attention mechanism (CA) is added in a YOLOv5 backbone extraction network, and position information is embedded into channel attention, so that a mobile network obtains information in a larger area and large overhead is avoided; (2) in a characteristic fusion network of YOLOv5, BiFPN is used for replacing PANet, and weight is introduced to better balance characteristic information of different scales; (3) aiming at dense and mutually shielded small targets, the dense targets are trained by using the variance local, so that the network model can accurately identify the targets which are overlapped in a large-area cluster manner. The method has better robustness to the current color change of the object, complex natural environment conditions and the like.

Description

Dense small target detection method based on improved YOLOv5
Technical Field
The invention relates to the technical field of target detection, in particular to a dense small target detection method based on improved YOLOv5.
Background
The target detection is a research hotspot in the fields of machine vision and artificial intelligence, and is also a core technology for application of face recognition, object classification, automatic sorting and the like. A large number of researchers have conducted a great deal of research around target detection, and some solutions have been proposed. The early method mainly extracts features through images, wherein the features comprise color features, texture features, shape features and spatial relationship features of the images. The color histogram-based feature matching method mainly includes a histogram intersection method, a reference color table method and the like, and local features of an image cannot be extracted well because colors cannot measure the direction and the size of the image. Common methods based on texture feature extraction are a gray level co-occurrence matrix method and a semi-variance graph, common models include a random field model and a fractal model, and texture is a concept of a region, so that excessive regionalization can be caused, and global features are ignored. The shape feature extraction-based method mainly includes a boundary feature method, a geometric parameter method, and the like, and the deformation target recognition effect is not good. Some researchers propose a fruit automatic identification technology based on machine vision, and carry out image acquisition according to a machine vision principle. After the image of the fruit is preprocessed by smoothing, sharpening and the like, the color sample value of the fruit is calculated in the RGB color space, image segmentation is carried out according to the sample value, and finally feature extraction is carried out by utilizing the segmented result. The traditional method based on image characteristics has the main problems that the expansibility of the method is not high, and different characteristics are often needed for different targets.
In recent years, machine learning, particularly deep learning, has emerged, leading to a breakthrough change in the field of computer vision. Researchers have proposed a convolutional neural network-based fruit identification method. The method generally comprises the steps of firstly obtaining RGB pictures of fruits, preprocessing and labeling the RGB pictures, constructing a data set, building a convolutional neural network, setting parameters of a network model, putting a training set into the convolutional neural network for training, and finally obtaining a fruit recognition model. Due to the strong applicability of the deep learning technology, the deep learning technology is popularized and applied to a plurality of target detection occasions in recent years.
In general, the deep learning target detection method widely applied at present can obtain a better detection effect on target detection with large area, large volume and not serious shielding. However, accurate and automatic detection of small, dense and severely occluded objects, such as leaves, fruits, flowers on trees, wild fauna photographed at high altitudes, etc., remains a challenge. Meanwhile, for target detection under these outdoor natural conditions, it is also necessary to overcome the influence of various environmental factors such as light, rain, fog, and the like. Although some methods, such as an object detection method based on a convolutional neural network, focus on the detection of small targets, there are also disadvantages, one is that when a large number of dense and overlapping targets are faced, accurate identification cannot be achieved, and the false identification is high; another disadvantage is that these methods pay too much attention to the accuracy of identifying small targets when identifying targets, and do not consider the model size and detection speed of the convolutional neural network, so that the finally generated detection model is difficult to deploy and use in mobile devices.
In a real application environment, the characteristics and application scenarios of specific targets also need to be considered. For example, in the application of detecting fruits on orchard trees, the characteristics of the fruits such as individuals, colors and the like show differences along with the growth cycle, and even if the fruits of the same variety have different characteristics, postures, shielding degrees and the like, the fruits of different varieties have different character characteristics. In addition, in the growing process of the fruits, the identification of the fruits on the final tree can be influenced by the light intensity, the fertilization mode, the irrigation of a water source and other complex environmental factors. Therefore, the target detection algorithm also needs to take into account the variation of the target object in size, color, and ambient conditions.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a dense small target detection method based on an improved YOLOv5 algorithm, which further improves the YOLOv5 algorithm. The idea is that (1) a coordinated attention mechanism (CA) is added in a YOLOv5 backbone extraction network, and position information is embedded into channel attention, so that a mobile network obtains information in a larger area and large overhead is avoided; (2) in a characteristic fusion network of YOLOv5, BiFPN is used for replacing PANet, and weight is introduced to better balance characteristic information of different scales; (3) aiming at dense and mutually shielded small targets, the dense targets are trained by using the variance local, so that the network model can accurately identify the targets which are overlapped in a large-area cluster manner. The method has better robustness to the current color change of the object, complex natural environment conditions and the like.
The technical scheme for realizing the purpose of the invention is as follows:
a dense small target detection method based on an improved YOLOv5 algorithm comprises the following steps:
s1, acquiring images: the method comprises the following steps that a user utilizes image acquisition equipment to acquire images of a target object, names the images of the acquired images according to the format of a Pascal VOC data set, and creates three folders named as antibiotics, ImageSets and JPEGImages;
s2, image preprocessing:
s2-1, image marking: in the image collected in the step S1, labeling the target in the image by using an image labeling tool label img, and labeling the position and the category name of the target;
s2-2, image amplification: if the image collected by the user in the step S1 cannot meet the requirement of identifying 2000 images required by a single category target, the image is amplified by using an Augmentor image data enhancement library, the user selects a storage path and a marking information XML file path of the image, an amplified image and XML file output path are formulated, a required image enhancer such as an enhancer for brightness, clipping, gaussian noise and the like is selected, the amplification quantity and an amplification mode (sequence, combination, random and the like) are selected to amplify the image, and the requirement required for identification is met;
s2-3, dividing the data set: dividing the amplified image and the labeled file into a training set, a testing set, a verification set and a training verification set, wherein the training set, the testing set and the verification set respectively account for 50%, 25% and 25%, and the training verification set accounts for 75% of the sum of the training set and the verification set;
s3, setting network model parameters: in a configuration file yaml of a YOLOv5 network model, setting the size of an input image of a convolutional neural network, the number of identification types and the iteration times according to the size of a memory and a video memory of a computer and the identification effect and training speed required by a user; and the user needs to use the type of the display card supporting CUDA acceleration;
s3-1, when the size of the image of the selected input network is 608 × 608 (independent of the size of the original image), the batch parameter is 8, the iteration time epcoh is 300, and the type of the detected object is 2, the user trains the model by using a single GPU, and at least 6GB is needed;
s3-2, when the size of the image selected to be input into the network is 640 x 640 (irrelevant to the size of the original image), the batch parameter is 8, the iteration time epcoh is 300, and the type of the detected object is 2, the user trains the model by using a single GPU, and at least 8GB is needed;
s4, improving the original YOLOv5 network structure to obtain an improved YOLOv5 network structure, wherein the improvement process is as follows:
s4-1: in the original YOLOv5 network structure, a CA coordinated attention mechanism is added after the 3 rd, 6 th and 9 th layers, and input features in the vertical and horizontal directions are respectively aggregated into two independent position senses by using two 1D global pooling operations; then, the two characteristic maps with the embedded specific direction information are respectively coded into two attribute maps, each attribute map captures the long-distance dependence relationship of the input characteristic map along a spatial direction, and the position information is stored in the generated attribute map; applying both of the attribute maps to the input feature map by multiplication to emphasize the representation of the attention area;
s4-2: in the original YOLOv5 network structure, BiFPN is extracted by using reinforced features, and on the basis of a PANet simplified version, if input and output nodes are on the same layer, an extra edge is added, so that more features are fused without increasing cost; upsampling P5_ in, and performing Concat _ bifpn stacking with P4_ in after upsampling to obtain P4_ td; then, P4_ td is upsampled, and after the upsampling, the P3_ in is subjected to Concat _ bifpn stacking to obtain P3_ out; then, the P3_ out is downsampled, and the downsampled P3_ out and P4 are subjected to Concat _ bifpn stacking to obtain P4_ out; then, the P4_ out is downsampled, and the downsampled P4_ out is stacked with the P5_ in to obtain P3_ out;
s4-3: performing a computation of a Loss function using Varifocal local to solve the class imbalance problem, where p is a predicted Iou perceptual classification score (IACS), q is a target score, and for positive samples in training, q is set to the iou (gt iou) between the generated bbox and gt box, and for negative samples in training, the training targets q of all classes are 0, and the training is concentrated on candidate detection samples with higher IACS; alpha and gamma are hyper-parameters, alpha is an adjustable scale factor used for balancing the loss between the positive and negative examples, alpha is more than or equal to 0 and less than or equal to 1, therefore, the training avoids paying attention to the negative example, the loss function can judge and balance between the difficult sample and the easy sample, and finally reduces the loss contribution of the easy sample, because p is more than or equal to 0 and less than or equal to 1, and gamma is set to be more than 1;
s5, training a network model: setting parameters of improved YOLOv5 network configuration files train and YOLOv5.yaml, putting the yaml file with the set parameters and the improved YOLOv5 network structure into a computer with a configured environment, training by using a training set and a verification set to mark pictures in a centralized manner, putting the pictures which are divided in the testing set into the computer to test in the training process to obtain the training effect of each stage, monitoring tensisorbard-logdir rings/train parameters in the setting process to observe the mAP value of the training in real time, and storing the trained network model weight pt after the training is finished;
s6, identifying by using the trained network model weight: preparing an image to be detected on a computer, changing a configuration file yaml, a trained weight and a to-be-detected picture path in detect.
According to the dense small target detection method based on the improved YOLOv5 algorithm, a coordinated attention mechanism and enhanced feature extraction are added by improving the structure of a YOLOv5 network model, so that smaller target objects of individuals can be better identified; the variable local Loss function is used, and target objects which are clustered and overlapped densely in a large area can be accurately identified; compared with the prior art, the invention has the following advantages:
(1) when the on-tree fruit recognition is carried out, an improved YOLOv5 network structure is adopted to train the image data set, and the trained model can accurately recognize dense small targets and can accurately recognize large-area clustered and overlapped target objects.
(2) When the target is detected, the improved YOLOv5 network structure is adopted to train the data set, and the detection model obtained by training has small volume and can adapt to various embedded devices
(3) The method can be applied in outdoor natural environment, has the characteristics of high identification precision and high speed, and can meet the requirement of real-time identification.
Drawings
FIG. 1 is a flow chart of a dense small target detection method based on the improved YOLOv5 algorithm;
FIG. 2 is a chart of CA coordinated attention machine;
FIG. 3 is a diagram of a BiFPN enhanced feature extraction network structure;
FIG. 4 is a graph showing the recognition effect of the improved YOLOv5 network model on mature citrus fruits;
FIG. 5 is a graph showing the recognition effect of the improved YOLOv5 network model on mature Nanfeng tangerines;
FIG. 6 is a graph of citrus identification effect of the improved YOLOv5 network model on the growth period;
FIG. 7 is a graph showing the recognition effect of improved YOLOv5 network model on Nanfeng mandarin orange in the growing period;
fig. 8 is a graph showing the identification effect of the improved YOLOv5 network model on large-area clustered citrus fruits.
Detailed Description
The invention will be further elucidated with reference to the drawings and examples, without however being limited thereto.
Example (b):
in this embodiment, citrus and Nanfeng mandarin orange are taken as examples to identify fruits on citrus trees in an orchard.
A dense small target detection method based on an improved YOLOv5 algorithm is shown in FIG. 1 and comprises the following steps:
s1, acquiring images: a user uses a digital camera or other image acquisition equipment to acquire images of citrus trees with fruits, names the images according to the format of a Pascal VOC data set, and creates three folders named as antibiotics, ImageSets and JPEGImages;
s2, image preprocessing:
s2-1, image marking: in the image collected in step S1, the image labeling tool label img is used to label the citrus fruit in the image, and the position and the category name of the citrus fruit are labeled. In this example, two varieties (categories) of citrus and Nanfeng mandarin orange are selected, and therefore, the two varieties are taken as an example and not only can be treated.
(1) When the citrus is framed, the tag can be named orange; when the Nanfeng tangerine is selected in a frame, the label can be named sweet _ orange;
(2) selecting densely clustered and overlapped citrus fruits in a frame mode in real time, selecting the citrus fruits in a frame mode one by one, and selecting the citrus fruits in a frame mode accurately by hand;
(3) when the oranges with the shielding rate of more than 95% are selected in the frame, the current target is abandoned;
(4) when the pixel area of the framed fruit target is less than 8 × 8, discarding the current target;
s2-2, image amplification: if the image acquired by the user in the step S1 can not meet the requirement of identifying 2000 pictures required by a single variety of citrus, the user can use the Augmentor image data enhancement library to amplify the image; the user selects the image storage path and the information marking XML file path, formulates an amplified image and an XML file output path, selects a required image intensifier (such as an intensifier of brightness, clipping, Gaussian noise and the like), selects the amplification quantity and the amplification mode (such as sequence, combination, randomness and the like) to amplify the image, and meets the requirement of identification.
S2-3, dividing the data set: dividing the amplified image and the labeled file into a training set, a testing set, a verification set and a training verification set, wherein the training set, the testing set and the verification set respectively account for 50%, 25% and 25%, and the training verification set accounts for 75% of the sum of the training set and the verification set;
s3, setting network model parameters: in a configuration file yaml of a YOLOv5 network model, setting the size, the number of identification types, the iteration times and the like of input images of a convolutional neural network according to the sizes of a memory and a video memory of a computer and the identification effect and training speed required by a user; and the user needs to use the type of the video card supporting cuda acceleration;
s3-1, when the size of the image of the selected input network is 608 × 608 (independent of the size of the original image), the batch parameter is 8, the iteration time epcoh is 300, and the type of the detected object is 2, the user uses a single gpu training model, and at least 6GB is needed;
s3-2, when the size of the image selected to be input into the network is 640 x 640 (irrelevant to the size of the original image), the batch parameter is 8, the iteration time epcoh is 300, and the type of the detected object is 2, the user uses a single gpu training model and needs at least 8 GB;
s4, improving the original YOLOv5 network structure to obtain an improved YOLOv5 network structure, wherein the improvement process is as follows:
s4-1, in the original YOLOv5 network structure, a CA coordinated attention mechanism is added after the 3 rd, 6 th and 9 th layers, and a CA coordinated attention mechanism diagram is shown in FIG. 2, and input features in the vertical and horizontal directions are respectively aggregated into two independent position senses by two 1D global pooling operations. The two signatures with embedded specific orientation information are then encoded as two orientation maps, respectively, each of which captures the long-range dependence of the input signature along one spatial direction. The location information can thus be saved in the generated attribution map. Both of the annotation maps are then applied to the input feature map by multiplication to emphasize the representation of the attention area.
S4-2, extracting BiFPN by using an enhanced feature in an original YOLOv5 network structure, wherein the BiFPN enhanced feature extraction network structure is as shown in figure 3, on the basis of a PANet simplified version, if input and output nodes are in the same layer, an extra edge is added, more features are fused without increasing cost, P5_ in is subjected to upsampling, and the upsampling and P4_ in are subjected to Concat _ BiFPN stacking to obtain P4_ td; then, P4_ td is upsampled, and after the upsampling, the P3_ in is subjected to Concat _ bifpn stacking to obtain P3_ out; then, the P3_ out is downsampled, and the downsampled P3_ out and P4 are subjected to Concat _ bifpn stacking to obtain P4_ out; then, P4_ out is downsampled, and the downsampled P5_ in is stacked to obtain P3_ out.
S4-3, using the variacal local for the computation of the Loss function to solve the class imbalance problem, where p is the predicted Iou perceptual classification score (IACS), q is the target score, q is set to IoU (gt IoU) between the generated bbox and gt box for positive samples in the training, and q is 0 for all classes of training targets for negative samples in the training. The training is focused on the candidate detection sample with the higher IACS; alpha and gamma are hyper-parameters, alpha is an adjustable scale factor used for balancing the loss between the positive and negative examples, alpha is more than or equal to 0 and less than or equal to 1, therefore, the training avoids paying attention to the negative example, the loss function can judge and balance between the difficult sample and the easy sample, and finally reduces the loss contribution of the easy sample, because p is more than or equal to 0 and less than or equal to 1, and gamma is set to be more than 1;
Figure DEST_PATH_IMAGE001
s5, training a network model: setting parameters of improved YOLOv5 network configuration files train and YOLOv5.yaml, putting the yaml file with the set parameters and the improved YOLOv5 network structure into a computer with a configured environment, training by using a training set and a verification set marked pictures, putting the pictures divided in the testing set into the computer for testing in the training process to obtain the training effect of each stage, monitoring tensisorbard-logdir rings/train parameters in the setting process to observe the mAP value of the training in real time, and storing the trained network model weight pt after the training is finished.
S6, identifying by using the trained network model weight: preparing a fruit image to be detected on a computer, changing a configuration file yaml, a trained weight and a picture path to be detected in detect.
The above scheme is adopted to identify the fruits on the citrus and the Nanfeng mandarin orange trees in different periods, the identification results are shown in fig. 4, fig. 5, fig. 6, fig. 7 and fig. 8, and the identification result graphs show that: when the on-tree fruit identification is carried out, the method is adopted to train the two fruit data sets, and the trained model can accurately identify the fruit targets with smaller individual and can accurately identify the fruit targets which are clustered and overlapped in a large area.

Claims (2)

1. A dense small target detection method based on an improved YOLOv5 algorithm is characterized by comprising the following steps:
s1, acquiring images: the method comprises the following steps that a user utilizes image acquisition equipment to acquire images of a target object, names the images of the acquired images according to the format of a Pascal VOC data set, and creates three folders named as antibiotics, ImageSets and JPEGImages;
s2, image preprocessing:
s2-1, image marking: in the image collected in the step S1, labeling the target in the image by using an image labeling tool label img, and labeling the position and the category name of the target;
s2-2, image amplification: if the image collected by the user in the step S1 can not meet the requirement of identifying 2000 images required by a single category target, the image is amplified by using an Augmentor image data enhancement library, the user selects a storage path and a marking information XML file path of the image, an amplified image and an XML file output path are formulated, a required image enhancer is selected, the amplification quantity and the amplification mode are selected to amplify the image, and the requirement required by identification is met;
s2-3, dividing the data set: dividing the amplified image and the labeled file into a training set, a testing set, a verification set and a training verification set, wherein the training set, the testing set and the verification set respectively account for 50%, 25% and 25%, and the training verification set accounts for 75% of the sum of the training set and the verification set;
s3, setting network model parameters: in a configuration file yaml of a YOLOv5 network model, setting the size of an input image of a convolutional neural network, the number of identification types and the iteration times according to the size of a memory and a video memory of a computer and the identification effect and training speed required by a user; and the user needs to use the type of the display card supporting CUDA acceleration;
s3-1, when the size of the image of the selected input network is 608 × 608, the batch parameter is 8, the iteration time epoch is 300, and the type of the detected object is 2, the user trains the model by using a single GPU, and at least 6GB is needed;
s3-2, when the size of the image of the selected input network is 640 x 640, the batch parameter is 8, the iteration time epoch is 300, and the type of the detected object is 2, the user uses a single GPU to train the model, and at least 8GB is needed;
s4, improving the original YOLOv5 network structure to obtain an improved YOLOv5 network structure, wherein the improvement process is as follows:
s4-1: in the original YOLOv5 network structure, a CA coordinated attention mechanism is added after the 3 rd, 6 th and 9 th layers, and input features in the vertical and horizontal directions are respectively aggregated into two independent position senses by using two 1D global pooling operations; then, the two characteristic maps with the embedded specific direction information are respectively coded into two attribute maps, each attribute map captures the long-distance dependence relationship of the input characteristic map along a spatial direction, and the position information is stored in the generated attribute map; applying both of the attribute maps to the input feature map by multiplication to emphasize the representation of the attention area;
s4-2: in the original YOLOv5 network structure, BiFPN is extracted by using reinforced features, and on the basis of a PANet simplified version, if input and output nodes are on the same layer, an extra edge is added, so that more features are fused without increasing cost; upsampling P5_ in, and performing Concat _ bifpn stacking with P4_ in after upsampling to obtain P4_ td; then, P4_ td is upsampled, and after the upsampling, the P3_ in is subjected to Concat _ bifpn stacking to obtain P3_ out; then, the P3_ out is downsampled, and the downsampled P3_ out and P4 are subjected to Concat _ bifpn stacking to obtain P4_ out; then, the P4_ out is downsampled, and the downsampled P4_ out is stacked with the P5_ in to obtain P3_ out;
s4-3: performing a computation of a Loss function using Varifocal local to solve the class imbalance problem, where p is a predicted Iou perceptual classification score (IACS), q is a target score, and for positive samples in training, q is set to the iou (gt iou) between the generated bbox and gt box, and for negative samples in training, the training targets q of all classes are 0, and the training is concentrated on candidate detection samples with higher IACS; alpha and gamma are hyper-parameters, alpha is an adjustable scale factor used for balancing the loss between the positive and negative examples, alpha is more than or equal to 0 and less than or equal to 1, therefore, the training avoids paying attention to the negative example, the loss function can judge and balance between the difficult sample and the easy sample, and finally reduces the loss contribution of the easy sample, because p is more than or equal to 0 and less than or equal to 1, and gamma is set to be more than 1;
s5, training a network model: setting parameters of improved YOLOv5 network configuration files train and YOLOv5.yaml, putting the yaml file with the set parameters and the improved YOLOv5 network structure into a computer with a configured environment, training by using a training set and a verification set to mark pictures in a centralized manner, putting the pictures which are divided in the testing set into the computer to test in the training process to obtain the training effect of each stage, monitoring tensisorbard-logdir rings/train parameters in the setting process to observe the mAP value of the training in real time, and storing the trained network model weight pt after the training is finished;
s6, identifying by using the trained network model weight: preparing an image to be detected on a computer, changing a configuration file yaml, a trained weight and a to-be-detected picture path in detect.
2. The method for detecting dense small objects based on the improved YOLOv5 algorithm in claim 1, wherein in step 2-2), the boosters comprise a brightness booster, a clipping booster, and a gaussian noise booster; the amplification modes comprise a sequence, a combination and a random amplification mode.
CN202111474306.7A 2021-12-06 2021-12-06 Dense small target detection method based on improved YOLOv5 Pending CN114140665A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111474306.7A CN114140665A (en) 2021-12-06 2021-12-06 Dense small target detection method based on improved YOLOv5

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111474306.7A CN114140665A (en) 2021-12-06 2021-12-06 Dense small target detection method based on improved YOLOv5

Publications (1)

Publication Number Publication Date
CN114140665A true CN114140665A (en) 2022-03-04

Family

ID=80383824

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111474306.7A Pending CN114140665A (en) 2021-12-06 2021-12-06 Dense small target detection method based on improved YOLOv5

Country Status (1)

Country Link
CN (1) CN114140665A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332849A (en) * 2022-03-16 2022-04-12 科大天工智能装备技术(天津)有限公司 Crop growth state combined monitoring method and device and storage medium
CN114998605A (en) * 2022-05-10 2022-09-02 北京科技大学 Target detection method for image enhancement guidance under severe imaging condition
CN115063795A (en) * 2022-08-17 2022-09-16 西南民族大学 Urinary sediment classification detection method and device, electronic equipment and storage medium
US11790640B1 (en) * 2022-06-22 2023-10-17 Ludong University Method for detecting densely occluded fish based on YOLOv5 network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114332849A (en) * 2022-03-16 2022-04-12 科大天工智能装备技术(天津)有限公司 Crop growth state combined monitoring method and device and storage medium
CN114332849B (en) * 2022-03-16 2022-08-16 科大天工智能装备技术(天津)有限公司 Crop growth state combined monitoring method and device and storage medium
CN114998605A (en) * 2022-05-10 2022-09-02 北京科技大学 Target detection method for image enhancement guidance under severe imaging condition
CN114998605B (en) * 2022-05-10 2023-01-31 北京科技大学 Target detection method for image enhancement guidance under severe imaging condition
US11790640B1 (en) * 2022-06-22 2023-10-17 Ludong University Method for detecting densely occluded fish based on YOLOv5 network
CN115063795A (en) * 2022-08-17 2022-09-16 西南民族大学 Urinary sediment classification detection method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Jia et al. Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot
CN107016405B (en) A kind of pest image classification method based on classification prediction convolutional neural networks
CN114140665A (en) Dense small target detection method based on improved YOLOv5
CN103049763B (en) Context-constraint-based target identification method
CN107346420A (en) Text detection localization method under a kind of natural scene based on deep learning
CN107273832B (en) License plate recognition method and system based on integral channel characteristics and convolutional neural network
CN109522889A (en) Hydrological ruler water level identification and estimation method based on image analysis
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN105574550A (en) Vehicle identification method and device
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN104156734A (en) Fully-autonomous on-line study method based on random fern classifier
CN113128335B (en) Method, system and application for detecting, classifying and finding micro-living ancient fossil image
CN108596038A (en) Erythrocyte Recognition method in the excrement with neural network is cut in a kind of combining form credit
CN113191334B (en) Plant canopy dense leaf counting method based on improved CenterNet
CN111178177A (en) Cucumber disease identification method based on convolutional neural network
CN110059539A (en) A kind of natural scene text position detection method based on image segmentation
CN109977899B (en) Training, reasoning and new variety adding method and system for article identification
CN110827312A (en) Learning method based on cooperative visual attention neural network
CN110599463A (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
CN108734200A (en) Human body target visible detection method and device based on BING features
CN114758132B (en) Fruit tree disease and pest identification method and system based on convolutional neural network
CN109615610B (en) Medical band-aid flaw detection method based on YOLO v2-tiny
Zheng et al. Single shot multibox detector for urban plantation single tree detection and location with high-resolution remote sensing imagery
CN104008374B (en) Miner's detection method based on condition random field in a kind of mine image
CN112364687A (en) Improved Faster R-CNN gas station electrostatic sign identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination