CN113191237A - Improved YOLOv 3-based fruit tree image small target detection method and device - Google Patents

Improved YOLOv 3-based fruit tree image small target detection method and device Download PDF

Info

Publication number
CN113191237A
CN113191237A CN202110434149.0A CN202110434149A CN113191237A CN 113191237 A CN113191237 A CN 113191237A CN 202110434149 A CN202110434149 A CN 202110434149A CN 113191237 A CN113191237 A CN 113191237A
Authority
CN
China
Prior art keywords
image
small target
target detection
detection
yolov3
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110434149.0A
Other languages
Chinese (zh)
Inventor
毛亮
郭子豪
陈鹏飞
杨晓帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Polytechnic
Original Assignee
Shenzhen Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Polytechnic filed Critical Shenzhen Polytechnic
Priority to CN202110434149.0A priority Critical patent/CN113191237A/en
Publication of CN113191237A publication Critical patent/CN113191237A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a fruit tree image small target detection method and device based on improved YOLOv 3. The fruit tree image small target detection method based on the improved YOLOv3 comprises the following steps: preprocessing an original image marked with a small target to be detected to obtain a training image, and collecting the training image in a training image set; respectively replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, and newly adding a feature extraction layer to construct a small target detection model of improved YOLOv 3; training the small target detection model by using the training image set, so that the small target detection model outputs the category and the position of the small target to be detected; and inputting the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image. The method can fully consider the characteristics of the small target in the fruit tree image and the shielding condition of the small target, and improve the detection precision of the small target.

Description

Improved YOLOv 3-based fruit tree image small target detection method and device
Technical Field
The invention relates to the technical field of computer vision, in particular to a fruit tree image small target detection method and device based on improved YOLOv 3.
Background
In recent years, as target detection is widely applied to the agricultural field, a target detection method based on deep learning is gradually adopted to replace a traditional sampling or visual inspection method to detect fruits in a fruit tree image so as to estimate the fruit tree yield. The target detection method based on deep learning is generally divided into a candidate box-based method and a regression-based method, common candidate box-based methods include Fast R-CNN, Fast R-CNN and R-FCN, and common regression-based methods include YOLO and SSD. Compared with a candidate frame-based method, the regression-based method does not need to extract a candidate frame, has high detection efficiency, but has great limitation on the size of an input image, and has low detection precision for small targets such as fruits which occupy few pixels, have unobvious texture and edge features and are likely to be shielded in fruit tree images.
Therefore, the currently proposed target detection method based on deep learning cannot be perfectly applied to the detection of small targets in fruit tree images, particularly small targets which are shielded, and the detection precision of the small targets is difficult to improve.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a fruit tree image small target detection method and device based on improved YOLOv3, which can fully consider the characteristics of small targets in fruit tree images and the condition that the small targets are blocked, and improve the small target detection precision.
In order to solve the above technical problem, in a first aspect, an embodiment of the present invention provides a method for detecting a small target of a fruit tree image based on improved YOLOv3, including:
preprocessing an original image marked with a small target to be detected to obtain a training image, and collecting the training image in a training image set;
respectively replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, and newly adding a feature extraction layer to construct a small target detection model of improved YOLOv 3;
training the small target detection model by using the training image set, so that the small target detection model outputs the category and the position of the small target to be detected;
and inputting the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image.
Further, before the inputting the detection image into the trained small target detection model and obtaining the category and the position of the small target in the detection image, the method further includes:
and preprocessing the detection image.
Further, the preprocessing comprises any one or more image processing of image cropping, image flipping and image scaling.
Further, the original transmission layer and the partial down-sampling layer of YOLOv3 are respectively replaced by DenseNet, and a new feature extraction layer is added to construct a small target detection model of improved YOLOv3, specifically:
replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing the small target detection model.
Further, the inputting of the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image specifically includes:
and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the predicted target in the detection image to obtain the type and the position of the small target in the detection image.
In a second aspect, an embodiment of the present invention provides an apparatus for detecting a small target in a fruit tree image based on improved YOLOv3, including:
the image processing module is used for preprocessing an original image marked with a small target to be detected to obtain a training image and collecting the training image in a training image set;
the model construction module is used for replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, adding a new feature extraction layer and constructing a small target detection model of improved YOLOv 3;
the model training module is used for training the small target detection model by using the training image set so as to enable the small target detection model to output the category and the position of the small target to be detected;
and the target detection module is used for inputting the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image.
Further, the target detection module is further configured to perform preprocessing on the detection image before the detection image is input into the trained small target detection model to obtain the type and the position of the small target in the detection image.
Further, the preprocessing comprises any one or more image processing of image cropping, image flipping and image scaling.
Further, the original transmission layer and the partial down-sampling layer of YOLOv3 are respectively replaced by DenseNet, and a new feature extraction layer is added to construct a small target detection model of improved YOLOv3, specifically:
replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing the small target detection model.
Further, the inputting of the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image specifically includes:
and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the predicted target in the detection image to obtain the type and the position of the small target in the detection image.
The embodiment of the invention has the following beneficial effects:
the method comprises the steps of preprocessing an original image marked with a small target to be detected to obtain a training image, collecting the training image in a training image set, respectively replacing an original transmission layer and a partial down-sampling layer of YOLOv3 with DenseNet, adding a new feature extraction layer, constructing an improved YOLOv3 small target detection model, training the small target detection model by using the training image set, enabling the small target detection model to output the type and the position of the small target to be detected, inputting the detection image into the trained small target detection model to obtain the type and the position of the small target in the detection image, and completing small target detection of the detection image. Compared with the prior art, the small target detection model is constructed based on the improved YOLOv3 network, the feature propagation of the image is enhanced, the feature fusion is promoted, one more feature graph with one scale is extracted by utilizing the newly added feature extraction layer, the detection capability of the small target is improved, the characteristics of the small target in the fruit tree image and the shielding condition of the small target can be fully considered, and the small target detection precision is improved.
Drawings
Fig. 1 is a schematic flow chart of a fruit tree image small target detection method based on improved YOLOv3 in a first embodiment of the present invention;
FIG. 2 is a schematic diagram of a prior art YOLOv3 network;
FIG. 3 is a schematic structural diagram of an improved YOLOv3 network according to a first embodiment of the present invention;
FIG. 4 is a data flow diagram of a training small target detection network according to a first embodiment of the present invention;
fig. 5 is a schematic structural diagram of a fruit tree image small-object detection device based on improved YOLOv3 in a second embodiment of the present invention.
Detailed Description
The technical solutions in the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that, the step numbers in the text are only for convenience of explanation of the specific embodiments, and do not serve to limit the execution sequence of the steps.
The first embodiment:
as shown in fig. 1, the first embodiment provides a fruit tree image small target detection method based on the improved YOLOv3, which includes steps S1 to S4:
s1, preprocessing the original image marked with the small target to be detected to obtain a training image, and collecting the training image in a training image set;
s2, replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, adding a new feature extraction layer, and constructing a small target detection model of improved YOLOv 3;
s3, training a small target detection model by using the training image set, and enabling the small target detection model to output the category and the position of the small target to be detected;
and S4, inputting the detection image into the trained small target detection model to obtain the type and position of the small target in the detection image.
Before the detection image is input to the trained small target detection model, the size of the detection image should be made equal to the size of the training image.
As an example, in step S1, an original image is acquired, a small target to be detected in the original image is labeled by a human or an image labeling tool, and the labeled original image is subjected to preprocessing, such as any one or more of image cropping, image flipping, and image scaling, to obtain a training image. The marked original image is preprocessed, so that the image quantity can be increased, the randomness of the image can be increased, and a stable small target detection model can be obtained. In order to improve the detection capability of small targets, for an original image with a high resolution, the original image needs to be cut into a series of block images with a certain naming specification to serve as training images.
In step S2, a backbone network DarkNet-53 of YOLOv3 is selected as a basic network architecture, the DarkNet-53 mainly consists of 1 × 1 or 3 × 3 convolution kernels and includes 53 convolutional layers, in the YOLOv3 network training process, due to multiple convolution and downsampling operations in the YOLOv3 network, feature information of an input image is lost in the network forward propagation process, by replacing an original transport layer with a lower resolution in the YOLOv3 network with DenseNet, feature propagation can be enhanced and feature fusion is promoted, and DenseNet promotes backward propagation of gradients, so that the network is easier to train, and by replacing a part of downsampling layers in the YOLOv3 network with DenseNet and adding a new feature extraction layer in the YOLOv3 network, a feature map with one more scale can be extracted, which is beneficial to improving the feature extraction capability of the network and improving the small target detection accuracy.
In step S3, a small target detection model is trained using the training image set, the training images in the training image set are input to the small target detection model for training, and during training, the type and position of the small target to be detected are regressed, so that the small target detection model outputs the type and position of the small target to be detected, and the trained small target detection model is obtained.
In step S4, the trained small target model is initially deployed by using the caffe deep learning framework, and a detection image with a size consistent with that of the training image is input into the trained small target detection model, so that the trained small target detection model performs target detection on the detection image, and the type and position of the small target in the detection image are obtained.
In a preferred embodiment, before inputting the detection image into the trained small target detection model and obtaining the category and the position of the small target in the detection image, the method further includes: and preprocessing the detection image.
Wherein the preprocessing comprises any one or more image processing of image cropping, image turning and image scaling.
In the embodiment, before the detection image is input into the trained small target detection model, any one or more of image cutting, image turning and image turning is performed on the detection image, so that the size of the detection image can be ensured to be consistent with that of the training image, the randomness of the image can be increased, and the small target detection precision can be improved.
In a preferred embodiment, an original transmission layer and a partial downsampling layer of YOLOv3 are respectively replaced by DenseNet, and a feature extraction layer is added to construct a small target detection model of improved YOLOv3, specifically: replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a new feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing a small target detection model.
Illustratively, a schematic structure diagram of a YOLOv3 network in the prior art is shown in fig. 2, and a schematic structure diagram of an improved YOLOv3 network is shown in fig. 3.
By replacing the original transmission layer with lower resolution in the YOLOv3 network with DenseNet, the original transmission layer adjusts the size of the input image from 256 × 256 to 512 × 512, which can enhance the feature propagation and promote the feature fusion, effectively avoid the loss of the feature information of the input image in the network forward propagation process, while by replacing the 32 x 32 and 16 x 16 downsampled layers in the YOLOv3 network with DenseNet, and a new feature extraction layer is added behind the first residual block in the YOLOv3 network, so that the feature extraction layer extracts a feature map with the scale of 128 × 128, and the feature maps with the scales of 64 × 64, 32 × 32 and 16 × 16 originally extractable by the YOLOv3 network are added to the feature map, the FPN (feature Pyramid network) algorithm is used for fusing the high-resolution characteristic of the low-level characteristic and the high-semantic information characteristic of the high-level characteristic, so that the feature extraction capability of the network is favorably improved, and the small target detection precision is improved.
Wherein, the introduction of DenseNet can solve the gradient disappearance problem of the deep network and enhance the feature propagation at the same time. The output of the YOLOv3 network at the l level is expressed using the formula:
xl=Hl(xl-1);
for ResNet, the identity function from the previous layer output is added:
xl=Hl(xl-1)+xl-1
in DenseNet, all previous layers would be connected as inputs:
xl=Hl([x0,x1,...,xl-1]);
in the above formula, Hl(. cndot.) is a nonlinear transfer function, a combinatorial operation, which may include a series of BN (batch normalization), ReLu, Pooling, and Conv operations, with a transfer function H of the DenseNet architecturelBN-ReLu-Conv (1X 1) -BN-ReLu-Conv (3X 3).
In the embodiment, by improving the YOLOv3 and constructing the small target detection model based on the improved YOLOv3 network, the small target detection capability of the small target detection model to the small target is favorably improved, and the small target detection accuracy is improved.
Illustratively, the training process of the small target detection model is shown in fig. 4.
Some Python libraries are written to pre-process and post-process the training images.
Preprocessing the high-resolution training image: the training image is cut into a series of block images with a certain naming standard, and the block images are input into a small target detection model as the training image for training. The block image is traversed by partitioning through the sliding window, in order to ensure that each region can be detected, the sliding window has a definable cropping size and an overlapping proportion, and the naming specification of the picture cropped by the sliding window is as follows: ImageName Row _ Column _ height _ width.
And (3) carrying out post-processing on the high-resolution block images: the predicted coordinates of the bounding box position of each block image are added with the values of row and column in the image name to be equal to the predicted coordinates of the small target in the uncut image. However, it should be noted that the overlap portion (overlap region) is repeatedly detected twice, so that the predicted values of two bounding boxes are generated, and therefore, the non-maximum suppression method can be applied to the global matrix of the bounding box prediction to mitigate such overlap detection.
The small target detection model uses a Loss function, which is the same as the YOLOv3 network, and the class and the position of the small target are regressed at the same time during training, wherein the Loss function Loss is the sum of the positioning Loss, the confidence Loss and the classification Loss, and the expression is as follows:
Loss=Errorcoord+Erroriou+Errorcls
in the above formula, ErrorcoordFor localizing loss, ErroriouFor confidence loss, ErrorclsIs a classification loss.
In a preferred embodiment, the detection image is input into the trained small target detection model to obtain the category and position of the small target in the detection image, specifically: and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the prediction target in the detection image to obtain the type and the position of the small target in the detection image.
In the embodiment, the trained small target detection model performs non-maximum suppression operation on the predicted target in the detection image, so that the optimal small target can be selected from a plurality of small targets repeatedly marked in the detection image overlapping region, and the small target detection accuracy is improved.
Second embodiment:
as shown in fig. 5, a second embodiment provides a fruit tree image small target detection device based on the improved YOLOv3, including: the image processing module 21 is configured to pre-process an original image labeled with a small target to be detected to obtain a training image, and collect the training image in a training image set; the model construction module 22 is used for replacing an original transmission layer and a partial downsampling layer of the YOLOv3 with DenseNet respectively, adding a new feature extraction layer and constructing a small target detection model of improved YOLOv 3; the model training module 23 is configured to train a small target detection model by using the training image set, so that the small target detection model outputs the category and the position of the small target to be detected; and the target detection module 24 is configured to input the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image.
The object detection module 24 should make the size of the detection image and the size of the training image consistent before inputting the detection image into the trained small object detection model.
Illustratively, an original image is acquired through the image processing module 21, a small target to be detected in the original image is labeled through a manual or image labeling tool, and the labeled original image is preprocessed, for example, by any one or more of image cropping, image flipping, and image scaling, so as to obtain a training image. The marked original image is preprocessed, so that the image quantity can be increased, the randomness of the image can be increased, and a stable small target detection model can be obtained. In order to improve the detection capability of small targets, for an original image with a high resolution, the original image needs to be cut into a series of block images with a certain naming specification to serve as training images.
Through the model building module 22, a backbone network DarkNet-53 of the YOLOv3 is selected as a basic network architecture, the DarkNet-53 mainly consists of 1 × 1 or 3 × 3 convolution kernels and comprises 53 convolution layers, in the process of training the YOLOv3 network, due to the fact that multiple convolution and down-sampling operations exist in the YOLOv3 network, feature information of an input image can be lost in the process of forward propagation of the network, feature propagation can be enhanced and feature fusion is promoted by replacing an original transmission layer with lower resolution in the YOLOv3 network with DenseNet, and Denset promotes backward propagation of gradient, so that the network is easier to train, meanwhile, by replacing a part of down-sampling layers in the YOLOv3 network with DenseNet and adding a feature extraction layer in the YOLOv3 network, one more feature graph can be extracted, the feature extraction capability of the network can be improved, and the detection accuracy of small targets can be improved.
The small target detection model is trained by the training image set through the model training module 23, the training images in the training image set are input into the small target detection model for training, and the type and the position of the small target to be detected are regressed during training, so that the small target detection model outputs the type and the position of the small target to be detected, and the trained small target detection model is obtained.
Through the target detection module 24, the trained small target model is initialized and deployed by using the caffe deep learning framework, and the detection image with the size consistent with that of the training image is input into the trained small target detection model, so that the trained small target detection model performs target detection on the detection image to obtain the type and position of the small target in the detection image.
In a preferred embodiment, the target detection module 24 is further configured to perform preprocessing on the detection image before inputting the detection image into the trained small target detection model to obtain the type and position of the small target in the detection image.
Wherein the preprocessing comprises any one or more image processing of image cropping, image turning and image scaling.
In this embodiment, the target detection module 24 performs any one or more of image processing such as image cropping, image flipping, and image flipping on the inspection image before inputting the detection image into the trained small target detection model, so that the size of the detection image can be kept consistent with that of the training image, the randomness of the image can be increased, and the small target detection accuracy can be improved.
In a preferred embodiment, an original transmission layer and a partial downsampling layer of YOLOv3 are respectively replaced by DenseNet, and a feature extraction layer is added to construct a small target detection model of improved YOLOv3, specifically: replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a new feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing a small target detection model.
Illustratively, by replacing an original transmission layer with a lower resolution in the YOLOv3 network with a DenseNet, adjusting the size of an input image from 256 × 256 to 512 × 512 by the original transmission layer, it is able to enhance feature propagation and promote feature fusion, effectively avoiding the feature information of the input image from being lost during network forward propagation, and by replacing a 32 × 32 downsampling layer and a 16 × 16 downsampling layer in the YOLOv3 network with a DenseNet, and adding a feature extraction layer after the first residual block in the YOLOv3 network, making the feature extraction layer extract a feature map with a scale of 128 × 128, and adding a total of four feature maps with scales of 64 × 64, 32 × 32, and 16 × 16 that can be extracted by the YOLOv3 network, so as to fuse high-resolution features of low-level features and high-level semantic features by an fpn (feature Pyramid) algorithm, which is beneficial to improving the feature extraction capability of the network, and the small target detection precision is improved.
Wherein, the introduction of DenseNet can solve the gradient disappearance problem of the deep network and enhance the feature propagation at the same time. The output of the YOLOv3 network at the l level is expressed using the formula:
xl=Hl(xl-1);
for ResNet, the identity function from the previous layer output is added:
xl=Hl(xl-1)+xl-1
in DenseNet, all previous layers would be connected as inputs:
xl=Hl([x0,x1,...,xl-1]);
in the above formula, Hl(. cndot.) is a non-linear transfer function, a combinatorial operation that may include a series of BN (batch normalization), ReLu,Pooling and Conv operation, transfer function H of the structure DenseNetlBN-ReLu-Conv (1X 1) -BN-ReLu-Conv (3X 3).
In this embodiment, the model building module 22 is used to improve YOLOv3, and a small target detection model is built based on the improved YOLOv3 network, which is beneficial to improving the detection capability of the small target detection model on the small target and improving the detection accuracy of the small target.
Illustratively, the training process of the small target detection model is specifically as follows:
some Python libraries are written to pre-process and post-process the training images.
Preprocessing the high-resolution training image: the training image is cut into a series of block images with a certain naming standard, and the block images are input into a small target detection model as the training image for training. The block image is traversed by partitioning through the sliding window, in order to ensure that each region can be detected, the sliding window has a definable cropping size and an overlapping proportion, and the naming specification of the picture cropped by the sliding window is as follows: ImageName Row _ Column _ height _ width.
And (3) carrying out post-processing on the high-resolution block images: the predicted coordinates of the bounding box position of each block image are added with the values of row and column in the image name to be equal to the predicted coordinates of the small target in the uncut image. However, it should be noted that the overlap portion (overlap region) is repeatedly detected twice, so that the predicted values of two bounding boxes are generated, and therefore, the non-maximum suppression method can be applied to the global matrix of the bounding box prediction to mitigate such overlap detection.
The small target detection model uses a Loss function, which is the same as the YOLOv3 network, and the class and the position of the small target are regressed at the same time during training, wherein the Loss function Loss is the sum of the positioning Loss, the confidence Loss and the classification Loss, and the expression is as follows:
Loss=Errorcoord+Erroriou+Errorcls
in the above formula, ErrorcoordFor localizing loss, ErroriouAs a loss of confidenceLoss of ErrorclsIs a classification loss.
In a preferred embodiment, the detection image is input into the trained small target detection model to obtain the category and position of the small target in the detection image, specifically: and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the prediction target in the detection image to obtain the type and the position of the small target in the detection image.
In this embodiment, the target detection module 24 enables the trained small target detection model to perform non-maximum suppression operation on the predicted target in the detection image, so that the optimal small target can be selected from a plurality of small targets repeatedly marked in the detection image overlapping region, which is beneficial to improving the small target detection accuracy.
In summary, the embodiment of the present invention has the following advantages:
the method comprises the steps of preprocessing an original image marked with a small target to be detected to obtain a training image, collecting the training image in a training image set, respectively replacing an original transmission layer and a partial down-sampling layer of YOLOv3 with DenseNet, adding a new feature extraction layer, constructing an improved YOLOv3 small target detection model, training the small target detection model by using the training image set, enabling the small target detection model to output the type and the position of the small target to be detected, inputting the detection image into the trained small target detection model to obtain the type and the position of the small target in the detection image, and completing small target detection of the detection image. Compared with the prior art, the small target detection model is constructed based on the improved YOLOv3 network, the feature propagation of the image is enhanced, the feature fusion is promoted, one more feature graph with one scale is extracted by utilizing the newly added feature extraction layer, the detection capability of the small target is improved, the characteristics of the small target in the fruit tree image and the shielding condition of the small target can be fully considered, and the small target detection precision is improved.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.
It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by hardware related to instructions of a computer program, and the computer program may be stored in a computer readable storage medium, and when executed, may include the processes of the above embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims (10)

1. A fruit tree image small target detection method based on improved YOLOv3 is characterized by comprising the following steps:
preprocessing an original image marked with a small target to be detected to obtain a training image, and collecting the training image in a training image set;
respectively replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, and newly adding a feature extraction layer to construct a small target detection model of improved YOLOv 3;
training the small target detection model by using the training image set, so that the small target detection model outputs the category and the position of the small target to be detected;
and inputting the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image.
2. The improved YOLOv 3-based fruit tree image small target detection method according to claim 1, wherein before the inputting the detection image into the trained small target detection model to obtain the category and position of the small target in the detection image, the method further comprises:
and preprocessing the detection image.
3. The improved YOLOv 3-based fruit tree image small-object detection method as claimed in claim 1 or 2, wherein the pre-processing includes any one or more of image cropping, image flipping, and image scaling.
4. The method for detecting the small target of the fruit tree image based on the improved YOLOv3 as claimed in claim 1, wherein the original transmission layer and the partial down-sampling layer of YOLOv3 are replaced by DenseNet, and a feature extraction layer is added to construct the small target detection model of the improved YOLOv3, specifically:
replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing the small target detection model.
5. The improved YOLOv 3-based fruit tree image small target detection method as claimed in claim 1, wherein the detection image is input into a trained small target detection model to obtain the type and position of the small target in the detection image, specifically:
and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the predicted target in the detection image to obtain the type and the position of the small target in the detection image.
6. A fruit tree image small target detection device based on improved YOLOv3 is characterized by comprising:
the image processing module is used for preprocessing an original image marked with a small target to be detected to obtain a training image and collecting the training image in a training image set;
the model construction module is used for replacing an original transmission layer and a partial down-sampling layer of the YOLOv3 with DenseNet, adding a new feature extraction layer and constructing a small target detection model of improved YOLOv 3;
the model training module is used for training the small target detection model by using the training image set so as to enable the small target detection model to output the category and the position of the small target to be detected;
and the target detection module is used for inputting the detection image into the trained small target detection model to obtain the category and the position of the small target in the detection image.
7. The improved YOLOv 3-based fruit tree image small target detection device according to claim 6, wherein the target detection module is further configured to pre-process the detection image before inputting the detection image into the trained small target detection model to obtain the type and position of the small target in the detection image.
8. The improved YOLOv 3-based fruit tree image small-object detection device as claimed in claim 6 or 7, wherein the pre-processing includes any one or more of image cropping, image flipping, and image scaling.
9. The device for detecting the small target of the fruit tree image based on the improved YOLOv3 as claimed in claim 6, wherein the original transmission layer and the partial down-sampling layer of YOLOv3 are replaced by DenseNet, and a feature extraction layer is added to construct the small target detection model of the improved YOLOv3, specifically:
replacing an original transmission layer of YOLOv3 with DenseNet, enabling the original transmission layer to adjust the size of an input image to 512 x 512, replacing a 32 x 32 down-sampling layer and a 16 x 16 down-sampling layer of YOLOv3 with DenseNet, adding a feature extraction layer after a first residual block of YOLOv3, enabling the feature extraction layer to extract a feature map with the size of 128 x 128, and constructing the small target detection model.
10. The improved YOLOv 3-based fruit tree image small target detection device as claimed in claim 6, wherein the input of the detection image into the trained small target detection model results in the type and position of the small target in the detection image, specifically:
and inputting the detection image into the trained small target detection model, and enabling the trained small target detection model to perform non-maximum suppression operation on the predicted target in the detection image to obtain the type and the position of the small target in the detection image.
CN202110434149.0A 2021-04-21 2021-04-21 Improved YOLOv 3-based fruit tree image small target detection method and device Pending CN113191237A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110434149.0A CN113191237A (en) 2021-04-21 2021-04-21 Improved YOLOv 3-based fruit tree image small target detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110434149.0A CN113191237A (en) 2021-04-21 2021-04-21 Improved YOLOv 3-based fruit tree image small target detection method and device

Publications (1)

Publication Number Publication Date
CN113191237A true CN113191237A (en) 2021-07-30

Family

ID=76978089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110434149.0A Pending CN113191237A (en) 2021-04-21 2021-04-21 Improved YOLOv 3-based fruit tree image small target detection method and device

Country Status (1)

Country Link
CN (1) CN113191237A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030366A (en) * 2023-02-21 2023-04-28 中国电建集团山东电力建设第一工程有限公司 Power line inspection detection method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110766098A (en) * 2019-11-07 2020-02-07 中国石油大学(华东) Traffic scene small target detection method based on improved YOLOv3
CN110826379A (en) * 2018-08-13 2020-02-21 中国科学院长春光学精密机械与物理研究所 Target detection method based on feature multiplexing and YOLOv3
CN112288700A (en) * 2020-10-23 2021-01-29 西安科锐盛创新科技有限公司 Rail defect detection method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826379A (en) * 2018-08-13 2020-02-21 中国科学院长春光学精密机械与物理研究所 Target detection method based on feature multiplexing and YOLOv3
CN110766098A (en) * 2019-11-07 2020-02-07 中国石油大学(华东) Traffic scene small target detection method based on improved YOLOv3
CN112288700A (en) * 2020-10-23 2021-01-29 西安科锐盛创新科技有限公司 Rail defect detection method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
冯晋: ""基于深度学习的水稻灯诱害虫检测方法的研究与优化"", 《中国优秀硕士学位论文全文数据库 农业科技辑》, no. 2021, pages 3 *
张广世等: "\"基于改进YOLOv3网络的齿轮缺陷检测\"", 《激光与光电子学进展》, vol. 57, no. 12, pages 3 *
薛月菊: "\"未成熟芒果的改进YOLOv2识别方法\"", 《农业工程学报》, vol. 34, no. 7, pages 1 - 3 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116030366A (en) * 2023-02-21 2023-04-28 中国电建集团山东电力建设第一工程有限公司 Power line inspection detection method and system

Similar Documents

Publication Publication Date Title
CN110310264B (en) DCNN-based large-scale target detection method and device
US10229346B1 (en) Learning method, learning device for detecting object using edge image and testing method, testing device using the same
CN110738207B (en) Character detection method for fusing character area edge information in character image
CN108230329B (en) Semantic segmentation method based on multi-scale convolution neural network
CN111681273B (en) Image segmentation method and device, electronic equipment and readable storage medium
CN110032998B (en) Method, system, device and storage medium for detecting characters of natural scene picture
CN112288008B (en) Mosaic multispectral image disguised target detection method based on deep learning
CN111652217A (en) Text detection method and device, electronic equipment and computer storage medium
CN111091123A (en) Text region detection method and equipment
CN113159120A (en) Contraband detection method based on multi-scale cross-image weak supervision learning
CN112800964A (en) Remote sensing image target detection method and system based on multi-module fusion
CN107784288A (en) A kind of iteration positioning formula method for detecting human face based on deep neural network
CN111353544A (en) Improved Mixed Pooling-Yolov 3-based target detection method
CN112800955A (en) Remote sensing image rotating target detection method and system based on weighted bidirectional feature pyramid
CN112861915A (en) Anchor-frame-free non-cooperative target detection method based on high-level semantic features
CN113591719A (en) Method and device for detecting text with any shape in natural scene and training method
CN114998756A (en) Yolov 5-based remote sensing image detection method and device and storage medium
CN116645592A (en) Crack detection method based on image processing and storage medium
CN115375999A (en) Target detection model, method and device applied to dangerous chemical vehicle detection
CN116486393A (en) Scene text detection method based on image segmentation
Shit et al. An encoder‐decoder based CNN architecture using end to end dehaze and detection network for proper image visualization and detection
CN113191237A (en) Improved YOLOv 3-based fruit tree image small target detection method and device
CN112884755B (en) Method and device for detecting contraband
CN111767919A (en) Target detection method for multi-layer bidirectional feature extraction and fusion
CN114663654B (en) Improved YOLOv4 network model and small target detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination