CN113313128B - SAR image target detection method based on improved YOLOv3 network - Google Patents

SAR image target detection method based on improved YOLOv3 network Download PDF

Info

Publication number
CN113313128B
CN113313128B CN202110613778.XA CN202110613778A CN113313128B CN 113313128 B CN113313128 B CN 113313128B CN 202110613778 A CN202110613778 A CN 202110613778A CN 113313128 B CN113313128 B CN 113313128B
Authority
CN
China
Prior art keywords
sar image
frame
network
target
box
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110613778.XA
Other languages
Chinese (zh)
Other versions
CN113313128A (en
Inventor
蒋忠进
王强
曾祥书
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110613778.XA priority Critical patent/CN113313128B/en
Publication of CN113313128A publication Critical patent/CN113313128A/en
Application granted granted Critical
Publication of CN113313128B publication Critical patent/CN113313128B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an SAR image target detection method based on an improved YOLOv3 network, which comprises the following steps: inputting the SAR image training data set into a Darknet53 network, and performing feature extraction on the SAR image to obtain a basic feature map of the SAR image; corresponding to three grid divisions of the SAR image, carrying out feature extraction and feature fusion on the basic feature map according to three scales of large, medium and small to obtain a multi-scale feature map corresponding to the SAR image; inputting the multi-scale feature map into a prediction network, and adjusting and optimizing the parameters of the candidate frame; substituting the optimized candidate frame parameters and the optimized marking frame parameters into a loss function, and calculating the loss value of the current network; based on the obtained loss value, the network parameters are updated through back propagation, and the network parameters are repeatedly trained until the network parameters are converged; inputting an SAR image test data set, carrying out network test, and outputting a detection frame which is matched with a target in a one-to-one manner and various detection indexes.

Description

SAR image target detection method based on improved YOLOv3 network
Technical Field
The invention relates to an SAR image target detection method based on an improved YOLOv3 network, and belongs to the technical field of deep learning and machine vision.
Background
In recent years, artificial intelligence has been rapidly developed, and is widely applied to various fields such as military, geophysical prospecting, medical treatment, urban planning and the like, and good effects are obtained, and particularly, a deep learning framework with a Convolutional Neural Network (CNN) as a core shows strong capability in the aspects of image processing and machine vision. The CNN is excellent in optical image processing, has high efficiency in automatic SAR (Synthetic Aperture Radar) image interpretation, and can efficiently and accurately perform target detection and identification.
At present, target detection methods based on deep learning can be mainly divided into two types: a two-stage (two-stage) detection model and a one-stage (one-stage) detection model. The double-stage algorithm mainly comprises Fast-RCNN, fast-RCNN and the like, has the advantages of high precision, low detection speed and difficulty in realizing real-time detection of targets. The single-stage detection is a detection method based on parameter regression, and is characterized in that candidate frame generation and classification regression are combined into one step, and the method mainly comprises a method of a YOLO (You Only Look one) series. The YOLO algorithm is proposed by Redmon J and the like in 2015, and the YOLOv2 algorithm is proposed again every other year, so that the calculation complexity is greatly reduced, and the target detection speed is increased. In 2018, month 4, the author introduced the YOLOv3 algorithm, again with a significant improvement in accuracy and speed. However, when the conventional YOLOv3 network is directly used for target detection of the SAR image, more false detection and missing detection phenomena occur. How to improve the YOLOv3 network to make it more suitable for SAR image target detection, thereby improving precision ratio and recall ratio is a problem faced by scientific researchers.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide an SAR image target detection method based on an improved YOLOv3 network. In order to better optimize the parameters of the candidate frame, a k-means clustering method is adopted to generate 9 groups of prior anchor frames which are used as initial values of the sizes of the candidate frames; to better describe the goodness of fit of two bounding boxes, r is introduced GIOU Instead of the cross-over ratio r IOU The box regression loss and confidence loss are recalculated, and the overall loss function is optimized, so that the network parameters are updated more effectively through back propagation.
In order to achieve the purpose, the invention adopts the technical scheme that:
a SAR image target detection method based on an improved YOLOv3 network comprises the following steps:
step 1, preparing an SAR image training data set, wherein a labeling frame parameter is attached to each known target in the data set, and the method comprises the following steps: frame center coordinates, frame width and height, object category and confidence; inputting the SAR image training data set into a Darknet53 network, and performing feature extraction on the SAR image to obtain a basic feature map of the SAR image;
step 2, corresponding to the division of three grids of the SAR image in large, medium and small sizes, carrying out feature extraction and feature fusion on the basic feature map according to three scales of the large, medium and small sizes to obtain a multi-scale feature map of the SAR image; each SAR image grid is provided with three size candidate boxes, each candidate box having the following parameters: frame center coordinates, frame width and height, object category and confidence;
step 3, inputting the multi-scale characteristic diagram into a prediction network, and adjusting and optimizing the parameters of the candidate frame;
step 4, substituting the optimized candidate frame parameters and the optimized marking frame parameters into a loss function, and calculating the loss value of the current network;
step 5, based on the obtained loss value, reversely propagating and updating the network parameters, and repeatedly training until the network parameters are converged;
and 6, inputting an SAR image test data set, carrying out network test, and outputting a detection frame which is matched with the target in a one-to-one manner and various detection indexes.
In step 3, the candidate frame parameters need to be initialized before being optimized, where the initial values of the frame width and height parameters are calculated as follows:
(1) Aligning the centers of all the labeled frames in the training data set, and listing the size data of all the labeled frames;
(2) Clustering of the size of the labeling frame is carried out by a k-means clustering method, and the distance metric d used for clustering is calculated as follows:
d(A,B)=1-r IOU (A,B)
Figure BDA0003097153400000021
wherein A and B are two different bounding boxes, r IOU Representing the cross-over ratio operation, r IOU (A, B) represents a cross-over ratio between the bounding boxes A and B, n represents a cross-over operation, and u represents a cross-over operation;
(3) Extracting 9 sizes which exist most densely to obtain 9 groups of prior anchor frames as initial values of the sizes of the candidate frames; the grid with three dimensions of large, medium and small exists, each dimension of grid is provided with candidate frames with 3 dimensions, and the candidate frames with 9 dimensions are provided in total.
In step 4, the loss function is expressed as follows:
l total =l box +l cls +l obj
wherein l box Represents the loss of regression of the frame, l cls Represents a loss of classification,/ obj Representing a confidence loss;
frame regression loss l box Is represented as follows:
Figure BDA0003097153400000022
wherein λ is coord Is a weighting factor; each image is divided into S-S grids, and each grid is provided with J candidate frames;
Figure BDA0003097153400000031
and
Figure BDA0003097153400000032
the width and the height of a marking frame corresponding to the ith grid are respectively;
Figure BDA0003097153400000033
representing whether the jth candidate frame of the ith grid contains a target, wherein the target is 1, and otherwise, the target is 0;
Figure BDA0003097153400000034
is represented by B i,j And
Figure BDA0003097153400000035
r between GIOU Value of, wherein B i,j The jth candidate box representing the ith mesh,
Figure BDA0003097153400000036
notation corresponding to ith gridFraming;
loss of classification l cls Is represented as follows:
Figure BDA0003097153400000037
wherein p is i,j Is the probability that the jth candidate box of the ith mesh is predicted to contain the target,
Figure BDA0003097153400000038
probability f of containing target for label box corresponding to ith grid etrp () Representing a two-class cross entropy function;
loss of confidence l obj Is represented as follows:
Figure BDA0003097153400000039
wherein λ is noobj Is a weight factor;
Figure BDA00030971534000000310
representing whether the jth candidate box of the ith grid contains a target or not, wherein the value of the time value of no target is 1, and the value of the time value is 0 if not; c i,j The confidence score for the jth candidate box representing the ith lattice,
Figure BDA00030971534000000311
representing the confidence score of the labeling frame corresponding to the ith grid;
r mentioned above GIOU The calculation is as follows:
Figure BDA00030971534000000312
wherein D is the smallest closed convex surface of bounding box a and bounding box B, i.e. the smallest box that can enclose them; d \ represents the rest part of the union of the removal of A and B in D;
the above-mentioned binary cross entropy function f etrp () Is shown below:
Figure BDA00030971534000000313
Where u is the candidate frame parameter,
Figure BDA00030971534000000314
is the label box parameter.
Confidence score C as described above i,j The calculation is as follows:
Figure BDA00030971534000000315
wherein p is i,j A probability that a jth candidate box for the ith mesh is predicted to contain the target;
Figure BDA0003097153400000041
is represented by B i,j And with
Figure BDA0003097153400000042
R between GIOU Value of, wherein B i,j The jth candidate box representing the ith lattice,
Figure BDA0003097153400000043
and representing the label box corresponding to the ith grid.
The improved YOLOv3 network in the invention uses k-means clustering method to compare r IOU And as a measurement distance, clustering the size of the labeling frame in the SAR image training data set to obtain 9 groups of prior anchor frames as an initial value of the frame size of the candidate frame optimization.
The improved YOLOv3 network in the invention improves the box regression loss function and the confidence coefficient loss function, thereby optimizing the overall loss function.
The improved YOLOv3 network of the invention introduces the r-based network GIOU The similarity measurement updates the calculation method of the confidence score, and can better calculate the goodness of fit between two bounding boxes.
Has the advantages that: compared with the conventional YOLOv3 network, the improved YOLOv3 network provided by the invention has faster network convergence speed in training. In the aspect of SAR image target detection indexes, the method has lower false detection and missing detection, and is more suitable for target detection when the image background is complex and the target size is different.
Drawings
FIG. 1 is a block diagram of an improved YOLOv3 network architecture;
FIG. 2 is a diagram of a port scene ship target detection false alarm comparison; wherein, (a) the detection result of the conventional YOLOv3 network; (b) improving the YOLOv3 network detection results;
FIG. 3 is a comparison diagram of ship target detection alarm leakage in a port scene; wherein, (a) the detection result of the conventional YOLOv3 network; (b) improving the YOLOv3 network detection results;
fig. 4 is a graph comparing loss function curves of a conventional YOLOv3 network and an improved YOLOv3 network.
Detailed Description
The invention will be further elucidated with reference to the drawings and specific embodiments, it being understood that these examples are intended to illustrate the invention only and are not intended to limit the scope of the invention. Various equivalent modifications of the invention, which fall within the scope of the appended claims of this application, will be suggested to those skilled in the art after reading this disclosure.
The invention discloses an SAR image target detection method based on an improved YOLOv3 network, wherein the network structure diagram is shown in figure 1. Inputting the SAR image training data set into a Darknet53 network, and performing feature extraction on the SAR image to obtain a basic feature map of the SAR image; carrying out feature extraction and feature fusion on the basic feature map according to three scales of large, medium and small to obtain a multi-scale feature map; inputting the multi-scale characteristic diagram into a prediction network, and adjusting and optimizing the parameters of the candidate frame; substituting the optimized candidate frame parameters and the optimized marking frame parameters into a loss function, and calculating the loss value of the current network; based on the obtained loss value, updating the network parameters through back propagation, and repeatedly training until the network parameters are converged; inputting an SAR image test data set, carrying out network test, and outputting a detection frame which is matched with a target in a one-to-one manner and various detection indexes.
The following specific embodiments take SAR image ship target detection as specific examples, and the specific steps are as follows:
step 1, preparing an SAR image training data set, and adjusting the sizes of all images to 416 multiplied by 416; attaching a label box parameter to each known target in the training dataset, including: frame center coordinates, frame width and height, object category and confidence; inputting the SAR image training data set into a Darknet53 network, and performing feature extraction on the SAR image to obtain a basic feature map of the SAR image;
step 2, corresponding to three grid divisions of 13 × 13, 26 × 26 and 52 × 52 of the SAR image, carrying out feature extraction and feature fusion on the basic feature map according to three scales of large, medium and small to obtain a multi-scale feature map of the SAR image; each SAR image grid is provided with three size candidate boxes, each candidate box having the following parameters: frame center coordinates, frame width and height, object category and confidence;
and 3, inputting the multi-scale characteristic graph into a prediction network, and adjusting and optimizing the parameters of the candidate frame.
Before optimizing the candidate frame parameters, the candidate frame parameters need to be initialized, wherein the initial values of the frame width and the height are calculated as follows:
1) And aligning the centers of all the labeled boxes in the training data set, and listing the size data of all the labeled boxes.
2) Clustering of the size of the labeling frame is carried out by a k-means clustering method, and the distance metric d used for clustering is calculated as follows:
d(A,B)=1-r IOU (A,B)
Figure BDA0003097153400000051
wherein A and B are two different bounding boxes, r IOU Represents the cross-over ratio operation, r IOU (A, B) represents the intersection ratio between the bounding boxes A and B, n represents the intersection operation, and u represents the union operation.
3) And extracting the most densely existing 9 sizes to obtain 9 groups of prior anchor frames serving as initial values of the sizes of the candidate frames. The grids of three scales, namely a large scale, a medium scale and a small scale exist, each scale of grid needs candidate frames of 3 sizes, and the candidate frames are 9 in total.
And 4, substituting the optimized candidate frame parameters and the optimized marking frame parameters into a loss function, and calculating the loss value of the current network.
The loss function used is expressed as follows:
l total =l box +l cls +l obj
wherein l box Indicates the loss of regression in the box,/ cls Represents a loss of classification,/ obj Indicating a loss of confidence.
Frame regression loss l box Is represented as follows:
Figure BDA0003097153400000061
wherein λ is coord Is a weight factor; dividing each image into S-S grids, wherein each grid is provided with J candidate frames;
Figure BDA0003097153400000062
and
Figure BDA0003097153400000063
the width and the height of a marking frame corresponding to the ith grid are respectively;
Figure BDA0003097153400000064
representing whether the jth candidate frame of the ith grid contains a target, wherein the target is 1, and otherwise, the target is 0;
Figure BDA0003097153400000065
is shown as B i,j And
Figure BDA0003097153400000066
r between GIOU Value of, wherein B i,j The jth candidate box representing the ith mesh,
Figure BDA0003097153400000067
and representing the label box corresponding to the ith grid.
Loss of classification l cls Is represented as follows:
Figure BDA0003097153400000068
wherein p is i,j The probability that the jth candidate box for the ith mesh is predicted to contain the target,
Figure BDA0003097153400000069
probability f of containing target for label box corresponding to ith grid etrp () Representing a two-class cross entropy function.
Loss of confidence l obj Is represented as follows:
Figure BDA00030971534000000610
wherein λ noobj In order to be a weight factor, the weight factor,
Figure BDA00030971534000000611
representing whether the jth candidate box of the ith grid contains a target or not, wherein the value of the time value of no target is 1, and the value of the time value is 0 if not; c i,j The confidence score for the jth candidate box representing the ith mesh,
Figure BDA00030971534000000612
and representing the confidence score of the labeling box corresponding to the ith grid.
R as mentioned above GIOU The calculation is as follows:
Figure BDA00030971534000000613
wherein D is the smallest closed convex surface of bounding box a and bounding box B, i.e. the smallest box that can enclose them; d \ represents the rest part of the union of the A and the B in the D.
The above-mentioned binary cross entropy function f etrp () Is represented as follows:
Figure BDA00030971534000000614
where u is the candidate box parameter,
Figure BDA00030971534000000615
is the label box parameter.
Confidence score C as described above i,j The calculation is as follows:
Figure BDA0003097153400000071
step 5, reversely propagating and updating network parameters based on the loss value; and repeatedly training until the network parameters are converged.
And 6, preparing an SAR image test data set, and adjusting the sizes of all images to 416 x 416 for an improved YOLOv3 network test. Inputting the test data set into a network for processing until a candidate frame with optimized parameters is obtained; deleting a plurality of candidate frames which belong to the same class of targets and have higher overlapping rate and lower confidence coefficient by adopting a non-maximum suppression (NMS) algorithm to obtain a plurality of detection frames which are matched with the targets one to one; and (5) counting detection indexes such as recall ratio and precision ratio, and outputting a detection result.
The performance evaluation index includes precision ratio r P Recall ratio r R And the harmonic mean F of the two 1 The calculation formulas are respectively as follows:
Figure BDA0003097153400000072
Figure BDA0003097153400000073
Figure BDA0003097153400000074
wherein N is TP Number of targets correctly detected, N FP Number of targets for false detection, N FN The number of missed targets.
The embodiment is as follows:
in the embodiment, a high-resolution SAR ship target detection data set AIR-SARShip-2.0 is adopted to verify the improved YOLOv3 network provided by the invention. The data set was published by the radars in 2020 and comprises a total of 300 SAR images with image resolutions including 1m and 3m. The image size is about 1000 multiplied by 1000 pixels, the image format is Tiff, single channel, 8/16 bit image depth, and the labeling file provides the length and width size of the corresponding image, the type of the labeling target and the position of the labeling rectangular frame. The SAR images in the data set relate to different scenes such as ports, island reefs and sea surfaces and comprise different ship targets such as transport ships, oil tankers and fishing ships, and each image comprises different numbers of ship targets.
In the embodiment, firstly, the data set is expanded by using methods such as turning, translating, changing brightness and the like, a total of 1500 SAR images are obtained after the expansion, and a corresponding tag is established for each SAR image. 1200 randomly drawn SAR images were used as a training data set, and the remaining 300 were used as a test data set.
Network parameters are set that improve the YOLOv3 network. And (3) carrying out initialization training on the model by adopting a COCO weight file, setting the moving average attenuation rate to be 0.9995, setting the initialization learning rate to be 1e-4, setting the final learning rate to be 1e-6, and amplifying the image by twice by using a nearest neighbor interpolation method in the upsampling process. Setting BatchSize as 6, and performing iterative training twice, wherein the algebra of the first local training is 20 epochs, and the algebra of the second global training is 30 epochs. The IOU threshold is set to 0.5 and the number of classes is 1, i.e., the ship class.
The experimental results show that the precision ratio of the improved YOLOv3 network is improved by 1.4 percentage points compared with the conventional YOLOv3 network, which indicates that the improved YOLOv3 network reduces the false detection rate. Fig. 2 shows the ship target detection results of two networks in a complex harbor background, where (a) is the detection result of a conventional YOLOv3 network, and (b) is the detection result of an improved YOLOv3 network. It can be seen that in the same scene, two false positives exist in the detection result of the conventional YOLOv3 network, and no false positive exists in the detection result of the improved YOLOv3 network.
In addition, compared with the conventional YOLOv3 network, the improved YOLOv3 network has the advantages that the recall ratio is also improved by 1.17 percentage points, which indicates that the improved YOLOv3 network has lower missing detection rate and has stronger detection capability for small target ships which are easy to miss detection. As shown in fig. 3, a diagram (a) shows the detection result of the conventional YOLOv3 network, a diagram (b) shows the detection result of the improved YOLOv3 network, and the circle in the diagram (a) is the ship target which is not detected, and the ship target is detected in the diagram (b).
The loss function versus curve is shown in fig. 4 when trained and tested using a conventional YOLOv3 network and an improved YOLOv3 network. It can be seen that the improved YOLOv3 network converges faster than the conventional YOLOv3 network, and the loss function of the improved YOLOv3 network decreases lower after the loss function has stabilized.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (4)

1. An SAR image target detection method based on an improved YOLOv3 network is characterized in that: the method comprises the following steps:
step 1, preparing an SAR image training data set, attaching a labeling frame parameter to each known target in the data set, wherein the method comprises the following steps: frame center coordinates, frame width and height, object category and confidence; inputting the SAR image training data set into a Darknet53 network, and performing feature extraction on the SAR image to obtain a basic feature map of the SAR image;
step 2, corresponding to the division of three grids of the SAR image in large, medium and small sizes, carrying out feature extraction and feature fusion on the basic feature map according to three scales of the large, medium and small sizes to obtain a multi-scale feature map of the SAR image; each SAR image grid is provided with three size candidate boxes, each candidate box having the following parameters: frame center coordinates, frame width and height, object category and confidence;
step 3, inputting the multi-scale feature map into a prediction network, and adjusting and optimizing the parameters of the candidate frame;
step 4, substituting the optimized candidate frame parameters and the optimized marking frame parameters into a loss function, and calculating the loss value of the current network;
step 5, based on the obtained loss value, reversely propagating and updating the network parameters, and repeatedly training until the network parameters are converged;
and 6, inputting an SAR image test data set, carrying out network test, and outputting a detection frame which is matched with the target in a one-to-one manner and various detection indexes.
2. The SAR image target detection method based on the improved YOLOv3 network as claimed in claim 1, characterized in that: in step 3, the candidate frame parameters need to be initialized before being optimized, where the initial values of the frame width and height parameters are calculated as follows:
(1) Aligning the centers of all the labeled frames in the training data set, and listing the size data of all the labeled frames;
(2) And (3) clustering the size of the marked frame by a k-means clustering method, wherein the distance metric d used for clustering is calculated as follows:
d(A,B)=1-r IOU (A,B)
Figure FDA0003097153390000011
wherein A and B are two different bounding boxes, r IOU Representing the cross-over ratio operation, r IOU (A, B) represents the intersection ratio between the bounding boxes A and B, n represents the intersection operation, and u represents the union operation;
(3) Extracting 9 sizes which exist most densely to obtain 9 groups of prior anchor frames as initial values of the sizes of the candidate frames; the grid with three dimensions of large, medium and small exists, each dimension of grid is provided with candidate frames with 3 dimensions, and the candidate frames with 9 dimensions are provided in total.
3. The method for detecting the SAR image target based on the improved YOLOv3 network as claimed in claim 1, wherein: in step 4, the loss function is expressed as follows:
l total =l box +l cls +l obj
wherein l box Represents the loss of regression of the frame, l cls Represents a loss of classification,/ obj Representing a confidence loss;
frame regression loss l box Is represented as follows:
Figure FDA0003097153390000021
wherein λ is coord Is a weighting factor; each image is divided into S-S grids, and each grid is provided with J candidate frames;
Figure FDA0003097153390000022
and
Figure FDA0003097153390000023
the width and the height of a marking frame corresponding to the ith grid are respectively;
Figure FDA0003097153390000024
representing whether the jth candidate frame of the ith grid contains a target, wherein the target is 1, and otherwise, the target is 0;
Figure FDA0003097153390000025
is shown as B i,j And with
Figure FDA0003097153390000026
R between GIOU Value of, wherein B i,j The jth candidate box representing the ith mesh,
Figure FDA0003097153390000027
representing a label frame corresponding to the ith grid;
loss of classification l cls Is represented as follows:
Figure FDA0003097153390000028
wherein p is i,j The probability that the jth candidate box for the ith mesh is predicted to contain the target,
Figure FDA0003097153390000029
probability f of containing target for label box corresponding to ith grid etrp () Representing a two-class cross entropy function;
loss of confidence l obj Is represented as follows:
Figure FDA00030971533900000210
wherein λ is noobj Is a weighting factor;
Figure FDA00030971533900000211
representing whether the jth candidate frame of the ith grid contains a target or not, wherein the value of no target is 1, and otherwise, the value is 0; c i,j The confidence score for the jth candidate box representing the ith mesh,
Figure FDA00030971533900000212
representing the confidence score of the labeling frame corresponding to the ith grid;
r mentioned above GIOU The calculation is as follows:
Figure FDA00030971533900000213
wherein D is the smallest closed convex surface of bounding box a and bounding box B, i.e. the smallest box that can enclose them; d \ represents the rest part of the union of the A and the B in the D;
the above-mentioned binary cross entropy function f etrp () Is represented as follows:
Figure FDA00030971533900000214
where u is the candidate box parameter,
Figure FDA00030971533900000215
is the label box parameter.
4. The SAR image target detection method based on the improved YOLOv3 network as claimed in claim 3, characterized in that: the confidence score C i,j The calculation is as follows:
Figure FDA0003097153390000031
wherein p is i,j A probability that a jth candidate box for the ith mesh is predicted to contain the target;
Figure FDA0003097153390000032
is shown as B i,j And
Figure FDA0003097153390000033
r between GIOU Value of, wherein B i,j The jth candidate box representing the ith mesh,
Figure FDA0003097153390000034
and representing the label box corresponding to the ith grid.
CN202110613778.XA 2021-06-02 2021-06-02 SAR image target detection method based on improved YOLOv3 network Active CN113313128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110613778.XA CN113313128B (en) 2021-06-02 2021-06-02 SAR image target detection method based on improved YOLOv3 network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110613778.XA CN113313128B (en) 2021-06-02 2021-06-02 SAR image target detection method based on improved YOLOv3 network

Publications (2)

Publication Number Publication Date
CN113313128A CN113313128A (en) 2021-08-27
CN113313128B true CN113313128B (en) 2022-10-28

Family

ID=77377202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110613778.XA Active CN113313128B (en) 2021-06-02 2021-06-02 SAR image target detection method based on improved YOLOv3 network

Country Status (1)

Country Link
CN (1) CN113313128B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114092739B (en) * 2021-11-02 2023-06-30 北京百度网讯科技有限公司 Image processing method, apparatus, device, storage medium, and program product
CN113903009B (en) * 2021-12-10 2022-07-05 华东交通大学 Railway foreign matter detection method and system based on improved YOLOv3 network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401148A (en) * 2020-02-27 2020-07-10 江苏大学 Road multi-target detection method based on improved multilevel YO L Ov3
CN111723748A (en) * 2020-06-22 2020-09-29 电子科技大学 Infrared remote sensing image ship detection method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401148A (en) * 2020-02-27 2020-07-10 江苏大学 Road multi-target detection method based on improved multilevel YO L Ov3
CN111723748A (en) * 2020-06-22 2020-09-29 电子科技大学 Infrared remote sensing image ship detection method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
双向特征融合的数据自适应SAR图像舰船目标检测模型;张筱晗等;《中国图象图形学报》;20200916(第09期);全文 *
基于RetinaNet的SAR图像舰船目标检测;刘洁瑜等;《湖南大学学报(自然科学版)》;20200225(第02期);全文 *

Also Published As

Publication number Publication date
CN113313128A (en) 2021-08-27

Similar Documents

Publication Publication Date Title
US11429818B2 (en) Method, system and device for multi-label object detection based on an object detection network
CN109977918B (en) Target detection positioning optimization method based on unsupervised domain adaptation
CN109871902B (en) SAR small sample identification method based on super-resolution countermeasure generation cascade network
Mahmoud et al. Object detection using adaptive mask RCNN in optical remote sensing images
CN111898432B (en) Pedestrian detection system and method based on improved YOLOv3 algorithm
CN113313128B (en) SAR image target detection method based on improved YOLOv3 network
CN109033944B (en) Method and system for classifying all-sky aurora images and positioning key local structure
CN110991257B (en) Polarized SAR oil spill detection method based on feature fusion and SVM
CN112507896B (en) Method for detecting cherry fruits by adopting improved YOLO-V4 model
CN111931953A (en) Multi-scale characteristic depth forest identification method for waste mobile phones
CN113159215A (en) Small target detection and identification method based on fast Rcnn
CN115019039A (en) Example segmentation method and system combining self-supervision and global information enhancement
CN116071389A (en) Front background matching-based boundary frame weak supervision image segmentation method
Zhang et al. Nearshore vessel detection based on Scene-mask R-CNN in remote sensing image
CN113657414B (en) Object identification method
Li et al. Insect detection and counting based on YOLOv3 model
Chen et al. Ship detection with optical image based on attention and loss improved YOLO
Yang et al. SAR image target detection and recognition based on deep network
Zhang et al. Traffic Sign Detection and Recognition Based onDeep Learning.
Li et al. Exploring label probability sequence to robustly learn deep convolutional neural networks for road extraction with noisy datasets
CN117218545A (en) LBP feature and improved Yolov 5-based radar image detection method
CN117173547A (en) Underwater target detection method based on improved YOLOv6 algorithm
CN116385876A (en) Optical remote sensing image ground object detection method based on YOLOX
Wang et al. FPA-DNN: a forward propagation acceleration based deep neural network for ship detection
CN114283323A (en) Marine target recognition system based on image deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant