CN115272846A - Improved Orientdrcnn-based rotating target detection method - Google Patents

Improved Orientdrcnn-based rotating target detection method Download PDF

Info

Publication number
CN115272846A
CN115272846A CN202210827268.7A CN202210827268A CN115272846A CN 115272846 A CN115272846 A CN 115272846A CN 202210827268 A CN202210827268 A CN 202210827268A CN 115272846 A CN115272846 A CN 115272846A
Authority
CN
China
Prior art keywords
improved
inputting
module
training
orientdrcnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210827268.7A
Other languages
Chinese (zh)
Inventor
王友伟
郭颖
邵香迎
鲍正位
王季宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202210827268.7A priority Critical patent/CN115272846A/en
Publication of CN115272846A publication Critical patent/CN115272846A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of rotating image target detection, in particular to a rotating target detection method based on improved Orienterdcnn, which comprises the following steps of: inputting an image; image preprocessing: adjusting each picture to a fixed size, carrying out normalization processing on the fixed-size pictures, and dividing the pictures into a training set, a verification set and a test set; inputting a network model for training: inputting the training set into an improved Oriented rcnn model for training; inputting a test set and outputting a detection result; the method defines the rotary anchor frame by using a six-parameter method different from the prior art, extracts different characteristics required by a classification task and a positioning task respectively by using different polarization functions, and introduces the SPP module to realize the fusion between local characteristics and global characteristics, so that the detection interference caused by the inconsistency of the required characteristics between classification and regression can be overcome, the different characteristics required by different tasks are effectively extracted, and the remote sensing image target can be classified and positioned more accurately.

Description

Improved Orienterdrcnn-based rotating target detection method
Technical Field
The invention relates to the technical field of rotating image target detection, in particular to a rotating target detection method based on improved Orientandrcnn.
Background
The remote sensing image target detection is a primary task in the field of remote sensing image processing, and the main work is to automatically find out an interested area and give a specific category of a target according to a given remote sensing image data set. However, the remote sensing image target has the characteristics of different directions, smaller scale and the like, and the conventional target detection uses a horizontal anchor frame and cannot be well attached to the target.
In order to accurately predict the target direction, many scholars propose rotating target detection, which introduces direction parameters to an RPN module and generates a directional anchor frame for regression and classification. For example, the rotarpn sets up 54 anchor frames of different angles, scales, and aspect ratios at each anchor point, which may improve accuracy when the directional objects are sparsely distributed. XieX et al propose Orienterrcnn to solve the computational cost problem and further improve accuracy.
In order to obtain accurate rotating target information, researchers introduce a rotating anchor frame on the existing target detection model to realize more accurate positioning. For example, patent application with publication number CN112800955A entitled "method and system for detecting remote sensing image rotating target based on weighted bidirectional feature pyramid" discloses a method for detecting remote sensing image rotating target based on weighted bidirectional feature pyramid, and introduces cross-scale feature fusion capability of a BiFPN enhanced model. The patent application with the publication number of CN110378242A and the name of 'a remote sensing target detection method of a double attention mechanism' discloses a remote sensing target detection method of the double attention mechanism, and a feature diagram is redefined by using the double attention mechanism.
However, due to the inconsistency of the required features between classification and regression, the existing rotating image target detection method has difficulty in accurately extracting different features required by different tasks.
Disclosure of Invention
The present invention aims to provide a method for detecting a rotating target based on an improved orientdrcnn, so as to solve the problems in the background art.
The technical scheme of the invention is as follows: a rotary target detection method based on improved Orientdrcnn comprises the following steps:
step 1, inputting an image: selecting a remote sensing image data set of an annotation file containing direction information as an input image, and randomly overturning and filling the input image;
step 2, image preprocessing: adjusting each picture to a fixed 1024 x 1024 size; carrying out normalization processing on the pictures with fixed sizes, and dividing the pictures into a training set, a verification set and a test set according to the following steps of 5;
step 3, inputting a network model for training: inputting the training set of the step 2 into an improved Orientandrcnn model for training;
step 4, inputting a test set, and outputting a detection result: detecting the remote sensing image by using a trained improved Orienterdrcnn model to obtain a result image of a target framed by a rotating frame;
wherein the training of the improved orientdrcnn model operation in step 3 comprises: inputting the training set into a main network ResNet50 for feature extraction to obtain features C2-C5 with different sizes; and inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap, wherein the SPP-FPN module outputs C5 which is the deepest layer of the backbone network through the SPP module to obtain M5.
Preferably, the specific operation of step 3 includes:
step 3.1, inputting the training set into a main network ResNet50 for feature extraction to obtain features C2-C5 with different sizes;
step 3.2, inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap;
step 3.3, inputting the featuremap into a rotation suggestion region generation module orientation edRPN, and outputting suggestion regions disposals after encoding and decoding;
and 3.4, inputting the feature files obtained in the step 3.2 and the explosals obtained in the step 3.3 into an improved detection head module PAM-head, carrying out final classification and positioning operation, and outputting a remote sensing target identification and positioning result.
Preferably, in step 3.2, the specific workflow of the SPP-FPN module includes: and (3) obtaining M5 by passing the deepest output C5 of the backbone network through an SPP module, carrying out element summation on results obtained by transversely connecting M5 upsampling and C4 to obtain M4, carrying out element summation on results obtained by transversely connecting M4 upsampling and C3 to obtain M3, and repeating the steps to obtain M2-M5, and carrying out 3 × 3 convolution on the M2-M5 to obtain improved FPN output P2-P5.
Preferably, the SPP module realizes fusion between local features and global features, processes the feature map by using pooling of different sizes, and finally performs stitching to obtain an output result.
Preferably, in step 3.3, the specific operation of the rotation proposal region generation module orientatedrpn includes: and 3.2, changing the number of convolution channels of the output characteristic diagram into 6A, wherein A represents the number of anchor frames generated at each anchor point, 6 represents that 6 parameters are needed to define a rotation anchor frame, and the 6 parameters are (x, y, w, h, delta alpha and delta beta), wherein x and y represent the coordinates of the central point of the generated horizontal anchor frame, w and h represent the width and height of the generated horizontal anchor frame, and delta alpha and delta beta represent the offset between two adjacent top points of the rotation anchor frame and the middle points of two adjacent edges of the horizontal anchor frame.
Preferably, the specific operation of the improved detection head PAM-head module in step 3.4 includes: the input feature map is processed by a polarized attention module PAM, different feature pyramids are generated for classification tasks and positioning tasks, feature interference among different tasks can be avoided, different key features required by different tasks are effectively extracted, the obtained different features are sent to a full connection layer for classification and regression, and final classification and positioning results are output.
Preferably, the polarization attention module PAM has a double-branch structure, the input feature map passes through the attention module (the channel attention module is parallel to the spatial attention module), and then uses different feature representation functions, wherein the classification branches use excitation functions, so that high-response global features can be obtained, and the positioning branches use suppression functions, so that only boundary features can be focused, and irrelevant high-activation regions can be suppressed.
Preferably, the experimental configuration of the training model of the rotating target detection method includes that the experimental environment is Python3.8, pytroch 1.7.0, torchvision0.7.0, and batchsize is 2, the initial value of the learning rate is set to 0.001, the maximum training epoch number is 12, and the learning rate is respectively reduced to 1 × 10-4 and 1 × 10-5 after the 9 th and 11 th epochs are iterated.
Preferably, the experimental hardware equipment of the training model of the rotating target detection method is
Figure BDA0003744467850000041
CoreTMi9-10900XCPU, NVIDIARTX3080Ti display card.
Preferably, the size of the input image is adjusted to 1024 × 1024 pixels, and the accuracy AP of each type of target and the average accuracy mAP of all types of targets in the data set are used as the measurement indexes of the experiment.
The invention provides a rotary target detection method based on improved Orientandrcnn by improvement, and compared with the prior art, the rotary target detection method has the following improvements and advantages:
the method uses a six-parameter method different from the prior art to define the rotating anchor frame, uses different polarization functions to respectively extract different characteristics required by a classification task and a positioning task, and introduces the SPP module to realize the fusion between local characteristics and global characteristics, so that the detection interference caused by the inconsistency of the required characteristics between classification and regression can be overcome, different characteristics required by different tasks are effectively extracted, and the remote sensing image target can be classified and positioned more accurately.
Drawings
The invention is further explained below with reference to the figures and examples:
FIG. 1 is a flow chart of the overall network framework of the present invention.
FIG. 2 is a diagram showing the structure of SPP-FPN in the present invention.
Fig. 3 is a block diagram of an SPP module according to the present invention.
FIG. 4 is a diagram of 6 parameters representing anchor boxes in the OrientedRPN of the present invention.
FIG. 5 is a structural diagram of PAM-head in the present invention.
FIG. 6 is a schematic diagram of a detection result of a remote sensing image obtained by the present invention.
Detailed Description
The present invention is described in detail below, and the technical solutions in the embodiments of the present invention are clearly and completely described, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The invention provides a rotary target detection method based on improved Orientandrcnn through improvement, and the technical scheme of the invention is as follows:
as shown in fig. 1, a method for detecting a rotating target based on modified orientdrcnn comprises the following steps:
step 1, inputting an image: selecting a remote sensing image data set of an annotation file containing direction information as an input image, and randomly overturning and filling the input image;
step 2, image preprocessing: adjusting each picture to a fixed 1024 × 1024 size; carrying out normalization processing on the fixed-size picture, and dividing the picture into a training set, a verification set and a test set according to a ratio of 5;
step 3, inputting a network model for training: inputting the training set of the step 2 into an improved Orientandrcnn model for training;
step 4, inputting a test set, and outputting a detection result: detecting the remote sensing image by using the trained improved Orienterdrcnn model to obtain a result image of the target framed by the rotating frame, wherein the result image is shown in FIG. 6;
wherein the training of the improved orientendrcnn model in step 3 comprises: inputting the training set into a main network ResNet50 for feature extraction to obtain features C2-C5 with different sizes; inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap, wherein the SPP-FPN module outputs C5 which is the deepest layer of the main network through the SPP module to obtain M5.
Wherein, the specific operation of step 3 includes:
step 3.1, inputting the training set into a backbone network ResNet50 for feature extraction to obtain features C2-C5 with different sizes;
step 3.2, inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap;
step 3.3, inputting the featuremap into a rotation proposal region generating module orientatedRPN, and outputting proposal regions proposals after encoding and decoding;
and 3.4, inputting the feature files obtained in the step 3.2 and the explosals obtained in the step 3.3 into an improved detection head module PAM-head, carrying out final classification and positioning operation, and outputting a remote sensing target identification and positioning result.
In step 3.2, the specific work flow of the SPP-FPN module includes: obtaining M5 by passing through an SPP module from C5 output at the deepest layer of the backbone network, performing element summation on results obtained by horizontally connecting M5 upsampling and C4 to obtain M4, performing element summation on results obtained by horizontally connecting M4 upsampling and C3 to obtain M3, and so on to obtain M2-M5, and performing 3 × 3 convolution on the M2-M5 respectively to obtain improved FPN output P2-P5, as shown in FIG. 2.
Further, the SPP module implements fusion between local features and global features, processes the feature map using pooling of different sizes, and finally performs stitching to obtain an output result, as shown in fig. 3.
In step 3.3, the specific operation of the rotation proposal region generation module orientatedrpn includes: step 3.2, the output characteristic diagram is changed into 6A after the number of convolution channels, A represents the number of anchor frames generated at each anchor point, 6 represents that 6 parameters are needed to define a rotation anchor frame, and 6 parameters are (x, y, w, h, delta alpha and delta beta), wherein x and y represent the coordinates of the central point of the generated horizontal anchor frame, w and h represent the width and height of the generated horizontal anchor frame, and delta alpha and delta beta represent the offset between two adjacent top points of the rotation anchor frame and the middle points of two adjacent edges of the horizontal anchor frame, as shown in FIG. 4. Wherein, the formula using the above 6 parameter regression anchor frame is:
Figure BDA0003744467850000071
where v1, v2, v3, v4 represent the four vertices of the rotation anchor frame.
Further, since the orientadrepn generates a large number of anchor frames, N anchor frames with higher scores need to be selected as subsequent inputs, the present invention uses the DIoU score as a positive sample allocation strategy, and the DIoU expression is shown in formula 2:
Figure BDA0003744467850000072
wherein d is2Represents the square of the distance of the predicted frame from the center point of the real frame, c2Representing the square of the length of the diagonal of the minimum bounding rectangle of the prediction box and the real box.
The specific operation of the detection head PAM-head module improved in the step 3.4 includes: processing the input feature graph by a polarized attention module PAM (pulse amplitude modulation), generating different feature pyramids for classification tasks and positioning tasks, avoiding feature interference among different tasks, effectively extracting different key features required by different tasks, sending the obtained different features into a full-connection layer for classification and regression, and outputting a final classification and positioning result; wherein, the PAM-head module structure is shown in FIG. 5. The total loss function of the model is shown in equation 3:
Figure BDA0003744467850000073
wherein L isclsUsing the cross-entropy loss, LregUsing SmoothL1And (4) loss. L is a radical of an alcoholclsAs shown in formula 4, LregAs shown in equation 5:
Figure BDA0003744467850000081
Figure BDA0003744467850000082
wherein p isiFor the output of the RPN classification branch, representing the probability of being suggested as foreground,
Figure BDA0003744467850000083
tag for ith true value, tiAn offset value representing the regression of the positioning branch,
Figure BDA0003744467850000084
denotes the offset value to the real box, smoothL1The function definition is shown in equation 6:
Figure BDA0003744467850000085
furthermore, the polarization attention module PAM is of a double-branch structure, an input feature map uses different feature representation functions after passing through an attention module (a channel attention module is parallel to a space attention module), wherein the classification branches use excitation functions to obtain global features with high response, and the positioning branches use inhibition functions to focus on boundary features only and inhibit irrelevant high activation regions; wherein, the expression of the excitation function is as follows:
Figure BDA0003744467850000086
where η is the excitation coefficient. The suppression function expression is as follows:
Figure BDA0003744467850000087
in the invention, the experimental configuration of the training model of the rotating target detection method comprises that an MMdetectionV2 frame is based, the experimental environment is Python3.8, pytroch 1.7.0, torchvision0.7.0, and batchsize is 2, the initial value of the learning rate is set to be 0.001, the maximum training epoch number is 12, and the learning rate is respectively reduced to 1 × 10-4 and 1 × 10-5 after the iteration is carried out to the 9 th epoch and the 11 th epoch.
The experimental hardware equipment of the training model of the rotating target detection method is
Figure BDA0003744467850000091
CoreTMi9-10900XCPU, NVIDIARTX3080Ti display card.
Considering that the remote sensing image is large, the size of the input image is adjusted to 1024 multiplied by 1024 pixels, and the precision AP and the average precision mAP of all kinds of targets of the data set are used as the measurement indexes of the experiment.
The method of the invention defines the rotating anchor frame by using a six-parameter method different from the prior art, extracts different characteristics required by a classification task and a positioning task respectively by using different polarization functions, and introduces an SPP module to realize the fusion between local characteristics and global characteristics, can overcome the detection interference caused by the inconsistency of the required characteristics between classification and regression, enhances the characteristic representation of the remote sensing small target, and has good network performance and strong model generalization capability.
The previous description is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A rotary target detection method based on improved Orientdrcnn is characterized in that: the method comprises the following steps:
step 1, inputting an image: selecting a remote sensing image data set of an annotation file containing direction information as an input image, and randomly overturning and filling the input image;
step 2, image preprocessing: adjusting each picture to a fixed 1024 × 1024 size; carrying out normalization processing on the fixed-size picture, and dividing the picture into a training set, a verification set and a test set according to a ratio of 5;
step 3, inputting a network model for training: inputting the training set of the step 2 into an improved Orientendrcnn model for training;
step 4, inputting a test set, and outputting a detection result: detecting the remote sensing image by using a trained improved Orientdrcnn model to obtain a result image of a target framed by a rotating frame;
wherein the training of the improved orientdrcnn model operation in step 3 comprises: inputting the training set into a main network ResNet50 for feature extraction to obtain features C2-C5 with different sizes; inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap, wherein the SPP-FPN module outputs C5 which is the deepest layer of the main network through the SPP module to obtain M5.
2. The improved orientdrcnn-based rotating object detection method according to claim 1, wherein: the specific operation of step 3 comprises:
step 3.1, inputting the training set into a main network ResNet50 for feature extraction to obtain features C2-C5 with different sizes;
step 3.2, inputting the extracted features into an SPP-FPN module for feature fusion to obtain featuremap;
step 3.3, inputting the featuremap into a rotation proposal region generating module orientatedRPN, and outputting proposal regions proposals after encoding and decoding;
and 3.4, inputting the featuremaps obtained in the step 3.2 and the disposals obtained in the step 3.3 into an improved detection head module PAM-head, performing final classification and positioning operation, and outputting a remote sensing target identification and positioning result.
3. The improved orientdrcnn-based rotating object detection method according to claim 2, wherein: in the step 3.2, the specific work flow of the SPP-FPN module includes: and (3) obtaining M5 by passing the deepest layer output C5 of the backbone network through an SPP module, performing element summation on a result obtained by transversely connecting M5 upsampling with C4 to obtain M4, performing element summation on a result obtained by transversely connecting M4 upsampling with C3 to obtain M3, repeating the steps to obtain M2-M5, and performing 3 x3 convolution on the M2-M5 to obtain improved FPN output P2-P5.
4. The method for detecting the rotating target based on the improved orientdrcnn as claimed in claim 3, wherein: the SPP module realizes the fusion between the local features and the global features, processes the feature map by using pooling of different sizes, and finally splices to obtain an output result.
5. The improved orientdrcnn-based rotating object detection method according to claim 2, wherein: in the step 3.3, the specific operation of the orientation edrpn of the rotation suggestion region generation module includes: and 3.2, changing the number of convolution channels of the output characteristic diagram into 6A, wherein A represents the number of anchor frames generated at each anchor point, 6 represents that 6 parameters are needed to define a rotation anchor frame, and the 6 parameters are (x, y, w, h, delta alpha and delta beta), wherein x and y represent the coordinates of the central point of the generated horizontal anchor frame, w and h represent the width and height of the generated horizontal anchor frame, and delta alpha and delta beta represent the offset between two adjacent top points of the rotation anchor frame and the middle points of two adjacent edges of the horizontal anchor frame.
6. The improved orientdrcnn-based rotating object detection method according to claim 2, wherein: the specific operation of the improved detection head PAM-head module in the step 3.4 comprises the following steps: the input feature graph is processed by a polarized attention module PAM to generate different feature pyramids for the classification task and the positioning task, so that feature interference among different tasks can be avoided, different key features required by different tasks are effectively extracted, the obtained different features are sent to a full-connection layer for classification and regression, and a final classification and positioning result is output.
7. The improved orientdrcnn-based rotating object detection method according to claim 6, wherein: the polarization attention module PAM is of a double-branch structure, an input feature map uses different feature representation functions after passing through an attention module (a channel attention module is parallel to a space attention module), wherein the classification branches use excitation functions to obtain global features with high response, the positioning branches use inhibition functions to focus on boundary features only and inhibit irrelevant high activation regions.
8. A method for detecting a rotating object based on modified orientdrcnn according to any of claims 1-7, wherein: the experimental configuration of the training model of the rotating target detection method comprises that based on an MMdetectionV2 framework, the experimental environment is Python3.8, pytroch 1.7.0, torchvision0.7.0, and batchsize is 2, the initial value of the learning rate is set to be 0.001, the maximum training epoch number is 12, and the learning rate is respectively reduced to 1 × 10-4 and 1 × 10-5 after the iteration is carried out to the 9 th and 11 th epochs.
9. The improved orientrcnn-based rotating object detection method according to claim 8, wherein: the experimental hardware equipment of the training model of the rotating target detection method is
Figure FDA0003744467840000031
CoreTMi9-10900XCPU, NVIDIARTX3080Ti display card.
10. The improved orientdrcnn-based rotating object detection method according to claim 9, wherein: the size of the input image is adjusted to 1024 x 1024 pixels, and the precision AP and the average precision mAP of all kinds of targets in the data set are used as the measurement indexes of the experiment.
CN202210827268.7A 2022-07-13 2022-07-13 Improved Orientdrcnn-based rotating target detection method Pending CN115272846A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210827268.7A CN115272846A (en) 2022-07-13 2022-07-13 Improved Orientdrcnn-based rotating target detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210827268.7A CN115272846A (en) 2022-07-13 2022-07-13 Improved Orientdrcnn-based rotating target detection method

Publications (1)

Publication Number Publication Date
CN115272846A true CN115272846A (en) 2022-11-01

Family

ID=83765467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210827268.7A Pending CN115272846A (en) 2022-07-13 2022-07-13 Improved Orientdrcnn-based rotating target detection method

Country Status (1)

Country Link
CN (1) CN115272846A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908908A (en) * 2022-11-14 2023-04-04 北京卫星信息工程研究所 Remote sensing image gathering type target identification method and device based on graph attention network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115908908A (en) * 2022-11-14 2023-04-04 北京卫星信息工程研究所 Remote sensing image gathering type target identification method and device based on graph attention network
CN115908908B (en) * 2022-11-14 2023-09-15 北京卫星信息工程研究所 Remote sensing image aggregation type target recognition method and device based on graph attention network

Similar Documents

Publication Publication Date Title
CN111639692B (en) Shadow detection method based on attention mechanism
Guo et al. Data‐driven flood emulation: Speeding up urban flood predictions by deep convolutional neural networks
CN111191654B (en) Road data generation method and device, electronic equipment and storage medium
CN115240121B (en) Joint modeling method and device for enhancing local features of pedestrians
CN103324753B (en) Based on the image search method of symbiotic sparse histogram
CN116363526A (en) MROCNet model construction and multi-source remote sensing image change detection method and system
CN109492610A (en) A kind of pedestrian recognition methods, device and readable storage medium storing program for executing again
Zheng et al. Feature enhancement for multi-scale object detection
CN115601562A (en) Fancy carp detection and identification method using multi-scale feature extraction
Guo et al. Fully convolutional DenseNet with adversarial training for semantic segmentation of high-resolution remote sensing images
CN115272846A (en) Improved Orientdrcnn-based rotating target detection method
CN115063833A (en) Machine room personnel detection method based on image layered vision
CN111368637A (en) Multi-mask convolution neural network-based object recognition method for transfer robot
Yuan et al. A cross-scale mixed attention network for smoke segmentation
Zhao et al. ST-YOLOA: a Swin-transformer-based YOLO model with an attention mechanism for SAR ship detection under complex background
Cyganek An analysis of the road signs classification based on the higher-order singular value decomposition of the deformable pattern tensors
CN109284752A (en) A kind of rapid detection method of vehicle
CN112365508A (en) SAR remote sensing image water area segmentation method based on visual attention and residual error network
CN110826478A (en) Aerial photography illegal building identification method based on countermeasure network
CN116704324A (en) Target detection method, system, equipment and storage medium based on underwater image
CN115578599A (en) Polarized SAR image classification method based on superpixel-hypergraph feature enhancement network
Thirumaladevi et al. Multilayer feature fusion using covariance for remote sensing scene classification
Wang et al. Extraction of main urban roads from high resolution satellite images by machine learning
Liang et al. Transformer-based multi-scale feature fusion network for remote sensing change detection
Jabshetti et al. Object detection using Regionlet transform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination