CN113657214B - Building damage assessment method based on Mask RCNN - Google Patents
Building damage assessment method based on Mask RCNN Download PDFInfo
- Publication number
- CN113657214B CN113657214B CN202110876141.XA CN202110876141A CN113657214B CN 113657214 B CN113657214 B CN 113657214B CN 202110876141 A CN202110876141 A CN 202110876141A CN 113657214 B CN113657214 B CN 113657214B
- Authority
- CN
- China
- Prior art keywords
- feature
- frame
- network
- layer
- building
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 239000011159 matrix material Substances 0.000 claims abstract description 9
- 230000001629 suppression Effects 0.000 claims abstract description 9
- 238000013528 artificial neural network Methods 0.000 claims abstract description 4
- 238000002372 labelling Methods 0.000 claims description 14
- 238000005070 sampling Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 238000012216 screening Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000001351 cycling effect Effects 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims 1
- 238000010276 construction Methods 0.000 claims 1
- 238000011156 evaluation Methods 0.000 abstract description 13
- 238000003062 neural network model Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 4
- 231100000809 Damage Assessment Model Toxicity 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
A building damage assessment method based on Mask RCNN relates to the field of building damage assessment. The method solves the problem that the neural network model has low precision in the building damage assessment task. The invention uses ResNet50-vd as a main network architecture of a feature pyramid network as a feature extraction part to realize the feature extraction of strong semantics and strong resolution of post-disaster images; the method comprises the steps that strong semantic features are input into a regional recommendation network, a recommendation frame and a recommendation frame category are obtained, a plurality of feature matrixes are generated, a shared feature layer is formed, and features with strong resolution are input into a RoIAlign layer; inputting each feature matrix to the RoIAlign layer, and obtaining a plurality of feature graphs after scaling the feature matrix to k multiplied by k; a plurality of feature maps are processed simultaneously, mask RCNN is generated through a full convolution neural network, and redundancy is removed through a non-maximum suppression algorithm; and obtaining the position of the building in the satellite image, and scoring the reliability of the prediction result according to the damage degree of the building. The method mainly realizes the hierarchical evaluation of the damage degree of the building and is applied to the disaster loss evaluation.
Description
Technical Field
The invention relates to the field of building damage assessment, in particular to a building damage assessment method based on Mask RCNN.
Background
The traditional damage evaluation research on buildings is mostly realized by collecting post-disaster field data and manually analyzing according to established evaluation standards. When a disaster occurs, it is very difficult to collect building image information on site, a lot of time is required, and benign promotion cannot be brought to the development of post-disaster emergency and post-disaster evaluation work.
With the development of artificial intelligence related theory, researchers at home and abroad begin to try to apply the methods of deep learning and machine learning to building damage evaluation of aerial photography and satellite images. However, the precision of the baseline model, the Mask RCNN-based instance segmentation model and the semantic segmentation model in the building damage assessment task is general, which is not beneficial to post-disaster assessment work.
Disclosure of Invention
The invention aims to provide a building damage assessment method based on Mask RCNN, which solves the problem of low accuracy of a neural network model in a building damage assessment task.
A building damage assessment method based on Mask RCNN comprises the following steps:
step one: the ResNet50-vd is applied to assist in the main network architecture of the feature pyramid network Feature Pyramid Networks as a feature extraction part, so that the feature extraction of strong semantics and strong resolution of post-disaster images is realized;
wherein the features of strong semantics and the features of strong resolution are called shared features;
step two: the obtained strong semantic features and the strong resolution feature strong semantic features in the shared features are input to a regional recommendation network Region Proposal Network, the strong resolution features are input to a RoIAlign layer, a suggestion frame and a suggestion frame category are obtained through the regional recommendation network Region Proposal Network, and a plurality of feature matrixes are generated to form a shared feature layer;
step three: inputting each feature matrix into the RoIAlign layer, and scaling all the feature matrices into a k multiplied by k feature map to obtain a plurality of feature maps;
step four: the classification and the bounding box regression are realized through the simultaneous processing of the feature maps, the Mask RCNN is generated through the full convolution neural network Fully Convolutional Network, and the redundancy is removed through a non-maximum suppression algorithm;
step five: and obtaining the position of each building in the satellite image, the damage degree of the building and the reliability score of the prediction result.
The step one, in which the application ResNet50-vd is assisted by the backbone network architecture of the feature pyramid network Feature Pyramid Networks, is used as a feature extraction part, and includes the steps of:
the feature extraction network is ResNet50-vd, and the feature extraction network is recorded by the output of the last layer of each stage and is used as a pyramid layer of the stage to obtain features of semantics with different degrees;
the pyramid level above the middle layer is up-sampled to obtain high-resolution features, and the high-resolution features are connected with the output of the strong semantics obtained through a bottom-up structure in the transverse direction, so that the strong semantics are constructed and the high-resolution features are also obtained;
the feature mapping size of the transverse connection is consistent by upsampling the spatial resolution to 2 times;
the multi-layer features with different resolutions can be obtained through the FPN, and the relation between the selected feature level and the scale of the target to be detected is as follows:
wherein the width of RoI on the input image is w, the height of RoI on the input image is h, and the feature layer corresponding to the object with the size of 224 multiplied by 224 is denoted as k 0 Layer, lower integral function is recorded as
And step two, screening the candidate frames by combining the category scores of the candidate frames to obtain a group of rectangular suggestion frames with scores, wherein parameters of the rectangular suggestion frames are as follows:
at each sliding window, a maximum number k of recommended frames called anchor frames can be obtained;
recommending that the reg output parameter of each boundary box is 4k, and the cls output parameter of each boundary box classification layer is 2k;
selecting the area of an anchor point frame as 32, wherein the length of the anchor point frame is 0.5; the area of the anchor point frame is 64, and the length of the anchor point frame is 1.0; the area of the anchor point frame is 128, and the length of the anchor point frame is 1.5; the area of the anchor point frame is 256, and the length of the anchor point frame is 2.0;
the regression parameters of the candidate boxes in the RPN are:
wherein (x, y) represents the center point of the rectangular frame, w represents the width of the rectangular frame, and h represents the height of the rectangular frame; x represents the index of the prediction frame of the regional recommendation network Region Proposal Network, x a Index, x, representing anchor block * An index representing a real frame; t is t x Representing the offset, t, of the prediction frame of the x-axis directional regional recommendation network Region Proposal Network y Indicating the offset, t, of the predicted frame of the y-axis directional regional recommendation network Region Proposal Network x * Representing the offset of a real frame in the x-axis direction, t y * Representing the offset of the real frame in the y-axis direction; t is t w Representing the width scale offset, t, of a rectangular box h Representing the height scale offset, t, of a rectangular box w * Representing the real width scale offset of a rectangular frame, t h * And representing the true height scale offset of the rectangular frame.
The process of inputting a feature matrix into the RoIAlign layer and scaling it to a k×k feature map in the third step is as follows:
the RoIAlign layer uses bilinear interpolation when generating the feature map, so that the alignment problem is avoided;
in the pooled output size of 2 x 2 grids, for each pooled output, 4 sampling points are selected from each grid, the pixel coordinates of the sampling points are floating point values, and the value of each sampling point is realized through bilinear interpolation of the characteristic values of four surrounding pixel points;
and taking the maximum value of the values at the 4 sampling points to output as the output value of the small grid, and finally obtaining a characteristic diagram with fixed size of k multiplied by k, wherein k=7 is used for predicting the final result.
The operation procedure for removing redundancy through a non-maximum suppression algorithm (NMS) in the fourth step is as follows:
and sorting the bounding boxes according to the confidence scores of the bounding boxes by the NMS, selecting the bounding box with the largest confidence score as a predicted result, if the intersection ratio (IoU) of other predicted results and the bounding box is larger than a set threshold value, considering the predicted results as the same predicted result, regarding the bounding box corresponding to the predicted result as a redundant box, deleting all the redundant boxes, judging the next predicted result, traversing all the candidate boxes, and finishing the operation of removing the redundancy.
The invention solves the problem of low precision of the neural network model in the building damage assessment task in the prior art, and has the technical effects that:
the hierarchical evaluation of the damage degree of the building can clearly determine the boundary of the building, achieve higher segmentation precision, and apply the evaluation result as input to the disaster damage evaluation, thereby providing important references for subsequent disaster evaluation work.
The method provided by the invention is suitable for grading evaluation of the damage degree of the building.
Drawings
FIG. 1 is a Mask-RCNN based building damage assessment model;
FIG. 2 is a feature pyramid network architecture;
FIG. 3 is a structure of a regional recommendation network;
fig. 4 is a schematic diagram of the data preprocessing effect.
Detailed Description
Embodiment one: the present embodiment will be described with reference to fig. 1. The building damage assessment method based on Mask RCNN of the embodiment comprises the following steps:
step one: the ResNet50-vd is applied to assist in the main network architecture of the feature pyramid network Feature Pyramid Networks as a feature extraction part, so that the feature extraction of strong semantics and strong resolution of post-disaster images is realized;
wherein the features of strong semantics and the features of strong resolution are called shared features;
step two: the obtained strong semantic features and the strong resolution feature strong semantic features in the shared features are input to a regional recommendation network Region Proposal Network, the strong resolution features are input to a RoIAlign layer, a suggestion frame and a suggestion frame category are obtained through the regional recommendation network Region Proposal Network, and a plurality of feature matrixes are generated to form a shared feature layer;
step three: inputting each feature matrix into the RoIAlign layer, and scaling all the feature matrices into a k multiplied by k feature map to obtain a plurality of feature maps;
step four: the classification and the bounding box regression are realized through the simultaneous processing of the feature maps, the Mask RCNN is generated through the full convolution neural network Fully Convolutional Network, and the redundancy is removed through a non-maximum suppression algorithm;
step five: and obtaining the position of each building in the satellite image, the damage degree of the building and the reliability score of the prediction result.
The step one, in which the application ResNet50-vd is assisted by the backbone network architecture of the feature pyramid network Feature Pyramid Networks, is used as a feature extraction part, and includes the steps of:
the feature extraction network is ResNet50-vd, and the feature extraction network is recorded by the output of the last layer of each stage and is used as a pyramid layer of the stage to obtain features of semantics with different degrees;
as shown in fig. 2, the pyramid level above the middle is up-sampled to obtain high-resolution features, and the high-resolution features are connected with the output of the strong semantics obtained through the bottom-up structure in the transverse direction, so that the strong semantics are constructed and the high-resolution features are also obtained;
the feature mapping size of the transverse connection is consistent by upsampling the spatial resolution to 2 times;
the multi-layer features with different resolutions can be obtained through the FPN, and the relation between the selected feature level and the scale of the target to be detected is as follows:
wherein the width of RoI on the input image is w, the height of RoI on the input image is h, and the feature layer corresponding to the object with the size of 224 multiplied by 224 is denoted as k 0 Layer, lower integral function is recorded as
The generating method of the regional recommendation network Region Proposal Network includes:
applying a sliding window mode with a 3 x 3 network structure to the input shared feature layer, wherein the application result of each sliding window is a one-dimensional feature vector;
the feature vector is used as an input vector by the full-connection layers of the two same level, one full-connection layer is a boundary frame recommendation layer reg, the other full-connection layer is a boundary frame classification layer cls, corresponding boundary frame regression parameters and probabilities of being predicted as a foreground and a background are finally obtained for each anchor frame anchor, and all initially obtained candidate frames are obtained by combining all preset anchor frames;
wherein one anchor block corresponds to one candidate block;
and screening the candidate frames according to the category scores of the candidate frames, wherein the category scores and the screening of the candidate frames are mainly realized based on a non-maximum suppression algorithm, a group of rectangular suggestion frames with scores are obtained to output, and the structure of the regional suggestion network is shown in figure 3.
And step three, screening the candidate frames by combining the category scores of the candidate frames to obtain a group of rectangular suggestion frames with scores, wherein parameters of the rectangular suggestion frames are as follows:
at each sliding window, a maximum number k of recommended frames called anchor frames can be obtained;
recommending that the reg output parameter of each boundary box is 4k, and the cls output parameter of each boundary box classification layer is 2k;
selecting the area of an anchor point frame as 32, wherein the length of the anchor point frame is 0.5; the area of the anchor point frame is 64, and the length of the anchor point frame is 1.0; the area of the anchor point frame is 128, and the length of the anchor point frame is 1.5; the area of the anchor point frame is 256, and the length of the anchor point frame is 2.0;
the regression parameters of the candidate boxes in the RPN are:
wherein (x, y) represents the center point of the rectangular frame, w represents the width of the rectangular frame, and h represents the height of the rectangular frame; x represents the index of the prediction frame of the regional recommendation network Region Proposal Network, x a Index, x, representing anchor block * An index representing a real frame; t is t x Representing the offset, t, of the prediction frame of the x-axis directional regional recommendation network Region Proposal Network y Indicating the offset, t, of the predicted frame of the y-axis directional regional recommendation network Region Proposal Network x * Representing the offset of a real frame in the x-axis direction, t y * Representing the offset of the real frame in the y-axis direction; t is t w Representing the width scale offset, t, of a rectangular box h Representing the height scale offset, t, of a rectangular box w * Representing the real width scale offset of a rectangular frame, t h * And representing the true height scale offset of the rectangular frame.
In the fourth step, the process of inputting a feature matrix to the RoIAlign layer and scaling the feature matrix to a k×k feature map is as follows:
the RoIAlign layer uses bilinear interpolation when generating the feature map, so that the alignment problem is avoided;
in the pooled output size of 2 x 2 grids, for each pooled output, 4 sampling points are selected from each grid, the pixel coordinates of the sampling points are floating point values, and the value of each sampling point is realized through bilinear interpolation of the characteristic values of four surrounding pixel points;
and taking the maximum value of the values at 4 sampling points to output as the output value of the small grid, and finally obtaining a characteristic diagram with fixed size of k multiplied by k, wherein k=7, and the characteristic diagram is used for predicting the final result.
The operation procedure for removing redundancy through a non-maximum suppression algorithm (NMS) is as follows:
and sorting the bounding boxes according to the confidence scores of the bounding boxes by the NMS, selecting the bounding box with the largest confidence score as a predicted result, if the intersection ratio (IoU) of other predicted results and the bounding box is larger than a set threshold value, considering the predicted results as the same predicted result, regarding the bounding box corresponding to the predicted result as a redundant box, deleting all the redundant boxes, judging the next predicted result, traversing all the candidate boxes, and finishing the operation of removing the redundancy.
In this embodiment, a Mask RCNN-based building damage assessment model is trained:
converting the labeling format of the original data set into the labeling format of the COCO data set, and generating corresponding labeling files for the pre-disaster image and the post-disaster image respectively; as shown in fig. 4, the positioning and classification accuracy of the small-area building is improved by performing preprocessing of random overturn and random clipping on the COCO data set; model overfitting is prevented by batch regularization of the data in the COCO dataset.
Random clipping of the COCO dataset:
(1) Setting a series of IoU thresholds, and randomly scrambling the sequence of the thresholds to judge whether the cut candidate region is effective or not;
(2) Traversing all IoU thresholds, returning the original image and corresponding labeling data if the current IoU threshold is 0 and not cutting, otherwise randomly selecting the scaling of the short side of the preset rectangular frame to obtain the width, height and coordinates of the cutting start point of the candidate cutting area;
(3) Comparing the obtained candidate clipping region with a IoU threshold value of a real labeling frame, and if both the obtained candidate clipping region and the real labeling frame are smaller than a maximum threshold value, continuing traversing the threshold value to circulate the last step;
(4) Screening all real marking frames in the cutting area, if the number of the effective frames is 0, cycling the step, otherwise, entering the next step;
(5) Calculating the region coordinates obtained by relatively cutting the effective real labeling frame;
(6) And calculating the coordinates of the effective segmentation area relative to the clipping area.
The present embodiment introduces Sigmoid global loss as a classification loss function. In order to optimize the training process of the model, a wall up mechanism is used, the learning rate is increased to 0.001 after 10000 steps are iterated, and then a learning rate decrementing strategy is performed.
The training environment and parameter configuration of this embodiment are shown in table 1.
Table 1 training configuration
In the embodiment, the F1 value is taken as a main evaluation measurement index, and the performances of different network models in a building damage evaluation task are compared and analyzed:
wherein, the F1 precision and recall are the harmonic mean. In general, the accuracy and recall are interactive, and when the recall is high, the accuracy is low, and the F1 value is measured to ensure that both are high.
Table 2 model performance comparison
The building damage assessment model based on Mask RCNN obtained through training in the embodiment achieves the optimal F1 value. The model not only realizes end-to-end training, but also fully utilizes the labeling information of the data set to finally realize the example segmentation of the damage condition of the building in order to better identify the boundary of the building.
Claims (8)
1. A method for assessing building damage based on Mask RCNN, the method comprising:
step one, a main network architecture of a feature pyramid network Feature Pyramid Networks is used as a feature extraction part by using ResNet50-vd as an auxiliary, so that the feature extraction of strong semantics and strong resolution of post-disaster images is realized;
wherein the features of strong semantics and the features of strong resolution are called shared features;
step two, strong semantic features in the obtained shared features are input into a region recommendation network Region Proposal Network, features with strong resolution are input into a RoIAlign layer, a suggestion frame and suggestion frame categories are obtained through the region recommendation network Region Proposal Network, and a plurality of feature matrixes are generated to form a shared feature layer;
step three, inputting each feature matrix into the RoIAlign layer, and scaling all feature matrices into a scaled formObtaining a plurality of feature maps;
step four, simultaneously processing the plurality of feature maps to realize classification and bounding box regression, generating Mask RCNN through a full convolution neural network Fully Convolutional Network, and removing redundancy through a non-maximum suppression algorithm;
step five, obtaining the position of each building in the satellite image, the degree of damage of the building and the reliability scoring of the prediction result;
the application ResNet50-vd is assisted by a backbone network architecture of a feature pyramid network Feature Pyramid Networks as a feature extraction part, and the steps include:
the feature extraction network is ResNet50-vd, and the feature extraction network is recorded by the output of the last layer of each stage and is used as a pyramid layer of the stage to obtain features of semantics with different degrees;
the pyramid level above the middle layer is up-sampled to obtain high-resolution features, and the high-resolution features are connected with the output of the strong semantics obtained through a bottom-up structure in the transverse direction, so that the strong semantics are constructed and the high-resolution features are also obtained;
the feature mapping size of the transverse connection is consistent by upsampling the spatial resolution to 2 times;
the multi-layer features with different resolutions can be obtained through the FPN, and the relation between the selected feature level and the scale of the target to be detected is as follows:
wherein the RoI width on the input image isThe RoI on the input image is +.>The feature layer corresponding to the object with the size 224×224 is marked as +.>The layer, the lower rounding function is marked +.>;
The operation procedure for removing redundancy by the non-maximum suppression algorithm NMS is as follows:
and sorting the boundary frames according to the confidence scores of the boundary frames by the NMS, selecting the boundary frame with the largest confidence score as a prediction result, if the intersection ratio IoU of other prediction results and the boundary frame is larger than the set threshold, considering the prediction results as the same prediction result, regarding the boundary frame corresponding to the prediction result as a redundant frame, deleting all the redundant frames, judging the next prediction result, traversing all the candidate frames, and finishing the operation of removing the redundancy.
2. The building damage assessment method based on Mask RCNN according to claim 1, wherein the generating method of the area recommendation network Region Proposal Network is as follows:
applying a sliding window mode with a 3 x 3 network structure to the input shared feature layer, wherein the application result of each sliding window is a one-dimensional feature vector;
the feature vector is used as an input vector by the full-connection layers of the two same level, one full-connection layer is a boundary frame recommendation layer reg, the other full-connection layer is a boundary frame classification layer cls, corresponding boundary frame regression parameters and probabilities of being predicted as a foreground and a background are finally obtained for each anchor frame anchor, and all initially obtained candidate frames are obtained by combining all preset anchor frames;
wherein one anchor block corresponds to one candidate block;
and screening the candidate frames according to the category scores of the candidate frames, wherein the screening of the category scores and the candidate frames is mainly realized based on a non-maximum suppression algorithm, and a group of rectangular suggestion frames with scores are obtained.
3. The method for evaluating building damage based on Mask RCNN according to claim 1, wherein the process of screening the candidate frames by combining the category scores thereof to obtain a set of rectangular suggestion frames with scores is characterized in that parameters of the rectangular suggestion frames are as follows:
at each sliding window, the maximum number can be obtainedIs called an anchor point frame;
the recommended layer reg output parameters for each bounding box areThe output parameter of each bounding box class layer cls is +.>;
Selecting the area of an anchor point frame as 32, wherein the length of the anchor point frame is 0.5; the area of the anchor point frame is 64, and the length of the anchor point frame is 1.0; the area of the anchor point frame is 128, and the length of the anchor point frame is 1.5; the area of the anchor point frame is 256, and the length of the anchor point frame is 2.0;
the regression parameters of the candidate boxes in the RPN are:
wherein,representing the center point of a rectangular box,/->Representing the width of a rectangular frame +.>Representing the height of the rectangular box; />Index representing prediction box of regional recommendation network Region Proposal Network, < >>Index representing anchor block,/->An index representing a real frame; />Representation->The axial direction region recommendation network Region Proposal Network predicts the offset of the frame, +.>Representation->Axis-oriented Region recommended network Region ProOffset of the pos Network prediction box,/-for>Representation->Offset of the true frame in axial direction, +.>Representation->Offset of the real frame in the axial direction; />Representing the width-scale offset of a rectangular box, +.>Height scale offset representing rectangular box, +.>Representing the real width scale offset of the rectangular frame, +.>And representing the true height scale offset of the rectangular frame.
4. The method for building damage assessment based on Mask RCNN of claim 1, wherein said inputting a feature matrix into the RoIALign layer scales it toThe process of the feature map of (a) is as follows:
the RoIAlign layer uses bilinear interpolation in generating the feature map;
in the pooled output size of 2 x 2 grids, for each pooled output, 4 sampling points are selected from each grid, the pixel coordinates of the sampling points are floating point values, and the value of each sampling point is realized through bilinear interpolation of the characteristic values of four surrounding pixel points;
the value at the 4 sampling points is taken as the maximum value and is output as the output value of the small grid, and finally the fixed size is obtainedWherein->。
5. The Mask-based RCNN building damage assessment method of claim 1, further comprising an assessment model training process, wherein the assessment model training process is:
converting the labeling format of the original data set into the labeling format of the COCO data set, and generating corresponding labeling files for the pre-disaster image and the post-disaster image respectively;
the COCO data set is subjected to pretreatment of random overturning and random cutting so as to improve the positioning and classification precision of the small-area building;
model overfitting is prevented by batch regularization of the data in the COCO dataset.
6. The Mask RCNN based building damage assessment method according to claim 5, further comprising an assessment model training process, wherein the random clipping of the COCO dataset:
(1) Setting a series of cross ratio IoU thresholds, and randomly scrambling the sequence of the thresholds to judge whether the cut candidate areas are effective or not;
(2) Traversing all cross ratio IoU thresholds, returning the original image and corresponding marking data if the current cross ratio IoU threshold is 0 and not cutting, otherwise randomly selecting the scaling of the short side of the preset rectangular frame to obtain the width, height and coordinates of the cutting start point of the candidate cutting area;
(3) Comparing the obtained intersection ratio IoU threshold value of the candidate clipping region and the real labeling frame, if both the intersection ratio IoU threshold value and the real labeling frame are smaller than the maximum threshold value, continuing traversing the threshold value to circularly carry out the last step;
(4) Screening all real marking frames in the cutting area, if the number of the effective frames is 0, cycling the step, otherwise, entering the next step;
(5) Calculating the region coordinates obtained by relatively cutting the effective real labeling frame;
(6) And calculating the coordinates of the effective segmentation area relative to the clipping area.
7. The method for estimating building damage based on Mask RCNN according to claim 5, further comprising an estimation model training process, wherein the estimation model training process is optimized, and the estimation model training process is optimized as follows: and using a norm up mechanism, increasing the learning rate to 0.001 after the mechanism iterates 10000 steps, and carrying out a learning rate decrementing strategy.
8. A computer device, characterized by: comprising a memory and a processor, said memory having stored therein a computer program, which when executed by said processor performs a Mask RCNN based construction damage assessment method according to any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110876141.XA CN113657214B (en) | 2021-07-30 | 2021-07-30 | Building damage assessment method based on Mask RCNN |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110876141.XA CN113657214B (en) | 2021-07-30 | 2021-07-30 | Building damage assessment method based on Mask RCNN |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113657214A CN113657214A (en) | 2021-11-16 |
CN113657214B true CN113657214B (en) | 2024-04-02 |
Family
ID=78478203
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110876141.XA Active CN113657214B (en) | 2021-07-30 | 2021-07-30 | Building damage assessment method based on Mask RCNN |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113657214B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111210443A (en) * | 2020-01-03 | 2020-05-29 | 吉林大学 | Deformable convolution mixing task cascading semantic segmentation method based on embedding balance |
CN111461110A (en) * | 2020-03-02 | 2020-07-28 | 华南理工大学 | Small target detection method based on multi-scale image and weighted fusion loss |
CN112560671A (en) * | 2020-12-15 | 2021-03-26 | 哈尔滨工程大学 | Ship detection method based on rotary convolution neural network |
-
2021
- 2021-07-30 CN CN202110876141.XA patent/CN113657214B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111210443A (en) * | 2020-01-03 | 2020-05-29 | 吉林大学 | Deformable convolution mixing task cascading semantic segmentation method based on embedding balance |
CN111461110A (en) * | 2020-03-02 | 2020-07-28 | 华南理工大学 | Small target detection method based on multi-scale image and weighted fusion loss |
CN112560671A (en) * | 2020-12-15 | 2021-03-26 | 哈尔滨工程大学 | Ship detection method based on rotary convolution neural network |
Also Published As
Publication number | Publication date |
---|---|
CN113657214A (en) | 2021-11-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110738207B (en) | Character detection method for fusing character area edge information in character image | |
CN111652217B (en) | Text detection method and device, electronic equipment and computer storage medium | |
CN110189255B (en) | Face detection method based on two-stage detection | |
CN112967243A (en) | Deep learning chip packaging crack defect detection method based on YOLO | |
CN107784288B (en) | Iterative positioning type face detection method based on deep neural network | |
CN111291637A (en) | Face detection method, device and equipment based on convolutional neural network | |
CN111523414A (en) | Face recognition method and device, computer equipment and storage medium | |
CN111696094A (en) | Immunohistochemical PD-L1 membrane staining pathological section image processing method, device and equipment | |
CN112233129B (en) | Deep learning-based parallel multi-scale attention mechanism semantic segmentation method and device | |
CN108305260B (en) | Method, device and equipment for detecting angular points in image | |
CN111291759A (en) | Character detection method and device, electronic equipment and storage medium | |
CN114266794B (en) | Pathological section image cancer region segmentation system based on full convolution neural network | |
CN112232371A (en) | American license plate recognition method based on YOLOv3 and text recognition | |
CN116994140A (en) | Cultivated land extraction method, device, equipment and medium based on remote sensing image | |
CN116645592B (en) | Crack detection method based on image processing and storage medium | |
CN114332473A (en) | Object detection method, object detection device, computer equipment, storage medium and program product | |
CN110909615A (en) | Target detection method based on multi-scale input mixed perception neural network | |
CN112419202A (en) | Wild animal image automatic identification system based on big data and deep learning | |
CN114299383A (en) | Remote sensing image target detection method based on integration of density map and attention mechanism | |
CN114445356A (en) | Multi-resolution-based full-field pathological section image tumor rapid positioning method | |
CN116486393A (en) | Scene text detection method based on image segmentation | |
CN115170978A (en) | Vehicle target detection method and device, electronic equipment and storage medium | |
CN114639102A (en) | Cell segmentation method and device based on key point and size regression | |
CN110728675A (en) | Pulmonary nodule analysis device, model training method, device and analysis equipment | |
CN117131348B (en) | Data quality analysis method and system based on differential convolution characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |