CN116071607B - Reservoir aerial image classification and image segmentation method and system based on residual error network - Google Patents

Reservoir aerial image classification and image segmentation method and system based on residual error network Download PDF

Info

Publication number
CN116071607B
CN116071607B CN202310212759.5A CN202310212759A CN116071607B CN 116071607 B CN116071607 B CN 116071607B CN 202310212759 A CN202310212759 A CN 202310212759A CN 116071607 B CN116071607 B CN 116071607B
Authority
CN
China
Prior art keywords
image
classification
segmentation
network
reservoir
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310212759.5A
Other languages
Chinese (zh)
Other versions
CN116071607A (en
Inventor
霍吉东
张�杰
王际朝
杨俊钢
阮宗利
王娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Petroleum East China
Original Assignee
China University of Petroleum East China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Petroleum East China filed Critical China University of Petroleum East China
Priority to CN202310212759.5A priority Critical patent/CN116071607B/en
Publication of CN116071607A publication Critical patent/CN116071607A/en
Application granted granted Critical
Publication of CN116071607B publication Critical patent/CN116071607B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a reservoir aerial image classification and image segmentation method and system based on a residual error network, and relates to the technical field of artificial intelligence, wherein the method comprises the following steps: acquiring an aerial image of a reservoir to be measured; inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting characteristics of the network-detected extracted reservoir aerial image through ResNet characteristics with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics; the method comprises the steps of inputting reservoir aerial images with predicted categories of land and water intersecting images into an image segmentation model, extracting candidate region features of the land and water intersecting images through a segmentation network based on multi-scale fusion, obtaining category predicted values of the candidate regions through a classification network based on global average pooling, and outputting water areas and land segmented in the land and water intersecting images based on the category predicted values. The invention can realize the classification of reservoir aerial images and the land and water region segmentation with higher accuracy.

Description

Reservoir aerial image classification and image segmentation method and system based on residual error network
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a reservoir aerial image classification and image segmentation method and system based on a residual error network.
Background
The reservoir is an artificial lake formed by constructing a barrage at the narrow mouth of a mountain ditch or river. After the reservoir is built, the effects of flood control, water storage irrigation, water supply, power generation, fish culture and the like can be achieved. The unmanned aerial vehicle and other data acquisition equipment are used for acquiring the aerial image of the reservoir, and the water surface area of the reservoir is monitored according to the aerial image of the reservoir, so that the unmanned aerial vehicle is beneficial to the fields of fishery management, flood control monitoring and the like. Therefore, the method has important significance in classifying the aerial images of the reservoirs and segmenting the images on land and water and identifying the water areas of the reservoirs.
At present, the scheme for classifying and dividing images based on the traditional convolutional neural network can achieve the purposes of classifying and dividing images, but has poor effect. Due to the interference of natural environment factors such as seasons, illumination and the like, large differences exist among water domain parts in the acquired aerial images of different reservoirs, so that the characteristic differences extracted by the existing method are large, and the final classification segmentation is affected. That is, the existing feature extraction network model cannot fully mine the land and water regional feature information through data with complex conditions.
In the traditional convolutional neural network, a full-connection layer is generally used for classification, but the full-connection layer has redundant parameters, the phenomenon of fitting is easy to occur, in addition, in a common image segmentation task, the region to be segmented generally occupies only a small part of the image, and the scheme is difficult to be applied to the condition that the water area ratio in the reservoir aerial image is greatly changed, and the semantic segmentation result with better performance cannot be realized.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a reservoir aerial image classification and image segmentation method and system based on a residual error network, which adopts an independently constructed reservoir aerial data set, fully utilizes a deep learning technology to mine different visual characteristic information between a water area and a land area, combines an image recognition and semantic segmentation network, and realizes high-accuracy and strong-robustness image classification of reservoir aerial images and segmentation of land and water areas in the images.
In a first aspect, the present disclosure provides a reservoir aerial image classification and image segmentation method based on a residual network, including:
acquiring an aerial image of a reservoir to be detected, and preprocessing the aerial image of the reservoir to be detected;
inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting characteristics of the network-detected extracted reservoir aerial image through ResNet characteristics with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
the method comprises the steps of inputting reservoir aerial images with predicted categories of land and water intersecting images into an image segmentation model, extracting candidate region features of the land and water intersecting images through a segmentation network based on multi-scale fusion, obtaining category predicted values of the candidate regions through a classification network based on global average pooling, and outputting water areas and land segmented in the land and water intersecting images based on the category predicted values.
According to a further technical scheme, the training process of the image classification model and the image segmentation model comprises the following steps:
acquiring reservoir aerial images to construct an original data set, and labeling the reservoir aerial images in the original data set with water Liu Leibie to construct a classification training sample set; carrying out land and water region labeling on land and water intersection images in the reservoir aerial image to construct a land and water intersection segmentation training sample set;
constructing an image classification model and an image segmentation model; the image classification model comprises a ResNet feature extraction network with dilation convolution and a classification network based on global average pooling, and the image segmentation model comprises a segmentation network based on multi-scale fusion and a classification network based on global average pooling;
and training an image classification model and an image segmentation model by using the classification training sample set and the amphibious cross segmentation training sample set to complete training of the model.
According to a further technical scheme, the method further comprises the step of preprocessing the reservoir aerial image before labeling the reservoir aerial image in the original data set; the preprocessing includes unifying image sizes and deleting duplicate images.
According to a further technical scheme, the ResNet feature extraction network with the expansion convolution is used for extracting features of reservoir aerial images; the classification network based on global average pooling is used for carrying out class prediction according to input features; the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image.
According to a further technical scheme, the ResNet feature extraction network with expanded convolution is constructed on the basis of a residual network ResNet-50 front 5-layer convolution layer, and comprises the following steps:
introducing an expansion convolution kernel into a 4 th layer convolution layer and a 5 th layer convolution layer of the ResNet-50 basic network, setting convolution step sizes and expansion rates of the 4 th layer convolution layer and the 5 th layer convolution layer, and obtaining the ResNet characteristic extraction network with expansion convolution, which is suitable for the task of semantic segmentation of amphibious cross images.
According to a further technical scheme, the classification network based on global average pooling is used as a global average pooling layer, after feature vectors with the sizes of H and W and the channel number of D are input into the global average pooling layer, the input feature vectors H multiplied by W multiplied by D are convolved by using C convolution checks with the size of 1 multiplied by D, an H multiplied by W multiplied by C feature map is obtained, C is the total number of categories, and finally pooling is carried out by utilizing pooling checks with the size of the feature map to obtain a final category predicted value.
According to a further technical scheme, the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image, and the feature extraction comprises:
acquiring a water area candidate region in the amphibious intersection image by using a selection search algorithm, wherein the generated candidate region comprises a boundary frame, a foreground mask and a foreground size;
scaling each candidate region to a plurality of different scales, and inputting the water area candidate regions with the different scales and the feature images extracted by the feature extraction network into a segmentation network based on multi-scale fusion;
and mapping the features obtained through the feature extraction network onto each candidate region by utilizing the ROI pooling layer to obtain initial region features, and multiplying the initial region features by corresponding foreground masks on each channel to obtain the candidate region features.
According to the further technical scheme, after the candidate region features of the amphibious cross image are extracted, the candidate region features are subjected to global average pooling to obtain the category predicted value of the candidate region, then the regional category predicted value is mapped to each pixel in the candidate region through the region-to-pixel layer, pixel-level prediction classification of the amphibious cross image is completed, and the water area in the amphibious cross image is segmented.
In a second aspect, the present disclosure provides a reservoir aerial image classification and image segmentation system based on a residual network, comprising:
the image acquisition and preprocessing module is used for acquiring the reservoir aerial image to be detected and preprocessing the reservoir aerial image to be detected;
the image classification module is used for inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting the characteristics of the network-detected reservoir aerial image to be extracted through the ResNet characteristics with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
the image segmentation module is used for inputting the reservoir aerial image with the predicted category of the amphibious intersection image into the image segmentation model, extracting the candidate region characteristics of the amphibious intersection image through a segmentation network based on multi-scale fusion, obtaining the category predicted value of the candidate region through a classification network based on global average pooling, and outputting the water area and land segmented in the amphibious intersection image based on the category predicted value.
According to a further technical scheme, the training process of the image classification model and the image segmentation model comprises the following steps:
acquiring reservoir aerial images to construct an original data set, and labeling the reservoir aerial images in the original data set with water Liu Leibie to construct a classification training sample set; carrying out land and water region labeling on land and water intersection images in the reservoir aerial image to construct a land and water intersection segmentation training sample set;
constructing an image classification model and an image segmentation model; the image classification model comprises a ResNet feature extraction network with dilation convolution and a classification network based on global average pooling, and the image segmentation model comprises a segmentation network based on multi-scale fusion and a classification network based on global average pooling;
and training an image classification model and an image segmentation model by using the classification training sample set and the amphibious cross segmentation training sample set to complete training of the model.
The one or more of the above technical solutions have the following beneficial effects:
1. the invention provides a reservoir aerial image classification and image segmentation method and system based on a residual error network, which adopt an independently constructed reservoir aerial image data set, fully utilize the deep learning technology to mine different visual characteristic information between a water area and a land area, and combine the image recognition and semantic segmentation network to ensure that a model has higher accuracy and stronger robustness for water area detection of the reservoir aerial image.
2. In the invention, the residual error network with expansion convolution is used for extracting the characteristics, so that a high-resolution characteristic diagram containing more information is obtained, and the adaptability and the reliability of the model are enhanced; the global average pooling classification method and the segmentation network based on multi-scale fusion are used, so that the generalization capability and performance of the model are improved, the land and water image classification result and the land and water intersection image segmentation result are fully fused, the water area detection segmentation with high quality and high accuracy is realized, and the water area accurate detection problem of the reservoir aerial image is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a reservoir aerial image classification and image segmentation method based on a residual network according to an embodiment of the invention;
FIG. 2 is a flowchart of an image classification model and image segmentation model training process according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a feature extraction network according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a classification network according to an embodiment of the invention;
fig. 5 is a schematic overall flow chart of reservoir aerial image classification and image segmentation based on a residual network according to an embodiment of the invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
Example 1
The embodiment provides a reservoir aerial image classification and image segmentation method based on a residual network, as shown in fig. 1, comprising the following steps:
acquiring an aerial image of a reservoir to be detected, and preprocessing the aerial image of the reservoir to be detected;
inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting characteristics of the network-detected extracted reservoir aerial image through ResNet characteristics with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
the method comprises the steps of inputting reservoir aerial images with predicted categories of land and water intersecting images into an image segmentation model, extracting candidate region features of the land and water intersecting images through a segmentation network based on multi-scale fusion, obtaining category predicted values of the candidate regions through a classification network based on global average pooling, and outputting water areas and land segmented in the land and water intersecting images based on the category predicted values.
Further, after the reservoir aerial image to be detected is obtained, preprocessing is carried out on the reservoir aerial image to be detected, wherein the preprocessing comprises unifying the image size, deleting repeated images and the like.
The training process of the image classification model and the image segmentation model, as shown in fig. 2, includes the following steps:
s1, acquiring reservoir aerial images to construct an original data set, and labeling the reservoir aerial images in the original data set by using water Liu Leibie to construct a classification training sample set; and labeling land and water areas of land and water intersecting images in the reservoir aerial image, and constructing a land and water intersecting segmentation training sample set.
In the embodiment, the unmanned aerial vehicle is utilized for autonomous aerial photography, a reservoir aerial photographic image is obtained, an original data set is constructed, the reservoir aerial photographic image classification data set and the amphibious cross segmentation data set are constructed through classification labeling and segmentation labeling on the basis of the original data set, and the classification data set and the amphibious cross segmentation data set are respectively used as a classification training sample set and an amphibious cross segmentation training sample set, and a classification model and a segmentation model are trained.
Specifically, firstly, acquiring reservoir aerial images under different shooting visual angles and background environments by using an unmanned plane, and autonomously constructing a reservoir aerial image original data set; dividing the original data set of the reservoir aerial image into three types of images of pure water area, pure land area and land-land intersection according to the land-water ratio, and constructing a reservoir aerial image classification data set; and finally, based on the classified land and water intersection image original data set, segmenting and labeling land and water areas in the image by using a labelme labeling tool, and constructing a land and water intersection segmentation data set of the reservoir aerial image.
Preferably, the method further comprises preprocessing the reservoir aerial image before labeling the reservoir aerial image in the original data set, wherein the preprocessing comprises unifying the image size, deleting repeated images and the like.
S2, constructing an image classification model and an image segmentation model, wherein the image classification model comprises a ResNet feature extraction network with expansion convolution and a classification network based on global average pooling; the image segmentation model comprises a segmentation network based on multi-scale fusion and a classification network based on global average pooling.
First, the ResNet feature extraction network with dilation convolution is used for extracting features of reservoir aerial images. In the embodiment, a ResNet-50 feature extraction network is improved according to the land and water region features of the reservoir aerial image, and the ResNet feature extraction network with expansion convolution, which is applicable to the land and water intersection image semantic segmentation task, is constructed.
Specifically, a ResNet feature extraction network with expanded convolution is constructed on the basis of the first 5 layers of convolution layers of a residual network ResNet-50. The structure of the ResNet feature extraction network with the dilation convolution is shown in fig. 3, wherein the normalized reservoir aerial image (RGB image) with the size of 224×224 is input into the ResNet feature extraction network with the dilation convolution, features of the reservoir aerial image are extracted, and a feature map with the size of 28×28 is output. Introducing expansion convolution kernels into a 4 th layer ConV4 and a 5 th layer ConV5 of a ResNet-50 basic network, setting the convolution step length of the 4 th layer ConV4 and the 5 th layer ConV5 as 1, setting the expansion ratio dilated of the 4 th layer ConV4 as 2 and the expansion ratio dilated of the 5 th layer ConV5 as 4, and further obtaining the ResNet feature extraction network with expansion convolution suitable for the amphibious cross image semantic segmentation task.
According to the embodiment, the ResNet-50 characteristic extraction network is improved according to the land and water area characteristics of the reservoir aerial image, the resolution of the improved ConV4 layer and ConV3 layer on output is improved by 2 times and 4 times respectively relative to the resolution of the ConV5 layer in the original residual difference network ResNet-50, the scale of the characteristic map is not changed through expansion convolution operation, the size of the final output characteristic map is 28 x 28, and more image space information is reserved.
And secondly, the classification network based on global average pooling is used for carrying out class prediction according to input features. In the embodiment, a classification network based on global average pooling is constructed to realize accurate land and water classification of reservoir aerial images, and the classification network is also suitable for semantic segmentation tasks of land and water intersection images.
Specifically, as shown in fig. 4, a global average pooling layer which is simultaneously applicable to image classification and semantic segmentation tasks is constructed to replace a traditional full-connection layer for classification. The global average pooling layer inputs feature vectors with the size of (H, W) and the channel number of D, then uses C convolution checks with the size of 1×1×D to input the feature vectors H×W×D for convolution to obtain H×W×C feature graphs, C is the total number of categories, and finally uses pooling check with the same size as the feature graphs for average pooling to obtain final category predicted values.
In this embodiment, when a global average pooling layer suitable for both image classification and semantic segmentation tasks is constructed, convolution of 1×1×d convolution check input feature vectors h×w×d may be used to implement cross-channel information integration, thereby improving land-water intersection region segmentation accuracy.
The global average pooling layer constructed by the embodiment can receive input features of any scale, uses global average pooling for each feature map to obtain an output, and the output represents a category prediction value, so that the classification prediction of the reservoir aerial image is realized. Compared with a common full-connection layer, the global average pooling layer constructed by the embodiment is more in accordance with a convolution structure, the corresponding relation between feature mapping and classification is enhanced, and meanwhile, as parameters needing to be optimized are not needed, the quantity of parameters is greatly reduced, so that the training process can be accelerated, and the risk of overfitting is reduced.
Finally, the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image. In this embodiment, for the land-water intersection image obtained by classification, a segmentation network based on multi-scale fusion is constructed to obtain a semantic segmentation model with high robustness.
Specifically, based on a multi-scale fused segmentation network, the method for extracting the candidate region characteristics of the amphibious intersection image comprises the following specific steps:
and S2.1, acquiring a water area candidate region in the amphibious intersecting image by using a selection search algorithm, wherein the generated candidate region comprises a boundary box (bounding box), a foreground mask (mask) and a foreground size (size), the boundary box represents the position of the candidate region on an original image, and the foreground mask is a 2-system mask which covers the candidate region and represents the water area.
And S2.2, scaling each candidate region to a plurality of different scales, mapping the features obtained by the original image through the feature extraction network to each candidate region with different scales by using the ROI pooling layer to obtain initial region features with different scales, and multiplying the initial region features with corresponding foreground masks on each channel for each initial region feature to obtain water area candidate region features.
Because the water area in the image is small and the single size is not suitable in the amphibious intersecting image data set, the embodiment provides a multi-scale fusion scheme aiming at the problem of different sizes of water areas, four scales of 3×3,5×5,7×7 and 11×11 are designed, the original feature image and the four scales are subjected to multi-scale fusion, one feature image has a plurality of scale frames, namely the size of a detection frame in the feature image can be considered to have a plurality of sizes, and the multi-scale fusion is realized similar to a Fast RCNN network.
In this embodiment, in order to fully preserve spatial detail information of candidate regions in an image, in consideration of inconsistent detail information contained in feature maps of different sizes, each candidate region is scaled to 4 different scales (3×3,5×5,7×7, 11×11) and input into a segmentation network, and candidate region features of corresponding scales are obtained in an ROI pooling layer. At this time, there are a plurality of water area candidate region features of different scales in the feature map.
After extracting the candidate region features of the amphibious intersection image, the candidate region features are subjected to global average pooling to obtain the category predicted value of the candidate region, and then the regional category predicted value is mapped to each pixel in the candidate region through the region-to-pixel layer, so that pixel-level prediction classification of the amphibious intersection image is realized, and the water area part in the amphibious intersection image is segmented.
In particular, for the weatherSelect arearThe activation value is obtained after the global average pooling layer processing of the image segmentation modelFrom all inclusive pixels for region-to-pixel layerpSelecting the region with the largest activation value as the pixelpActivation value of (i), i.e
Classifying the amphibious intersecting image semantic segmentation targets by adopting Softmax regression, and obtaining pixels through Softmax layerspProbability of belonging to water or landThe method comprises the following steps: />Wherein, the method comprises the steps of, wherein,irepresenting pixel pointspBelonging to the category->Representing pixel pointspIs of the categoryiIs a function of the probability of (1),Cindicating the number of all categories.
According to the embodiment, the semantic label is allocated to each pixel in consideration of the semantic segmentation target of the amphibious cross image, which is equivalent to the two classification problems, so that the classification is performed by adopting Softmax regression, and the robustness and the accuracy of reservoir aerial image classification can be improved.
Finally, according toPrediction pixelpSemantic category->
Referring to fig. 5, in this embodiment, a specific flow for implementing image classification and image segmentation based on an image classification model and an image segmentation model is as follows:
firstly, normalizing reservoir aerial images, inputting the reservoir aerial images into a ResNet feature extraction network with expansion convolution in an image classification model to perform feature extraction, and obtaining corresponding feature images; then, a classification network based on global average pooling, namely a global average pooling layer is used for carrying out class prediction on the feature map, and the original reservoir aerial image is divided into three types of pure water areas, pure land areas and land-water intersections;
if the classification is carried out to obtain a pure water domain image, the pure water domain image is the data to be rejected, and if the classification is carried out to obtain an amphibious cross image, the amphibious cross image is input into an image segmentation model of the next stage;
then, a water area candidate region set is generated on the input amphibious intersecting image, a feature map and candidate regions scaled to different scales are input into a multi-scale fusion-based segmentation network, and candidate region features are obtained through an ROI pooling layer; and finally classifying the candidate areas through a global average pooling classification layer, mapping the area category information to each pixel in the area through an area-to-pixel layer to obtain a pixel level prediction result, and dividing the water area part in the water-land intersection image.
And S3, training an image classification model and an image segmentation model by using the classification training sample set and the amphibious cross segmentation training sample set until model loss is smaller than a set value or no change occurs any more, obtaining model parameters, and finishing training of the image classification model and the image segmentation model.
Based on the training model, the reservoir aerial image to be tested is input into the image classification model and the image segmentation model, so that the land and water image classification of the reservoir aerial image to be tested and the land and water segmentation of land and water intersection images are realized.
In summary, the reservoir aerial image classification and image segmentation method based on the residual network provided by the embodiment adopts the reservoir aerial image data set which is built independently, fully utilizes the deep learning technology to mine different visual characteristic information between the water area and the land area, combines the image recognition and the semantic segmentation network, and enables the model to have higher accuracy and stronger robustness for water area detection of the reservoir aerial image; the residual error network with the expansion convolution is used for extracting the characteristics, so that a high-resolution characteristic diagram containing more information is obtained, and the adaptability and the reliability of the model are enhanced; meanwhile, the global average pooling classification method and the segmentation network based on multi-scale fusion are used, so that the generalization capability and performance of the model are improved, the land and water image classification result and the land and water intersection image segmentation result are fully fused, the water area detection segmentation with high quality and high accuracy is realized, and the water area accurate detection problem of the reservoir aerial image is solved.
Example two
The embodiment provides a reservoir aerial image classification and image segmentation system based on a residual error network, which comprises the following steps:
the image acquisition and preprocessing module is used for acquiring the reservoir aerial image to be detected and preprocessing the reservoir aerial image to be detected;
the image classification module is used for inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting the characteristics of the network-detected reservoir aerial image to be extracted through the ResNet characteristics with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
the image segmentation module is used for inputting the reservoir aerial image with the predicted category of the amphibious intersection image into the image segmentation model, extracting the candidate region characteristics of the amphibious intersection image through a segmentation network based on multi-scale fusion, obtaining the category predicted value of the candidate region through a classification network based on global average pooling, and outputting the water area and land segmented in the amphibious intersection image based on the category predicted value.
The system or the device provided in the embodiment of the present application may be specifically used to execute the scheme provided in the embodiment of the method corresponding to the embodiment of fig. 1, and specific functions and technical effects that can be achieved are not described herein again.
The steps involved in the second embodiment correspond to those of the first embodiment of the method, and the detailed description of the second embodiment can be found in the related description section of the first embodiment.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented by general-purpose computer means, alternatively they may be implemented by program code executable by computing means, whereby they may be stored in storage means for execution by computing means, or they may be made into individual integrated circuit modules separately, or a plurality of modules or steps in them may be made into a single integrated circuit module. The present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (5)

1. A reservoir aerial image classification and image segmentation method based on a residual network is characterized by comprising the following steps:
acquiring an aerial image of a reservoir to be detected, and preprocessing the aerial image of the reservoir to be detected;
inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting the characteristics of the reservoir aerial image to be detected through a ResNet characteristic extraction network with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
inputting a reservoir aerial image with a predicted class of land-water crossing images into an image segmentation model, extracting candidate region characteristics of the land-water crossing images through a segmentation network based on multi-scale fusion, obtaining class predicted values of candidate regions through a classification network based on global average pooling, and outputting water areas and land segmented in the land-water crossing images based on the class predicted values;
the training process of the image classification model and the image segmentation model comprises the following steps:
acquiring reservoir aerial images to construct an original data set, and labeling the reservoir aerial images in the original data set with water Liu Leibie to construct a classification training sample set; carrying out land and water region labeling on land and water intersection images in the reservoir aerial image to construct a land and water intersection segmentation training sample set;
constructing an image classification model and an image segmentation model; the image classification model comprises a ResNet feature extraction network with dilation convolution and a classification network based on global average pooling, and the image segmentation model comprises a segmentation network based on multi-scale fusion and a classification network based on global average pooling; the ResNet feature extraction network with the expansion convolution is used for extracting features of reservoir aerial images; the classification network based on global average pooling is used for carrying out class prediction according to input features; the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image;
training an image classification model and an image segmentation model by using the classification training sample set and the amphibious cross segmentation training sample set to finish training of the model;
the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image, and the feature extraction comprises:
acquiring a water area candidate region in the amphibious intersection image by using a selection search algorithm, wherein the generated candidate region comprises a boundary frame, a foreground mask and a foreground size;
scaling each candidate region to a plurality of different scales, and inputting the water area candidate regions with the different scales and the feature images extracted by the feature extraction network into a segmentation network based on multi-scale fusion;
mapping the features obtained through the feature extraction network onto each candidate region by utilizing the ROI pooling layer to obtain initial region features, and multiplying the initial region features by corresponding foreground masks on each channel to obtain candidate region features;
the method for extracting the candidate region characteristics of the amphibious intersecting image, obtaining a category predicted value of the candidate region through a classification network based on global average pooling, and outputting the water area and land separated from the amphibious intersecting image based on the category predicted value comprises the following steps:
extracting candidate region characteristics of the amphibious intersection image, obtaining a category predicted value of the candidate region through a classification network based on global average pooling, mapping the regional category predicted value to each pixel in the candidate region through a region-to-pixel layer, completing pixel-level prediction classification of the amphibious intersection image, dividing a water area and a land in the amphibious intersection image, and outputting the water area and the land divided in the amphibious intersection image.
2. The method for classifying and segmenting reservoir aerial images based on the residual network as claimed in claim 1, wherein the method further comprises preprocessing reservoir aerial images before labeling the reservoir aerial images in the original dataset; the preprocessing includes unifying image sizes and deleting duplicate images.
3. The reservoir aerial image classification and image segmentation method based on the residual network as set forth in claim 1, wherein constructing the expanded convolution-carrying ResNet feature extraction network based on the 5 previous convolution layers of the residual network ResNet-50 comprises:
introducing an expansion convolution kernel into a 4 th layer convolution layer and a 5 th layer convolution layer of the ResNet-50 basic network, setting convolution step sizes and expansion rates of the 4 th layer convolution layer and the 5 th layer convolution layer, and obtaining the ResNet characteristic extraction network with expansion convolution, which is suitable for the task of semantic segmentation of amphibious cross images.
4. The method for classifying and segmenting aerial images of reservoir based on residual network as claimed in claim 1, wherein the global average pooling-based classification network is used as a global average pooling layer, after feature vectors with the sizes of H and W and the channel number of D are input into the global average pooling layer, the input feature vectors H x W x D are convolved by using C convolution checks with the size of 1 x D to obtain H x W x C feature images, C is the total number of categories, and finally the feature images are subjected to average pooling by using pooling checks with the same size as the feature images to obtain final category prediction values.
5. Reservoir aerial image classification and image segmentation system based on residual error network, characterized by comprising:
the image acquisition and preprocessing module is used for acquiring the reservoir aerial image to be detected and preprocessing the reservoir aerial image to be detected;
the image classification module is used for inputting the preprocessed reservoir aerial image to be detected into an image classification model, extracting the characteristics of the reservoir aerial image to be detected through a ResNet characteristic extraction network with expansion convolution, and outputting category prediction through a classification network based on global average pooling according to the image characteristics;
the image segmentation module is used for inputting the reservoir aerial image with the predicted category of the amphibious intersection image into the image segmentation model, extracting the candidate region characteristics of the amphibious intersection image through a segmentation network based on multi-scale fusion, obtaining the category predicted value of the candidate region through a classification network based on global average pooling, and outputting the water area and land segmented in the amphibious intersection image based on the category predicted value;
the training process of the image classification model and the image segmentation model comprises the following steps:
acquiring reservoir aerial images to construct an original data set, and labeling the reservoir aerial images in the original data set with water Liu Leibie to construct a classification training sample set; carrying out land and water region labeling on land and water intersection images in the reservoir aerial image to construct a land and water intersection segmentation training sample set;
constructing an image classification model and an image segmentation model; the image classification model comprises a ResNet feature extraction network with dilation convolution and a classification network based on global average pooling, and the image segmentation model comprises a segmentation network based on multi-scale fusion and a classification network based on global average pooling; the ResNet feature extraction network with the expansion convolution is used for extracting features of reservoir aerial images; the classification network based on global average pooling is used for carrying out class prediction according to input features; the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image;
training an image classification model and an image segmentation model by using the classification training sample set and the amphibious cross segmentation training sample set to finish training of the model;
the multi-scale fusion-based segmentation network is used for extracting candidate region features of the amphibious intersection image, and the feature extraction comprises:
acquiring a water area candidate region in the amphibious intersection image by using a selection search algorithm, wherein the generated candidate region comprises a boundary frame, a foreground mask and a foreground size;
scaling each candidate region to a plurality of different scales, and inputting the water area candidate regions with the different scales and the feature images extracted by the feature extraction network into a segmentation network based on multi-scale fusion;
mapping the features obtained through the feature extraction network onto each candidate region by utilizing the ROI pooling layer to obtain initial region features, and multiplying the initial region features by corresponding foreground masks on each channel to obtain candidate region features;
the method for extracting the candidate region characteristics of the amphibious intersecting image, obtaining a category predicted value of the candidate region through a classification network based on global average pooling, and outputting the water area and land separated from the amphibious intersecting image based on the category predicted value comprises the following steps:
extracting candidate region characteristics of the amphibious intersection image, obtaining a category predicted value of the candidate region through a classification network based on global average pooling, mapping the regional category predicted value to each pixel in the candidate region through a region-to-pixel layer, completing pixel-level prediction classification of the amphibious intersection image, dividing a water area and a land in the amphibious intersection image, and outputting the water area and the land divided in the amphibious intersection image.
CN202310212759.5A 2023-03-08 2023-03-08 Reservoir aerial image classification and image segmentation method and system based on residual error network Active CN116071607B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310212759.5A CN116071607B (en) 2023-03-08 2023-03-08 Reservoir aerial image classification and image segmentation method and system based on residual error network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310212759.5A CN116071607B (en) 2023-03-08 2023-03-08 Reservoir aerial image classification and image segmentation method and system based on residual error network

Publications (2)

Publication Number Publication Date
CN116071607A CN116071607A (en) 2023-05-05
CN116071607B true CN116071607B (en) 2023-08-08

Family

ID=86178631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310212759.5A Active CN116071607B (en) 2023-03-08 2023-03-08 Reservoir aerial image classification and image segmentation method and system based on residual error network

Country Status (1)

Country Link
CN (1) CN116071607B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN110853057A (en) * 2019-11-08 2020-02-28 西安电子科技大学 Aerial image segmentation method based on global and multi-scale full-convolution network
CN111209972A (en) * 2020-01-09 2020-05-29 中国科学院计算技术研究所 Image classification method and system based on hybrid connectivity deep convolution neural network
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111640116A (en) * 2020-05-29 2020-09-08 广西大学 Aerial photography graph building segmentation method and device based on deep convolutional residual error network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7103240B2 (en) * 2019-01-10 2022-07-20 日本電信電話株式会社 Object detection and recognition devices, methods, and programs

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN110853057A (en) * 2019-11-08 2020-02-28 西安电子科技大学 Aerial image segmentation method based on global and multi-scale full-convolution network
CN111209972A (en) * 2020-01-09 2020-05-29 中国科学院计算技术研究所 Image classification method and system based on hybrid connectivity deep convolution neural network
CN111340047A (en) * 2020-02-28 2020-06-26 江苏实达迪美数据处理有限公司 Image semantic segmentation method and system based on multi-scale feature and foreground and background contrast
CN111640116A (en) * 2020-05-29 2020-09-08 广西大学 Aerial photography graph building segmentation method and device based on deep convolutional residual error network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Semantic Segmentation of Lotus Leaves in UAV Aerial Images via U-Net and DeepLab-based Networks;Wen-Tse Chiu et al;《 2020 International Computer Symposium (ICS)》;第535-540页 *

Also Published As

Publication number Publication date
CN116071607A (en) 2023-05-05

Similar Documents

Publication Publication Date Title
CN109859190B (en) Target area detection method based on deep learning
CN111507271B (en) Airborne photoelectric video target intelligent detection and identification method
Xie et al. Multilevel cloud detection in remote sensing images based on deep learning
Zhou et al. BOMSC-Net: Boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery
CN111027493B (en) Pedestrian detection method based on deep learning multi-network soft fusion
CN114202672A (en) Small target detection method based on attention mechanism
CN110929577A (en) Improved target identification method based on YOLOv3 lightweight framework
CN111461212B (en) Compression method for point cloud target detection model
Alidoost et al. A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image
Geng et al. Using deep learning in infrared images to enable human gesture recognition for autonomous vehicles
CN114155481A (en) Method and device for recognizing unstructured field road scene based on semantic segmentation
Chen et al. Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning
CN111753682B (en) Hoisting area dynamic monitoring method based on target detection algorithm
Shen et al. Biomimetic vision for zoom object detection based on improved vertical grid number YOLO algorithm
CN116188999B (en) Small target detection method based on visible light and infrared image data fusion
CN116645592B (en) Crack detection method based on image processing and storage medium
CN112016512A (en) Remote sensing image small target detection method based on feedback type multi-scale training
Maggiolo et al. Improving maps from CNNs trained with sparse, scribbled ground truths using fully connected CRFs
CN116740516A (en) Target detection method and system based on multi-scale fusion feature extraction
Li et al. Progressive attention-based feature recovery with scribble supervision for saliency detection in optical remote sensing image
CN117710841A (en) Small target detection method and device for aerial image of unmanned aerial vehicle
CN113177956A (en) Semantic segmentation method for unmanned aerial vehicle remote sensing image
Zhang et al. Small target detection based on squared cross entropy and dense feature pyramid networks
Patil et al. Road segmentation in high-resolution images using deep residual networks
CN116071607B (en) Reservoir aerial image classification and image segmentation method and system based on residual error network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant