CN110443248B - Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image - Google Patents

Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image Download PDF

Info

Publication number
CN110443248B
CN110443248B CN201910560692.8A CN201910560692A CN110443248B CN 110443248 B CN110443248 B CN 110443248B CN 201910560692 A CN201910560692 A CN 201910560692A CN 110443248 B CN110443248 B CN 110443248B
Authority
CN
China
Prior art keywords
image
batch
current
remote sensing
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910560692.8A
Other languages
Chinese (zh)
Other versions
CN110443248A (en
Inventor
张觅
胡翔云
赵丽科
魏域君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201910560692.8A priority Critical patent/CN110443248B/en
Publication of CN110443248A publication Critical patent/CN110443248A/en
Application granted granted Critical
Publication of CN110443248B publication Critical patent/CN110443248B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method and a system for eliminating a semantic segmentation blocking effect of a large-amplitude remote sensing image. And eliminating the extended boundary by using the size of the sliding window and the fusion factor so as to obtain a final fusion result and eliminate the semantic segmentation blocking effect of the large-amplitude remote sensing image.

Description

Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image
Technical Field
The invention relates to the field of computer vision and remote sensing, in particular to a method and a system for eliminating a semantic segmentation blocking effect of a remote sensing image.
Background
In recent years, with the promotion of large-scale applications such as deep learning, big data and Graphic Processing Unit (GPU), the intelligent interpretation of remote sensing images faces many opportunities and challenges. The high-resolution remote sensing image semantic segmentation technology is a technology for endowing each pixel with a category attribute, can be widely applied to emergency processing tasks such as homeland change detection, geographical national situation general survey, earthquake prevention and disaster reduction and the like, and has great economic and social values.
Semantic segmentation methods for images are commonly used in indoor/outdoor image processing. With the advent of large-scale data sets such as ImageNet and MS-COCO, semantic segmentation methods have been rapidly developed, and methods represented by Deep Convolutional Neural Networks (DCNN) have been widely studied. Compared with the traditional method, such as TextonBoost and the like, the DCNN-based semantic segmentation method has better robustness, can find the optimal approximation function of image semantic segmentation through the combination of linear and nonlinear mapping under the condition of having enough labeled data, and is the method with the best effect in the current indoor/outdoor image semantic segmentation task. In the field of remote sensing image processing, a semantic segmentation technology is also called as a remote sensing image classification technology, and is different from indoor/outdoor images, the problem faced by the semantic segmentation of the remote sensing images is more complex, and the problems of semantic segmentation scale, direction, spatial context information expression and the like exist, and the problems of local information loss, same-spectrum foreign matter, same-object different-spectrum and the like of a training sample can also occur. The consistency of the block prediction results of large-image remote sensing images (usually, the size is larger than 15000 multiplied by 15000 pixels) becomes a bottleneck of applying the indoor/outdoor image semantic segmentation method to remote sensing image processing.
The DCNN-based semantic segmentation method can be divided into three categories according to processing levels and fusion units: the first kind of characteristics is that based on image classification network, such as VGGNet, GoogleNet, ResNet, etc., multiple strategies are fused, and semantic information extraction of end-to-end (input to output) is completed by adjusting the structure of the image classification network. For example, a "hole" convolution kernel is used in a dilated convolution (DilatedConv) network to maintain the perceptual view of a Full Convolution Network (FCN); RefineNet represents semantic segmentation information through multipath and multiresolution; the ExFuse model fuses bottom-layer and high-layer features and the like. The second category is the method using object detection to assist the semantic segmentation task. The method integrates the tasks of target detection and semantic segmentation, and segments the examples in the image by using the outsourcing rectangle and the target mask branch. For example, the Mask-RCNN framework has achieved champions on the MS-COCO dataset using this strategy. However, for some specific categories in the remote sensing image, the labeling data are not single instances enclosed by rectangular frames, and the labeling data have large subjective differences. The third type of scene constraint semantic segmentation aims to blend scene information in a semantic segmentation task so as to inhibit interference of irrelevant scenes. Semantically segmented scene information generally originates from two aspects: one is scene category information of the image block; another aspect is the combination of different hierarchical features of DCNN. For the former, the scene information constraint is derived from statistics of each category information in the semantic segmentation labeling data, and the dominant category information is used as the scene constraint information; for the latter, mainly through integration of different hierarchical features in the network structure, it may lead to an increasingly complex structural design and an excessive consumption of GPU computing resources.
Although the above-described semantic segmentation method based on DCNN can perform segmentation using local image blocks (patch) as processing units, the processing objects are limited to small images due to GPU resource limitations and model design, and information fusion between predicted local image blocks is not considered. The remote sensing image has a large image size which is usually 3-4 times of the size of the natural image, and by using a sliding window or weighted overlapping sliding window method, optimal stitching is still difficult to achieve between prediction results of each local image block after block prediction, and a blocking effect cannot be eliminated (as shown in fig. 1(a) - (c)). Therefore, when the semantic segmentation prediction of the large-scale remote sensing image is required, a blocking effect elimination method is introduced, so that the transition of the prediction result of the large-scale image is smoother, and the semantic segmentation result keeps better global consistency.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a global weighted fusion (GWFose) method (shown in figure 1(d)) which is used for eliminating the semantic segmentation blocking effect of the remote sensing image greatly, so that the semantic segmentation result of the remote sensing image is transited more smoothly, and the optimal stitching among the segmentation results of local image blocks is achieved.
In order to achieve the purpose, the invention provides the technical scheme that: the method for eliminating the semantic segmentation blocking effect of the large-amplitude remote sensing image comprises the following steps:
step 1, calculating weighted fusion preprocessing parameters of an image to be semantically segmented, comprising the following substeps;
step 1.1, expanding the boundary of the remote sensing image to be interpreted;
step 1.2, a window weighting function;
step 1.3, calculating the total step length of batch processing;
step 2, a semantic segmentation method based on a Convolutional Neural Network (CNN) introduces a batch processing mode in a prediction stage, and uses a window weighting function to fuse batch processing image prediction results to obtain a final interpretation result, and the method comprises the following substeps;
step 2.1, initializing the current step length s to be 0;
step 2.2, judging whether the current step length is smaller than the total step length, if so, initializing the parameter lambda of the current batch to be 0, and turning to the step 2.3; otherwise, the image is processed, and the interpretation result image is output;
step 2.3, judging whether the parameter lambda of the current batch is smaller than the total batch bs, if so, calculating a weighted interpretation result of the current batch by using a window weighting function, and turning to the step 2.4;
step 2.4, updating the current step length parameter s ═ s + bs until the current step length is larger than the total step length, indicating that the whole image is processed, and turning to the step 2.5;
step 2.5, according to the fusion factor beta, calculating the extended boundary image Ip×qIs interpreted to be a result Mb
Step 2.6, according to the window size k and the fusion factor beta, obtaining the interpretation result M after eliminating the expansion boundaryT
Further, the specific implementation manner of step 1.1 is as follows,
the input remote sensing image to be interpreted is assumed to be Im×nWherein m and n are the width and height of the image to be interpreted respectively; the window size during image interpretation is k, and the fusion factor is beta; the extended image is Ip×qWherein p and q are the width and height of the extended image respectively, and the extended width w and height h are calculated by the size of the interpretation window and the fusion factor:
Figure BDA0002108193830000031
wherein the symbols
Figure BDA0002108193830000032
Indicating rounding up, so the width and height of the extended image are:
Figure BDA0002108193830000033
Figure BDA0002108193830000034
and finally, expanding the image to be interpreted in a mirror image boundary expansion mode according to the expansion width and the expansion height.
Further, the specific implementation manner of step 1.2 is as follows,
using a window weighting function for calculating a global weight value between overlapping windows,
Figure BDA0002108193830000035
wherein the window size is k, and when processing the image blocks in the window, the overlap degree between each divided blocks is kept to be
Figure BDA0002108193830000036
Expanding the formula (5) to two dimensions to obtain a second-order smooth function W under the two-dimensional conditionk×k=[f(x) f(y)]T
Further, the specific implementation manner of step 1.3 is as follows,
according to the window size k, the image width and height p and q after the boundary is expanded and the fusion factor beta, the total step length of batch processing in the x direction and the y direction is respectively calculated as follows:
Figure BDA0002108193830000037
Figure BDA0002108193830000041
wherein sm represents a scaling factor, and the calculation mode is as follows:
Figure BDA0002108193830000042
the total step size of the final batch process is therefore
Figure BDA0002108193830000043
Furthermore, the specific implementation manner of calculating the current batch weighting interpretation result in step 2.3 is as follows,
Figure BDA0002108193830000044
wherein the symbol denotes a two-dimensional convolution operator; the function F (-) represents a CNN semantic segmentation prediction network; i isλRepresenting the block image to be processed in the current batch by the image I after the boundary expansionp×qObtaining:
Iλ=Ip×q[xλ:xλ+k,yλ:yλ+k,:] (10)
Figure BDA0002108193830000045
yλ=(index%ly)×sm (12)
x in the formula (10)λ:xλ+ k represents the extended image matrix Ip×qThe range taken along the x-axis is [ x ]λ,xλ+k];yλ:yλ+ k represents the extended image matrix Ip×qThe range taken along the y-axis is [ y ]λ,yλ+k](ii) a In the formulas (11) and (12), index ∈ [ s, s + bs) represents a positive integer vernier, s is a current step length parameter, and if the current batch parameter λ is smaller than the total batch bs, the current batch image I continues to be obtainedλAnd updating the current batch parameter lambda + 1.
Further, the interpretation result M in step 2.5bThe way of calculating (a) is as follows,
Figure BDA0002108193830000046
in the formula (13), the first and second groups,
Figure BDA0002108193830000047
represents the ith lot weighted interpretation result, all the lot results together form the image I with the size of p × qp×qIs interpreted to be a result Mb
Further, the interpretation result M in step 2.6TIn a manner such asIn the following, the first and second parts of the material,
MT=Mb[w:p-w,h:q-h,:] (14)
in the formula (14), w: p-w represents the image matrix M for the interpretation resultbThe range taken along the x-axis is [ w, p-w ]](ii) a Q-h represents the image matrix M for the interpretation resultbTaken along the y-axis, is in the range [ h, q-h]。
The invention also provides a system for eliminating the semantic segmentation blocking effect of the large-amplitude remote sensing image, which comprises the following modules:
the parameter calculation module is used for calculating weighted fusion preprocessing parameters of the image to be semantically segmented and comprises the following sub-modules;
the boundary expansion submodule is used for expanding the boundary of the remote sensing image to be interpreted;
the window weighting function calculation submodule is used for calculating a window weighting function;
the batch processing total step length calculation submodule is used for calculating the batch processing total step length;
the interpretation result output module is used for introducing a batch processing mode in a prediction stage based on a semantic segmentation method of a Convolutional Neural Network (CNN), fusing a batch processing image prediction result by using a window weighting function to obtain a final interpretation result, and comprises the following sub-modules;
a first sub-module, configured to initialize a current step size β equal to 0;
the second submodule is used for judging whether the current step length is smaller than the total step length or not, if so, initializing the parameter lambda of the current batch to be 0, and transferring to the third submodule; otherwise, the image is processed, and the interpretation result image is output;
the third sub-module is used for judging whether the parameter lambda of the current batch is smaller than the total batch bs, if so, calculating a weighted interpretation result of the current batch by using a window weighting function, and transferring the weighted interpretation result to the fourth sub-module;
the fourth sub-module is used for updating the current step length parameter s ═ s + bs until the current step length parameter s is larger than the total step length, and then the whole image is processed and the fourth sub-module is switched to the fifth sub-module;
a fifth sub-module for calculating an extended boundary shadow according to the fusion factor betaLike Ip×qIs interpreted to be a result Mb
A sixth sub-module for obtaining the interpretation result M after eliminating the extended boundary according to the window size k and the fusion factor betaT
Compared with the existing sliding window or weighted overlapping sliding window fusion technology, the global weighted fusion GWFose method is provided, a second-order global weighted smoothing function is adopted, the high-efficiency calculation advantage of the GPU is considered, and a GPU batch processing mode is introduced in the semantic segmentation prediction stage of the remote sensing image, so that batch processing images under different step lengths are subjected to global weighted fusion. And eliminating the extended boundary by using the size of the sliding window and the fusion factor, thereby eliminating the semantic segmentation blocking effect of the large-amplitude remote sensing image and obtaining the optimal fusion result.
Drawings
FIG. 1 is a block diagram of a semantic segmentation blocking effect of local image blocks of a large-amplitude remote sensing image and a processing result thumbnail of the block effect. Wherein, the image (a) is an original image; FIG. (b) shows the result of the sliding window process; FIG. (c) shows the result of weighted overlap sliding window processing; FIG. d shows the results of the patented process of the present invention.
Fig. 2 is a flowchart of a global weighted fusion (GWFuse) blocking effect elimination method adopted in the present invention.
FIG. 3 is a schematic diagram of a remote sensing image boundary expansion to be interpreted.
Fig. 4 is a diagram illustrating a global window weighted smoothing function. Wherein, diagram (a) is a schematic diagram of the case of extending the second order smoothing function to two dimensions; graph (b) is a two-dimensional weighted smoothing function visualization.
FIG. 5 is a more semantic segmentation blocking effect and processing result thumbnail thereof. Wherein, (a) is the original image; (b) listing as a sliding window processing result; (c) the columns are weighted overlapping sliding window processing results; (d) listed as the patent processing result of the present invention. The original image size floats in the range of 7000 × 7000 pixels to 80000 × 80000 pixels.
Detailed Description
The invention adopts a global weighting fusion method to solve the problem of eliminating the blocking effect of the semantic segmentation result of the large-amplitude remote sensing image, the method utilizes the size of a sliding window and a global fusion factor to comprehensively calculate a window weighting function, simultaneously considers the advantages of batch processing prediction of a Convolutional Neural Network (CNN), and carries out global weighting fusion on batch processing images under different step sizes by taking an image block processed by the total batch processing size as a processing unit. And eliminating the extended boundary by using the size of the sliding window and the fusion factor so as to obtain a final fusion result and eliminate the semantic segmentation blocking effect of the large-amplitude remote sensing image.
In order to better understand the technical solution of the present invention, the following description will be made with reference to the accompanying drawings. The global weighting fusion (GWFose) semantic segmentation blocking effect elimination method provided by the invention is shown in figure 2, and the core of the method lies in window weighting function calculation and batch processing image weighting fusion. The specific implementation steps are as follows:
step 1, calculating weighted fusion preprocessing parameters of the image to be semantically segmented.
The calculation of the preprocessing parameters is a precondition for eliminating the semantic segmentation blocking effect, and comprises three aspects of image boundary expansion, window weighting function calculation and batch processing total step length calculation. The method comprises the following specific steps:
1.1 remote sensing image boundary expansion to be interpreted
The input remote sensing image to be interpreted is assumed to be Im×nWherein m and n are the width and height of the image to be interpreted respectively; the window size during image interpretation is k, and the fusion factor is beta; the extended image is Ip×qWherein p and q are the width and height of the extended image, respectively. As shown in fig. 3, from the interpretation window size and the fusion factor, the extended width w and height h can be calculated as:
Figure BDA0002108193830000061
wherein the symbols
Figure BDA0002108193830000062
Indicating rounding up. Therefore, the width and height of the expanded image are:
Figure BDA0002108193830000063
Figure BDA0002108193830000071
the invention adopts a mirror image boundary expansion mode to expand the image to be interpreted according to the expansion width and height. Wherein the fusion factor is chosen to be β -2 and the window size is chosen to be k-512.
1.2 calculation of Window weighting function
In fig. 1(c), the semantic segmentation result is weighted by using overlapping window voting, and the smoothness of the boundary of the window overlapping is not considered fundamentally. In order to make the transition between predicted semantic segmentation blocks more natural, the patent designs a second-order window weighting smoothing function for calculating a global weight value between overlapping windows.
The reason why the window size is k, which causes the blocking effect of the semantic segmentation result, is that the voting function is a first-order linear function, the weight is higher at a position closer to the center of the window, and a step is generated at the window boundary:
Figure BDA0002108193830000072
in order to overcome the disadvantage of the step function at the boundary, the second-order window weighted smoothing function designed by the patent is as follows:
Figure BDA0002108193830000073
in formula (5), x represents a coordinate position in the x direction. When processing image blocks in the window, the overlapping degree between each divided block is kept as
Figure BDA0002108193830000074
The formula (5) is expanded to two dimensions, namely a second-order smooth function W under the condition of two dimensions is obtainedk×k=[f(x) f(y)]TWhere f (x), f (y) represent the window weighting functions in the x-direction and y-direction, respectively. Fig. 4 is a diagram illustrating a case where the second-order smoothing function is extended to two dimensions.
1.3, batch Total step size calculation
According to the window size k, the image width and height p and q after the boundary is expanded, and the fusion factor beta, the total step length of batch processing in the x direction and the y direction can be calculated and respectively:
Figure BDA0002108193830000075
Figure BDA0002108193830000076
wherein sm represents a scaling factor, and the calculation mode is as follows:
Figure BDA0002108193830000081
the total step size of the final batch process is therefore
Figure BDA0002108193830000082
And 2, eliminating the blocking effect of the batch processing semantic segmentation.
The semantic segmentation method based on Convolutional Neural Network (CNN) usually uses a single image block as a processing unit in the prediction stage. The invention introduces a batch processing mode in a prediction stage of a semantic segmentation method by virtue of the advantages of GPU batch processing in a CNN training stage so as to conveniently use a weighted window function to fuse batch processing image prediction results. The method specifically comprises the following steps:
step 2.1, initialize the current step size s to 0.
Step 2.2, determine whether the current step is less than the total step l ═ lx×ly. If the current batch length is less than the total step length, initializing the parameter lambda of the current batch to be 0, and turning to the step 2.3; otherwise, it indicates that the image has been processed,outputting the interpretation result image MT
And 2.3, judging whether the parameter lambda of the current batch is smaller than the total batch bs. If the total batch size is larger than the total batch size, a window weighting function W is utilizedk×k=[f(x) f(y)]TCalculating the current batch weighting interpretation result, and turning to the step 2.4:
Figure BDA0002108193830000083
wherein the symbol denotes a two-dimensional convolution operator; the function F (-) represents a CNN semantic segmentation prediction network, and a Full Convolution Network (FCN) with a dense connection structure is adopted as a prediction network function; i isλRepresenting the block image to be processed in the current batch by the image I after the boundary expansionp×qObtaining:
Iλ=Ip×q[xλ:xλ+k,yλ:yλ+k,:] (10)
Figure BDA0002108193830000084
yλ=(index%ly)×sm (12)
x in the formula (10)λ:xλ+ k represents the extended image matrix Ip×qThe range taken along the x-axis is [ x ]λ,xλ+k];yλ:yλ+ k represents the extended image matrix Ip×qThe range taken along the y-axis is [ y ]λ,yλ+k]. Index ∈ [ s, s + bs) in equations (11) and (12) represents a positive integer vernier, s being the current stride parameter. If the parameter lambda of the current batch is smaller than the total batch bs, continuously obtaining the image I of the current batchλAnd updating the current batch parameter lambda + 1.
And 2.4, updating the current step length parameter s + bs until the current step length is larger than the total step length lx×ly. At this time, the whole image is processed by the semantic segmentation network of the dense connection structure, and the step 2.5 is carried out;
in the step 2.5, the step of the method,calculating an extended boundary image I according to the fusion factor betap×qIs interpreted to be a result MbThe calculation method is as follows:
Figure BDA0002108193830000091
in the formula (13), the first and second groups,
Figure BDA0002108193830000092
representing the ith lot weighted interpretation result, all the lots of prediction results together form the image I with the size of p × qp×qIs interpreted to be a result Mb
Step 2.6, according to the window size k and the fusion factor beta, obtaining the interpretation result M after eliminating the expansion boundaryT. The process of eliminating the extended boundary is opposite to the process of expanding the boundary in the step 1.1, the width and the height of the clipping are calculated by the formula (1), and finally, the interpretation result after the extended boundary is fused and eliminated is as follows:
MT=Mb[w:p-w,h:q-h,:] (14)
in the formula (14), w: p-w represents the image matrix M for the interpretation resultbThe range taken along the x-axis is [ w, p-w ]](ii) a Q-h represents the image matrix M for the interpretation resultbTaken along the y-axis, is in the range [ h, q-h]. Fig. 5 is an example of a thumbnail applied to large-format remote sensing image semantic segmentation blocking effect elimination by using the method provided by the present patent, where an original image includes types of GeoEye, ZY-3, and the like, and the image size floating range is 7000 × 7000 pixels to 80000 × 80000 pixels.
The embodiment of the invention also provides a system for eliminating the semantic segmentation blocking effect of the large-amplitude remote sensing image, which comprises the following modules:
the parameter calculation module is used for calculating weighted fusion preprocessing parameters of the image to be semantically segmented and comprises the following sub-modules;
the boundary expansion submodule is used for expanding the boundary of the remote sensing image to be interpreted;
the window weighting function calculation submodule is used for calculating a window weighting function;
the batch processing total step length calculation submodule is used for calculating the batch processing total step length;
the interpretation result output module is used for introducing a batch processing mode in a prediction stage based on a semantic segmentation method of a Convolutional Neural Network (CNN), fusing a batch processing image prediction result by using a window weighting function to obtain a final interpretation result, and comprises the following sub-modules;
a first sub-module, configured to initialize a current step size β equal to 0;
the second submodule is used for judging whether the current step length is smaller than the total step length or not, if so, initializing the parameter lambda of the current batch to be 0, and transferring to the third submodule; otherwise, the image is processed, and the interpretation result image is output;
the third sub-module is used for judging whether the parameter lambda of the current batch is smaller than the total batch bs, if so, calculating a weighted interpretation result of the current batch by using a window weighting function, and transferring the weighted interpretation result to the fourth sub-module;
the fourth sub-module is used for updating the current step length parameter s ═ s + bs until the current step length parameter s is larger than the total step length, and then the whole image is processed and the fourth sub-module is switched to the fifth sub-module;
a fifth sub-module for calculating an extended boundary image I according to the fusion factor betap×qIs interpreted to be a result Mb
A sixth sub-module for obtaining the interpretation result M after eliminating the extended boundary according to the window size k and the fusion factor betaT
The specific implementation of each module corresponds to each step, and the invention is not described.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (5)

1. The method for eliminating the blocking effect of the semantic segmentation of the large-amplitude remote sensing image is characterized by comprising the following steps of:
step 1, calculating weighted fusion preprocessing parameters of a remote sensing image to be interpreted, wherein the method comprises the following substeps;
step 1.1, expanding the boundary of the remote sensing image to be interpreted;
step 1.2, determining a window weighting function;
step 1.3, calculating the total step length of batch processing;
step 2, a semantic segmentation method based on a convolutional neural network CNN introduces a batch processing mode in a prediction stage, and uses a window weighting function to fuse batch processing image prediction results to obtain a final interpretation result, and the method comprises the following substeps;
step 2.1, initializing the current step length s to be 0;
step 2.2, judging whether the current step length is smaller than the total step length, if so, initializing the parameter lambda of the current batch to be 0, and turning to the step 2.3; otherwise, the image is processed, and the interpretation result image is output;
step 2.3, judging whether the parameter lambda of the current batch is smaller than the total batch bs, if so, calculating a weighted interpretation result of the current batch by using a window weighting function, and turning to the step 2.4;
the specific implementation of calculating the current lot weighting interpretation result in step 2.3 is as follows,
Figure FDA0003193810700000011
wherein the symbol denotes a two-dimensional convolution operator; the function F (-) represents a CNN semantic segmentation prediction network; i isλRepresenting the block image to be processed in the current batch by the image I after the boundary expansionp×qObtaining:
Iλ=Ip×q[xλ:xλ+k,yλ:yλ+k,:] (10)
Figure FDA0003193810700000012
yλ=(index%ly)×sm (12)
x in the formula (10)λ:xλ+ k represents the extended image matrix Ip×qThe range taken along the x-axis is [ x ]λ,xλ+k];yλ:yλ+ k represents the extended image matrix Ip×qThe range taken along the y-axis is [ y ]λ,yλ+k](ii) a In the formulas (11) and (12), index ∈ [ s, s + bs) represents a positive integer vernier, s is a current step length parameter, and if the current batch parameter λ is smaller than the total batch bs, the current batch image I continues to be obtainedλUpdating the parameter lambda of the current batch to lambda + 1;
step 2.4, updating the current step length parameter s ═ s + bs until the current step length is larger than the total step length, indicating that the whole image is processed, and turning to the step 2.5;
step 2.5, according to the fusion factor beta, calculating the extended boundary image Ip×qIs interpreted to be a result Mb
Interpretation of the results M in step 2.5bThe way of calculating (a) is as follows,
Figure FDA0003193810700000021
in the formula (13), the first and second groups,
Figure FDA0003193810700000022
represents the ith lot weighted interpretation result, all the lot results together form the image I with the size of p × qp×qIs interpreted to be a result Mb
Step 2.6, according to the window size k and the fusion factor beta, obtaining the interpretation result M after eliminating the expansion boundaryT
Interpretation of the results M in step 2.6TThe way of calculating (a) is as follows,
MT=Mb[w:p-w,h:q-h,:] (14)
in the formula (14), w: p-w represents the image matrix M for the interpretation resultbTaken along the x-axisThe range is [ w, p-w](ii) a Q-h represents the image matrix M for the interpretation resultbTaken along the y-axis, is in the range [ h, q-h]。
2. The method for eliminating the semantic segmentation blocking effect of the large-format remote sensing image according to claim 1, characterized by comprising the following steps: the specific implementation of step 1.1 is as follows,
the input remote sensing image to be interpreted is assumed to be Im×nWherein m and n are the width and height of the image to be interpreted respectively; the window size during image interpretation is k, and the fusion factor is beta; the extended image is Ip×qWherein p and q are the width and height of the extended image respectively, and the extended width w and height h are calculated by the size of the interpretation window and the fusion factor:
Figure FDA0003193810700000023
wherein the symbols
Figure FDA0003193810700000024
Indicating rounding up, so the width and height of the extended image are:
Figure FDA0003193810700000025
Figure FDA0003193810700000026
and finally, expanding the image to be interpreted in a mirror image boundary expansion mode according to the expansion width and the expansion height.
3. The method for eliminating the semantic segmentation blocking effect of the large-format remote sensing image according to claim 2, characterized by comprising the following steps: the specific implementation of step 1.2 is as follows,
using a window weighting function for calculating a global weight value between overlapping windows,
Figure FDA0003193810700000031
wherein the window size is k, and when processing the image blocks in the window, the overlap degree between each divided blocks is kept to be
Figure FDA0003193810700000032
Expanding the formula (5) to two dimensions to obtain a second-order smooth function W under the two-dimensional conditionk×k=[f(x) f(y)]T
4. The method for eliminating the semantic segmentation blocking effect of the large-format remote sensing image according to claim 3, characterized by comprising the following steps: the specific implementation of step 1.3 is as follows,
according to the window size k, the image width and height p and q after the boundary is expanded and the fusion factor beta, the total step length of batch processing in the x direction and the y direction is respectively calculated as follows:
Figure FDA0003193810700000033
Figure FDA0003193810700000034
wherein sm represents a scaling factor, and the calculation mode is as follows:
Figure FDA0003193810700000035
the total step size of the final batch process is therefore
Figure FDA0003193810700000036
5. A large-amplitude remote sensing image semantic segmentation blocking effect elimination system for realizing the method of any one of claims 1 to 4 is characterized by comprising the following modules:
the parameter calculation module is used for calculating the weighted fusion preprocessing parameters of the remote sensing image to be interpreted and comprises the following sub-modules;
the boundary expansion submodule is used for expanding the boundary of the remote sensing image to be interpreted;
the window weighting function calculation submodule is used for calculating a window weighting function;
the batch processing total step length calculation submodule is used for calculating the batch processing total step length;
the interpretation result output module is used for introducing a batch processing mode in a prediction stage based on a semantic segmentation method of a Convolutional Neural Network (CNN), fusing a batch processing image prediction result by using a window weighting function to obtain a final interpretation result, and comprises the following sub-modules;
a first sub-module, configured to initialize a current step size β equal to 0;
the second submodule is used for judging whether the current step length is smaller than the total step length or not, if so, initializing the parameter lambda of the current batch to be 0, and transferring to the third submodule; otherwise, the image is processed, and the interpretation result image is output;
the third sub-module is used for judging whether the parameter lambda of the current batch is smaller than the total batch bs, if so, calculating a weighted interpretation result of the current batch by using a window weighting function, and transferring the weighted interpretation result to the fourth sub-module;
the fourth sub-module is used for updating the current step length parameter s ═ s + bs until the current step length parameter s is larger than the total step length, and then the whole image is processed and the fourth sub-module is switched to the fifth sub-module;
a fifth sub-module for calculating an extended boundary image I according to the fusion factor betap×qIs interpreted to be a result Mb
A sixth sub-module for obtaining the interpretation result M after eliminating the extended boundary according to the window size k and the fusion factor betaT
CN201910560692.8A 2019-06-26 2019-06-26 Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image Active CN110443248B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910560692.8A CN110443248B (en) 2019-06-26 2019-06-26 Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910560692.8A CN110443248B (en) 2019-06-26 2019-06-26 Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image

Publications (2)

Publication Number Publication Date
CN110443248A CN110443248A (en) 2019-11-12
CN110443248B true CN110443248B (en) 2021-12-03

Family

ID=68428394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910560692.8A Active CN110443248B (en) 2019-06-26 2019-06-26 Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image

Country Status (1)

Country Link
CN (1) CN110443248B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222453B (en) * 2020-01-03 2022-06-14 武汉大学 Remote sensing image change detection method based on dense connection and geometric structure constraint

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
WO2014168587A1 (en) * 2013-04-12 2014-10-16 Agency For Science, Technology And Research Method and system for processing an input image
CN105335966A (en) * 2015-10-14 2016-02-17 南京信息工程大学 Multi-scale remote-sensing image segmentation method based on local homogeneity index
CN107194912A (en) * 2017-04-20 2017-09-22 中北大学 The brain CT/MR image interfusion methods of improvement coupling dictionary learning based on rarefaction representation
CN109145920A (en) * 2018-08-21 2019-01-04 电子科技大学 A kind of image, semantic dividing method based on deep neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742355A (en) * 2009-12-24 2010-06-16 厦门大学 Method for partial reference evaluation of wireless videos based on space-time domain feature extraction
WO2014168587A1 (en) * 2013-04-12 2014-10-16 Agency For Science, Technology And Research Method and system for processing an input image
CN105335966A (en) * 2015-10-14 2016-02-17 南京信息工程大学 Multi-scale remote-sensing image segmentation method based on local homogeneity index
CN107194912A (en) * 2017-04-20 2017-09-22 中北大学 The brain CT/MR image interfusion methods of improvement coupling dictionary learning based on rarefaction representation
CN109145920A (en) * 2018-08-21 2019-01-04 电子科技大学 A kind of image, semantic dividing method based on deep neural network

Also Published As

Publication number Publication date
CN110443248A (en) 2019-11-12

Similar Documents

Publication Publication Date Title
CN112884064B (en) Target detection and identification method based on neural network
CN111768432B (en) Moving target segmentation method and system based on twin deep neural network
CN110570371B (en) Image defogging method based on multi-scale residual error learning
CN110782490B (en) Video depth map estimation method and device with space-time consistency
CN109299274B (en) Natural scene text detection method based on full convolution neural network
CN111914698B (en) Human body segmentation method, segmentation system, electronic equipment and storage medium in image
CN112084859B (en) Building segmentation method based on dense boundary blocks and attention mechanism
CN111310593B (en) Ultra-fast lane line detection method based on structure perception
CN110705412A (en) Video target detection method based on motion history image
CN111476710A (en) Video face changing method and system based on mobile platform
CN113610050A (en) Mask wearing real-time detection method based on YOLOv5
CN113139896A (en) Target detection system and method based on super-resolution reconstruction
Yu et al. Progressive refined redistribution pyramid network for defect detection in complex scenarios
CN110443248B (en) Method and system for eliminating semantic segmentation blocking effect of large-amplitude remote sensing image
Wang et al. High-resolution remote sensing image semantic segmentation based on a deep feature aggregation network
CN111462132A (en) Video object segmentation method and system based on deep learning
US20230072445A1 (en) Self-supervised video representation learning by exploring spatiotemporal continuity
Liang et al. Hybrid transformer-CNN networks using superpixel segmentation for remote sensing building change detection
Tian et al. HPM-TDP: An efficient hierarchical PatchMatch depth estimation approach using tree dynamic programming
CN117197530A (en) Insulator defect identification method based on improved YOLOv8 model and cosine annealing learning rate decay method
CN111160262A (en) Portrait segmentation method fusing human body key point detection
CN113223006B (en) Lightweight target semantic segmentation method based on deep learning
CN116030364A (en) Unmanned aerial vehicle lightweight target detection method, system, medium, equipment and terminal
CN116363447A (en) Wafer defect detection method, defect detection model training method and device
CN111553921B (en) Real-time semantic segmentation method based on channel information sharing residual error module

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant