CN109948697B - Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification - Google Patents

Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification Download PDF

Info

Publication number
CN109948697B
CN109948697B CN201910208202.8A CN201910208202A CN109948697B CN 109948697 B CN109948697 B CN 109948697B CN 201910208202 A CN201910208202 A CN 201910208202A CN 109948697 B CN109948697 B CN 109948697B
Authority
CN
China
Prior art keywords
data
remote sensing
area
training sample
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910208202.8A
Other languages
Chinese (zh)
Other versions
CN109948697A (en
Inventor
苗则朗
史文中
贺跃光
肖粤龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910208202.8A priority Critical patent/CN109948697B/en
Publication of CN109948697A publication Critical patent/CN109948697A/en
Application granted granted Critical
Publication of CN109948697B publication Critical patent/CN109948697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A30/00Adapting or protecting infrastructure or their operation
    • Y02A30/60Planning or developing urban green infrastructure

Landscapes

  • Image Analysis (AREA)

Abstract

The invention belongs to the field of remote sensing image processing, and discloses a method for extracting urban built-up areas by using a multi-source data-assisted remote sensing image classification, which comprises the steps of firstly preprocessing remote sensing data; then preprocessing operations such as coordinate conversion, redundant data elimination and filtering are carried out on the acquired multi-source data, and the frequency and spectrum similarity of the training samples in the single pixel are calculated; and finally, extracting the built-up city area by using a single-class support vector machine. The method realizes the rapid extraction of the urban built-up area, realizes the automatic generation of real-time and nearly zero-cost training samples by using the multi-source data, integrates the multi-source geographic data and the multi-source remote sensing image data to extract the urban built-up area, and greatly improves the extraction precision of the urban built-up area.

Description

Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification
Technical Field
The invention relates to a method for extracting urban built-up areas by using a multi-source data-assisted remote sensing image classification, and belongs to the field of remote sensing image processing.
Background
The area of the urban built-up area is an important index for measuring the urbanization process. The accurate extraction of the boundary of the built-up area is beneficial to reasonably planning the land and provides important reference for relevant researches such as urban waterlogging, heat island effect and the like. The expansion of a built-up area also has a significant impact on regional climate change.
Remote sensing technology plays a very important role in monitoring a wide range of land type changes. At present, the technology for extracting the urban built-up area by using multi-source remote sensing images, such as DMSP/LOS remote sensing light data, MODIS, Landsat, ENVISAT ASAR, SPOT and the like, has been widely applied. To date, many methods have been developed for unsupervised or supervised satellite image interpretation. And in particular supervised classification, has gained wider application due to its higher precision performance. Although there are many advantages to using remote sensing images to extract urban built-up areas, the acquisition of training samples required by supervised classification is often an important factor affecting the accuracy of the built-up area extraction. The supervised classification needs a large amount of on-site training samples, consumes a large amount of manpower, material resources and financial resources, and particularly consumes a huge amount of 'three forces' when a classification area rises to a global scale, which puts an urgent need on a real-time and low-cost training sample sampling technology. Meanwhile, the selection of the training samples requires that operators have prior knowledge of the land types, the manual selection of the training samples is time-consuming and labor-consuming, and the extraction efficiency of the urban built-up area is limited.
Disclosure of Invention
The invention aims to provide a method for extracting urban built-up areas by using a multi-source data-assisted remote sensing image in a classified manner.
In order to achieve the purpose, the invention provides a method for extracting urban built-up areas by using a multi-source data-assisted remote sensing image in a classified manner, which comprises the following steps:
(1) carrying out data preprocessing on the obtained remote sensing image;
(2) acquiring the mass-source data, selecting training samples from the mass-source data, calculating the number of the training samples in a single pixel according to the spatial resolution of the remote sensing image, and further calculating the frequency and spectrum similarity;
(3) extracting the impervious layer by using a single-class support vector machine;
(4) and performing EM clustering on the impervious layer to obtain the urban built-up area to be extracted.
Further, the remote sensing image in the step (1) is taken from Landsat-8, and the data preprocessing comprises data correction, abnormal value detection and cloud masking.
Further, the data is corrected to convert the terrain correction data into a radiation value by using the provided calibration value in case of multispectral data; the abnormal value is detected as an abnormal value generated by acquisition inconsistency or calibration error; the cloud mask is a multispectral image with cloud content less than 10% selected to avoid cloud coverage effects.
Further, the crowd-source data in the step (2) includes two open data sources, namely social media data and openstreet map.
Further, in the step (2), the social media data is taken from Twitter, the number of tweets in a single pixel in the Landsat image is first calculated, and then only pixels with at least one tweets point are taken as training samples; with Ω ═ x i I 1.. multidot.l } represents a training sample set, where x represents a spectral vector whose number of columns is equal to the number of bands of the multispectral image;
the frequency and spectral similarity of the Twitter data were measured:
1) measuring the frequency of Twitter data: defining the frequency of tweets of the ith training sample as F i
Figure GDA0003676923180000021
Wherein l is the number of Landsat pixels containing tween, T i Is the number of tweets, T, in the corresponding pixel i A larger value of (a) indicates a larger probability that the training sample is located in the impervious layer;
2) measurement of spectral similarity of Twitter data: assuming that a cluster is formed by training samples derived from Twitter, the distance from the non-permeable class to the cluster center of the training samples is smaller than the distance from the permeable class to the cluster center of the training samples, and the distance from the training samples to the cluster center is quantitatively measured by using a minimum covariance matrix MCD; when the point cloud is symmetrically distributed around a center, the probability density function of the ith training sample is expressed as
Figure GDA0003676923180000031
Wherein x i Is the spectral vector of the ith training sample, u is the sample mean, Σ is the covariance matrix of the mean, and the distance between the ith training sample and its mean is defined as the spectral similarity S i
S i =(x i -u) T-1 (x i -u) (3)
Wherein
Figure GDA0003676923180000032
And
Figure GDA0003676923180000033
respectively, the sample mean and the sample covariance matrix, S i Smaller values of (i) indicate a higher probability that the ith training sample falls in the watertight region;
based on the frequency and spectral similarity of the Twitter data, the weight of the ith training sample is expressed as:
Figure GDA0003676923180000034
further, in the step (2), for the data taken from the openstreet map, the impervious layer is first converted into two OSM raster images with spatial resolutions of 1m and 30m, respectively, and the ith impervious pixel of the OSM raster image with resolution of 30m covers the area R formed by 900 pixels on the OSM raster image with resolution of 1m i (ii) a Thus, the frequency of the ith water-impermeable picture element is defined as:
Figure GDA0003676923180000041
wherein
Figure GDA0003676923180000042
Is the OSM raster image region R of 1m resolution i The number of impermeable pixels,/ o Is the number of impervious layer pixels of the OSM raster image with the resolution of 30 m;
spectral similarity S of impervious pixels with OSM as training sample i o Is represented as follows:
Figure GDA0003676923180000043
wherein y is i And
Figure GDA0003676923180000044
respectively, the sample mean and the sample covariance matrix, S i o Smaller values of (i) indicate a higher probability that the ith training sample falls in the watertight region; weight w of training sample generated by OSM i o Is represented as follows:
Figure GDA0003676923180000045
further, in the step (3), classifying the remote sensing image by using a single support vector machine (OCSVM) to extract impervious layer information;
the OCSVM trains a hypersphere with a minimum volume, wherein the hypersphere contains a maximum training sample of a single class, and the distance from the sample to a boundary is interpreted as a similarity degree and provides a reference for judging whether the sample belongs to a specific class or not;
OCSVM as a constrained convex optimization problem
Figure GDA0003676923180000046
s.t.||x i -a|| 2 ≤R 2i
ξ i ≥0,i=1,...,l(8)
Wherein a and R are the center and radius of the hypersphere, respectively; | | The | is the Euler norm, ξ i Is used to control the relaxation variable of the relaxation degree, C is a trade-off parameter; the dual form of equation (8) is as follows:
Figure GDA0003676923180000051
s.t.0≤α i ≤C,i=1,...,l
Figure GDA0003676923180000052
wherein<·,·>Is the inner product, { α 1 ,...,α l Is the lagrange multiplier;
the weight of the training samples from the water impermeable area is made greater than the weight of the training samples from the water permeable area, and the weight of the training samples in equation (4) is introduced into equation (8), as shown below
Figure GDA0003676923180000053
s.t.||x i -a|| 2 ≤R 2i
ξ i ≥0,i=1,...,l (10)
Equation (9) becomes the following equation accordingly:
Figure GDA0003676923180000054
s.t.0≤α i ≤w i C,i=1,...,l
Figure GDA0003676923180000055
by kernel function K<x i ,x j >Replacing the inner product in equation (7) and equation (9)<x i ,x j >Obtaining a core version of an OCSVM (weighted OCSVM) and a weighted single-class support vector machine (WOCSVM), determining the most appropriate parameter by a quintuplet cross-validation method, and selecting a free parameter by the following formula:
Figure GDA0003676923180000056
where theta is the entire set of free parameters,
Figure GDA0003676923180000057
is the optimal set of parameters obtained by quintupling cross-validation, OA is the overall accuracy, nSV is the number of support vectors.
Further, the EM clustering algorithm in the step (4) is to cluster the probability map of the impervious layer obtained in the step (3) to remove image noise of the probability map and fill holes existing in the image.
Further, the EM clustering algorithm divides the probability map into five classes, and at least one class belonging to the impervious layer is extracted from the five classes.
The invention integrates the multi-source data and the remote sensing image data, and realizes high-precision urban built-up area extraction by using a single support vector machine: firstly, open and free crowd-sourced geographic data are obtained, and the crowd-sourced data are preprocessed; preprocessing the remote sensing data; and finally, a single-type support vector machine is used for extracting the urban built-up area, so that a real-time and nearly zero-cost training sample is automatically generated by using the multi-source data, the multi-source geographic data and the multi-source remote sensing image data are fused to extract the urban built-up area, the urban built-up area is rapidly extracted, and the extraction precision of the urban built-up area is greatly improved.
Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the embodiments of the invention without limiting the embodiments of the invention. In the drawings:
FIG. 1 is a flow chart of one embodiment of the present invention.
Fig. 2 is Landsat-8 remote sensing data in tokyo, japan in an embodiment of the present invention.
Fig. 3 is a partial image of fig. 2.
Fig. 4 is a distribution diagram of social media data Twitter as a training sample.
FIG. 5 is a graph of the results of extracting impermeable layers using Twitter data as training samples.
Fig. 6 is a distribution diagram of open source data OSM as a training sample.
Fig. 7 is a graph of the results of extracting impermeable layers using OSM data as a training sample.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating embodiments of the invention, are given by way of illustration and explanation only, not limitation.
In an embodiment of the present invention, as shown in fig. 1, a flowchart of a method for extracting a built-up city area from remote sensing image data by using a crowd-sourced data automatic generation training sample includes the following specific steps:
s1, preprocessing original data.
The remote sensing data selected by the embodiment of the invention is the Landsat 8 image. The study area was tokyo, japan, and several pre-processing steps were employed in order to reduce the temporal and spatial radiation differences of the different Landsat datasets used in the examples. The method specifically comprises data correction, outlier detection and cloud masking.
(a) And data correction, in the case of multispectral data, converting the terrain correction data into radiometric values using the provided calibration values.
(b) Outlier detection acquisition inconsistencies or calibration errors produce outliers, especially in multispectral data (labeled 0 or NaN), which may lead to classification errors. This is often the case at the borders of the image because each band captured by the sensor is slightly delayed, resulting in some missing position information. As previously mentioned, this phenomenon affects the multispectral dataset and increases the edge portions that must be excluded from the processing. To prevent this problem, an automatic shielding method is proposed.
(c) Cloud layer existence can influence the precision of classification, and therefore the multispectral image is selected to avoid cloud layer coverage influence. In practice, images with cloud cover less than 10% are selected.
FIG. 2 shows a preprocessed Landsat-8 OLI image of Tokyo 2015, Japan, with 7 bands of 30m spatial resolution. The embodiment of the present invention selects a local area 2666 (rows) × 2728 (columns) from the whole scene image for description (fig. 3).
And S2, generating a training sample from the open data. The embodiment of the invention selects two open data sources of social media data (taking Twitter data as an example) and Openstoretmap (OSM) respectively. The two data are taken as an example to illustrate that the training sample strategy is automatically generated by the crowd-sourced data. Twitter is considered herein to be the only source of geo-location related social media. Firstly, the number of tweets in a single pixel in the Landsat image is calculated, and then only the pixel with at least one tweets point is taken as a training sample. Q ═ x i I 1.. l } represents a training sample set. Where x represents a spectral vector (i.e., spectral vector) with a number of columns equal to the number of bands of the multispectral image. It should be noted that Twitter data with original geographic reference has two basic features, 1) tweets can be repeatedly sent in 30m pixels; 2) the twets can be delivered on either a water-impermeable surface or a water-permeable surface. The embodiment of the invention provides two measurement methods (namely the frequency and the spectrum similarity of the Twitter data) to describe the characteristics of the Twitter data.
1) Twitter data typically recurs in the same coordinate-like range (Landsat-8 terrestrial spatial resolution of 30 meters). On the basis, defining the frequency of tweets of the ith training sample asF i
Figure GDA0003676923180000081
Wherein l is the number of Landsat pixels containing tween, T i Is the number of tweets in the corresponding pixel. T is i A larger value of (a) means that the training sample is more likely to be located in the impervious layer. For example, the number of Twitter sent in a certain pixel of the Landsat-8 image is 5, and the maximum Twitter sending frequency in a certain pixel of the image is, for example, 10, so that the pixel with frequency of 5 has a frequency of 0.5, i.e., F i =0.5。
2) Spectral similarity of Twitter data As mentioned above, although a small number of tweets are from water-permeable areas, such as the sea/beach, farmland, mountainous areas, etc., most of the tweets are concentrated in water-impermeable areas. This means that training samples from Twitter have similar spectral values. Assume that training samples derived from Twitter form a cluster. The distance from the non-permeable class to the clustering center of the training sample is smaller, and the distance from the permeable class to the clustering center of the training sample is larger. The distance of the training samples to the cluster center can be quantitatively measured with a Minimum Covariance Matrix (MCD). The core idea of the MCD algorithm is to find h observation values with minimum spatial divergence. The determinant of the covariance matrix is a good measure of the divergence of the point cloud when the point cloud (in this example the h most central observations) is symmetrically distributed around a center. Assuming that the ith training sample satisfies the multivariate normal distribution, the probability density function thereof can be expressed as
Figure GDA0003676923180000091
Wherein x i Is the spectral vector of the ith training sample, u is the sample mean,
Figure GDA0003676923180000095
is the mean valueThe covariance matrix of (2). The distance between the ith training sample and the mean value is defined as the spectral similarity S i
S i =(x i -u) T-1 (x i -u) (3)
Wherein
Figure GDA0003676923180000092
And
Figure GDA0003676923180000093
respectively, a sample mean and a sample covariance matrix. S i A smaller value of (i) means that the ith training sample is more likely to fall in the water-impermeable area. Given the frequency and spectral similarity of the Twitter data, the weight of the ith training sample is expressed as:
Figure GDA0003676923180000094
for example, when the frequency of the ith training sample is 0.5, when S i When 2, weight w i 0.8244; fig. 4 shows a distribution diagram of social media data Twitter as a training sample.
For OSM, we first convert the impervious layers of buildings, traffic, roads, etc. into two raster images with spatial resolutions of 1m and 30m, respectively. The i-th impervious pixel of the OSM raster image with the resolution of 30m will cover 900 pixels (region R) on the OSM raster image with the resolution of 1m i ). Thus, the frequency of the ith water-impermeable picture element is defined as:
Figure GDA0003676923180000101
wherein
Figure GDA0003676923180000102
Is the OSM raster image region R of 1m resolution i The number of impermeable pixels,/ o The number of impervious layer pixels of the OSM raster image with the resolution of 30 m; when the region R i The number of the waterproof pixels is 300, the frequency F in the area i O =0.3333;
Similar to Twitter, the spectral similarity of the impervious pixels of OSM can also be calculated in the same way. Using F i O Alternative F i And updating the spectrum similarity in the formula (4) to obtain the weight of the training sample generated by the OSM. Spectral similarity S of impervious pixels with OSM as training sample i o Is represented as follows:
Figure GDA0003676923180000103
wherein y is i And
Figure GDA0003676923180000104
respectively, the sample mean and the sample covariance matrix, S i o Smaller values of (i) indicate a higher probability that the ith training sample falls in the watertight region; weight w of training sample generated by OSM i o Is represented as follows:
Figure GDA0003676923180000105
fig. 6 is a distribution diagram of another open source data (OSM) as a training sample.
S3, extracting the impervious layer by using a single-class support vector machine
The embodiment of the invention classifies the satellite images by using a single-class support vector machine (OCSVM), and extracts the information of the impervious layer. In contrast to a standard support vector machine that finds optimal boundaries separating two classes, OCSVM trains a hypersphere with a minimum volume that contains the largest training sample of a single class. The distance of a sample to a boundary may be interpreted as a degree of similarity, which may provide a reference as to whether the sample belongs to a particular class. OCSVM is a constrained convex optimization problem
Figure GDA0003676923180000111
s.t.||x i -a|| 2 ≤R 2i
ξ i ≥0,i=1,...,l (8)
Wherein α and R are the center and radius of the hypersphere, respectively; | | The | is the Euler norm, ξ i Is a slack variable, and C is a trade-off parameter that is used to control the degree of slack. The dual form of equation (8) is as follows:
Figure GDA0003676923180000112
s.t.0≤α i ≤C,i=1,...,l
Figure GDA0003676923180000113
wherein<·,·>Is the inner product, { α 1 ,...,α l Is the lagrange multiplier.
Twitter data as training samples may come from both watertight and water-permeable areas. In other words, the training samples contain a certain proportion of the image elements that are mislabeled. As described in step S2, we give training samples from the water impermeable areas a higher weight and give training samples from the water permeable areas a lower weight. Therefore, we assume that a training sample with small weight has a weak influence on the parameters (such as center and radius) for fitting the hypersphere. For this purpose, the weights of the training samples in equation (4) are introduced into equation (8), as shown below
Figure GDA0003676923180000114
s.t.||x i -a|| 2 ≤R 2i
ξ i ≥0,i=1,...,l (10)
Equation (8) becomes the following equation accordingly:
Figure GDA0003676923180000121
s.t.0≤α i ≤w i C,i=1,...,l
Figure GDA0003676923180000122
by kernel function K<x i ,x j >Replacing the inner product in equation (9) and equation (11)<x i ,x j >The nucleated versions of OCSVM and WOCSVM are obtained. The most suitable parameters are determined by quintupling cross-validation. In general, the free parameters are difficult to adjust when only a sample of the target marker is available. In fact, in this case, only the true rate (sensitivity) can be calculated, and the other error corresponding values (specificity) cannot be calculated. In this case, the choice of the free parameters is determined by the following equation:
Figure GDA0003676923180000123
where (is the entire set of free parameters,
Figure GDA0003676923180000124
is the optimal set of parameters obtained by quintupling cross-validation, OA is the overall accuracy, nSV is the number of support vectors. This performance constraint limits the complexity of the model by keeping the number of support vectors low, thereby achieving higher overall accuracy. In order to reduce the computational cost brought by the large-scale training sample capacity, the OCSVM classification is realized by using the LIBSVM packet accelerated by the GPU.
And S4, clustering the probability graph obtained in the step S3.
And (3) clustering the impervious bed (probability map) obtained in the last step by utilizing an expectation maximization clustering algorithm (EM clustering algorithm), wherein the clustering aims to remove image noise and fill holes in the image. The general steps of the EM algorithm are:
e, step E: q i (z (i) ):=p(z (i) |x (i) ;θ)
And M:
Figure GDA0003676923180000125
the probability graph is divided into 5 classes by using an EM algorithm, the class or classes belonging to the impervious layer are extracted, and finally the urban built-up area or the impervious layer to be extracted is obtained.
Fig. 5 and 7 are graphs showing the result of extracting an impervious layer using Twitter data as a training sample and OSM data as a training sample, respectively.
Although the embodiments of the present invention have been described in detail with reference to the accompanying drawings, the embodiments of the present invention are not limited to the details of the above embodiments, and various simple modifications can be made to the technical solutions of the embodiments of the present invention within the technical idea of the embodiments of the present invention, and the simple modifications all belong to the protection scope of the embodiments of the present invention.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. In order to avoid unnecessary repetition, the embodiments of the present invention do not describe every possible combination.
In addition, any combination of various different implementation manners of the embodiments of the present invention can be made, and the embodiments of the present invention should also be regarded as the disclosure of the embodiments of the present invention as long as the combination does not depart from the spirit of the embodiments of the present invention.

Claims (6)

1. A method for extracting urban built-up areas by using multi-source data to assist remote sensing image classification is characterized by comprising the following steps:
(1) carrying out data preprocessing on the obtained remote sensing image;
(2) acquiring the mass-source data, selecting training samples from the mass-source data, calculating the number of the training samples in a single pixel according to the spatial resolution of the remote sensing image, and further calculating the frequency and spectrum similarity; the crowd source data comprises two open data sources, namely social media data and Openstreetmap, wherein the social media data is taken from Twitter, the number of tweets in a single pixel in a Landsat image is firstly calculated, and then only pixels with at least one tweet point are taken as training samples; with Ω ═ x i I 1.. multidot.l } represents a training sample set, where x represents a spectral vector whose number of columns is equal to the number of bands of the multispectral image;
the frequency and spectral similarity of the Twitter data were measured:
1) measuring the frequency of Twitter data: defining the frequency of tweets of the ith training sample as F i
Figure FDA0003676923170000011
Wherein l is the number of Landsat pixels containing tween, T i Is the number of tweets, T, in the corresponding pixel i The larger the value of (a) is, the larger the probability that the training sample is located in the impervious layer is;
2) measurement of spectral similarity of Twitter data: assuming that a cluster is formed by training samples derived from Twitter, the distance from the non-permeable class to the cluster center of the training samples is smaller than the distance from the permeable class to the cluster center of the training samples, and the distance from the training samples to the cluster center is quantitatively measured by using a minimum covariance matrix MCD; when the point cloud is symmetrically distributed around a center, the probability density function of the ith training sample is expressed as
Figure FDA0003676923170000012
Wherein x i Is the spectral vector of the ith training sample, u is the sample mean,Σ is a covariance matrix of the mean, and the distance of the ith training sample from its mean is defined as the spectral similarity S i
S i =(x i -u) T-1 (x i -u) (3)
Wherein S i Smaller values of (i) indicate a higher probability that the ith training sample falls in the watertight region;
based on the frequency and spectral similarity of the Twitter data, the weight of the ith training sample is expressed as:
Figure FDA0003676923170000021
for the data taken from the Opestreemap, firstly, the impervious layer is converted into two OSM raster images with the spatial resolutions of 1m and 30m, respectively, and the ith impervious pixel of the OSM raster image with the resolution of 30m covers an area R formed by 900 pixels on the OSM raster image with the resolution of 1m i (ii) a Thus, the frequency of the ith water-impermeable picture element is defined as:
Figure FDA0003676923170000022
wherein
Figure FDA0003676923170000023
Is the OSM raster image region R of 1m resolution i Water impermeable pixel count of l o Is the number of impervious layer pixels of the OSM grid image with the resolution of 30 m;
spectral similarity of impervious pixels with OSM as training sample
Figure FDA0003676923170000024
Is represented as follows:
Figure FDA0003676923170000025
wherein y is i And
Figure FDA0003676923170000026
respectively a sample mean and a sample covariance matrix,
Figure FDA0003676923170000027
smaller values of (i) indicate a higher probability that the ith training sample falls in the watertight region; weight of training sample generated by OSM
Figure FDA0003676923170000028
Is represented as follows:
Figure FDA0003676923170000029
(3) extracting the impervious layer by using a single-class support vector machine;
(4) and performing EM clustering on the impervious layer to obtain the urban built-up area to be extracted.
2. The method for extracting the urban built-up area by using the crowd-sourced data to assist the classification of the remote sensing image according to claim 1, wherein the remote sensing image in the step (1) is taken from Landsat-8, and the data preprocessing comprises data correction, abnormal value detection and cloud masking.
3. The method for extracting the urban built-up area by using the crowd-sourced data-assisted remote sensing image in a classified manner according to claim 2, wherein the data correction is to convert terrain correction data into a radiation value by using a provided calibration value in the case of multispectral data; the abnormal value is detected as an abnormal value caused by acquisition inconsistency or calibration error; the cloud mask is a multispectral image with cloud content less than 10% selected to avoid cloud coverage effects.
4. The method for extracting the urban built-up area by using the crowd-sourced data to assist the classification of the remote sensing image according to the claim 1, wherein in the step (3), the remote sensing image is classified by using a one-class support vector machine (OCSVM) to extract the impervious layer information;
the OCSVM trains a hypersphere with a minimum volume, wherein the hypersphere contains a maximum training sample of a single class, and the distance from the sample to a boundary is interpreted as a similarity degree and provides a reference for judging whether the sample belongs to a specific class or not;
OCSVM as a constrained convex optimization problem
Figure FDA0003676923170000031
Wherein a and R are the center and radius of the hypersphere, respectively; | | The | is the Euler norm, ξ i Is used for controlling the relaxation variable of the relaxation degree, and C is a balance parameter; the dual form of equation (8) is as follows:
Figure FDA0003676923170000041
wherein<·,·>Is the inner product, { α 1 ,...,α l Is the lagrange multiplier;
the weight of the training samples from the water impermeable area is made greater than the weight of the training samples from the water permeable area, and the weight of the training samples in equation (4) is introduced into equation (8), as shown below
Figure FDA0003676923170000042
Equation (9) becomes the following equation accordingly:
Figure FDA0003676923170000043
by kernel functions K<x i ,x j >Alternative formulae (9) andinner product in formula (11)<x i ,x j >Obtaining a coring version of an OCSVM (weighted OCSVM) and a weighted single-class support vector machine (WOCSVM), determining the most appropriate parameter by a quintuplet cross-validation method, and selecting a free parameter by the following formula:
Figure FDA0003676923170000044
where theta is the entire set of free parameters,
Figure FDA0003676923170000045
is the optimal parameter set obtained by quintupling cross-validation, OA is the overall accuracy, nSV is the number of support vectors.
5. The method for extracting the built-up areas of the city by using the crowd-sourced data to assist the classification of the remote sensing images as claimed in claim 1, wherein the EM clustering algorithm in the step (4) clusters the probability map of the impervious layer obtained in the step (3) so as to remove the image noise of the probability map and fill holes existing in the image.
6. The method for classifying and extracting built-up areas of cities by using the crowd-sourced data-assisted remote sensing images as claimed in claim 5, wherein the EM clustering algorithm classifies the probability map into five classes, and extracts at least one class belonging to impervious layers from the five classes.
CN201910208202.8A 2019-03-19 2019-03-19 Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification Active CN109948697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910208202.8A CN109948697B (en) 2019-03-19 2019-03-19 Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910208202.8A CN109948697B (en) 2019-03-19 2019-03-19 Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification

Publications (2)

Publication Number Publication Date
CN109948697A CN109948697A (en) 2019-06-28
CN109948697B true CN109948697B (en) 2022-08-26

Family

ID=67008433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910208202.8A Active CN109948697B (en) 2019-03-19 2019-03-19 Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification

Country Status (1)

Country Link
CN (1) CN109948697B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110781948A (en) * 2019-10-22 2020-02-11 北京市商汤科技开发有限公司 Image processing method, device, equipment and storage medium
CN111125553B (en) * 2019-11-22 2022-05-31 中国科学院城市环境研究所 Intelligent urban built-up area extraction method supporting multi-source data
CN115270904B (en) * 2022-04-13 2023-04-18 广州市城市规划勘测设计研究院 Method and system for spatialization of proper-age permanent population in compulsory education stage

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034233A (en) * 2018-07-18 2018-12-18 武汉大学 A kind of high-resolution remote sensing image multi classifier combination classification method of combination OpenStreetMap
CN109063754A (en) * 2018-07-18 2018-12-21 武汉大学 A kind of remote sensing image multiple features combining classification method based on OpenStreetMap
CN109299673A (en) * 2018-09-05 2019-02-01 中国科学院地理科学与资源研究所 The green degree spatial extraction method of group of cities and medium
CN109325085A (en) * 2018-08-08 2019-02-12 中南大学 A kind of urban land identification of function and change detecting method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160379388A1 (en) * 2014-07-16 2016-12-29 Digitalglobe, Inc. System and method for combining geographical and economic data extracted from satellite imagery for use in predictive modeling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034233A (en) * 2018-07-18 2018-12-18 武汉大学 A kind of high-resolution remote sensing image multi classifier combination classification method of combination OpenStreetMap
CN109063754A (en) * 2018-07-18 2018-12-21 武汉大学 A kind of remote sensing image multiple features combining classification method based on OpenStreetMap
CN109325085A (en) * 2018-08-08 2019-02-12 中南大学 A kind of urban land identification of function and change detecting method
CN109299673A (en) * 2018-09-05 2019-02-01 中国科学院地理科学与资源研究所 The green degree spatial extraction method of group of cities and medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Semisupervised One-Class Support Vector Machines for Classification of Remote Sensing Data;Jordi Mũnoz-Marí等;《IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING》;20100831;第48卷(第8期);第3188-3197页 *
加权SVDD算法在人体姿态估计中的研究与应用;韩贵金;《计算机工程与应用》;20170801;第53卷(第15期);第132-136页 *

Also Published As

Publication number Publication date
CN109948697A (en) 2019-06-28

Similar Documents

Publication Publication Date Title
Chen et al. Mapping global urban areas from 2000 to 2012 using time-series nighttime light data and MODIS products
Li et al. SinoLC-1: the first 1-meter resolution national-scale land-cover map of China created with the deep learning framework and open-access data
CN109948697B (en) Method for extracting urban built-up area by using multi-source data to assist remote sensing image classification
CN109919875B (en) High-time-frequency remote sensing image feature-assisted residential area extraction and classification method
Bakr et al. Monitoring land cover changes in a newly reclaimed area of Egypt using multi-temporal Landsat data
CN111832518B (en) Space-time fusion-based TSA remote sensing image land utilization method
Zhang et al. Phenology-assisted supervised paddy rice mapping with the Landsat imagery on Google Earth Engine: Experiments in Heilongjiang Province of China from 1990 to 2020
Zhang et al. An automated early-season method to map winter wheat using time-series Sentinel-2 data: A case study of Shandong, China
CN113139901A (en) Remote sensing fine inversion method for watershed scale vegetation net primary productivity
CN114417646B (en) High-dimensional heterogeneous precipitation data fusion method and system
Chen et al. Comparison of different methods for spatial downscaling of GPM IMERG V06B satellite precipitation product over a typical arid to semi-arid area
Mukherjee et al. Development of new index-based methodology for extraction of built-up area from landsat7 imagery: Comparison of performance with svm, ann, and existing indices
CN107576399B (en) MODIS forest fire detection-oriented brightness and temperature prediction method and system
Chang et al. Research on the extraction method of urban built-up areas with an improved night light index
Tang et al. A novel sample selection method for impervious surface area mapping using JL1-3B nighttime light and Sentinel-2 imagery
CN102540169B (en) Quality evaluation method for water body mapping product based on remote sensing image
Zhang et al. A new method for monitoring start of season (SOS) of forest based on multisource remote sensing
Zhao et al. Cloud identification and properties retrieval of the Fengyun-4A satellite using a ResUnet model
Sun et al. Deep learning for check dam area extraction with optical images and digital elevation model: A case study in the hilly and gully regions of the Loess Plateau, China
CN117058557A (en) Cloud and cloud shadow joint detection method based on physical characteristics and deep learning model
CN116340863A (en) Air pollutant prediction method and device, electronic equipment and readable storage medium
Guo et al. Mapping impervious surface distribution and dynamics in an arid/semiarid area-A case study in ordos, China
Parmar et al. Land use land cover change detection and its impact on land surface temperature of malana watershed kullu, Himachal Pradesh, India
Zhu et al. A Coupled Temporal–Spectral–Spatial Multidimensional Information Change Detection Framework Method: A Case of the 1990–2020 Tianjin, China
CN115546658A (en) Night cloud detection method combining data set quality improvement and CNN improvement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant