CN104820774A - Space complexity based mapsheet sampling method - Google Patents

Space complexity based mapsheet sampling method Download PDF

Info

Publication number
CN104820774A
CN104820774A CN201510181877.XA CN201510181877A CN104820774A CN 104820774 A CN104820774 A CN 104820774A CN 201510181877 A CN201510181877 A CN 201510181877A CN 104820774 A CN104820774 A CN 104820774A
Authority
CN
China
Prior art keywords
sampling
map sheet
parameter
sample
space complexity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510181877.XA
Other languages
Chinese (zh)
Other versions
CN104820774B (en
Inventor
童小华
谢欢
孟雯
***
刘世杰
陈鹏
张松林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201510181877.XA priority Critical patent/CN104820774B/en
Publication of CN104820774A publication Critical patent/CN104820774A/en
Application granted granted Critical
Publication of CN104820774B publication Critical patent/CN104820774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Length Measuring Devices By Optical Means (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to a space complexity based mapsheet sampling method. The method comprises a first step of acquiring overall prior information of mapsheet data sampling; a second step of acquiring a sample size of a mapsheet sampling scheme according to a probability statistics theory and the prior information; a third step of acquiring global kernel density estimation parameters of each mapsheet; a fourth step of acquiring the effective area ratio parameter of each mapsheet; a fifth step of according to the global kernel density estimation parameters, the effective area ratio parameter and the sample size of the mapsheet sampling scheme, performing layered mapsheet sampling based on the space complexity; and a sixth step of further performing sample arrangement and adjustment and outputting a result. Compared with the prior art, the method breaks the limitation that a conventional sampling method does not take the spatial relationship of a sample object into account and lacks a unified sample size quantitative model, and realizes quantitative estimation of the sample size by adopting a probability statistics based sample model; the sample arrangement method takes the spatial complexity character and distribution of the sample object into account, and guarantees the spatial representativeness of the sample.

Description

A kind of map sheet methods of sampling based on space complexity
Technical field
The present invention relates to a kind of map sheet methods of sampling, especially relate to a kind of map sheet methods of sampling based on space complexity.
Background technology
Pick test can provide reliable information to the quality management of product, is the basic means of quality control.Traditional methods of sampling comprises: simple random sampling, stratified sampling, systematic sampling and cluster sampling, its for target sample for being simple one dimension homogeney data.Surveying and mapping product is as a kind of spatial data, there is the features such as magnanimity, multidimensional, heterogeneous body, traditional sampling method exist when complicated spatial data obviously not enough, as lacked unified sampling model and the expression lacked the spatial information of data and utilization.At present for this problem, some scholars improve traditional methods of sampling and sampling model, and such as Duarte and Saraiva utilizes nonlinear integer programming theoretical, departure and minimum, propose attributed sampling Optimized model; Wang Jingfeng etc. establish sandwich Spatial sampling model, by a series of estimation of error research space characteristics on the impact of precision result.These researchs combine spatial data and theory of errors proposes new method, but also there is sampling model instability, lack the consideration of the concrete spatial arrangement to sample.
Summary of the invention
Object of the present invention be exactly in order to overcome above-mentioned prior art exist defect and a kind of map sheet methods of sampling based on space complexity is provided, the sample extracted by the method has stronger representativeness, Accuracy extimate result has stronger confidence level, provides new thinking for solving spatial data quality evaluation key issue simultaneously.
Object of the present invention can be achieved through the following technical solutions:
A kind of map sheet methods of sampling based on space complexity comprises the following steps:
1) the overall prior imformation of map sheet sampling of data is obtained;
2) sample size of map sheet sampling plan is obtained according to Probability Statistics Theory and prior imformation;
3) the overall Density Estimator parameter of each map sheet is obtained;
4) useful area obtaining each map sheet compares parameter;
5) according to overall Density Estimator parameter, useful area than the sample size of parameter and map sheet sampling plan, the layering map sheet of carrying out based on space complexity is sampled;
6) adjustment laid by sample further, and Output rusults.
Described prior imformation comprises the coverage of map sheet data, geographic position, framing rule, sampling unit, production time, production method, storage format and quantity.
Described step 2) be specially: according to prior imformation, by Probability Statistics Theory, controlled sampling error, quantitatively calculates map sheet sampling plan to sample total, meets following formula:
n = μ 1 - α / 2 2 ( 1 - p ) r 2 p 1 + 1 N ( μ 1 - α / 2 2 ( 1 - p ) r 2 p - 1 ) - - - ( 1 )
In formula, n represents the sample size that will calculate, and α gets 5%, μ 1-α/2represent standardized normal distribution critical value when degree of confidence gets 1-α/2, p represents and estimates wrong precision, and r is relative error.
Described p adopts quality of reception limit.
Described step 3) be specially: according to probability theory, if X ibe the Sampling characters of map sheet, as the overall middle independent same distribution sample extracted from distribution density function f, i ∈ M, M are the sum of Sampling characters in single map sheet, estimate the value f of f at certain some x place of single map sheet mx (), meets following formula:
f M ( x ) = 1 Mh Σ i = 1 M k ( x - X i h ) - - - ( 2 )
In formula, k () is called kernel function, h>0, is bandwidth, (x-X i) represent estimation point x to X iplace's distance, uses f mx () represents the overall Density Estimator parameter of single map sheet.
Described kernel function adopts quartic polynomial kernel function or normal state kernel function.
The useful area of described each map sheet is the area ratio that in single map sheet, Sampling characters accounts for this map sheet than parameter.
Described step 5) be specially:
501: by step 3) and step 4), after being normalized respectively than parameter overall Density Estimator parameter and useful area, with 0.5 for separation, parameter value is in [0,0.5) interval imparting low value, interval imparting high level that parameter value is in [0.5,1];
502: be divided into four layers based on two parameters, and set the sampling rate of every layer;
503: by step 2) in, the sample size of map sheet sampling plan is multiplied by sampling rate and obtains every layer of map sheet sample size needing to lay, and successively lays sample at random.
Set in described step 502:
When overall Density Estimator parameter is low value and useful area is low value than parameter, then this layer of sampling rate is 10%;
When overall Density Estimator parameter is low value and useful area is high level than parameter, then this layer of sampling rate is 20%;
When overall Density Estimator parameter is high level and useful area is low value than parameter, then this layer of sampling rate is 30%;
When overall Density Estimator parameter is high level and useful area is high level than parameter, then this layer of sampling rate is 40%.
Described step 6) be specially: can the adjustment of further sample laying be carried out according to region heterogeneity and/or expertise and export map sheet sample results.
Compared with prior art, the present invention has the following advantages:
1) the present invention overcomes traditional methods of sampling and does not consider the spatial relationship of target sample and lack the limitation of unified sample size quantitative model, and then adopt the sampling model based on probability statistics to achieve the quantitative predication of sample size, cloth quadrat method considers space complexity feature and the distribution of target sample, guarantees the spatial representative of sample.
2) by there is stronger representativeness in conjunction with overall Density Estimator parameter and useful area than the sample that parameter carries out extracting, Accuracy extimate result has stronger confidence level, there is provided new thinking for solving spatial data quality evaluation key issue simultaneously, be applicable to spatial data, and reasonable expression and the utilization of the spatial information of sample can be guaranteed.
Accompanying drawing explanation
Fig. 1 is the map sheet methods of sampling process flow diagram that the present invention takes space complexity into account;
Fig. 2 is the global framing schematic diagram of experimental data GlobeLand30 of the present invention;
Fig. 3 is experimental data GlobeLand30 global space complexity schematic diagram of the present invention;
Fig. 4 is GlobeLand30 of the present invention whole world map sheet sample distribution schematic diagram.
Embodiment
Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.The present embodiment is implemented premised on technical solution of the present invention, give detailed embodiment and concrete operating process, but protection scope of the present invention is not limited to following embodiment.
Map sheet in this method is defined as the minimum unit that spatial data stores through the achievement that each generic operations such as collection, production, editor, drawing are formed, such as global standards framing, national 1:100 ten thousand Standard division range etc.; If there is no clear and definite framing rule, geographical zone, the replacement of administrative boundary line can be adopted, such as each continent, various countries, each city etc.Map sheet sampling, for spatial data achievement, take map sheet as sampling unit, and sampling plan adopts sampling model determination sample size, and cloth specimen scheme considers that sample laying is carried out in the space complexity of data and distribution, final formation map sheet sample.
As shown in Figure 1, a kind of map sheet methods of sampling based on space complexity comprises the following steps:
1) the overall prior imformation of map sheet sampling of data is obtained.
Prior imformation comprises the coverage of map sheet data, geographic position, framing rule, sampling unit, production time, production method, storage format and quantity.
2) sample size of map sheet sampling plan is obtained according to Probability Statistics Theory and prior imformation.Be specially: according to prior imformation, by Probability Statistics Theory, controlled sampling error, quantitatively calculates map sheet sampling plan to sample total, meets following formula:
n = μ 1 - α / 2 2 ( 1 - p ) r 2 p 1 + 1 N ( μ 1 - α / 2 2 ( 1 - p ) r 2 p - 1 ) - - - ( 1 )
In formula, n represents the sample size that will calculate, and α gets 5%, μ 1-α/2represent standardized normal distribution critical value when degree of confidence gets 1-α/2, p represents and estimates wrong precision, and the quality of reception can be adopted to limit (AQL), and r is relative error, and α, p and r are from prior imformation.
3) the overall Density Estimator parameter of each map sheet is obtained.Be specially:
Map sheet is the minimum memory unit of performance data, represents certain spatial dimension.Space complexity under map sheet yardstick is it is considered that data complexity in global scope, and data type is more, and the spatial information of expression is abundanter, and space complexity is larger, and the probability that the spatial data in this region is extracted is also larger.Therefore, non-parametric estmation---Density Estimator comes complexity and the distribution situation of quantificational expression spatial data in employing.
Density Estimator is according to probability theory, if X iit is the Sampling characters (such as attributive character or space characteristics) of map sheet, as the overall middle independent same distribution sample extracted from distribution density function f, i ∈ M, M are the sum of Sampling characters in single map sheet, estimate the value f of f at certain some x place of single map sheet mx (), has Rosenblatt-Parzen kernel estimates usually, meet following formula:
f M ( x ) = 1 Mh Σ i = 1 M k ( x - X i h ) - - - ( 2 )
In formula, k () is called kernel function, h>0, is bandwidth, (x-X i) represent estimation point x to X iplace's distance, uses f mx () represents the overall Density Estimator parameter of single map sheet, such as, get average.Kernel function adopts quartic polynomial kernel function or normal state kernel function.
4) useful area obtaining each map sheet compares parameter.The useful area of each map sheet is the area ratio that in single map sheet, Sampling characters accounts for this map sheet than parameter.
5) according to overall Density Estimator parameter, useful area than the sample size of parameter and map sheet sampling plan, the layering map sheet of carrying out based on space complexity is sampled.Be specially:
501: by step 3) and step 4), after being normalized respectively than parameter overall Density Estimator parameter and useful area, with 0.5 for separation, parameter value is in [0,0.5) interval imparting low value, interval imparting high level that parameter value is in [0.5,1];
502: be divided into four layers based on two parameters, and set the sampling rate of every layer, sampling rate represents that map sheet samples sources is in the ratio of this layer; Concrete setting is as shown in table 1:
Table 1 is sampled based on the layering map sheet of space complexity
The number of plies Layering Sampling rate
1 The low map sheet useful area ratio of low Density Estimator & 10%
2 Low Density Estimator & high map sheet useful area ratio 20%
3 The low map sheet useful area ratio of high Density Estimator & 30%
4 High Density Estimator & high map sheet useful area ratio 40%
503: by step 2) in, the sample size of map sheet sampling plan is multiplied by sampling rate and obtains every layer of map sheet sample size needing to lay, and successively lays sample at random.
6) can the adjustment of further sample laying be carried out according to region heterogeneity and/or expertise and export map sheet sample results.
In the present embodiment, experimental data adopts First Chinese to overlap the 2010 issue certificates of 30 meters of global seismic covering remote sensing mapping product G lobeLand30, comprise more than 800 map sheet in the whole world, framing rule changes with latitude: the map format between north and south latitude-60 ゜ to+60 ゜ is 5 ゜ × 6 ゜, map format between north and south latitude 60 ゜ to 90 ゜ is 5 ゜ × 12 ゜, as shown in Figure 2.
First obtain data message, adopt sampling model (formula 1) to calculate sample size, as shown in table 2 below:
Table 2 global seismic covers remote sensing mapping product first order sampling plan and sample distributes
By sample size quantitative predication, 80 samples need be extracted from 847 map sheets in the whole world, and according to each continent area weight, sample be distributed further.
The space complexity of each map sheet is calculated, as shown in Figure 3 according to formula 2.
Fig. 3 is the space complexity intensity map formed according to global seismic cover data Density Estimator, and Sampling characters is surface area, and wherein white high bright region is ground mulching accumulation area, and the spatial information comprised enriches, and space complexity is high; And black dull region is rarefaction, space complexity is low.
Again to each map sheet reference area validity, according to two layering indexs: space complexity and area efficiency carry out stratified sampling, consider region heterogeneity and expert opinion, the final map sheet sample results extracted as shown in Figure 4 simultaneously.
Sampling process is divided into sampling plan and cloth specimen scheme by the inventive method, and sampling plan achieves the quantitative predication of sample size by sampling model, and cloth specimen scheme considers that the space complexity feature of sample is optimized sample further and laid.Final map sheet sample sampling results shows that the present invention is applicable to spatial data, and can guarantee reasonable expression and the utilization of the spatial information of sample.

Claims (10)

1., based on a map sheet methods of sampling for space complexity, it is characterized in that, comprise the following steps:
1) the overall prior imformation of map sheet sampling of data is obtained;
2) sample size of map sheet sampling plan is obtained according to Probability Statistics Theory and prior imformation;
3) the overall Density Estimator parameter of each map sheet is obtained;
4) useful area obtaining each map sheet compares parameter;
5) according to overall Density Estimator parameter, useful area than the sample size of parameter and map sheet sampling plan, the layering map sheet of carrying out based on space complexity is sampled;
6) adjustment laid by sample further, and Output rusults.
2. a kind of map sheet methods of sampling based on space complexity according to claim 1, it is characterized in that, described prior imformation comprises the coverage of map sheet data, geographic position, framing rule, sampling unit, production time, production method, storage format and quantity.
3. a kind of map sheet methods of sampling based on space complexity according to claim 1, is characterized in that, described step 2) be specially: according to prior imformation, pass through Probability Statistics Theory, controlled sampling error, quantitatively calculates map sheet sampling plan to sample total, meets following formula:
n = μ 1 - α / 2 2 ( 1 - p ) r 2 p 1 + 1 N ( μ 1 - α / 2 2 ( 1 - p ) r 2 p - 1 ) - - - ( 1 ) In formula, n represents the sample size that will calculate, and α gets 5%, μ 1-α/2represent standardized normal distribution critical value when degree of confidence gets 1-α/2, p represents and estimates wrong precision, and r is relative error.
4. a kind of map sheet methods of sampling based on space complexity according to claim 3, is characterized in that, described p adopts quality of reception limit.
5. a kind of map sheet methods of sampling based on space complexity according to claim 1, is characterized in that, described step 3) be specially: according to probability theory, if X ibe the Sampling characters of map sheet, as the overall middle independent same distribution sample extracted from distribution density function f, i ∈ M, M are the sum of Sampling characters in single map sheet, estimate the value f of f at certain some x place of single map sheet mx (), meets following formula:
f M ( x ) = 1 Mh Σ i = 1 M k ( x - X i h ) - - - ( 2 ) In formula, k () is called kernel function, h>0, is bandwidth, (x-X i) represent estimation point x to X iplace's distance, uses f mx () represents the overall Density Estimator parameter of single map sheet.
6. a kind of map sheet methods of sampling based on space complexity according to claim 5, is characterized in that, described kernel function adopts quartic polynomial kernel function or normal state kernel function.
7. a kind of map sheet methods of sampling based on space complexity according to claim 1, is characterized in that, the useful area of described each map sheet is the area ratio that in single map sheet, Sampling characters accounts for this map sheet than parameter.
8. a kind of map sheet methods of sampling based on space complexity according to claim 1, is characterized in that, described step 5) be specially:
501: by step 3) and step 4), after being normalized respectively than parameter overall Density Estimator parameter and useful area, with 0.5 for separation, parameter value is in [0,0.5) interval imparting low value, interval imparting high level that parameter value is in [0.5,1];
502: be divided into four layers based on two parameters, and set the sampling rate of every layer;
503: by step 2) in, the sample size of map sheet sampling plan is multiplied by sampling rate and obtains every layer of map sheet sample size needing to lay, and successively lays sample at random.
9. a kind of map sheet methods of sampling based on space complexity according to claim 8, is characterized in that, sets in described step 502:
When overall Density Estimator parameter is low value and useful area is low value than parameter, then this layer of sampling rate is 10%;
When overall Density Estimator parameter is low value and useful area is high level than parameter, then this layer of sampling rate is 20%;
When overall Density Estimator parameter is high level and useful area is low value than parameter, then this layer of sampling rate is 30%;
When overall Density Estimator parameter is high level and useful area is high level than parameter, then this layer of sampling rate is 40%.
10. a kind of map sheet methods of sampling based on space complexity according to claim 1, it is characterized in that, described step 6) be specially: can the adjustment of further sample laying be carried out according to region heterogeneity and/or expertise and export map sheet sample results.
CN201510181877.XA 2015-04-16 2015-04-16 A kind of map sheet sampling approach based on space complexity Active CN104820774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510181877.XA CN104820774B (en) 2015-04-16 2015-04-16 A kind of map sheet sampling approach based on space complexity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510181877.XA CN104820774B (en) 2015-04-16 2015-04-16 A kind of map sheet sampling approach based on space complexity

Publications (2)

Publication Number Publication Date
CN104820774A true CN104820774A (en) 2015-08-05
CN104820774B CN104820774B (en) 2016-08-03

Family

ID=53731069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510181877.XA Active CN104820774B (en) 2015-04-16 2015-04-16 A kind of map sheet sampling approach based on space complexity

Country Status (1)

Country Link
CN (1) CN104820774B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203790A (en) * 2017-06-23 2017-09-26 上海海洋大学 Utilize the Chinese land noctilucence Classification in Remote Sensing Image Accuracy Assessment of two stage sampling model
CN112215299A (en) * 2020-10-26 2021-01-12 中山大学 Block bootstrap method for mean value estimation of hydrological meteorological space data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033574A1 (en) * 2001-05-03 2003-02-13 Vasic Bane V. Interative decoding based on dominant error events
CN1564600A (en) * 2004-04-22 2005-01-12 上海交通大学 Detection method of moving object under dynamic scene
CN101957997A (en) * 2009-12-22 2011-01-26 北京航空航天大学 Regional average value kernel density estimation-based moving target detecting method in dynamic scene

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030033574A1 (en) * 2001-05-03 2003-02-13 Vasic Bane V. Interative decoding based on dominant error events
CN1564600A (en) * 2004-04-22 2005-01-12 上海交通大学 Detection method of moving object under dynamic scene
CN101957997A (en) * 2009-12-22 2011-01-26 北京航空航天大学 Regional average value kernel density estimation-based moving target detecting method in dynamic scene

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姜成晟: "地理空间抽样理论研究综述", 《地理学报》 *
金勇进等: "抽样技术的变革:空间抽样及有关问题", 《统计与咨询》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107203790A (en) * 2017-06-23 2017-09-26 上海海洋大学 Utilize the Chinese land noctilucence Classification in Remote Sensing Image Accuracy Assessment of two stage sampling model
CN112215299A (en) * 2020-10-26 2021-01-12 中山大学 Block bootstrap method for mean value estimation of hydrological meteorological space data
CN112215299B (en) * 2020-10-26 2023-08-15 中山大学 Block bootstrap method for hydrological space data mean value estimation

Also Published As

Publication number Publication date
CN104820774B (en) 2016-08-03

Similar Documents

Publication Publication Date Title
Sun et al. Estimating the spatial distribution of solar photovoltaic power generation potential on different types of rural rooftops using a deep learning network applied to satellite images
Ran et al. Delineation of reservoirs using remote sensing and their storage estimate: an example of the Yellow River basin, China
CN107784165B (en) Surface temperature field multi-scale data assimilation method based on photovoltaic power station
CN104462660A (en) Drawing method for winter icing thickness distribution of field electric transmission line
CN106650618A (en) Random forest model-based population data spatialization method
CN102004856A (en) Rapid collective Kalman filtering assimilating method for real-time data of high-frequency observation data
CN108268969A (en) Regional Economic Development form analysis and Forecasting Methodology and device based on remotely-sensed data
CN105426881B (en) Mountain background thermal field model constrained underground heat source daytime remote sensing detection locating method
CN104050323A (en) Fuzzy multi-criterion wind power plant site selection method for high-altitude mountain area
CN103699809B (en) Water and soil loss space monitoring method based on Kriging interpolation equations
US20240241287A1 (en) Method and System for Calculating Carbon and Water Flux of Ecosystem Based on Weather Station
CN112926468A (en) Tidal flat elevation automatic extraction method
Sinha et al. Groundwater dynamics in North Bihar plains
CN104820774A (en) Space complexity based mapsheet sampling method
CN110457819B (en) Method for identifying urban natural air ducts according to natural environment
CN104714001B (en) The method of a kind of soil erosion survey unitary space layout
CN104392113A (en) Method for estimating wind speed of cold air wind on offshore sea surface
CN104794335A (en) General multistage space sampling method
CN101793977A (en) Estimation method of hydrogeological parameters
Xu et al. Evaluation method and empirical application of human activity suitability of land resources in Qinghai-Tibet Plateau
CN105022856B (en) Predict the reservoir modeling methodologies of high camber meandering channel reservoir internal structure
CN114186413A (en) Landslide susceptibility evaluation method based on surface deformation and pregnant disaster environment conditions
CN111199092A (en) Solar radiation remote sensing estimation method and system and data processing device
CN110135103A (en) A kind of method and system using water flow simulation Urban Natural ventilation potentiality
CN107688712B (en) A kind of temperature NO emissions reduction method based on DEM and NDVI

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant