CN112016696A - PM integrating satellite observation and ground observation1Concentration inversion method and system - Google Patents
PM integrating satellite observation and ground observation1Concentration inversion method and system Download PDFInfo
- Publication number
- CN112016696A CN112016696A CN202010817931.6A CN202010817931A CN112016696A CN 112016696 A CN112016696 A CN 112016696A CN 202010817931 A CN202010817931 A CN 202010817931A CN 112016696 A CN112016696 A CN 112016696A
- Authority
- CN
- China
- Prior art keywords
- model
- observation
- concentration
- satellite
- geo
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 38
- 238000003066 decision tree Methods 0.000 claims abstract description 19
- 238000010276 construction Methods 0.000 claims abstract description 7
- 238000005259 measurement Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 7
- 150000001875 compounds Chemical class 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 4
- 230000002123 temporal effect Effects 0.000 abstract description 5
- 230000008859 change Effects 0.000 abstract description 3
- 238000012952 Resampling Methods 0.000 abstract description 2
- 239000013618 particulate matter Substances 0.000 description 38
- 238000007637 random forest analysis Methods 0.000 description 23
- 239000000443 aerosol Substances 0.000 description 10
- 230000000875 corresponding effect Effects 0.000 description 10
- 239000002245 particle Substances 0.000 description 9
- 238000010801 machine learning Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000012544 monitoring process Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 241000208818 Helianthus Species 0.000 description 2
- 235000003222 Helianthus annuus Nutrition 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000003344 environmental pollutant Substances 0.000 description 2
- 230000008821 health effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 231100000719 pollutant Toxicity 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 206010020772 Hypertension Diseases 0.000 description 1
- 239000004233 Indanthrene blue RS Substances 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000003915 air pollution Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 208000006673 asthma Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 230000017531 blood circulation Effects 0.000 description 1
- 238000001246 colloidal dispersion Methods 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 230000010485 coping Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000010419 fine particle Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N15/00—Investigating characteristics of particles; Investigating permeability, pore-volume or surface-area of porous materials
- G01N15/06—Investigating concentration of particle suspensions
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Chemical & Material Sciences (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Dispersion Chemistry (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a PM integrating satellite observation and ground observation1The concentration inversion method and the concentration inversion system are used for acquiring and matching data, and comprise the steps of acquiring ground PM1 data, satellite AOD data and related meteorological geographic parameters, and resampling the related meteorological geographic parameters by taking the spatial resolution of the satellite AOD data as reference; then, taking a PM1 observation station as a center, adopting a space-time window with a preset space radius and a preset time radius, calculating the mean value of each input feature in the space-time window, and matching the mean value with the actually measured PM1 concentration value of the corresponding observation station to form a training sample set; constructing an initial RF model, and optimally setting the number of decision trees and the number of variables used in constructing a binary tree according to the predicted residual error change of the model; initial geo-RF model construction, including defining spatial proximity observations S-PM1, forward temporal proximity observations T-PM1, and proximity spatial distance constraints, with spatio-temporal proximity observations also being input as explanatory variables to the constructed initial geo-RF modelObtaining a geo-RF model from the RF model; geo-RF model training and PM1 concentration estimation were performed.
Description
Technical Field
The invention belongs to the technical field of space observation and analysis, and particularly relates to a PM (particulate matter) integrating satellite observation and ground observation1A concentration inversion method and system.
Background
In recent decades, the Chinese economy has developed rapidly, and at the same time the emission of large quantities of pollutants, especially of particulate matter, has resulted in serious air pollution. Research has shown that aerosol particles have significant environmental, climatic and health effects. First, aerosols can directly scatter or absorb sunlight, thereby reducing atmospheric visibility. Due to PM1Is more similar to the main wavelength of sunlight, so that the capability of attenuating sunlight is stronger. Secondly, when the water vapor saturation reaches a certain degree, the aerosol can be activated into cloud condensation nuclei so as to change the radiation characteristics of the cloud, and the global radiation balance is indirectly influenced. In addition, prolonged exposure to fine particulate matter in the atmosphere may cause asthma, hypertension and even lung cancer. PM compared to Large particles1The medicine can stay in the air for a longer time, and can go deep into the human body due to small particle size, so that harmful substances are brought into various parts of the body through blood circulation, and the harm is larger. In view of this, PM1Has attracted increasing attention.
High-precision high-space-time resolution PM1The observation is crucial for the in-depth development of the spatial and temporal distribution of the aerosol and its health effects. At present, urban atmospheric quality monitoring in China mainly depends on foundation observation, but the number of sites is limited and the distribution is seriously uneven. Using the China meteorology bureau PM1Monitoring networks, e.g., national PM1Only 73 monitoring sites are distributed mainly in the east of China. Therefore, it is difficult to characterize the time-series variation of a wide range of particulate matter by relying solely on ground-based observations. The satellite remote sensing observation coverage is wide, the imaging speed is high, and the defect of foundation observation can be effectively overcome. Developing satellite-based PM, particularly geostationary satellite1The concentration inversion method is particularly necessary.
The theoretical basis for satellite-based inversion of particulate matter concentration is that there is a significant correlation between particulate matter concentration and the optical aerosol thickness (AOD) observed by the satellite. The current AOD data for inversion mostly originate from polar orbit satellites, and the time resolution is relatively low (1 day or even lower), so that the effective tracking of the evolution process of the particulate matter cannot be realized. Inversion models can be roughly classified into three categories: physical models, statistical models, and machine learning models. The results are not ideal due to the limitations of the data and the algorithm itself. Specifically, the method comprises the following steps:
the basic principle of a physical model is to parameterize known or assumed physical mechanisms and establish a relationship between particulate matter concentration and explanatory variables. The theoretical foundation of the model is relatively firm, but the requirement on data is high, the data acquisition difficulty is high, and the universality of the model is limited by the hypothesis parameters facing small samples.
The statistical model is used for describing the linear relation between the dependent variable and the independent variables, the process is simple, the flexibility is strong, but the model cannot solve the complex nonlinear relation between the variables, and therefore the precision of the model is limited.
The machine learning model has self-organizing and self-learning capabilities and has incomparable advantages in nonlinear problem processing, so that the algorithm is gradually applied to particle concentration inversion, but the dependence of the model precision on samples is high, and the precision of large-scale inversion still needs to be further improved.
More importantly, with PM2.5And PM10In contrast, current is for PM1Much less, and mostly regionally.
Interpretation of terms:
aerosol: a colloidal dispersion system formed by dispersing and suspending small solid or liquid particles in a gas medium, also called as a gas dispersion system, wherein the particle size of the particles is 1-100 nanometers.
PM1: refers to particles having a kinetic diameter of less than or equal to 1 micron in the atmosphere.
Aerosol optical thickness (AOD): the integral of the extinction coefficient of the aerosol in the vertical direction can be used to evaluate the attenuation effect of the aerosol on light.
Random Forest (RF): a machine learning model that utilizes multiple decision trees to train and predict samples.
Disclosure of Invention
Based on the above analysis, the present invention aims to solve the problem of large-scale PM1The problems of low concentration inversion time resolution, low precision and the like are solved, and the PM with more accuracy and stronger applicability is established1And (4) an inversion technical scheme.
The technical scheme of the invention provides a PM1 concentration inversion method fusing satellite and ground observation, which comprises the following steps,
step 2, constructing an initial RF model, wherein input characteristic parameters of the model comprise related meteorological geographic parameters, AOD, Hour, Month, Lon and Lat, wherein the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM1Concentration; optimally setting the number ntree of the decision tree and the number mtree of variables used in constructing the binary tree according to the prediction residual variation of the model;
step 3, constructing an initial geo-RF model, comprising defining space adjacent observation S-PM1Forward time proximity observation T-PM1And the proximity spatial distance constraint DIS is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between the target point and the ith station;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtjRepresenting the time distance between the current time of the station and the jth measurement time;
according to S-PM1,T-PM1The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
step 4, geo-RF model training and PM1The concentration estimation, carried out as follows,
calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model1And (4) concentration inversion.
Furthermore, the satellite AOD data was from a Himapari-8L 3 hour scale dataset, with observation samples identified as "very good" with confidence, and a spatial distribution rate of 0.05.
And the optimization setting mode of ntree and mtree is that initial values are respectively set firstly, and then the model structure is determined when the prediction residual error change of the model is relatively gentle by continuous adjustment.
Furthermore, in step 3, the best proximity observations are determined, including n, the number of spatially nearest neighbor sites and m, the number of forward proximity observations, for constraining the geo-RF model input features and improving the model computation efficiency.
Furthermore, by calculating the correlation between the predicted model values and the observed values for all samples, the model performance is evaluated, and the best adjacent observation is determined.
Moreover, the relevant meteorological geographic parameters comprise a near-surface temperature TEMP, a near-surface pressure SP, a relative humidity RH, a horizontal wind speed, a boundary layer height BLH, a normalized vegetation index NDVI and a surface elevation DEM; the horizontal wind speed comprises a latitudinal wind uw and a latitudinal wind vw;
when the initial RF model is constructed, the input characteristic parameters of the model comprise AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, Hour, Month, Lon and Lat,
spatio-temporal proximity observations are also input as explanatory variables into the constructed initial RF model, and the resulting geo-RF model is expressed as follows,
PM1=f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,Hour,Month,Lon,Lat,S-PM1,T-PM1,DIS) (4)
where f () is the corresponding function expression that expresses the geo-RF model.
The invention provides a PM1 concentration inversion system fusing satellite and foundation observation, which is used for realizing the PM1 concentration inversion method fusing satellite and foundation observation.
And, including the following modules,
a first module for data acquisition and matching, including acquisition of ground-based PM1The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM1The observation station is used as a center, a space-time window with preset space radius and time radius is adopted, the mean value of each input characteristic in the space-time window is calculated, andthe mean value series is actually measured with PM of corresponding observation station1Matching concentration values to form a training sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, Hour, Month, Lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM1Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing the variable number mtree used in the binary tree;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM1Forward time proximity observation T-PM1And the proximity spatial distance constraint DIS is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between the target point and the ith station;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtiRepresenting the time distance between the current time of the station and the jth measurement time;
according to S-PM1,T-PM1The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM1The concentration estimation, carried out as follows,
calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model1And (4) concentration inversion.
Alternatively, the system comprises a processor and a memory, wherein the memory is used for storing program instructions, and the processor is used for calling the stored instructions in the processor to execute the PM1 concentration inversion method for fusing satellite observation and ground observation.
Alternatively, a readable storage medium is included, on which a computer program is stored, which when executed, implements a method of PM1 concentration inversion that fuses satellite and ground observations as described above.
The high-resolution-ratio high-efficiency forest (RF) model-based machine learning model combines Himapari-8 AOD products with meteorological and geographic auxiliary data and simultaneously considers PM1The space-time autocorrelation of concentration fuses a satellite and ground observation to construct a geo-RF model and realize large-scale and small-scale PM1High precision inversion of concentration. The scheme shows stronger stability at different moments, seasons and regions, and can realize large-scale hour PM1High precision inversion of concentration.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
FIG. 2 is a schematic diagram illustrating the correlation between the predicted value and the measured value of the model according to the embodiment of the present invention, wherein FIG. 2(a) is a diagram illustrating the correlation between the predicted value and the measured valueCorrelation R between model predicted value and measured value obtained under space distance2FIG. 2(b) is a schematic diagram showing the correlation R between the predicted value and the measured value of the model obtained at different time distances2Schematic representation.
Detailed Description
In order to more clearly understand the present invention, the technical solutions of the present invention are specifically described below with reference to the accompanying drawings and examples.
The present invention notices that: the sunflower 8 satellite (Himapari-8) is a 2 nd generation static meteorological satellite transmitted by a Japanese weather service office (JMA), and is successfully transmitted in 10 months in 2014 and formally put into service in 7 months in 2015. The main sensor carried by the satellite is an advanced sunflower imager (AHI), which is a multispectral imager consisting of 3 visible light channels, 3 near infrared channels and 10 infrared channels. The spatial resolution of the visible light channel is 500 m, and the other channels are 1000 m-2000 m. Himapari-8 runs on a 140.7-degree E orbit, the observation range covers east Asia and Western Pacific areas (80-200-degree E, 60-degree S-60-degree N), and the full-disk observation can be completed every ten minutes. Hiwari-8 has published two AOD datasets (L2 and L3) with spatial resolution of 0.05 by far. The temporal resolution of L2 was 10 minutes, and the temporal resolution of L3 was 1 hour, 1 day, or 1 month, respectively. The confidence of the data can be divided into four levels, namely "very good", "general" and "untrustworthy". Compared with polar orbit satellites, Himapari-8 AOD has high time resolution and can be used for hour-level PM1And the concentration is estimated, so that the large-range dynamic monitoring of the pollution of the fine particles is realized, and data support is provided for urban atmospheric pollution monitoring and management and the establishment of a coping policy of atmospheric pollution.
The invention is based on a Random Forest (RF) machine learning method, and constructs a method for realizing small-scale PM (particle matter) by fusing satellite and foundation observation1A geo-RF model of concentration high precision inversion.
In the embodiment, a high-time-resolution Himapari-8 AOD product and auxiliary data such as weather and geography are combined based on a machine learning model-random forest model, and PM is considered at the same time1Spatio-temporal autocorrelation of concentration, correlating satellites withAnd the ground observation is fused and integrated into model training data which is input into the random forest model. The random forest is composed of a plurality of decision trees, a data set is constructed by a mode of replacing the decision trees for random extraction, the data set is input into different decision trees to train a model, each decision tree has a training result, and the average value of all the results is the prediction result of the model. Integrating the processes of data matching, space-time autocorrelation information solving and random forest training to construct a geo-RF model and realize the middle east hour-level PM1And (4) concentration inversion.
Referring to fig. 1, the embodiment provides a PM integrating satellite observation and ground observation1The concentration inversion method comprises the following steps:
(1) data acquisition and matching
Ground-based PM used in the present embodiment1The data comes from small-scale mass concentration observation provided by the China meteorological office aerosol observation network; satellite AOD data was from a himwari-8L 3 hour scale dataset, where only observed samples with confidence levels identified as "very good" were taken, with a spatial distribution rate of 0.05 °. Since the diffusion and accumulation of pollutants is mainly controlled by meteorological conditions, and the underlying surface and topography have some influence in the process, embodiments are directed to PM in addition to AOD1A series of relevant meteorological and geographical parameters are also taken into account when estimating, including near surface temperature (TEMP, unit: K), near surface pressure (SP, unit: Pa), relative humidity (RH, unit:%), horizontal wind speed (uw: weftwise wind, vw: transwind, unit: m/s), boundary layer height (BLH, unit: m), normalized vegetation index (NDVI, dimensionless), and surface elevation (DEM, digital elevation model, unit: m), etc.
Wherein the meteorological parameters are from the European mesoscale numerical forecasting center (ECWMF), and the spatial resolution is 0.125 degrees; the normalized vegetation index (NDVI) is from MODIS satellite observations with a spatial resolution of 1 km; surface elevation (DEM) data was obtained from the united states geological survey at a resolution of 90 m.
Because the spatial resolution of each variable is inconsistent, firstly, the spatial resolution of Himapwari-8 AOD is taken as a reference for meteorologyAnd resampling the geographic data. Then in PM1Taking an observation station as a center, adopting a space-time window with the space radius of 0.05 degrees and the time radius of 30min, calculating the average value of each input feature in the fixed space-time window, and carrying out actual measurement on the average value series and the PM measured by the corresponding observation station1And matching the concentration values to form a training sample set.
(2) Initial RF model construction
A Random Forest (RF) model is formed by combining a plurality of decision trees (e.g., decision trees 1-L in fig. 1), and the average of the results of all decision trees is used as the final output of the model. Two keys of the random forest construction are the determination of the number of decision trees ntree and the structure of each decision tree. The former can be determined empirically and the latter is related to the number of variables mtree used in constructing the binary tree. Both can be optimized and determined by parameter adjustment.
Specifically, assume that there is a dataset D ═ xi1,xi2,...,xiP,yi}(i∈[1,Q]) There are Q samples, P features, xi1,xi2,...,xiPRepresenting P independent variables, i.e. P characteristic parameters, y, for model input, respectivelyiRepresenting the output of the model. Firstly, inputting a characteristic number mtree, which is used for determining a decision result of a node on a decision tree; where mtree should be much smaller than P. Then, randomly sampling Q times from Q training samples in a manner of sampling back, forming a training set by sampling results, and estimating errors by using samples which are not sampled for prediction. For each node, mtree features are randomly selected, and the decision for each node on the decision tree is determined based on these features. And calculating the optimal splitting mode according to the mtree characteristics so as to construct each decision tree.
Corresponding to the above description, the initial RF model in this embodiment is constructed as follows: the input characteristic parameters of the model are 13, namely P is 13, and include AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, Hour, Month, Lon (longitude), Lat (latitude); the output parameter of the model is PM1Concentration, i.e. y being PM1And (4) concentration. Combined with experience, respectivelyGiven the initial values of ntree and mtree as 50 and 2, and continuously adjusting the two parameters, the model structure is determined when the prediction residual variation of the model is relatively flat, and the present embodiment finally sets ntree to 100 and mtree to 4.
(3) geo-RF model construction
PM1Has autocorrelation characteristics, i.e., spatial and temporal neighboring observations are significantly correlated. Therefore, theoretically, the integration of the space-time adjacent ground observation information into the model is beneficial to improving the PM of the target point1And (4) concentration estimation accuracy. To obtain a spatio-temporal proximity observation expression, this patent defines a spatial proximity observation (S-PM)1The unit: mu g/m3) Forward time proximity observation (T-PM)1The unit: mu g/m3) And the constraint of adjacent space Distance (DIS), the calculation formula is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between a target point and the ith station, and describing the target point by adopting the Euclidean distance;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtjThe time distance between the current time and the jth measurement time of the station is represented, wherein i is 1, 2.N, n is the number of nearest neighbor stations, j 1, 2.
The patent inputs spatio-temporal proximity observations as explanatory variables into the initial RF model constructed, and defines the model obtained at this time as a geo-RF model. The geo-RF model can be expressed as follows:
PM1=f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,Hour,Month,Lon,Lat,S-PM1,T-PM1,DIS) (4)
where f () is the corresponding function expression that expresses the geo-RF model.
According to the first law of geography, close objects are more closely related, when the distance reaches a certain degree, the introduction of the adjacent observation can not have an improvement effect on the estimation of the target point any more, and even can influence the estimation precision, so that the optimal adjacent observation (namely the number n of the spatial nearest stations and the number m of the forward adjacent observation) needs to be determined for constraining the input features of the geo-RF model and improving the calculation efficiency of the model.
To determine the best neighbor observations (i.e., n and m), this embodiment compares the estimated accuracy of model 10 fold cross validation when n and m take different values. Firstly, determining values of n and m, such as n is 1 and m is 1, then randomly dividing a sample set into ten parts, training 1 part of 9 parts of the ten parts in turn for verification, and repeating the steps for 10 times until all samples are forecasted once and are forecasted only once. Finally, calculating the correlation between the model predicted value and the observed value of all samples, namely R2The model performance was evaluated as shown in fig. 2. As shown in fig. 2(a), the correlation between the model predicted value and the measured value obtained at different spatial distances shows that, when n is 8, the correlation between the model predicted value and the site observation is relatively high (R)20.617) and considering computational complexity, n is set to 8 in this patent. As shown in FIG. 2(b), the correlation between the predicted value and the measured value of the model obtained at different time intervals increases with m, R2Slightly modified but generally declining, m being set to 2 in this patent, the model now has relatively good performance (R)2=0.611)。
(4) geo-RF model training with PM1Concentration estimation
Calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS (n is 8, m is 2), obtaining the geo-RF optimal adjacent observation input features, and combining with other parameters such as AOD, weather, geography and the like, constructing a complete training sample set. And then inputting all training samples into the model to train the training samples, and obtaining the trained model according to the minimum residual principle, wherein the process can realize automatic operation by adopting a computer software technology. Finally, large-range PM can be realized based on the trained model1And (3) performing concentration inversion, specifically respectively matching parameters such as AOD, weather, geography and the like according to the step (1) and the step (3), and calculating and matching S-PM1、T-PM1And DIS, to construct an inverse data set, which is then input into a trained geo-RF model to achieve a wide range of PM1And (4) concentration inversion.
The results of an exemplary study using the eastern part of China (100 DEG E-130 DEG E,20 DEG N-44 DEG N) show: (1) compared with other inversion models, the method provided by the patent has relatively high precision (geo-RF V.S.LME-BT V.S.GTWR V.S.GAM, R)20.83 v.s.0.80v.s.0.74v.s.0.59); (2) model R during daytime (9-16 hours)2The variation range is 0.68-0.87, wherein the precision is higher in the noon period; (3) within one year, model R2The variation range is 0.64-0.86, wherein the winter performance is better than the summer performance, and the difference between the monthly estimated value and the measured value is about 1 μ g/m3(ii) a (4) Of the 66 sites in the study area, about 80% of sites had R2Greater than 0.6.
In specific implementation, a person skilled in the art can implement the automatic operation process by using a computer software technology, and a system device for implementing the method, such as a computer-readable storage medium storing a corresponding computer program according to the technical solution of the present invention and a computer device including a corresponding computer program for operating the computer program, should also be within the scope of the present invention.
In some possible embodiments, a PM1 concentration inversion system fusing satellite and ground observation is provided, including the following modules,
a first module for data acquisition and matching, including acquisition of ground-based PM1The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM1Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and the PM measured by the corresponding observation station1Matching concentration values to form a training sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, Hour, Month, Lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM1Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing the variable number mtree used in the binary tree;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM1Forward time proximity observation T-PM1And the proximity spatial distance constraint DIS is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between the target point and the ith station;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtjRepresenting the time distance between the current time of the station and the jth measurement time;
i 1, 2.. n, n is the number of nearest neighbor stations, j 1, 2.. m, m is the number of forward neighbor observation hours;
according to S-PM1,T-PM1The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM1The concentration estimation, carried out as follows,
calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model1And (4) concentration inversion.
The implementation of each module may refer to the corresponding steps in the method embodiments, which are not repeated herein.
In some possible embodiments, there is provided a PM1 concentration inversion system for fused satellite and ground observations, comprising a processor and a memory, the memory for storing program instructions, the processor for invoking the stored instructions in the processor to perform a PM1 concentration inversion method for fused satellite and ground observations as described above.
In some possible embodiments, a system for PM1 concentration inversion fusing satellite and ground observation is provided, which includes a readable storage medium having a computer program stored thereon, and when the computer program is executed, the method for PM1 concentration inversion fusing satellite and ground observation is implemented as described above.
It should be understood that the above-mentioned embodiments are described in some detail, and not intended to limit the scope of the invention, and those skilled in the art will be able to make alterations and modifications without departing from the scope of the invention as defined by the appended claims.
Claims (10)
1. PM integrating satellite observation and ground observation1The concentration inversion method is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, data acquisition and matching, including acquisition of foundation PM1The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM1Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and the PM measured by the corresponding observation station1Matching concentration values to form a training sample set;
step 2, constructing an initial RF model, wherein input characteristic parameters of the model comprise related meteorological geographic parameters, AOD, Hour, Month, Lon and Lat, wherein the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM1Concentration; optimally setting the number ntree of the decision tree and the number mtree of variables used in constructing the binary tree according to the prediction residual variation of the model;
step 3, constructing an initial geo-RF model, comprising defining space adjacent observation S-PM1Forward time proximity observation T-PM1And the proximity spatial distance constraint DIS is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between the target point and the ith station;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtjRepresenting the time distance between the current time of the station and the jth measurement time;
i 1, 2.. n, n is the number of nearest neighbor stations, j 1, 2.. m, m is the number of forward neighbor observation hours;
according to S-PM1,T-PM1The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
step 4, geo-RF model training and PM1The concentration estimation, carried out as follows,
calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model1And (4) concentration inversion.
2. The PM fused satellite and ground observation of claim 11The concentration inversion method is characterized by comprising the following steps: the satellite AOD data is from a Himapari-8L 3 hour-scale dataset, identified with a confidence level of "very highGood "observation sample, spatial distribution ratio of 0.05 °.
3. The PM fused satellite and ground observation of claim 11The concentration inversion method is characterized by comprising the following steps: the optimal setting mode of ntree and mtree is that initial values are respectively set firstly, and then the model structure is determined when the prediction residual error of the model changes relatively gently.
4. The PM fused satellite and ground observation of claim 11The concentration inversion method is characterized by comprising the following steps: in step 3, the best proximity observation is determined, including the number n of the nearest spatial neighboring stations and the number m of forward proximity observations, for constraining the geo-RF model input features and improving the model computation efficiency.
5. The PM fused satellite and ground observation of claim 41The concentration inversion method is characterized by comprising the following steps: and evaluating the model performance by calculating the correlation between the model predicted value and the observed value of all samples, and determining the optimal adjacent observation.
6. PM fusing satellite and ground observation according to claim 1 or 2 or 3 or 4 or 51The concentration inversion method is characterized by comprising the following steps: the relevant meteorological geographic parameters comprise a near-surface temperature TEMP, a near-surface pressure SP, a relative humidity RH, a horizontal wind speed, a boundary layer height BLH, a normalized vegetation index NDVI and a surface elevation DEM; the horizontal wind speed comprises a latitudinal wind uw and a latitudinal wind vw;
when the initial RF model is constructed, the input characteristic parameters of the model comprise AOD, TEMP, SP, RH, uw, vw, BLH, NDVI, DEM, Hour, Month, Lon and Lat,
spatio-temporal proximity observations are also input as explanatory variables into the constructed initial RF model, and the resulting geo-RF model is expressed as follows,
PM1=f(AOD,TEMP,SP,RH,uw,vw,BLH,NDVI,DEM,Hour,Month,Lon,Lat,S-PM1,T-PM1,DIS) (4)
where f () is the corresponding function expression that expresses the geo-RF model.
7. PM integrating satellite observation and ground observation1Concentration inversion system, its characterized in that: PM for implementing a fused satellite and ground based observation according to any of claims 1-61A concentration inversion method.
8. The PM fused satellite and ground based observation of claim 71Concentration inversion system, its characterized in that: comprises the following modules which are used for realizing the functions of the system,
a first module for data acquisition and matching, including acquisition of ground-based PM1The data, the satellite AOD data and the related meteorological geographic parameters are resampled by taking the spatial resolution of the satellite AOD data as reference; then in PM1Taking an observation station as a center, adopting a space-time window with preset space radius and time radius, calculating the mean value of each input feature in the space-time window, and carrying out actual measurement on the mean value series and the PM measured by the corresponding observation station1Matching concentration values to form a training sample set;
the second module is used for constructing an initial RF model and comprises input characteristic parameters of the model, wherein the input characteristic parameters comprise related meteorological geographic parameters, AOD, Hour, Month, Lon and Lat, the AOD represents satellite AOD data, the Hour represents time, the Month represents Month, the Lon represents longitude, and the Lat represents latitude; let the output parameter y of the model be PM1Concentration; optimally setting the number ntree of the decision tree according to the prediction residual variation of the model, and constructing the variable number mtree used in the binary tree;
a third module for initial geo-RF model construction, including defining a spatial proximity observation S-PM1Forward time proximity observation T-PM1And the proximity spatial distance constraint DIS is as follows:
in the formula (I), the compound is shown in the specification,
PM1,iis PM measured from the ith nearest station of the target point1Concentration, wsiIs the ith nearest site PM1Spatial weight of, dsiRepresenting the spatial distance between the target point and the ith station;
PM1,jis the PM measured at the jth moment in the forward closest proximity of the target point1Concentration, wtjRepresents the measured PM of the jth time point nearest to the station1A time weight of (d); dtjRepresenting the time distance between the current time of the station and the jth measurement time;
i 1, 2.. n, n is the number of nearest neighbor stations, j 1, 2.. m, m is the number of forward neighbor observation hours;
according to S-PM1,T-PM1The DIS inputs the space-time adjacent observation as an explanatory variable into the constructed initial RF model to obtain a geo-RF model;
a fourth module for geo-RF model training with PM1The concentration estimation, carried out as follows,
calculating S-PM according to formulas (1) - (3)1、T-PM1And DIS, obtaining the best adjacent observation input characteristics of geo-RF, and combining satellite AOD data and related meteorological geographic parameters to construct a complete training sample set; then inputting all training samples into a geo-RF model for training, obtaining a trained geo-RF model according to a residual minimum principle, and finally realizing large-range PM based on the trained model1And (4) concentration inversion.
9. The PM fused satellite and ground based observation of claim 71Concentration inversion system characterized in thatIn the following steps: comprising a processor and a memory, the memory is used for storing program instructions, the processor is used for calling the stored instructions in the processor to execute the PM fusing satellite and ground observation according to any one of claims 1-61A concentration inversion method.
10. The PM fused satellite and ground based observation of claim 71Concentration inversion system, its characterized in that: comprising a readable storage medium having stored thereon a computer program which, when executed, implements a PM fused with satellite and ground based observations as claimed in any one of claims 1 to 61A concentration inversion method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010817931.6A CN112016696B (en) | 2020-08-14 | 2020-08-14 | PM integrating satellite observation and ground observation 1 Concentration inversion method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010817931.6A CN112016696B (en) | 2020-08-14 | 2020-08-14 | PM integrating satellite observation and ground observation 1 Concentration inversion method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112016696A true CN112016696A (en) | 2020-12-01 |
CN112016696B CN112016696B (en) | 2022-10-04 |
Family
ID=73504482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010817931.6A Active CN112016696B (en) | 2020-08-14 | 2020-08-14 | PM integrating satellite observation and ground observation 1 Concentration inversion method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112016696B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114018773A (en) * | 2021-11-03 | 2022-02-08 | 中科三清科技有限公司 | PM2.5Method, device and equipment for acquiring concentration spatial distribution data and storage medium |
CN114330146A (en) * | 2022-03-02 | 2022-04-12 | 北京英视睿达科技股份有限公司 | Satellite gas data completion method and system |
CN114898823A (en) * | 2022-07-01 | 2022-08-12 | 北京英视睿达科技股份有限公司 | High-spatial-temporal-resolution remote sensing near-surface NO 2 Concentration estimation method and system |
CN115356249A (en) * | 2022-10-19 | 2022-11-18 | 北华航天工业学院 | Satellite polarization PM2.5 estimation method and system based on machine learning fusion model |
CN117828992A (en) * | 2024-01-04 | 2024-04-05 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Accurate prediction method and system for CCN number concentration with high space-time resolution |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110313958A1 (en) * | 2009-02-16 | 2011-12-22 | Institutt For Energiteknikk | System and method for empirical ensemble-based virtual sensing of particulates |
CN104268423A (en) * | 2014-10-11 | 2015-01-07 | 武汉大学 | Large-scale dynamic evolution dust type aerosol retrieval method |
US20180149577A1 (en) * | 2016-11-28 | 2018-05-31 | International Business Machines Corporation | Particulate matter monitoring |
CN109213964A (en) * | 2018-07-13 | 2019-01-15 | 中南大学 | A kind of satellite AOD product bearing calibration for merging multi-source feature geographic factor |
CN109583516A (en) * | 2018-12-24 | 2019-04-05 | 天津珞雍空间信息研究院有限公司 | A kind of space and time continuous PM2.5 inversion method based on ground and moonscope |
CN110428113A (en) * | 2019-08-09 | 2019-11-08 | 云南电网有限责任公司电力科学研究院 | A kind of predicting model for dissolved gas in transformer oil method based on random forest |
CN111426633A (en) * | 2020-06-15 | 2020-07-17 | 航天宏图信息技术股份有限公司 | PM at night2.5Mass concentration estimation method and device |
-
2020
- 2020-08-14 CN CN202010817931.6A patent/CN112016696B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110313958A1 (en) * | 2009-02-16 | 2011-12-22 | Institutt For Energiteknikk | System and method for empirical ensemble-based virtual sensing of particulates |
CN104268423A (en) * | 2014-10-11 | 2015-01-07 | 武汉大学 | Large-scale dynamic evolution dust type aerosol retrieval method |
US20180149577A1 (en) * | 2016-11-28 | 2018-05-31 | International Business Machines Corporation | Particulate matter monitoring |
CN109213964A (en) * | 2018-07-13 | 2019-01-15 | 中南大学 | A kind of satellite AOD product bearing calibration for merging multi-source feature geographic factor |
CN109583516A (en) * | 2018-12-24 | 2019-04-05 | 天津珞雍空间信息研究院有限公司 | A kind of space and time continuous PM2.5 inversion method based on ground and moonscope |
CN110428113A (en) * | 2019-08-09 | 2019-11-08 | 云南电网有限责任公司电力科学研究院 | A kind of predicting model for dissolved gas in transformer oil method based on random forest |
CN111426633A (en) * | 2020-06-15 | 2020-07-17 | 航天宏图信息技术股份有限公司 | PM at night2.5Mass concentration estimation method and device |
Non-Patent Citations (2)
Title |
---|
RUI LI ETAL.: "Estimating high-resolution PM1 concentration from Himawari-8 combining extreme gradient boosting-geographically and temporally weighted regression (XGBoost-GTWR)", 《ATMOSPHERIC ENVIRONMENT》 * |
朱松岩等: "紫外大气甲醛卫星遥感反演方法和研究现状", 《中国环境科学》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114018773A (en) * | 2021-11-03 | 2022-02-08 | 中科三清科技有限公司 | PM2.5Method, device and equipment for acquiring concentration spatial distribution data and storage medium |
CN114330146A (en) * | 2022-03-02 | 2022-04-12 | 北京英视睿达科技股份有限公司 | Satellite gas data completion method and system |
CN114330146B (en) * | 2022-03-02 | 2022-06-28 | 北京英视睿达科技股份有限公司 | Satellite gas data completion method and system |
CN114898823A (en) * | 2022-07-01 | 2022-08-12 | 北京英视睿达科技股份有限公司 | High-spatial-temporal-resolution remote sensing near-surface NO 2 Concentration estimation method and system |
CN115356249A (en) * | 2022-10-19 | 2022-11-18 | 北华航天工业学院 | Satellite polarization PM2.5 estimation method and system based on machine learning fusion model |
CN117828992A (en) * | 2024-01-04 | 2024-04-05 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | Accurate prediction method and system for CCN number concentration with high space-time resolution |
Also Published As
Publication number | Publication date |
---|---|
CN112016696B (en) | 2022-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112016696B (en) | PM integrating satellite observation and ground observation 1 Concentration inversion method and system | |
Li et al. | Point-surface fusion of station measurements and satellite observations for mapping PM2. 5 distribution in China: Methods and assessment | |
CN113591572B (en) | Water and soil loss quantitative monitoring method based on multi-source data and multi-phase data | |
CN110232471B (en) | Rainfall sensor network node layout optimization method and device | |
CN112785024B (en) | Runoff calculation and prediction method based on watershed hydrological model | |
CN114265836A (en) | All-weather assimilation method and device of satellite microwave hygrothermograph based on cloud area temperature and humidity profile inversion | |
CN110595968B (en) | PM2.5 concentration estimation method based on geostationary orbit satellite | |
Wang et al. | Estimating hourly PM2. 5 concentrations using MODIS 3 km AOD and an improved spatiotemporal model over Beijing-Tianjin-Hebei, China | |
Fan et al. | Medium-range forecasting of daily reference evapotranspiration across China using numerical weather prediction outputs downscaled by extreme gradient boosting | |
CN112699959B (en) | Multi-source multi-scale precipitation data fusion method and device based on energy functional model | |
CN112163375A (en) | Long-time sequence near-surface ozone inversion method based on neural network | |
CN112861072B (en) | Satellite-ground multi-source rainfall self-adaptive dynamic fusion method | |
CN113011455B (en) | Air quality prediction SVM model construction method | |
Sun et al. | Microwave and meteorological fusion: A method of spatial downscaling of remotely sensed soil moisture | |
CN115795399A (en) | Self-adaptive fusion method and system for multi-source remote sensing precipitation data | |
Xiong et al. | Estimating the PM2. 5 concentration over Anhui Province, China, using the Himawari-8 AOD and a GAM/BME model | |
CN115358060A (en) | Efficient algorithm framework supporting high-precision prediction of new energy power generation power | |
CN114707396A (en) | All-time PM2.5Near real-time production method of concentration seamless lattice point data | |
Han et al. | Estimation of high-resolution PM2. 5 concentrations based on gap-filling aerosol optical depth using gradient boosting model | |
CN115544706A (en) | Wavelet and XGboost model integrated atmospheric fine particle concentration estimation method | |
Orellana-Samaniego et al. | Estimating monthly air temperature using remote sensing on a region with highly variable topography and scarce monitoring in the southern Ecuadorian Andes | |
CN117494419A (en) | Multi-model coupling drainage basin soil erosion remote sensing monitoring method | |
Zhang et al. | The CA model based on data assimilation | |
Shangguan et al. | A Combined model to predict GNSS precipitable water vapor based on deep learning | |
Yang et al. | Optimization of Hourly PM 2.5 Inversion Model Integrating Upper-Air Meteorological Elements |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |