CN109542993B

CN109542993B - Remote sensing data processing method based on rough set

Info

Publication number: CN109542993B
Application number: CN201811369570.2A
Authority: CN
Inventors: 顾沈明; 管林挺; 谭安辉
Original assignee: Zhejiang Ocean University ZJOU
Current assignee: Zhejiang Ocean University ZJOU
Priority date: 2018-11-16
Filing date: 2018-11-16
Publication date: 2022-07-22
Anticipated expiration: 2038-11-16
Also published as: CN109542993A

Abstract

Aiming at the problem that the prior art cannot meet the requirement of rapid attribute reduction of remote sensing data, a remote sensing data mining method based on a rough set is provided; adopting a parallel method to reduce the attributes; reducing the output of intermediate results; useful information is mined from the remote sensing data, and the remote sensing data is reduced and mined by utilizing the quick attribute of a rough set; the method can effectively improve the attribute reduction speed of the remote sensing data.

Description

Remote sensing data processing method based on rough set

Technical Field

The invention belongs to the field of remote sensing data processing, and particularly relates to a remote sensing data processing method based on a rough set.

Background

In recent decades, high spatial resolution remote sensing images have been widely used in the fields of agriculture, forestry, oceans, environmental monitoring and the like, and have great economic value and social benefit. However, because the Volume (Volume) of the high spatial resolution remote sensing image is large, the data types (Variety) are many, the information is rich, the interpretation and analysis process is complex, and it is difficult to accurately and efficiently perform automatic ground feature classification on the high spatial resolution remote sensing image so far. How to classify the ground features of the high-spatial-resolution remote sensing big data becomes one of the technical difficulties and bottlenecks affecting the large-scale application of the ground features. Compared with the medium-low resolution remote sensing image, the high-spatial resolution remote sensing image has richer textures, more obvious shapes and more complex spatial relationship. The prior art often adopts spectral, shape and texture features to describe different characteristics of the ground class in the high spatial resolution remote sensing image. However, these features are underlying features that make it difficult to fully describe the geometric and structural information of the terrain on the high spatial resolution image. In recent years, Bag-of-Word (BOW) and Topic (Topic model) models in text analysis and scene understanding have been introduced into the field of remote sensing. The method extracts statistical information or semantic information of local features through a word bag model, and analyzes the theme in the high-spatial-resolution remote sensing image according to the statistical information or the semantic information, so that the purpose of classification is achieved. Most of the existing feature extraction methods are statistical features, so that the essential information of the land is difficult to accurately describe, and the automatic interpretation of the remote sensing image with high spatial resolution is difficult to realize. How to extract deep structure information of the ground features on the high-spatial-resolution remote sensing big data aiming at the diversity of the sensors, the variability of imaging conditions and the complexity of ground targets, and describe the ground features as completely as possible is the key for classifying the ground features in the high-spatial-resolution remote sensing big data. In order to develop better features, a great deal of effort has to be devoted to the study of a good feature. While good feature development often requires a deep understanding of the problem and repeated exploration. There is therefore a need to be able to automatically generate suitable features. The existing remote sensing satellite data processing system architecture generally comprises a lower data application layer, a data storage layer, a data processing and analyzing layer and an upper data application layer, and the whole system mainly realizes data processing and analysis through a single system computing cluster. However, as the remote sensing satellite is more and more densely emitted, the load data and application diversity of the remote sensing satellite are more and more obvious, and meanwhile, the data storage scale of the remote sensing satellite is rapidly increased, particularly for the remote sensing application of near real-time processing, the data amount to be processed is increased in multiples, and the high timeliness requirement of a user on data processing and application is more and more strong.

Disclosure of Invention

The invention relates to a remote sensing data processing method based on a rough set, which is designed aiming at the problems that the data quantity to be processed is multiplied and the high timeliness requirements of users on data processing and application are more and more strong especially for the remote sensing application of near real-time processing.

A remote sensing data processing method based on a rough set comprises the following steps:

m1, setting GPS positioning information at the known resource points on the earth surface where the remote sensing information needs to be processed;

m2, selecting multi-waveband data and full-wave data as first-class screening data according to the type of the remote sensing data, selecting spectral band values as second-class screening data, and selecting geographic information as characteristic screening data;

m3, screening data according to the characteristics, and counting the statistical characteristics of the selected data blocks;

m4, classifying the statistical characteristics of the selected data blocks according to the characteristic screening data, and removing data with the similarity lower than 64% in the screening data by comparing the statistical characteristics of the data blocks;

m5, performing characterization processing on the second type of screening data;

m6, repeating the steps M4 and M5, and removing redundant features of the selected data blocks represented by all types of feature screening data;

m7, counting all the screening data characteristics in M6 as a screening set;

m8, screening the earth surface characteristics or ocean resource characteristics of the data blocks according to the screening set of the second type of screening data;

m9, screening a screening set of screening data of the same type for the data in the step M8, and confirming the data blocks;

m10, performing regulation extraction on the blocks of which the screening result of the screening set of the second type of screening data is 1 and the screening result of the screening set of the first type of screening data is 0 according to the result of the step M9;

m11, reversely extracting the rule of the step M10 and rejecting the rule of the filter set in the step M7 for one iteration;

m12, replacing the step M8, wherein the screening set in M9 is a screening set and a one-time iteration screening set, and the subsequent screening is preferentially carried out by double-screening set screening;

m13, repeating the steps M8 to M11 until all data blocks are screened;

m14, outputting the screening result or rotating to continue the screening from the first data block.

Preferably, the step M2 includes the following steps:

a1, removing geographic information as characteristic screening data

A2, establishing an expert knowledge base, and taking an expert knowledge base waveband data vector set as feature screening data;

a3, carrying out normalization processing on the screening set in the step A2, and setting the screening set as an open set;

a4, copying the screening set resulting from step A3 to step M7 as a supplemental screening set.

Preferably, the screening manner in step M8 is based on a greedy search algorithm, and the screening set conditions are matched, and the data blocks are subjected to matching sorting screening.

Preferably, the step M9 includes the following steps:

b1, establishing a description maximum boundary domain according to the screening item number of the screening set;

b2, arranging the large classes in the boundary domain by a critical value;

b3, matching the probabilities with the descriptive features and making the matching aggregate probability greater than the set confidence level;

b4, performing boundary screening by using all description features meeting the confidence level;

b5, screening out the data in all the data blocks which accord with the maximum boundary domain, and setting the data as a primary credible maximum edge screening result;

b6, changing the rule category of the screening set according to the adaptive boundary domain;

b7, judging with the confidence level of the step B3 of the adaptive boundary domain straight line of the step B6;

and B8, carrying out secondary screening on the primary credible maximum side screening result of the step B5 according to the judgment result of the step B7 to obtain an output result.

Preferably, the expert knowledge base comprises the variation value of the vegetation growth index on a plurality of wave bands, and the ocean field is the variation value of the ocean resources on the plurality of wave bands of the telemetry data terminal in time conversion.

Preferably, the data chunk splitting in step M3 includes the following steps:

c1, constructing a granulation decomposition model according to the SI-DWT conversion;

c2, carrying out alignment splitting on the data blocks, and stopping SI-DWT conversion during the secondary decomposition;

and C3, performing characterization expression on the secondary resolution particle data.

The method has the substantial effects that the remote sensing data are mined by utilizing the rapid attribute reduction of the rough set, the attribute reduction speed of the remote sensing data is effectively improved, the matching value of the rough set can be simply, circularly and continuously updated to accelerate the subsequent mining speed, and the original data are continuously mined.

Detailed Description

The technical solution of the present invention will be further specifically described below by way of specific examples.

Example 1

The remote sensing data processing method based on the rough set comprises the following steps:

m1, setting GPS positioning information at the known resource points of the earth surface where the remote sensing information needs to be processed;

m6, repeating steps M4 and M5, and removing redundant features of the selected data blocks represented by all types of feature screening data;

m7, counting all the screening data characteristics in M6 as a screening set;

m9, screening a screening set of screening data for the data in the step M8, and confirming the data blocks;

m10, performing rule extraction on the blocks of which the screening result of the screening set of the second type of screening data is 1 and the screening result of the screening set of the first type of screening data is 0 according to the result of the step M9;

m12, replacing the step M8, wherein the screening sets in M9 are a screening set and a one-time iteration screening set, and the subsequent screening is preferentially carried out by double screening sets;

m13, repeating the steps M8 to M11 until all data blocks are screened;

The step M2 includes the following steps:

a1, removing geographic information as characteristic screening data

The screening mode in step M8 is based on greedy search algorithm, and the screening set conditions are matched, and the data blocks are subjected to matching sorting screening.

The greedy search algorithm is not capable of obtaining an overall optimal solution for all problems, the key is selection of the greedy search algorithm, and the greedy search algorithm must have no aftereffect, namely, a previous process of a certain state cannot influence a later state and is only related to the current state. Generally, an overall optimal solution of the problem can be first proved, starting from a greedy search, and after the greedy search is performed, the original problem is simplified into a similar subproblem with a smaller scale. Then, the mathematical induction method proves that through greedy search of each step, an overall optimal solution of the problem can be finally obtained.

The step M9 includes the following steps:

b2, sorting the large classes in the boundary domain by a critical value;

b4, performing boundary screening with all description features meeting the confidence level;

b5, screening out the data meeting the maximum boundary domain in all the data blocks, and setting the data as a primary credible maximum edge screening result;

b7, judging with the confidence level of the step B6 of the adaptive boundary domain straight line step B3;

Confidence level refers to the degree to which a particular individual believes the authenticity of a particular proposition, i.e., probability, is a measure of the rationality of an individual's belief. The confidence interpretation of the probabilities indicates that the event itself has no probability, and that the assignment of a probability to an event is simply the belief evidence in the mind of the person to whom the probability is assigned. The confidence level refers to the probability that the overall parameter value falls within a certain area of the sample statistic; the confidence interval refers to the error range between the sample statistic and the overall parameter value under a certain confidence level. The larger the confidence interval, the higher the confidence level.

And the maximum boundary domain and the adaptive boundary domain are the maximum value and the minimum value which can generate results in the data for the specific characteristics in the set, and the confidence level is introduced to balance the matching level of the maximum boundary domain and the adaptive boundary domain during matching due to the complete randomness of the data.

The expert knowledge base comprises the variation value of the vegetation growth index on a complex wave band, and the ocean field is the variation value of the complex wave band of the ocean resources in the telemetering data terminal in the time conversion.

The data block splitting in the step M3 includes the following steps:

c2, carrying out alignment splitting on the data block, and stopping SI-DWT conversion during the secondary decomposition;

The granulation calculation refers to a calculation and an operation performed on the information particles. The practical grain calculation is used for roughly splitting data blocks, facilitates matching work of subsequent data screening, ensures that discontinuous cutting is output during screening, and can avoid screening failure caused by edge effect in screening, or avoids data edge screening effect by carrying out secondary granulation on basic data in a repeated granulation mode and overlapping a primary screening result.

Claims

1. A remote sensing data processing method based on a rough set is characterized by comprising the following steps:

m2, selecting multi-wave band data and full-wave data as first-class screening data according to the type of the remote sensing data, selecting a spectral band value as second-class screening data, and selecting geographic information as characteristic screening data;

m7, counting all the screening data characteristics in M6 as a screening set;

m13, repeating the steps M8 to M11 until all the data blocks are screened;

m14, outputting the screening result or rotating to continue screening from the first data block.

2. The remote sensing data processing method based on the rough set as claimed in claim 1, wherein said step M2 includes the following steps:

a1, removing geographic information as feature screening data

A2, establishing an expert knowledge base, and taking an expert knowledge base band data vector set as feature screening data;

3. The method for processing remote sensing data based on the rough set according to claim 1, wherein the screening in step M8 is based on greedy search algorithm, and is used for matching the conditions of the screening set and for sorting and screening the data blocks.

4. The remote sensing data processing method based on the rough set as claimed in claim 1, wherein said step M9 includes the following steps:

b2, sorting the large classes in the boundary domain by a critical value;

5. The method as claimed in claim 2, wherein the expert knowledge base includes variation of vegetation growth index in complex wave bands, and the ocean domain is variation of complex wave bands of ocean resources in the telemetry data terminal in time conversion.

6. The remote sensing data processing method based on the rough set as claimed in claim 2, wherein the data block splitting in the step M3 includes the following steps:

c1, constructing a granulation decomposition model according to the SI-DWT transformation;