CN112308151A

CN112308151A - Weighting-based classification method for hyperspectral images of rotating forest

Info

Publication number: CN112308151A
Application number: CN202011207564.4A
Authority: CN
Inventors: 冯伟; 董淑仙; 全英汇; 钟娴; 童莹萍
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2020-11-03
Filing date: 2020-11-03
Publication date: 2021-02-02

Abstract

The invention discloses a method for classifying a hyperspectral image of a rotating forest based on weighting, which solves the problems of low precision of hyperspectral image classification and low integration performance of a classification model. The scheme is as follows: the hyperspectral image sample is divided into a training set and a testing set; initializing sample weights of the training set, and multiplying the sample weights by corresponding samples of the training set to obtain a weighted training set; training a decision tree-based classifier and obtaining a weighted training set classification result; establishing a weight-based rotating forest model; and putting the test set into a weight-based rotating forest model to obtain a final classification result of the hyperspectral image sample. According to the method, the samples containing important information are mined by designing a dynamic weighting function, and the classification result of the weighted training set of the generated decision tree-based classifier is brought into the current decision tree-based classifier to be trained.

Description

Weighting-based classification method for hyperspectral images of rotating forest

The invention belongs to the technical field of image processing, mainly relates to remote sensing image processing, and particularly relates to a method for classifying a rotating forest hyperspectral image based on weighting. In particular to a remote sensing classification method for mining important samples, which can be used for hyperspectral image land classification.

Background

Classification is one of the main tasks of remote sensing information processing. The classification of hyperspectral data is generally more difficult than other remote sensing images because of the high ratio of features to the number of samples of hyperspectral data and the presence of redundant information in the feature set. Although most learning systems face a troublesome problem known as "dimensional disaster," studies have demonstrated the successful application of classifier ensemble techniques in hyperspectral classification. Ensemble learning is an effective method for developing an accurate classification system, and can improve the performance of a weak classifier and make accurate prediction. Boosting and autonomous aggregation (bagging) are the main ensemble learning methods. Diversity is considered to be a very important feature of the classifier combination that can be effectively used to reduce variance errors without increasing bias errors of the integration method. To encourage diversity in bagging, the ban Kam Ho of Bell laboratories in 1995 proposed a Random Forest (RFs) algorithm. In 2005, Jisoo Ham applied RFs to remote sensing image classification for the first time and achieved satisfactory results. RFs are a combination of tree predictors, where the decision tree is constructed using a sample-with-put training sample technique; they randomly sample the properties and choose the best segmentation among these variables, rather than all the properties. RFs have the important advantages of efficient operation on large databases, capability of processing thousands of input variables without deleting variables, low time cost, etc.

In image processing, Juan J Rodri i guez refers to the idea of RFs and proposes a Rotation Forest (RoF) method, aiming at establishing a more accurate and diversified basic classifier. The method randomly divides a feature space into a plurality of subspaces, applies feature transformation to each subspace respectively, repeats the process, and generates different training data sets and basic classifiers for different feature subspaces. The showa proves that the rotary forest algorithm is superior to bagging, Adaptive Boosting (AdaBoost) and RFs in a paper 'classifier integration algorithm research based on rotary forest'.

In conclusion, the rotating forest generates a sparse rotating matrix by using a feature extraction algorithm, and an original image is projected to different coordinate systems, so that the constructed base classifier has strong difference. Thus, RoF provides better performance in image classification than bagging, AdaBoost, and RFs, among other algorithms. However, since the RoF algorithm gives all training samples the same weight, the potential for providing important information samples is ignored. Furthermore, these algorithms each independently generate base classifiers, some of which not only increase the computational complexity of the algorithms, but also reduce the integration performance of the algorithms.

Disclosure of Invention

The invention aims to provide a method for classifying hyperspectral images of a rotating forest based on weighting, which has better integration performance, aiming at the defects of the prior art. According to the method, important samples are mined and weighted, and the growth of trees in the weighted rotating forest is guided in a self-adaptive manner, so that the classification precision of the hyperspectral images is improved.

The invention relates to a method for classifying hyperspectral images of a rotating forest based on weighting, which is characterized by comprising the following steps of:

(1) obtaining samples and dividing a training set and a testing set: acquiring hyperspectral image samples with the size of M multiplied by F through field collection or a remote sensing database, wherein M represents the number of samples, F represents the characteristic number of each sample, and C represents the number of classes of the samples; then randomly extracting N samples from the M samples to be used as a training set S, and using the rest samples as a test set E; s ═ x₁,y₁),(x₂,y₂),…,(x_N,y_N)，x_iRepresents the ith sample of the training set S, is a 1 XF vector, y_iRepresents a sample x_iLabel of (a), y_i∈{1,2,…,C}；

(2) Initializing sample weights in a training set S: by W (x)_i) Represents a sample x_iInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)_i)＝1/N，i＝1,2,...,N；

(3) Generating a weighted training set S': initializing N training sample weights W (x)_i) Samples x corresponding to training set S respectively_iMultiplying to obtain weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),…,(W(x_N)·x_N,y_N)；

(4) Establishing a weight-based rotating forest model: assuming that a weighted-based rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, arranging the T decision tree-based classifiers in sequence, and training according to the arrangement sequence, wherein T is 1,2, …; sampling the weighted training set S' for N times by adopting a sampling mode with putting back to obtain a diversity training sample set S_tDiversity training sample set S_tIs a 1 x F vector; randomly combining diversity training sample set S_tF features in the image are divided into K subsets, forming feature subset F_t,kK is 1,2,. K; training a sample set S from diversity_tTo select a feature subset F_t,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets S_t,k(ii) a Training a subset S of samples for diversity using a Principal Component Analysis (PCA) algorithm_t,kExtracting features to obtain a rotation matrix

Training diversity sample set S_tAnd a rotation matrix

Multiplying to obtain a diversity training sample set S 'after rotation'_t(ii) a Training sample set S 'with diversity after rotation'_tTraining decision tree base classifiers, wherein the tth trained decision tree base classifier is represented as xi_tStill, T ═ 1,2, …, T; the T trained decision tree-based classifiers jointly form a weight-based rotating forest model of the hyperspectral image;

(5) generating a classification result: putting each sample in the test set E into T trained decision tree-based classifiers in a weighted rotating forest model respectively to obtain T classification results; and the category with the largest quantity in the T classification results is the classification result of the hyperspectral image based on the weighted rotating forest model.

The method solves the problems that the potential for providing important information samples is ignored and the independent generation of each base classifier increases the algorithm complexity and reduces the integration performance, and improves the classification precision of the hyperspectral image samples.

Compared with the prior art, the invention has the following advantages:

the classification precision is improved: aiming at the problems that all training samples are considered to be equal and potential containing important information samples is ignored in the RoF algorithm, the potential containing the important information samples is mined and the samples are weighted by designing a dynamic weighting function. The larger the sample weight is, the more important the sample is, the more attention can be given to the decision tree-based classifier, and the classification precision of the hyperspectral image sample is improved.

The integration performance of the algorithm is improved: aiming at the problems that the basic classifiers are mutually independent in the RoF algorithm, so that the calculation complexity of the algorithm is increased and the integration performance of the algorithm is reduced for some basic classifiers, the invention weights the samples by training each decision tree base classifier, and the weight is calculated according to the classification result of the generated decision tree base classifier on the training samples, so that the decision tree base classifiers are mutually connected, and the integration performance of the model is improved.

Drawings

FIG. 1 is a block flow diagram of the present invention;

FIG. 2 is a block diagram of a process for building a weight-based rotating forest model according to the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and examples.

Example 1

The hyperspectral image contains abundant spectral information and can effectively reflect the information of an imaging target, so that the hyperspectral image is widely applied to the fields of precision agriculture, environmental monitoring, military reconnaissance and the like. In these applications, hyperspectral image classification is one of the important links, and the final goal of classification is to accurately give each pixel in an image a unique class identifier. The RoF algorithm is one of many classification algorithms. The RoF algorithm generates a sparse rotation matrix by using a feature extraction algorithm, and projects an original image into different coordinate systems, so that the constructed base classifier has strong difference. Compared with algorithms such as bagging, AdaBoost and RFs, the classification precision of the hyperspectral image samples can be improved by the RoF algorithm. But all training samples are given the same weight in the RoF algorithm, and the potential of providing important information samples is ignored. Furthermore, these algorithms each independently generate base classifiers, some of which not only increase the computational complexity of the algorithms, but also reduce the integration performance of the algorithms. Therefore, the invention provides a method for classifying hyperspectral images of a rotating forest based on weighting by developing improvement and thinking aiming at the current situation.

The invention discloses a method for classifying hyperspectral images of a rotating forest based on weighting, which comprises the following steps of:

(1) obtaining samples and dividing a training set and a testing set: the hyperspectral image samples can be obtained through field collection or a remote sensing database according to images to be classified, and the size of the hyperspectral image samples is M multiplied by F, wherein M represents the number of samples, F represents the feature number of each sample, and C represents the number of categories in the image samples. Then randomly extracting N samples from the M samples to be used as a training set S, and using the rest samples as a test set E; s ═ x₁,y₁),(x₂,y₂),…,(x_N,y_N)，x_iRepresents the ith sample of the training set S, is a 1 XF vector, y_iRepresents a sample x_i1,2,.., N, y_i∈{1,2,…,C}。

If the dataset used is a Pavia University school landscape in northern italy, the hyperspectral image samples to be classified may be acquired in the field at school or from a remote sensing database.

(2) Initializing sample weights in training set SThe value: by W (x)_i) Representing training set samples x_iInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)_i) 1/N, i is 1,2, N, i is the initial weight of each sample in the training set S of the present invention is 1/N.

(3) Generating a weighted training set S': initializing N training sample weights W (x)_i) Samples x corresponding to training set S respectively_iMultiplication, i ═ 1,2, …, N, resulting in a weighted training set S':

wherein, (W (x)₁)·x₁,y₁) The first weighted sample, and so on.

(4) Establishing a weight-based rotating forest model: assuming that the weighted rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, T being 1,2, …, and T, and training according to the sequence. Sampling the weighted training set S' for N times by adopting a sampling mode with putting back to obtain a diversity training sample set S_tDiversity training sample set S_tIs a 1 x F vector. Randomly combining diversity training sample set S_tF features in the image are divided into K subsets, forming feature subset F_t,kK is 1, 2. Training a sample set S from diversity_tTo select a feature subset F_t,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets S_t,k. Training a subset S of samples for diversity using a Principal Component Analysis (PCA) algorithm_t,kExtracting features to obtain a rotation matrix

Training diversity sample set S_tAnd a rotation matrix

Multiplication by multiplicationObtaining a diversity training sample set S 'after rotation'_t. Training sample set S 'with diversity after rotation'_tTraining decision tree base classifiers, wherein the tth trained decision tree base classifier is represented as xi_tThere is still T ═ 1,2, …, T. And the T trained decision tree-based classifiers jointly form a weight-based rotating forest model of the hyperspectral image. Compared with the independent generation among all base classifiers in the existing RoF algorithm, the method calculates the sample weight according to the classification result of the generated decision tree base classifier on the training sample, weights the sample, trains the current decision tree base classifier by using the weighted training sample, and leads all the trained base classifiers to be mutually linked, so the weighted rotating forest model has better integration performance than the existing RoF model.

(5) Generating a classification result: and respectively putting each sample in the test set E into T trained decision tree-based classifiers in the weighted rotating forest model to obtain T classification results. The category with the largest quantity in the T classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.

The RoF algorithm is a typical hyperspectral image classification algorithm at present, and the difference and the accuracy between base classifiers are improved by using a feature extraction algorithm, so that the classification accuracy of hyperspectral image samples is improved. However, RoF also has the problem of poor integration performance of the rotating forest model due to neglect of the potential of providing important information samples and the fact that a plurality of base classifiers which do not reasonably utilize RoF. Aiming at the problem, the invention provides an overall technical scheme of classification of a rotating forest hyperspectral image based on weighting through research. According to the method, a dynamic weighting function is designed, potential of important samples is mined, weights are given to the samples, the greater the sample weight is, the more attention can be given to the samples by a classifier, and the classification precision of the hyperspectral image samples is improved. In addition, because the invention weights the sample before training each decision tree-based classifier, and the weight is calculated according to the classification result of the generated decision tree-based classifier to the training sample, the decision tree-based classifiers are interconnected in the invention, and the classification result of the previously generated decision tree-based classifier to the training sample is brought into the decision tree-based classifier to be trained currently, thereby improving the integration performance of the model.

Example 2

The method for classifying hyperspectral images of rotating forests based on weighting is the same as that in the embodiment 1, and a rotating forest model based on weighting is established in the step (4), and referring to fig. 2, the method comprises the following steps:

(4a) initializing a decision tree base classifier: introducing a rotary forest model, wherein the basic structure of the introduced rotary forest model is that a rotary forest model based on weighting is assumed to be formed by T decision tree-based classifiers together, the sequence number of the decision tree-based classifier is set to be T, T is 1,2, …, T, T decision tree-based classifiers are arranged in sequence and are trained according to the arrangement sequence, the sequence number T of the initialized decision tree-based classifier is 1, and the iteration of training the decision tree-based classifiers is started. After the serial number of the decision tree-based classifier is initialized, a diversity training sample set, a feature subset, a diversity training sample subset and a rotated diversity training sample set are sequentially generated, and the decision tree-based classifier is trained by using the rotated diversity training sample set to complete the training of the decision tree-based classifier.

(4b) Generating a diversity training sample set S_t: sampling the weighted training set S' for N times by adopting a random sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each time_t＝[X_t,Y_t]Wherein X is_tDenotes the set of samples obtained by N extractions, X_tWhere each sample is still a 1 x F vector, Y_tRepresents a correspondence X_tLabels of all samples in (1), X_tAnd Y_tJointly form a diversity training sample set S_t。

(4c) Generating a feature subset F_t,k: randomly and non-putting back diversity training sample set S_tF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formed_tCorresponding feature subset F_t,kK is 1,2, …, K, where K denotes the number of feature subsets, assuming each subset is a subset of featuresA subset of features F_t,kP features in the image, then each feature subset F_t,kAre each a 1 × P vector, and P ═ F/K.

(4d) Generating a diversity training sample subset S_t,k: training a sample set S from diversity_tTo the feature subset F_t,kContaining columns to which the features correspond, i.e. feature subsets F_t,kCorresponding diversity training sample set S_tForm K diversity training sample subsets S_t,kDiversity training sample subset S_t,kIs N × P.

(4e) Computing rotation matrices

Training sample subsets S for K diversity respectively by using PCA algorithm_t,kCalculating a coefficient matrix c_t,kK is 1,2, …, K, and the coefficient matrix c is used_t,kForm a block diagonal matrix R_t(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristics_tThe rows are rearranged to obtain a diversity training sample set S_tOf the rotation matrix

(4f) Generation of post-rotation diversity training sample set S'_t: training diversity sample set S_tAnd a rotation matrix

Multiplying to obtain a diversity training sample set S 'after rotation'_t，

(4g) Training sample set S 'with diversity after rotation'_tTraining a decision tree base classifier: using xi_tRepresenting the trained decision tree-based classifier, training the sample set S 'with the rotated diversity obtained in step (4 f)'_tTraining the decision tree base classifier introduced in the step (4a) to obtain a trained decision tree base classifier xi_tTrained decision tree based classifier xi_tIncluding for all samples x_iThe classification result of (1).

(4h) Update the sample weight W (x)_i): using xi_qRepresents the currently obtained decision tree base classifier of step (4g), q is 1,2, …, t; according to the currently obtained decision tree base classifier xi_qFor all samples x_iCalculating a sample x from the classification result of (2)_iWeight W (x)_i)。

(4i) Updating the weighted training set S': the sample weight W (x) obtained by the calculation in the step (4h) is used_i) Re-weighting the training set S, specifically, weighting the sample weight W (x)_i) Sample x corresponding to training set S_iMultiplying as a new training sample, sample x_iCorresponding label y_iKeeping the same, an updated weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),…,(W(x_N)·x_N,y_N)；

(4j) Updating the sequence number t of the decision tree base classifier: and (4) returning to execute the step (4b) by making t equal to t +1, and entering the next iteration of training the decision tree-based classifier.

(4k) Generating a weight-based rotating forest model: repeating (4b) - (4j) T times, and traversing all decision tree base classifiers to obtain T trained decision tree base classifiers xi_t(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.

Based on the RoF algorithm, the dynamic weighting function is additionally arranged to weight the samples, the more the weight of the samples is, the more important the samples are, and the more attention is given to the next decision tree-based classifier, so that the classification precision of the hyperspectral image samples can be effectively improved. When the current decision tree-based classifier is trained, the used training samples are obtained by multiplying the weight values obtained by calculating the classification results of the generated decision tree-based classifier on the training samples by the samples, so that the interconnection among a plurality of decision tree-based classifiers in the weighted-based rotating forest model is established, and the weighted-based rotating forest model with the interconnection among the decision tree-based classifiers is finally generated, so that the integration performance of the model is improved, and the classification precision of the hyperspectral image is also improved.

Example 3

The classification method of the rotated forest hyperspectral image based on weighting is the same as that in the embodiment 1-2, and the sample weight W (x) is updated in the step (4h)_i)：

Wherein t represents the serial number of the decision tree base classifier after current training, q represents the serial number of the decision tree base classifier after training, q is 1,2, …, t, xi_q(x_i) Representing the qth trained decision tree base classifier ξ_qFor sample x_iClassification result of (2), Y_t(x_i) Representing a diversity training sample set S_tCorresponds to sample x_iThe label of (1).

The RoF algorithm has the problems that all training samples are considered to be equal and the potential of the samples containing important information is ignored. In the invention, the larger the sample weight is, the more important the sample is, more attention can be given through the decision tree base classifier, and the classification precision of the hyperspectral image sample is improved.

An experimental, dataless example is given below to further illustrate the invention

Example 4

The classification method of the rotating forest hyperspectral image based on weighting is the same as the embodiment 1-3, and referring to FIG. 1, the implementation steps of the invention are as follows:

step 1: acquiring a hyperspectral image sample and a sample to be detected: and acquiring a hyperspectral image as a hyperspectral image sample to be classified through field acquisition or a remote sensing database.

The hyperspectral image sample of the embodiment is from, but not limited to, the Pavia University hyperspectral data collected by the imaging spectrometer of the reflective optical system. The size of the Pavia University dataset is M × F42776 × 103, that is, the number of samples M42776, the number of features F per sample 103, and the number of classes C of samples 9.

Step 2: dividing the hyperspectral image sample into a training set and a test set: randomly extracting 10 samples from each category of the Pavia University dataset, and obtaining the total number N of training samples to be 10 × C to 90. The 90 training samples and the labels corresponding to the training samples form a training set S, and the remaining 42686 samples and the corresponding labels serve as a test set E. S ═ x₁,y₁),(x₂,y₂),…,(x_N,y_N)，x_iThe ith sample of the training set S is represented as a 1 × 103 vector, y_iRepresents a sample x_iLabel of (a), y_i∈{1,2,…,C}。

And step 3: initializing sample weights in a training set S: by W (x)_i) Represents a sample x_iInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)_i) 1/N, i 1,2, N representing the number of samples of the training set S.

In this embodiment, since 90 samples are randomly extracted from the training set in step 2 as training samples, if the number N of samples in the training set S is 90, the weight W (x) of each sample in the training set S is initialized_i)＝1/90，i＝1,2,...,90。

And 4, step 4: generating a weighted training set S': the N initialized training sample weights W (x) obtained in the step 3_i) Samples x corresponding to training set S respectively_iMultiplying to obtain weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),…,(W(x_N)·x_N,y_N)。

In this embodiment, the N initialized training sample weights W (x) obtained in step 3 are used_i) 1/90 are trained respectivelyCorresponding sample x in training set S_iMultiplying to obtain weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),…,(W(x₉₀)·x₉₀,y₉₀)＝(x₁/90,y₁),(x₂/90,y₂),…,(x₉₀/90,y₉₀) The number of samples in the weighted training set S' is still equal to 90.

And 5: establishing a weight-based rotating forest model: referring to fig. 2, assuming that the weighted-based rotating forest model is formed by T decision tree-based classifiers together, the serial numbers of the decision tree-based classifiers are set to be T, T is 1,2, …, and T decision tree-based classifiers are associated with each other.

In this embodiment, T is set to 50, that is, the weighted-based rotating forest model is formed by 50 decision tree-based classifiers together.

5.1) initializing the sequence number t of the decision tree base classifier to 1.

5.2) generating a diversity training sample set S_t: sampling the weighted training set S' for N times by adopting a sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each time_t＝[X_t,Y_t]Wherein X is_tDenotes samples taken N times, each sample x_i(x_i∈X_t) Is still a 1 XF vector, Y_tRepresents a correspondence X_tLabels of all samples in (1), X_tAnd Y_tJointly form a diversity training sample set S_t。

In this embodiment, since the number of samples in the weighted training set S 'is N equal to 90, the weighted training set S' is sampled N equal to 90 times by a sampling method with put-back, and the samples extracted each time and the labels corresponding to the samples form the diversity training sample set S_t。

5.3) generating feature subset F_t,k: randomly and non-putting back diversity training sample set S_tF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formed_tCorresponding feature subset F_t,kK1, 2, K, usedK denotes the number of feature subsets, assuming that each feature subset F_t,kP features in the image, then each feature subset F_t,kAre each a 1 × P vector, and P ═ F/K.

In this embodiment, if K is 30, the diversity training sample set S is randomly placed back-less_tThe 103 features in the training sample set are divided into 30 subsets to form a diversity training sample set S_tCorresponding feature subset F _t,k1,2, 30. Since 103 features are not evenly divisible by 30, there are 4 features in the 1 st to 13 th feature subsets and 3 features in the 14 th to 30 th feature subsets.

5.4) generating a diversity training sample subset S_t,k: training a sample set S from diversity_tTo select a corresponding feature subset F_t,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets S_t,k(ii) a Assume each feature subset F_t,kP features in the training sequence, then the diversity training sample subset S_t,kThe dimension of (a) is NxP;

in this embodiment, feature subset F_t,kK is 30, so the sample set S is trained from diversity_tTo select a corresponding feature subset F_t,kThe method comprises the columns corresponding to the features, and 30 diversity training sample subsets S are formed_t,k. Wherein, the 1 st to 13 th diversity training sample subsets S_t,kWherein, N is 90 samples, the dimension of each sample is 1 multiplied by 4, and the 14 th to 30 th diversity training sample subsets S_t,kThere are also 90 samples with dimensions of 1 × 3 per sample.

5.5) calculating the rotation matrix

Training sample subsets S for K diversity respectively by using PCA algorithm_t,kCalculating a coefficient matrix c_t,kK1, 2, K, and training the sample subset S with K diversity_t,kCoefficient matrix c of_t,kForm a block diagonal matrix R_t(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristics_tThe rows are rearranged to obtain a diversity training sample set S_tOf the rotation matrix

In this embodiment, a PCA algorithm is used to respectively train the subsets S of samples with diversity K equal to 30_t,kCalculating a coefficient matrix c_t,kK 1,2, 30, and using the coefficient matrix c_t,kForm a block diagonal matrix R_t(ii) a Finally, the block diagonal matrix R is paired according to the arrangement sequence of the original F-103 characteristics_tThe rows are rearranged to obtain a diversity training sample set S_tOf the rotation matrix

5.6) generating a rotated diversity training sample set S'_t: training diversity sample set S_tAnd a rotation matrix

Multiplying to obtain a diversity training sample set S 'after rotation'_t，

5.7) training decision Tree based classifier xi_t: training sample set S with rotated diversity_t' training decision tree base classifier to obtain trained decision tree base classifier xi_t。

5.8) update the sample weight W (x)_i): using xi_qRepresents the currently derived decision tree-based classifier, q 1, 2. According to the currently obtained decision tree base classifier xi_qFor sample x_iCalculating a sample x from the classification result of (1)_iWeight W (x)_i)：

Wherein t represents the serial number of the decision tree base classifier after current training, q represents the serial number of the decision tree base classifier after training, and q is 1,2_q(x_i) Representing the qth trained decision tree base classifier ξ_qFor sample x_iClassification result of (2), Y_t(x_i) Representing a diversity training sample set S_tCorresponds to sample x_iThe label of (1).

5.9) update the weighted training set S': sample weight W (x) calculated in step 5.8)_i) The training set S is reweighed to obtain an updated weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),...,(W(x_N)·x_N,y_N)。

5.10) updating sequence number t of the decision tree base classifier: and returning to execute the step 5.2) when t is t +1, generating the diversity training sample set, the feature subset, the diversity training sample subset and the rotated diversity training sample set again, training the decision tree base classifier by using the rotated diversity training sample set, and entering the next round of updating iteration of the decision tree base classifier.

5.11) generating a weight-based rotating forest model: repeating 5.2) -5.10) T times to obtain T trained decision tree base classifiers xi_t(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.

In this embodiment, repeat 5.2) -5.10) T ═ 50 times, to obtain 50 trained decision tree base classifiers ξ_t(T-1, 2, …,50), and T-50 trained decision tree-based classifiers collectively form a weighted-based rotating forest model.

Step 6: generating a classification result: putting each sample in the test set E into T trained decision tree-based classifiers in a weighted rotating forest model respectively to obtain T classification results; the category with the largest quantity in the T classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.

In this embodiment, 42686 samples in the test set E are respectively put into 50 trained decision tree-based classifiers in a weighted-based rotating forest model, and 50 classification results are obtained; the category with the largest number in the 50 classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.

The invention mainly solves the problems that the prior art neglects to provide important information samples, and the integration performance of the classification model is low. The implementation scheme is as follows: acquiring a hyperspectral image sample, and dividing the hyperspectral image sample into a training set and a test set; initializing the weight of each sample in a training set; correspondingly multiplying the initialized weight value by each sample in the training set to obtain a weighted training set; using the weighted training set to pair the decision tree base classifier to obtain the trained decision tree base classifier; calculating a sample weight according to a classification result of the trained decision tree-based classifier on the training sample and a designed dynamic weighting function, re-weighting the training set and training the decision tree-based classifier; the T trained decision trees jointly form a weight-based rotating forest model; and putting each sample in the test set into a weighted rotating forest model to obtain a final classification result of the hyperspectral image samples. The method can excavate the potential of an important information sample through a weighted rotating forest model, improves the integration performance among trained decision tree-based classifiers, and can be used for hyperspectral image land classification.

The effects of the present invention can be further illustrated by the following tests:

example 5

The classification method of the hyperspectral images of the rotating forest based on the weighting is the same as the embodiment 1-4,

test conditions and contents:

in this example, a total of 5 tests were performed, i.e. 10, 20, 30, 40 samples were randomly drawn from each class of the Pavia University dataset, since the number of classes of the Pavia University dataset is C9, and the training set size is 90, 180, 270, and 360 for 5 experiments. The original random forest algorithm, the rotating forest algorithm and the algorithm of the invention are respectively used for classifying the Pavia University data, the average precision is counted, and the result is shown in Table 1.

Test results and analysis:

TABLE 1 average Classification precision (%) -of original random forest Algorithm and inventive Algorithm

Training set S size	Original random forest algorithm	Rotating forest algorithm	The invention
				90	71.71	75.83	77.49
180	75.07	78.76	81.35
				270	80.04	83.63	86.76
360	80.54	85.11	88.14

Under the condition that the sizes of the training set S are respectively 90, 180, 270 and 360, comparing the original random forest algorithm, the rotating forest algorithm and the average classification precision of the method respectively to obtain the results shown in the table 1. As can be seen from Table 1, when the sizes of the training sets are respectively 90, 180, 270 and 360, the average classification precision of the method is greater than that of the original random forest algorithm and the rotating forest algorithm, and the average precision of the method is the highest when the size of the training set S is 360. Experiments prove that the method can obtain higher average classification precision than the conventional random forest algorithm and the rotary forest algorithm under the condition that the sizes of the training sets S are respectively 90, 180, 270 and 360, and has good application effect.

In short, the method for classifying the hyperspectral images of the rotating forest based on the weighting solves the problems that the hyperspectral images are low in classification precision and low in classification model integration performance in the prior art. The implementation scheme is as follows: acquiring a hyperspectral image sample and dividing the hyperspectral image sample into a training set and a testing set; initializing sample weights of the training set, and multiplying the weights by corresponding samples of the training set to obtain a weighted training set; training a decision tree-based classifier to obtain a weighted training set classification result; designing a dynamic weighting function to calculate the classification result of the weighted training set and circularly update the sample weight; repeating the process of weighting the training set and training the decision tree-based classifiers for T times to obtain T trained decision tree-based classifiers which jointly form a weight-based rotating forest model, namely establishing the weight-based rotating forest model; and putting the test set into a weight-based rotating forest model to obtain a final classification result of the hyperspectral image sample. According to the method, the samples containing important information are mined by designing a dynamic weighting function, and the classification result of the weighted training set of the generated decision tree-based classifier is brought into the current decision tree-based classifier to be trained.

The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims

1. A method for classifying hyperspectral images of rotating forests based on weighting is characterized by comprising the following steps:

(2) Initializing weights of samples in the training set S: by W (x)_i) Represents a sample x_iInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)_i)＝1/N，i＝1,2,...,N；

(4) Establishing a weight-based rotating forest model: assuming that a weighted-based rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, arranging the T decision tree-based classifiers in sequence, and training according to the arrangement sequence, wherein T is 1,2, …; weighted by sampling with put-backSampling the training set S' for N times to obtain a diversity training sample set S_tDiversity training sample set S_tIs a 1 x F vector; randomly combining diversity training sample set S_tF features in the image are divided into K subsets, forming feature subset F_t,kK is 1,2,. K; training a sample set S from diversity_tTo select a feature subset F_t,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets S_t,k(ii) a Diversity training sample subset S using Principal Component Analysis (PCA) algorithm_t,kExtracting features to obtain a rotation matrix

Training diversity sample set S_tAnd a rotation matrix

2. The method for classifying hyperspectral images of rotated forests based on weighting as claimed in claim 1, wherein the step (4) of establishing the model of the rotated forests based on weighting comprises the following steps:

(4a) initializing a decision tree base classifier: introducing a rotary forest model, wherein the basic structure of the introduced rotary forest model is that a rotary forest model based on weighting is assumed to be formed by T decision tree-based classifiers together, the sequence number of the decision tree-based classifier is set to be T, T is 1,2, …, T, the T decision tree-based classifiers are arranged in sequence and are trained according to the arrangement sequence, the sequence number T of the decision tree-based classifier is initialized to be 1, and the iteration of training the decision tree-based classifier is started;

(4b) generating a diversity training sample set S_t: sampling the weighted training set S' for N times by adopting a random sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each time_t＝[X_t,Y_t]Wherein X is_tDenotes the set of samples obtained by N extractions, X_tWhere each sample is still a 1 x F vector, Y_tRepresents a correspondence X_tLabels of all samples in (1), X_tAnd Y_tJointly form a diversity training sample set S_t；

(4c) Generating a feature subset F_t,k: randomly and non-putting back diversity training sample set S_tF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formed_tCorresponding feature subset F_t,kK, where K denotes the number of feature subsets, assuming that each feature subset F is a subset F_t,kP features in the image, then each feature subset F_t,kAre each a 1 × P vector, P ═ F/K;

(4d) generating a diversity training sample subset S_t,k: training a sample set S from diversity_tTo the feature subset F_t,kContaining columns corresponding to the features, forming K diversity training sample subsets S_t,kDiversity training sample subset S_t,kThe dimension of (a) is NxP;

(4e) computing rotation matrices

Training sample subsets S for K diversity respectively by using PCA algorithm_t,kCalculating a coefficient matrix c_t,kK is 1,2,.. K, and is used with a coefficient matrix c_t,kForm a block diagonal matrix R_t(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristics_tThe rows of (A) are rearranged to obtain diversity trainingSample set S_tOf the rotation matrix

Multiplying to obtain a diversity training sample set S 'after rotation'_t，

(4g) Training sample set S 'with diversity after rotation'_tTraining a decision tree base classifier: using xi_tRepresenting a trained decision tree-based classifier, training a sample set S 'with the rotated diversity'_tTraining the decision tree-based classifier to obtain a trained decision tree-based classifier xi_tTrained decision tree based classifier xi_tIncluding for all samples x_iThe classification result of (2);

(4h) update the sample weight W (x)_i): using xi_qRepresents the currently derived decision tree base classifier, q ═ 1,2, …, t; the currently obtained decision tree base classifier xi is used_qFor all samples x_iSubstituting the classification result into the designed dynamic weighting function to calculate the sample x_iWeight W (x)_i)；

(4i) Updating the weighted training set S': using the sample weight W (x)_i) The training set S is reweighed to obtain an updated weighted training set S', S ═ W (x)₁)·x₁,y₁),(W(x₂)·x₂,y₂),…,(W(x_N)·x_N,y_N)；

(4j) Updating the sequence number t of the decision tree base classifier: returning to the step (4b) by making t equal to t +1, and entering the iteration of the next round of training decision tree-based classifier;

(4k) generating a weight-based rotating forest model: repeat (4b) - (4j)) T times, all decision tree base classifiers are traversed to obtain T trained decision tree base classifiers xi_t(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.

3. A method for classifying hyperspectral images of rotating forest based on weighting as claimed in claim 2, wherein the step (4h) of updating the sample weight W (x) according to a dynamic weighting function_i)：