CN112308151A - Weighting-based classification method for hyperspectral images of rotating forest - Google Patents

Weighting-based classification method for hyperspectral images of rotating forest Download PDF

Info

Publication number
CN112308151A
CN112308151A CN202011207564.4A CN202011207564A CN112308151A CN 112308151 A CN112308151 A CN 112308151A CN 202011207564 A CN202011207564 A CN 202011207564A CN 112308151 A CN112308151 A CN 112308151A
Authority
CN
China
Prior art keywords
training
decision tree
sample
diversity
samples
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011207564.4A
Other languages
Chinese (zh)
Inventor
冯伟
董淑仙
全英汇
钟娴
童莹萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202011207564.4A priority Critical patent/CN112308151A/en
Publication of CN112308151A publication Critical patent/CN112308151A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for classifying a hyperspectral image of a rotating forest based on weighting, which solves the problems of low precision of hyperspectral image classification and low integration performance of a classification model. The scheme is as follows: the hyperspectral image sample is divided into a training set and a testing set; initializing sample weights of the training set, and multiplying the sample weights by corresponding samples of the training set to obtain a weighted training set; training a decision tree-based classifier and obtaining a weighted training set classification result; establishing a weight-based rotating forest model; and putting the test set into a weight-based rotating forest model to obtain a final classification result of the hyperspectral image sample. According to the method, the samples containing important information are mined by designing a dynamic weighting function, and the classification result of the weighted training set of the generated decision tree-based classifier is brought into the current decision tree-based classifier to be trained.

Description

Weighting-based classification method for hyperspectral images of rotating forest
The invention belongs to the technical field of image processing, mainly relates to remote sensing image processing, and particularly relates to a method for classifying a rotating forest hyperspectral image based on weighting. In particular to a remote sensing classification method for mining important samples, which can be used for hyperspectral image land classification.
Background
Classification is one of the main tasks of remote sensing information processing. The classification of hyperspectral data is generally more difficult than other remote sensing images because of the high ratio of features to the number of samples of hyperspectral data and the presence of redundant information in the feature set. Although most learning systems face a troublesome problem known as "dimensional disaster," studies have demonstrated the successful application of classifier ensemble techniques in hyperspectral classification. Ensemble learning is an effective method for developing an accurate classification system, and can improve the performance of a weak classifier and make accurate prediction. Boosting and autonomous aggregation (bagging) are the main ensemble learning methods. Diversity is considered to be a very important feature of the classifier combination that can be effectively used to reduce variance errors without increasing bias errors of the integration method. To encourage diversity in bagging, the ban Kam Ho of Bell laboratories in 1995 proposed a Random Forest (RFs) algorithm. In 2005, Jisoo Ham applied RFs to remote sensing image classification for the first time and achieved satisfactory results. RFs are a combination of tree predictors, where the decision tree is constructed using a sample-with-put training sample technique; they randomly sample the properties and choose the best segmentation among these variables, rather than all the properties. RFs have the important advantages of efficient operation on large databases, capability of processing thousands of input variables without deleting variables, low time cost, etc.
In image processing, Juan J Rodri i guez refers to the idea of RFs and proposes a Rotation Forest (RoF) method, aiming at establishing a more accurate and diversified basic classifier. The method randomly divides a feature space into a plurality of subspaces, applies feature transformation to each subspace respectively, repeats the process, and generates different training data sets and basic classifiers for different feature subspaces. The showa proves that the rotary forest algorithm is superior to bagging, Adaptive Boosting (AdaBoost) and RFs in a paper 'classifier integration algorithm research based on rotary forest'.
In conclusion, the rotating forest generates a sparse rotating matrix by using a feature extraction algorithm, and an original image is projected to different coordinate systems, so that the constructed base classifier has strong difference. Thus, RoF provides better performance in image classification than bagging, AdaBoost, and RFs, among other algorithms. However, since the RoF algorithm gives all training samples the same weight, the potential for providing important information samples is ignored. Furthermore, these algorithms each independently generate base classifiers, some of which not only increase the computational complexity of the algorithms, but also reduce the integration performance of the algorithms.
Disclosure of Invention
The invention aims to provide a method for classifying hyperspectral images of a rotating forest based on weighting, which has better integration performance, aiming at the defects of the prior art. According to the method, important samples are mined and weighted, and the growth of trees in the weighted rotating forest is guided in a self-adaptive manner, so that the classification precision of the hyperspectral images is improved.
The invention relates to a method for classifying hyperspectral images of a rotating forest based on weighting, which is characterized by comprising the following steps of:
(1) obtaining samples and dividing a training set and a testing set: acquiring hyperspectral image samples with the size of M multiplied by F through field collection or a remote sensing database, wherein M represents the number of samples, F represents the characteristic number of each sample, and C represents the number of classes of the samples; then randomly extracting N samples from the M samples to be used as a training set S, and using the rest samples as a test set E; s ═ x1,y1),(x2,y2),…,(xN,yN),xiRepresents the ith sample of the training set S, is a 1 XF vector, yiRepresents a sample xiLabel of (a), yi∈{1,2,…,C};
(2) Initializing sample weights in a training set S: by W (x)i) Represents a sample xiInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)i)=1/N,i=1,2,...,N;
(3) Generating a weighted training set S': initializing N training sample weights W (x)i) Samples x corresponding to training set S respectivelyiMultiplying to obtain weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(xN)·xN,yN);
(4) Establishing a weight-based rotating forest model: assuming that a weighted-based rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, arranging the T decision tree-based classifiers in sequence, and training according to the arrangement sequence, wherein T is 1,2, …; sampling the weighted training set S' for N times by adopting a sampling mode with putting back to obtain a diversity training sample set StDiversity training sample set StIs a 1 x F vector; randomly combining diversity training sample set StF features in the image are divided into K subsets, forming feature subset Ft,kK is 1,2,. K; training a sample set S from diversitytTo select a feature subset Ft,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets St,k(ii) a Training a subset S of samples for diversity using a Principal Component Analysis (PCA) algorithmt,kExtracting features to obtain a rotation matrix
Figure BDA0002757547450000021
Training diversity sample set StAnd a rotation matrix
Figure BDA0002757547450000022
Multiplying to obtain a diversity training sample set S 'after rotation't(ii) a Training sample set S 'with diversity after rotation'tTraining decision tree base classifiers, wherein the tth trained decision tree base classifier is represented as xitStill, T ═ 1,2, …, T; the T trained decision tree-based classifiers jointly form a weight-based rotating forest model of the hyperspectral image;
(5) generating a classification result: putting each sample in the test set E into T trained decision tree-based classifiers in a weighted rotating forest model respectively to obtain T classification results; and the category with the largest quantity in the T classification results is the classification result of the hyperspectral image based on the weighted rotating forest model.
The method solves the problems that the potential for providing important information samples is ignored and the independent generation of each base classifier increases the algorithm complexity and reduces the integration performance, and improves the classification precision of the hyperspectral image samples.
Compared with the prior art, the invention has the following advantages:
the classification precision is improved: aiming at the problems that all training samples are considered to be equal and potential containing important information samples is ignored in the RoF algorithm, the potential containing the important information samples is mined and the samples are weighted by designing a dynamic weighting function. The larger the sample weight is, the more important the sample is, the more attention can be given to the decision tree-based classifier, and the classification precision of the hyperspectral image sample is improved.
The integration performance of the algorithm is improved: aiming at the problems that the basic classifiers are mutually independent in the RoF algorithm, so that the calculation complexity of the algorithm is increased and the integration performance of the algorithm is reduced for some basic classifiers, the invention weights the samples by training each decision tree base classifier, and the weight is calculated according to the classification result of the generated decision tree base classifier on the training samples, so that the decision tree base classifiers are mutually connected, and the integration performance of the model is improved.
Drawings
FIG. 1 is a block flow diagram of the present invention;
FIG. 2 is a block diagram of a process for building a weight-based rotating forest model according to the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
Example 1
The hyperspectral image contains abundant spectral information and can effectively reflect the information of an imaging target, so that the hyperspectral image is widely applied to the fields of precision agriculture, environmental monitoring, military reconnaissance and the like. In these applications, hyperspectral image classification is one of the important links, and the final goal of classification is to accurately give each pixel in an image a unique class identifier. The RoF algorithm is one of many classification algorithms. The RoF algorithm generates a sparse rotation matrix by using a feature extraction algorithm, and projects an original image into different coordinate systems, so that the constructed base classifier has strong difference. Compared with algorithms such as bagging, AdaBoost and RFs, the classification precision of the hyperspectral image samples can be improved by the RoF algorithm. But all training samples are given the same weight in the RoF algorithm, and the potential of providing important information samples is ignored. Furthermore, these algorithms each independently generate base classifiers, some of which not only increase the computational complexity of the algorithms, but also reduce the integration performance of the algorithms. Therefore, the invention provides a method for classifying hyperspectral images of a rotating forest based on weighting by developing improvement and thinking aiming at the current situation.
The invention discloses a method for classifying hyperspectral images of a rotating forest based on weighting, which comprises the following steps of:
(1) obtaining samples and dividing a training set and a testing set: the hyperspectral image samples can be obtained through field collection or a remote sensing database according to images to be classified, and the size of the hyperspectral image samples is M multiplied by F, wherein M represents the number of samples, F represents the feature number of each sample, and C represents the number of categories in the image samples. Then randomly extracting N samples from the M samples to be used as a training set S, and using the rest samples as a test set E; s ═ x1,y1),(x2,y2),…,(xN,yN),xiRepresents the ith sample of the training set S, is a 1 XF vector, yiRepresents a sample xi1,2,.., N, yi∈{1,2,…,C}。
If the dataset used is a Pavia University school landscape in northern italy, the hyperspectral image samples to be classified may be acquired in the field at school or from a remote sensing database.
(2) Initializing sample weights in training set SThe value: by W (x)i) Representing training set samples xiInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)i) 1/N, i is 1,2, N, i is the initial weight of each sample in the training set S of the present invention is 1/N.
(3) Generating a weighted training set S': initializing N training sample weights W (x)i) Samples x corresponding to training set S respectivelyiMultiplication, i ═ 1,2, …, N, resulting in a weighted training set S':
Figure BDA0002757547450000041
wherein, (W (x)1)·x1,y1) The first weighted sample, and so on.
(4) Establishing a weight-based rotating forest model: assuming that the weighted rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, T being 1,2, …, and T, and training according to the sequence. Sampling the weighted training set S' for N times by adopting a sampling mode with putting back to obtain a diversity training sample set StDiversity training sample set StIs a 1 x F vector. Randomly combining diversity training sample set StF features in the image are divided into K subsets, forming feature subset Ft,kK is 1, 2. Training a sample set S from diversitytTo select a feature subset Ft,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets St,k. Training a subset S of samples for diversity using a Principal Component Analysis (PCA) algorithmt,kExtracting features to obtain a rotation matrix
Figure BDA0002757547450000042
Training diversity sample set StAnd a rotation matrix
Figure BDA0002757547450000043
Multiplication by multiplicationObtaining a diversity training sample set S 'after rotation't. Training sample set S 'with diversity after rotation'tTraining decision tree base classifiers, wherein the tth trained decision tree base classifier is represented as xitThere is still T ═ 1,2, …, T. And the T trained decision tree-based classifiers jointly form a weight-based rotating forest model of the hyperspectral image. Compared with the independent generation among all base classifiers in the existing RoF algorithm, the method calculates the sample weight according to the classification result of the generated decision tree base classifier on the training sample, weights the sample, trains the current decision tree base classifier by using the weighted training sample, and leads all the trained base classifiers to be mutually linked, so the weighted rotating forest model has better integration performance than the existing RoF model.
(5) Generating a classification result: and respectively putting each sample in the test set E into T trained decision tree-based classifiers in the weighted rotating forest model to obtain T classification results. The category with the largest quantity in the T classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.
The RoF algorithm is a typical hyperspectral image classification algorithm at present, and the difference and the accuracy between base classifiers are improved by using a feature extraction algorithm, so that the classification accuracy of hyperspectral image samples is improved. However, RoF also has the problem of poor integration performance of the rotating forest model due to neglect of the potential of providing important information samples and the fact that a plurality of base classifiers which do not reasonably utilize RoF. Aiming at the problem, the invention provides an overall technical scheme of classification of a rotating forest hyperspectral image based on weighting through research. According to the method, a dynamic weighting function is designed, potential of important samples is mined, weights are given to the samples, the greater the sample weight is, the more attention can be given to the samples by a classifier, and the classification precision of the hyperspectral image samples is improved. In addition, because the invention weights the sample before training each decision tree-based classifier, and the weight is calculated according to the classification result of the generated decision tree-based classifier to the training sample, the decision tree-based classifiers are interconnected in the invention, and the classification result of the previously generated decision tree-based classifier to the training sample is brought into the decision tree-based classifier to be trained currently, thereby improving the integration performance of the model.
Example 2
The method for classifying hyperspectral images of rotating forests based on weighting is the same as that in the embodiment 1, and a rotating forest model based on weighting is established in the step (4), and referring to fig. 2, the method comprises the following steps:
(4a) initializing a decision tree base classifier: introducing a rotary forest model, wherein the basic structure of the introduced rotary forest model is that a rotary forest model based on weighting is assumed to be formed by T decision tree-based classifiers together, the sequence number of the decision tree-based classifier is set to be T, T is 1,2, …, T, T decision tree-based classifiers are arranged in sequence and are trained according to the arrangement sequence, the sequence number T of the initialized decision tree-based classifier is 1, and the iteration of training the decision tree-based classifiers is started. After the serial number of the decision tree-based classifier is initialized, a diversity training sample set, a feature subset, a diversity training sample subset and a rotated diversity training sample set are sequentially generated, and the decision tree-based classifier is trained by using the rotated diversity training sample set to complete the training of the decision tree-based classifier.
(4b) Generating a diversity training sample set St: sampling the weighted training set S' for N times by adopting a random sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each timet=[Xt,Yt]Wherein X istDenotes the set of samples obtained by N extractions, XtWhere each sample is still a 1 x F vector, YtRepresents a correspondence XtLabels of all samples in (1), XtAnd YtJointly form a diversity training sample set St
(4c) Generating a feature subset Ft,k: randomly and non-putting back diversity training sample set StF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formedtCorresponding feature subset Ft,kK is 1,2, …, K, where K denotes the number of feature subsets, assuming each subset is a subset of featuresA subset of features Ft,kP features in the image, then each feature subset Ft,kAre each a 1 × P vector, and P ═ F/K.
(4d) Generating a diversity training sample subset St,k: training a sample set S from diversitytTo the feature subset Ft,kContaining columns to which the features correspond, i.e. feature subsets Ft,kCorresponding diversity training sample set StForm K diversity training sample subsets St,kDiversity training sample subset St,kIs N × P.
(4e) Computing rotation matrices
Figure BDA0002757547450000061
Training sample subsets S for K diversity respectively by using PCA algorithmt,kCalculating a coefficient matrix ct,kK is 1,2, …, K, and the coefficient matrix c is usedt,kForm a block diagonal matrix Rt(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristicstThe rows are rearranged to obtain a diversity training sample set StOf the rotation matrix
Figure BDA0002757547450000062
(4f) Generation of post-rotation diversity training sample set S't: training diversity sample set StAnd a rotation matrix
Figure BDA0002757547450000063
Multiplying to obtain a diversity training sample set S 'after rotation't
Figure BDA0002757547450000064
(4g) Training sample set S 'with diversity after rotation'tTraining a decision tree base classifier: using xitRepresenting the trained decision tree-based classifier, training the sample set S 'with the rotated diversity obtained in step (4 f)'tTraining the decision tree base classifier introduced in the step (4a) to obtain a trained decision tree base classifier xitTrained decision tree based classifier xitIncluding for all samples xiThe classification result of (1).
(4h) Update the sample weight W (x)i): using xiqRepresents the currently obtained decision tree base classifier of step (4g), q is 1,2, …, t; according to the currently obtained decision tree base classifier xiqFor all samples xiCalculating a sample x from the classification result of (2)iWeight W (x)i)。
(4i) Updating the weighted training set S': the sample weight W (x) obtained by the calculation in the step (4h) is usedi) Re-weighting the training set S, specifically, weighting the sample weight W (x)i) Sample x corresponding to training set SiMultiplying as a new training sample, sample xiCorresponding label yiKeeping the same, an updated weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(xN)·xN,yN);
(4j) Updating the sequence number t of the decision tree base classifier: and (4) returning to execute the step (4b) by making t equal to t +1, and entering the next iteration of training the decision tree-based classifier.
(4k) Generating a weight-based rotating forest model: repeating (4b) - (4j) T times, and traversing all decision tree base classifiers to obtain T trained decision tree base classifiers xit(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.
Based on the RoF algorithm, the dynamic weighting function is additionally arranged to weight the samples, the more the weight of the samples is, the more important the samples are, and the more attention is given to the next decision tree-based classifier, so that the classification precision of the hyperspectral image samples can be effectively improved. When the current decision tree-based classifier is trained, the used training samples are obtained by multiplying the weight values obtained by calculating the classification results of the generated decision tree-based classifier on the training samples by the samples, so that the interconnection among a plurality of decision tree-based classifiers in the weighted-based rotating forest model is established, and the weighted-based rotating forest model with the interconnection among the decision tree-based classifiers is finally generated, so that the integration performance of the model is improved, and the classification precision of the hyperspectral image is also improved.
Example 3
The classification method of the rotated forest hyperspectral image based on weighting is the same as that in the embodiment 1-2, and the sample weight W (x) is updated in the step (4h)i):
Figure BDA0002757547450000071
Figure BDA0002757547450000072
Wherein t represents the serial number of the decision tree base classifier after current training, q represents the serial number of the decision tree base classifier after training, q is 1,2, …, t, xiq(xi) Representing the qth trained decision tree base classifier ξqFor sample xiClassification result of (2), Yt(xi) Representing a diversity training sample set StCorresponds to sample xiThe label of (1).
The RoF algorithm has the problems that all training samples are considered to be equal and the potential of the samples containing important information is ignored. In the invention, the larger the sample weight is, the more important the sample is, more attention can be given through the decision tree base classifier, and the classification precision of the hyperspectral image sample is improved.
An experimental, dataless example is given below to further illustrate the invention
Example 4
The classification method of the rotating forest hyperspectral image based on weighting is the same as the embodiment 1-3, and referring to FIG. 1, the implementation steps of the invention are as follows:
step 1: acquiring a hyperspectral image sample and a sample to be detected: and acquiring a hyperspectral image as a hyperspectral image sample to be classified through field acquisition or a remote sensing database.
The hyperspectral image sample of the embodiment is from, but not limited to, the Pavia University hyperspectral data collected by the imaging spectrometer of the reflective optical system. The size of the Pavia University dataset is M × F42776 × 103, that is, the number of samples M42776, the number of features F per sample 103, and the number of classes C of samples 9.
Step 2: dividing the hyperspectral image sample into a training set and a test set: randomly extracting 10 samples from each category of the Pavia University dataset, and obtaining the total number N of training samples to be 10 × C to 90. The 90 training samples and the labels corresponding to the training samples form a training set S, and the remaining 42686 samples and the corresponding labels serve as a test set E. S ═ x1,y1),(x2,y2),…,(xN,yN),xiThe ith sample of the training set S is represented as a 1 × 103 vector, yiRepresents a sample xiLabel of (a), yi∈{1,2,…,C}。
And step 3: initializing sample weights in a training set S: by W (x)i) Represents a sample xiInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)i) 1/N, i 1,2, N representing the number of samples of the training set S.
In this embodiment, since 90 samples are randomly extracted from the training set in step 2 as training samples, if the number N of samples in the training set S is 90, the weight W (x) of each sample in the training set S is initializedi)=1/90,i=1,2,...,90。
And 4, step 4: generating a weighted training set S': the N initialized training sample weights W (x) obtained in the step 3i) Samples x corresponding to training set S respectivelyiMultiplying to obtain weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(xN)·xN,yN)。
In this embodiment, the N initialized training sample weights W (x) obtained in step 3 are usedi) 1/90 are trained respectivelyCorresponding sample x in training set SiMultiplying to obtain weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(x90)·x90,y90)=(x1/90,y1),(x2/90,y2),…,(x90/90,y90) The number of samples in the weighted training set S' is still equal to 90.
And 5: establishing a weight-based rotating forest model: referring to fig. 2, assuming that the weighted-based rotating forest model is formed by T decision tree-based classifiers together, the serial numbers of the decision tree-based classifiers are set to be T, T is 1,2, …, and T decision tree-based classifiers are associated with each other.
In this embodiment, T is set to 50, that is, the weighted-based rotating forest model is formed by 50 decision tree-based classifiers together.
5.1) initializing the sequence number t of the decision tree base classifier to 1.
5.2) generating a diversity training sample set St: sampling the weighted training set S' for N times by adopting a sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each timet=[Xt,Yt]Wherein X istDenotes samples taken N times, each sample xi(xi∈Xt) Is still a 1 XF vector, YtRepresents a correspondence XtLabels of all samples in (1), XtAnd YtJointly form a diversity training sample set St
In this embodiment, since the number of samples in the weighted training set S 'is N equal to 90, the weighted training set S' is sampled N equal to 90 times by a sampling method with put-back, and the samples extracted each time and the labels corresponding to the samples form the diversity training sample set St
5.3) generating feature subset Ft,k: randomly and non-putting back diversity training sample set StF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formedtCorresponding feature subset Ft,kK1, 2, K, usedK denotes the number of feature subsets, assuming that each feature subset Ft,kP features in the image, then each feature subset Ft,kAre each a 1 × P vector, and P ═ F/K.
In this embodiment, if K is 30, the diversity training sample set S is randomly placed back-lesstThe 103 features in the training sample set are divided into 30 subsets to form a diversity training sample set StCorresponding feature subset F t,k1,2, 30. Since 103 features are not evenly divisible by 30, there are 4 features in the 1 st to 13 th feature subsets and 3 features in the 14 th to 30 th feature subsets.
5.4) generating a diversity training sample subset St,k: training a sample set S from diversitytTo select a corresponding feature subset Ft,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets St,k(ii) a Assume each feature subset Ft,kP features in the training sequence, then the diversity training sample subset St,kThe dimension of (a) is NxP;
in this embodiment, feature subset Ft,kK is 30, so the sample set S is trained from diversitytTo select a corresponding feature subset Ft,kThe method comprises the columns corresponding to the features, and 30 diversity training sample subsets S are formedt,k. Wherein, the 1 st to 13 th diversity training sample subsets St,kWherein, N is 90 samples, the dimension of each sample is 1 multiplied by 4, and the 14 th to 30 th diversity training sample subsets St,kThere are also 90 samples with dimensions of 1 × 3 per sample.
5.5) calculating the rotation matrix
Figure BDA0002757547450000091
Training sample subsets S for K diversity respectively by using PCA algorithmt,kCalculating a coefficient matrix ct,kK1, 2, K, and training the sample subset S with K diversityt,kCoefficient matrix c oft,kForm a block diagonal matrix Rt(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristicstThe rows are rearranged to obtain a diversity training sample set StOf the rotation matrix
Figure BDA0002757547450000092
In this embodiment, a PCA algorithm is used to respectively train the subsets S of samples with diversity K equal to 30t,kCalculating a coefficient matrix ct,kK 1,2, 30, and using the coefficient matrix ct,kForm a block diagonal matrix Rt(ii) a Finally, the block diagonal matrix R is paired according to the arrangement sequence of the original F-103 characteristicstThe rows are rearranged to obtain a diversity training sample set StOf the rotation matrix
Figure BDA0002757547450000101
5.6) generating a rotated diversity training sample set S't: training diversity sample set StAnd a rotation matrix
Figure BDA0002757547450000102
Multiplying to obtain a diversity training sample set S 'after rotation't
Figure BDA0002757547450000103
5.7) training decision Tree based classifier xit: training sample set S with rotated diversityt' training decision tree base classifier to obtain trained decision tree base classifier xit
5.8) update the sample weight W (x)i): using xiqRepresents the currently derived decision tree-based classifier, q 1, 2. According to the currently obtained decision tree base classifier xiqFor sample xiCalculating a sample x from the classification result of (1)iWeight W (x)i):
Figure BDA0002757547450000104
Figure BDA0002757547450000105
Wherein t represents the serial number of the decision tree base classifier after current training, q represents the serial number of the decision tree base classifier after training, and q is 1,2q(xi) Representing the qth trained decision tree base classifier ξqFor sample xiClassification result of (2), Yt(xi) Representing a diversity training sample set StCorresponds to sample xiThe label of (1).
5.9) update the weighted training set S': sample weight W (x) calculated in step 5.8)i) The training set S is reweighed to obtain an updated weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),...,(W(xN)·xN,yN)。
5.10) updating sequence number t of the decision tree base classifier: and returning to execute the step 5.2) when t is t +1, generating the diversity training sample set, the feature subset, the diversity training sample subset and the rotated diversity training sample set again, training the decision tree base classifier by using the rotated diversity training sample set, and entering the next round of updating iteration of the decision tree base classifier.
5.11) generating a weight-based rotating forest model: repeating 5.2) -5.10) T times to obtain T trained decision tree base classifiers xit(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.
In this embodiment, repeat 5.2) -5.10) T ═ 50 times, to obtain 50 trained decision tree base classifiers ξt(T-1, 2, …,50), and T-50 trained decision tree-based classifiers collectively form a weighted-based rotating forest model.
Step 6: generating a classification result: putting each sample in the test set E into T trained decision tree-based classifiers in a weighted rotating forest model respectively to obtain T classification results; the category with the largest quantity in the T classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.
In this embodiment, 42686 samples in the test set E are respectively put into 50 trained decision tree-based classifiers in a weighted-based rotating forest model, and 50 classification results are obtained; the category with the largest number in the 50 classification results is the classification result of the weighted rotating forest model, namely the classification result of the hyperspectral image sample to be classified.
The invention mainly solves the problems that the prior art neglects to provide important information samples, and the integration performance of the classification model is low. The implementation scheme is as follows: acquiring a hyperspectral image sample, and dividing the hyperspectral image sample into a training set and a test set; initializing the weight of each sample in a training set; correspondingly multiplying the initialized weight value by each sample in the training set to obtain a weighted training set; using the weighted training set to pair the decision tree base classifier to obtain the trained decision tree base classifier; calculating a sample weight according to a classification result of the trained decision tree-based classifier on the training sample and a designed dynamic weighting function, re-weighting the training set and training the decision tree-based classifier; the T trained decision trees jointly form a weight-based rotating forest model; and putting each sample in the test set into a weighted rotating forest model to obtain a final classification result of the hyperspectral image samples. The method can excavate the potential of an important information sample through a weighted rotating forest model, improves the integration performance among trained decision tree-based classifiers, and can be used for hyperspectral image land classification.
The effects of the present invention can be further illustrated by the following tests:
example 5
The classification method of the hyperspectral images of the rotating forest based on the weighting is the same as the embodiment 1-4,
test conditions and contents:
in this example, a total of 5 tests were performed, i.e. 10, 20, 30, 40 samples were randomly drawn from each class of the Pavia University dataset, since the number of classes of the Pavia University dataset is C9, and the training set size is 90, 180, 270, and 360 for 5 experiments. The original random forest algorithm, the rotating forest algorithm and the algorithm of the invention are respectively used for classifying the Pavia University data, the average precision is counted, and the result is shown in Table 1.
Test results and analysis:
TABLE 1 average Classification precision (%) -of original random forest Algorithm and inventive Algorithm
Training set S size Original random forest algorithm Rotating forest algorithm The invention
90 71.71 75.83 77.49
180 75.07 78.76 81.35
270 80.04 83.63 86.76
360 80.54 85.11 88.14
Under the condition that the sizes of the training set S are respectively 90, 180, 270 and 360, comparing the original random forest algorithm, the rotating forest algorithm and the average classification precision of the method respectively to obtain the results shown in the table 1. As can be seen from Table 1, when the sizes of the training sets are respectively 90, 180, 270 and 360, the average classification precision of the method is greater than that of the original random forest algorithm and the rotating forest algorithm, and the average precision of the method is the highest when the size of the training set S is 360. Experiments prove that the method can obtain higher average classification precision than the conventional random forest algorithm and the rotary forest algorithm under the condition that the sizes of the training sets S are respectively 90, 180, 270 and 360, and has good application effect.
In short, the method for classifying the hyperspectral images of the rotating forest based on the weighting solves the problems that the hyperspectral images are low in classification precision and low in classification model integration performance in the prior art. The implementation scheme is as follows: acquiring a hyperspectral image sample and dividing the hyperspectral image sample into a training set and a testing set; initializing sample weights of the training set, and multiplying the weights by corresponding samples of the training set to obtain a weighted training set; training a decision tree-based classifier to obtain a weighted training set classification result; designing a dynamic weighting function to calculate the classification result of the weighted training set and circularly update the sample weight; repeating the process of weighting the training set and training the decision tree-based classifiers for T times to obtain T trained decision tree-based classifiers which jointly form a weight-based rotating forest model, namely establishing the weight-based rotating forest model; and putting the test set into a weight-based rotating forest model to obtain a final classification result of the hyperspectral image sample. According to the method, the samples containing important information are mined by designing a dynamic weighting function, and the classification result of the weighted training set of the generated decision tree-based classifier is brought into the current decision tree-based classifier to be trained.
The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (3)

1. A method for classifying hyperspectral images of rotating forests based on weighting is characterized by comprising the following steps:
(1) obtaining samples and dividing a training set and a testing set: acquiring hyperspectral image samples with the size of M multiplied by F through field collection or a remote sensing database, wherein M represents the number of samples, F represents the characteristic number of each sample, and C represents the number of classes of the samples; then randomly extracting N samples from the M samples to be used as a training set S, and using the rest samples as a test set E; s ═ x1,y1),(x2,y2),…,(xN,yN),xiRepresents the ith sample of the training set S, is a 1 XF vector, yiRepresents a sample xiLabel of (a), yi∈{1,2,…,C};
(2) Initializing weights of samples in the training set S: by W (x)i) Represents a sample xiInitializing the initial weight, initializing the weight of each sample in the training set S: w (x)i)=1/N,i=1,2,...,N;
(3) Generating a weighted training set S': initializing N training sample weights W (x)i) Samples x corresponding to training set S respectivelyiMultiplying to obtain weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(xN)·xN,yN);
(4) Establishing a weight-based rotating forest model: assuming that a weighted-based rotating forest model is formed by T decision tree-based classifiers together, setting the sequence number of the decision tree-based classifiers as T, arranging the T decision tree-based classifiers in sequence, and training according to the arrangement sequence, wherein T is 1,2, …; weighted by sampling with put-backSampling the training set S' for N times to obtain a diversity training sample set StDiversity training sample set StIs a 1 x F vector; randomly combining diversity training sample set StF features in the image are divided into K subsets, forming feature subset Ft,kK is 1,2,. K; training a sample set S from diversitytTo select a feature subset Ft,kThe columns corresponding to the characteristics are included to form K diversity training sample subsets St,k(ii) a Diversity training sample subset S using Principal Component Analysis (PCA) algorithmt,kExtracting features to obtain a rotation matrix
Figure FDA0002757547440000011
Training diversity sample set StAnd a rotation matrix
Figure FDA0002757547440000012
Multiplying to obtain a diversity training sample set S 'after rotation't(ii) a Training sample set S 'with diversity after rotation'tTraining decision tree base classifiers, wherein the tth trained decision tree base classifier is represented as xitStill, T ═ 1,2, …, T; the T trained decision tree-based classifiers jointly form a weight-based rotating forest model of the hyperspectral image;
(5) generating a classification result: putting each sample in the test set E into T trained decision tree-based classifiers in a weighted rotating forest model respectively to obtain T classification results; and the category with the largest quantity in the T classification results is the classification result of the hyperspectral image based on the weighted rotating forest model.
2. The method for classifying hyperspectral images of rotated forests based on weighting as claimed in claim 1, wherein the step (4) of establishing the model of the rotated forests based on weighting comprises the following steps:
(4a) initializing a decision tree base classifier: introducing a rotary forest model, wherein the basic structure of the introduced rotary forest model is that a rotary forest model based on weighting is assumed to be formed by T decision tree-based classifiers together, the sequence number of the decision tree-based classifier is set to be T, T is 1,2, …, T, the T decision tree-based classifiers are arranged in sequence and are trained according to the arrangement sequence, the sequence number T of the decision tree-based classifier is initialized to be 1, and the iteration of training the decision tree-based classifier is started;
(4b) generating a diversity training sample set St: sampling the weighted training set S' for N times by adopting a random sampling mode with putting back, and forming a diversity training sample set S by using samples extracted each timet=[Xt,Yt]Wherein X istDenotes the set of samples obtained by N extractions, XtWhere each sample is still a 1 x F vector, YtRepresents a correspondence XtLabels of all samples in (1), XtAnd YtJointly form a diversity training sample set St
(4c) Generating a feature subset Ft,k: randomly and non-putting back diversity training sample set StF characteristics in the training sample set are divided into K subsets, K is more than 1 and less than F, and a diversity training sample set S is formedtCorresponding feature subset Ft,kK, where K denotes the number of feature subsets, assuming that each feature subset F is a subset Ft,kP features in the image, then each feature subset Ft,kAre each a 1 × P vector, P ═ F/K;
(4d) generating a diversity training sample subset St,k: training a sample set S from diversitytTo the feature subset Ft,kContaining columns corresponding to the features, forming K diversity training sample subsets St,kDiversity training sample subset St,kThe dimension of (a) is NxP;
(4e) computing rotation matrices
Figure FDA0002757547440000031
Training sample subsets S for K diversity respectively by using PCA algorithmt,kCalculating a coefficient matrix ct,kK is 1,2,.. K, and is used with a coefficient matrix ct,kForm a block diagonal matrix Rt(ii) a Finally, the block diagonal matrix R is checked according to the original arrangement sequence of the F characteristicstThe rows of (A) are rearranged to obtain diversity trainingSample set StOf the rotation matrix
Figure FDA0002757547440000032
(4f) Generation of post-rotation diversity training sample set S't: training diversity sample set StAnd a rotation matrix
Figure FDA0002757547440000033
Multiplying to obtain a diversity training sample set S 'after rotation't
Figure FDA0002757547440000034
(4g) Training sample set S 'with diversity after rotation'tTraining a decision tree base classifier: using xitRepresenting a trained decision tree-based classifier, training a sample set S 'with the rotated diversity'tTraining the decision tree-based classifier to obtain a trained decision tree-based classifier xitTrained decision tree based classifier xitIncluding for all samples xiThe classification result of (2);
(4h) update the sample weight W (x)i): using xiqRepresents the currently derived decision tree base classifier, q ═ 1,2, …, t; the currently obtained decision tree base classifier xi is usedqFor all samples xiSubstituting the classification result into the designed dynamic weighting function to calculate the sample xiWeight W (x)i);
(4i) Updating the weighted training set S': using the sample weight W (x)i) The training set S is reweighed to obtain an updated weighted training set S', S ═ W (x)1)·x1,y1),(W(x2)·x2,y2),…,(W(xN)·xN,yN);
(4j) Updating the sequence number t of the decision tree base classifier: returning to the step (4b) by making t equal to t +1, and entering the iteration of the next round of training decision tree-based classifier;
(4k) generating a weight-based rotating forest model: repeat (4b) - (4j)) T times, all decision tree base classifiers are traversed to obtain T trained decision tree base classifiers xit(T ═ 1,2, …, T), the T trained decision tree based classifiers collectively make up a weighted-based rotating forest model.
3. A method for classifying hyperspectral images of rotating forest based on weighting as claimed in claim 2, wherein the step (4h) of updating the sample weight W (x) according to a dynamic weighting functioni):
Figure FDA0002757547440000035
Figure FDA0002757547440000041
Wherein t represents the serial number of the decision tree base classifier after current training, q represents the serial number of the decision tree base classifier after training, and q is 1,2q(xi) Representing the qth trained decision tree base classifier ξqFor sample xiClassification result of (2), Yt(xi) Representing a diversity training sample set StCorresponds to sample xiThe label of (1).
CN202011207564.4A 2020-11-03 2020-11-03 Weighting-based classification method for hyperspectral images of rotating forest Pending CN112308151A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011207564.4A CN112308151A (en) 2020-11-03 2020-11-03 Weighting-based classification method for hyperspectral images of rotating forest

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011207564.4A CN112308151A (en) 2020-11-03 2020-11-03 Weighting-based classification method for hyperspectral images of rotating forest

Publications (1)

Publication Number Publication Date
CN112308151A true CN112308151A (en) 2021-02-02

Family

ID=74334055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011207564.4A Pending CN112308151A (en) 2020-11-03 2020-11-03 Weighting-based classification method for hyperspectral images of rotating forest

Country Status (1)

Country Link
CN (1) CN112308151A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884067A (en) * 2021-03-15 2021-06-01 中山大学 Hop count matrix recovery method based on decision tree classifier

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073880A (en) * 2011-01-13 2011-05-25 西安电子科技大学 Integration method for face recognition by using sparse representation
CN105844300A (en) * 2016-03-24 2016-08-10 河南师范大学 Optimized classification method and optimized classification device based on random forest algorithm
CN107358142A (en) * 2017-05-15 2017-11-17 西安电子科技大学 Polarimetric SAR Image semisupervised classification method based on random forest composition
CN107766883A (en) * 2017-10-13 2018-03-06 华中师范大学 A kind of optimization random forest classification method and system based on weighted decision tree
CN107943830A (en) * 2017-10-20 2018-04-20 西安电子科技大学 A kind of data classification method suitable for higher-dimension large data sets
CN108038448A (en) * 2017-12-13 2018-05-15 河南理工大学 Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy
CN111414863A (en) * 2020-03-23 2020-07-14 国家海洋信息中心 Enhanced integrated remote sensing image classification method
CN111680615A (en) * 2020-06-04 2020-09-18 西安电子科技大学 Multi-class unbalanced remote sensing land cover image classification method based on integration interval

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073880A (en) * 2011-01-13 2011-05-25 西安电子科技大学 Integration method for face recognition by using sparse representation
CN105844300A (en) * 2016-03-24 2016-08-10 河南师范大学 Optimized classification method and optimized classification device based on random forest algorithm
CN107358142A (en) * 2017-05-15 2017-11-17 西安电子科技大学 Polarimetric SAR Image semisupervised classification method based on random forest composition
CN107766883A (en) * 2017-10-13 2018-03-06 华中师范大学 A kind of optimization random forest classification method and system based on weighted decision tree
CN107943830A (en) * 2017-10-20 2018-04-20 西安电子科技大学 A kind of data classification method suitable for higher-dimension large data sets
CN108038448A (en) * 2017-12-13 2018-05-15 河南理工大学 Semi-supervised random forest Hyperspectral Remote Sensing Imagery Classification method based on weighted entropy
CN111414863A (en) * 2020-03-23 2020-07-14 国家海洋信息中心 Enhanced integrated remote sensing image classification method
CN111680615A (en) * 2020-06-04 2020-09-18 西安电子科技大学 Multi-class unbalanced remote sensing land cover image classification method based on integration interval

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEI FENG 等: "Weight-Based Rotation Forest for Hyperspectral Image Classification", 《IEEE GEOSCIENCE AND REMOTE SENSING LETTERS》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884067A (en) * 2021-03-15 2021-06-01 中山大学 Hop count matrix recovery method based on decision tree classifier
CN112884067B (en) * 2021-03-15 2023-08-01 中山大学 Hop count matrix recovery method based on decision tree classifier

Similar Documents

Publication Publication Date Title
CN110321963B (en) Hyperspectral image classification method based on fusion of multi-scale and multi-dimensional space spectrum features
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
CN111695467B (en) Spatial spectrum full convolution hyperspectral image classification method based on super-pixel sample expansion
Feng et al. Convolutional neural network based on bandwise-independent convolution and hard thresholding for hyperspectral band selection
Narendra et al. A non-parametric clustering scheme for LANDSAT
CN110084159A (en) Hyperspectral image classification method based on the multistage empty spectrum information CNN of joint
CN111914728B (en) Hyperspectral remote sensing image semi-supervised classification method and device and storage medium
CN108846426A (en) Polarization SAR classification method based on the twin network of the two-way LSTM of depth
CN105760900B (en) Hyperspectral image classification method based on neighbour's propagation clustering and sparse Multiple Kernel Learning
CN109359525B (en) Polarized SAR image classification method based on sparse low-rank discrimination spectral clustering
CN109344698A (en) EO-1 hyperion band selection method based on separable convolution sum hard threshold function
CN105913092B (en) Figure canonical hyperspectral image band selection method based on sub-space learning
CN103258210A (en) High-definition image classification method based on dictionary learning
CN103914705A (en) Hyperspectral image classification and wave band selection method based on multi-target immune cloning
CN111680579A (en) Remote sensing image classification method for adaptive weight multi-view metric learning
CN110334777A (en) A kind of unsupervised attribute selection method of weighting multi-angle of view
CN116824485A (en) Deep learning-based small target detection method for camouflage personnel in open scene
CN107578063A (en) Image Spectral Clustering based on fast selecting landmark point
CN113516019B (en) Hyperspectral image unmixing method and device and electronic equipment
CN114663770A (en) Hyperspectral image classification method and system based on integrated clustering waveband selection
CN113298184B (en) Sample extraction and expansion method and storage medium for small sample image recognition
CN104732246B (en) A kind of semi-supervised coorinated training hyperspectral image classification method
CN107273919A (en) A kind of EO-1 hyperion unsupervised segmentation method that generic dictionary is constructed based on confidence level
CN113392871B (en) Polarized SAR (synthetic aperture radar) ground object classification method based on scattering mechanism multichannel expansion convolutional neural network
CN112308151A (en) Weighting-based classification method for hyperspectral images of rotating forest

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210202