CN106203523A

CN106203523A - The classification hyperspectral imagery of the semi-supervised algorithm fusion of decision tree is promoted based on gradient

Info

Publication number: CN106203523A
Application number: CN201610561589.1A
Authority: CN
Inventors: 张向荣; 焦李成; 张鑫; 冯婕; 白静; 马文萍; 侯彪; 马晶晶
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2016-07-17
Filing date: 2016-07-17
Publication date: 2016-12-07
Anticipated expiration: 2036-07-17
Also published as: CN106203523B

Abstract

The present invention proposes a kind of hyperspectral image classification method promoting the semi-supervised algorithm fusion of decision tree based on gradient, for solving the technical problem that nicety of grading present in the existing classification hyperspectral imagery combined based on Active Learning is relatively low with semi-supervised learning, its step includes: (1) input hyperspectral image data；(2) sample point feature is extracted；(3) training gradient promotes decision tree classifier parameter；(4) study is concentrated sample point classification；(5) assessment sample point confidence level；(6) by rarefaction representation screening sample point；(7) renewal has labelling training set；(8) output category result.The present invention utilizes grader to predict the outcome and the confidence level of unmarked sample point is estimated by rarefaction representation, height according to unmarked sample point confidence level, it is divided into two set and carries out different process, while improving nicety of grading, alleviate the burden of handmarking, can be used for the field such as geologic survey, atmospheric pollution.

Description

The classification hyperspectral imagery of the semi-supervised algorithm fusion of decision tree is promoted based on gradient

Technical field

The invention belongs to technical field of image processing, relate to the sorting technique of a kind of high spectrum image, be specifically related to one Promote the hyperspectral image classification method of the semi-supervised algorithm fusion of decision tree based on gradient, can be used for geologic survey, atmospheric pollution With fields such as military target strikes.

Background technology

Along with the development of optical remote sensing technology, the course of remotely sensed image is from panchromatic (black and white) image, colored shooting, multispectral Scanning imagery is until high-spectrum remote-sensing imaging of today and Hyper spectral Imaging.High spectrum resolution remote sensing technique have employed 10^-2λ and continuous print Spectrum channel carries out lasting remotely sensed image to atural object, obtains the cartographic feature data in a large number with complete spectrum information, it is achieved Atural object spatial information, radiation information, the synchronization of spectral information obtain, and have the characteristic of " collection of illustrative plates unification ", provide for Objects recognition Convenience.

Conventional hyperspectral image data includes the unloaded visible ray by NASA NASA jet propulsion laboratory/red The Indian Pine data set of outer imaging spectrometer AVIRIS acquisition and Kennedy Space Center (being called for short KSC) data Collection, and the Botswana data set etc. that the EO-I HYPERION spectrogrph of NASA obtains.

High spectrum image terrain classification problem is classified mainly by the spectral signature of atural object, by analyzing height The spectrum form of each pixel content in spectrum picture, and according to its generic of feature decision of its content.Traditional Gao Guang Spectrum image classification method mainly has the supervised classification method with support vector machines and neutral net as representative and with fuzzy poly- Class method is the unsupervised segmentation method of representative.Supervised classification method needs have marker samples to train in a large number and obtains better performances Grader, the training dataset of Classification of hyperspectral remote sensing image problem is to have marked class label on those remote sensing images Sample point, the class label of marker samples point the most manually completes.But, please high spectrum image be carried out manually human expert Labelling is one wastes time and energy and work of a high price；For unsupervised segmentation method, owing to lacking priori, only with distant Sample is divided into some classifications by the spectral signature regularity of distribution of sense image atural object, and classification results simply reaches different classes of Distinguish, the attribute of classification can not be determined, it is impossible to ensure correct corresponding between classification with atural object classification after cluster.

In this case, hyperspectral image classification method based on semi-supervised learning and Active Learning causes both at home and abroad The extensive concern of scholar.Semi-supervised learning utilizes has flag data to train preliminary classification device on a small quantity, and then with a large amount of unmarked numbers According to improving the performance of preliminary classification device further to reach accurately to learn, compensate for a certain extent supervised learning with without supervision The deficiency of study.Conventional semisupervised classification method includes self-training method, coorinated training, generating probability model algorithm, half prison Superintend and direct support vector machines, and method based on figure.In these methods, by giving class mark to data untagged, utilize To class target data re-training grader, obtain final classification results.But, semi-supervised learning is disadvantageous in that, Less at number of samples, in the case of model training is insufficient, the class mark of data untagged is predicted the most inaccurate, by labelling The sample of mistake adds training set and is declined by the learning performance causing grader.Active Learning is intended to by certain query strategy Select valuable sample for disaggregated model, filter out the sample information of redundancy, thus according to the knowledge of domain expert and Experience, is artificially marked the sample of these informative.The main task of Active Learning is to find efficient sample This query strategy so that the sample quality of selected marker is high and few, both can ensure that classification performance, it is also possible to alleviate marker samples Workload.The query strategy that Active Learning is conventional at present has: 1) sample based on sample uncertainty；2) entrust based on inquiry expert Member can sample, and forms a committee here with multiple graders, uses the mode of ballot to determine whether to choose sample.? In Active Learning, by expert unmarked sample is marked the accuracy that can ensure that labelling, but handmarking certainly will It it is time-consuming effort.

Active Learning introduces the sample of handmarking by consulting human expert, it is ensured that absolutely accuracy.By In the time-consuming effort of handmarking, so the number of samples that can carry out handmarking is limited.Semi-supervised learning relies on grader to nothing Marker samples is predicted, and newly-increased number of samples is many but cannot ensure quality.For the feature of both approaches, Chinese scholars Consider to combine two kinds of methods, it is proposed that the classification hyperspectral imagery side combined with semi-supervised learning based on Active Learning Method, alleviates the burden of handmarking while guarantee increases newly and has marker samples number.Such as, Inmaculada D ' opido, Jun Li et al. is at paper " A New Semi-supervised Approach for Hyper-spectral Image Classification With Different Active Learning " in (WHISPERS, 2012), disclose a kind of half Supervision Active Learning Method, for classification hyperspectral imagery, utilizes the query strategy of Active Learning to select during semi-supervised learning The unmarked sample gone out screens, and selects the sample that wherein quantity of information enriches the most.Concretely comprising the following steps of the method: sparse In multinomial logistic regression grader, calculate the maximum a posteriori probability of unmarked sample in having marker samples neighborhood；To be the most general The imparting class mark that rate is bigger, adds in a certain specific collection；Utilize several query strategies conventional in Active Learning to this set In sample select, select and classifier performance promoted the sample that contribution is maximum；The sample selected be there was added labelling In sample set, re-training grader.This method saves time and manpower, only depends on yet with lacking handmarking's process Relying grader itself to carry out class mark prediction, nicety of grading has much room for improvement.

Summary of the invention

It is an object of the invention to the defect overcoming above-mentioned prior art to exist, it is proposed that a kind of based on gradient lifting decision-making Set the hyperspectral image classification method of semi-supervised algorithm fusion, utilize and have marker samples point training gradient to promote decision tree on a small quantity GBDT grader, and unmarked sample point is screened, the unmarked sample point choosing confidence level higher carries out semi-supervised Practising, uncertain bigger unmarked sample point carries out Active Learning, under the common effect of expert and grader, it is achieved that right Effective classification of high spectrum image, divides for solving the existing high spectrum image combined based on Active Learning with semi-supervised learning The technical problem that present in class method, nicety of grading is relatively low.

For achieving the above object, the technical scheme that the present invention takes, comprise the steps:

(1) input comprises C class, the high spectrum image of N number of sample point, each sample point takes its neighborhood window, takes this window In mouthful, the maximum of the every one-dimensional characteristic of all sample points is as the space characteristics of this central sample point, by the spectral signature of sample point Connect with space characteristics, obtain the empty spectrum signature vector of sample point；

(2) from the high spectrum image of input, labelling training set, study collection and test set have been chosen, it is achieved step is:

(2a) from every class sample point of the high spectrum image of input, randomly select r sample point, obtain having labelling to train CollectionThe category label collection of its correspondence isWherein, n for there being the labelling training sample total number of point, And n=C × r, x_iFor there being the i-th of labelling training set to have marker samples point, l_iHave belonging to labelling training sample point for i-th Category label, l_i∈ 1,2 ..., C}, R are real number field, and D is the intrinsic dimensionality of sample point；

(2b) having the sample point beyond marker samples point from the n chosen, the ratio of randomly selecting is the sample point of per1, Obtain study collectionWherein, s is total number of study collection sample point, s=(N-n) × per1, z_qFor study The q-th sample point concentrated；

(2c) residue sample point is utilized to constitute test setM is the total number of test set sample, m=N- N-s, y_jJth test sample point for test set；

(3) utilization has labelling training setIn the characteristic vector of sample point and corresponding class mark matrix, Gradient being promoted decision tree GBDT classifier parameters be trained, every two classes have marker samples point to train to obtain two classification Device model, finally, C class has marker samples point to can get C × (C-1)/2 two sorter model；

(4) study is collectedIn sample point be input in multiple two sorter models obtained, obtain The prediction class mark k of each sample point in this study collection Z；

(5) according to the study collection obtainedIn each sample point z_qPrediction class mark k, it is judged that each In two sorter models, each sample point z_qWhen being assigned to kth class, whether the number of times P that wins victory of class mark k is equal to C-1, the most then This sample point is added empty set S_semiIn, otherwise, sample point is added empty set S_actIn；To study collection Z in all sample points by One judges, is gatheredAnd setWherein, z_q1For set S_semiIn sample point, z_q2For set S_actIn sample point, s' for set S_semiIn the total number of sample point, s " for set S_actIn sample point the most individual Number, wherein, s '+s "=s；

(6) rarefaction representation is utilized, to the set S obtained_semiWith set S_actIn sample point screen, it is achieved step For:

(6a) all sample points structure dictionary A=[x having in labelling training set X is utilized₁,x₂,…,x_n], and utilize structure Dictionary A, respectively to set S_semiIn sample point z_q1With set S_actIn sample point z_q2Carry out rarefaction representation: z_q1=A α₁, z_q2=A α₂, wherein, α₁And α₂It it is rarefaction representation coefficient vector；

(6b) orthogonal matching pursuit algorithm OMP is utilized to obtain sample point z_q1With sample point z_q2Rarefaction representation coefficient vector:WithWherein | | | |₂For l₂ Norm, metric data reconstructed error；||·||₁For l₁Norm, is used for ensureing vector α₁With vector α₂Degree of rarefication, λ be reconstruct by mistake Difference item and the balance factor of sparse item；

(6c) according to rarefaction representation coefficient vector α₁And α₂The class mark having marker samples point corresponding to middle nonzero term, i.e. l_i ∈ 1,2 ..., C}, will gather S_semiMiddle prediction class mark k and class mark l_iIdentical sample point z_q1Screen, and will filter out The class mark of all sample points gives class mark l_i；S will be gathered simultaneously_actMiddle prediction class mark k and class mark l_iDifferent sample point z_q2Screening Out, and by all sample points filtered out expert is transferred to manually to mark；

(7) S will be gathered_semiMiddle imparting class mark l_iSample point z_q1With set S_actIn the sample point z that manually marks_q2, Join in labelling training set X, re-training classifier parameters, obtain new sorter model；

(8) iterative step (3)～step (7), until meeting the iterations set, utilizes the grader mould finally given Type, to test setIn sample point classify, obtain the classification results of test set

The present invention compared with prior art, has the advantage that

1. due to the fact that employing grader predicts the outcome and the confidence level of unmarked sample point is carried out by rarefaction representation Assessment, simultaneously according to the height of unmarked sample point confidence level, is divided into two set, and the characteristic for the two set is entered The different process of row, with existing based on Active Learning compared with the hyperspectral image classification method that semi-supervised learning combines, have The accuracy rate that improve image classification of effect.

2. due to the fact that and use handmarking to have labelling training set with the unmarked sample point renewal of grader prediction, with Time make use of marker samples point and unmarked sample point training grader, have the number of marker samples point needed for effectively reducing Mesh, it is ensured that while classification accuracy rate, alleviates the burden of handmarking.

Accompanying drawing explanation

Fig. 1 be the present invention realize FB(flow block)；

Fig. 2 is the present invention and prior art is having labelling training sample to count out asynchronous nicety of grading simulation comparison Figure.

Detailed description of the invention

Below in conjunction with drawings and Examples, the invention will be further described.

With reference to Fig. 1, the step that is embodied as of the present invention includes:

Step 1, input hyperspectral image data:

Inputting a panel height spectrum picture, remove background sample point, residue sample point has N number of, comprises C classification.

Step 2, sample point sky spectrum signature is extracted, it is achieved step is:

Step 2a, vectorial as the spectral signature of this sample point with the spectrum characteristic parameter of each each wave band of sample point, The primitive character dimension of sample point is d.

Step 2b, takes its neighborhood window to each sample point, and window size is c × c, and in taking this window, all sample points are every The maximum of one-dimensional characteristic is as the space characteristics of this central sample point, and intrinsic dimensionality is d.

Step 2c, connects the spectral signature of sample point with space characteristics, obtains its final characteristic vector, intrinsic dimensionality For D, D=2 × d.

Step 3, has chosen labelling training set X, test set Y and study collection Z, it is achieved step from the high spectrum image of input Suddenly it is:

Step 3a, from every class sample point of the high spectrum image of input, randomly selecting r sample point composition has labelling to instruct Practice collectionThe category label collection of its correspondence isWherein, n is for there being labelling training sample point the most individual Number, and n=C × r, x_iFor there being the i-th of labelling training set to have marker samples point, l_iHave belonging to labelling training sample point for i-th Category label, l_i∈ 1,2 ..., C}, R are real number field；

Step 3b, has the sample point beyond marker samples point from the n chosen, and the ratio of randomly selecting is the sample of per1 Point constitutes study collectionWherein, s is the study collection total number of sample point, and s=(N-n) × per1, z_qFor learning Practise the q-th sample point concentrated；

Step 3c, utilizes residue sample point to constitute test setM is the total number of test set sample, m= N-n-s, y_jJth test sample point for test set；

Step 4, training gradient promotes decision tree GBDT classifier parameters, and it is pre-that the sample point learning to concentrate is carried out class mark Survey, it is achieved step is:

Step 4a, input has labelling training setThe characteristic vector of middle sample point and corresponding class mark square Battle array promotes in decision tree GBDT grader to gradient, trains classifier parameters；

Step 4b, input study collectionThe characteristic vector of middle sample point in the sorter model obtained, Obtain sample point z_qCorresponding class mark k；

Step 5, concentrates the confidence level of sample point, sample point is divided into two set, it is achieved step is according to study:

Step 5a, sample z_qThrough the two grader classification obtained, obtain predict the outcome value score (k) and score (t), Wherein two graders are to utilize kth class to have marker samples point and t class to have the training of marker samples point to obtain, k ∈ 1,2 ..., C}, T ∈ 1,2 ..., C}, k ≠ t, score (k) and score (t) they are to sample z respectively by grader_qKth class and t class Predict the outcome value；

Step 5b, sample z_qThe number of times P that wins victory of classification k obtained based on this two grader is

P = Σ_{t = 1, t &NotEqual; k}^{C} I (s c o r e (k) > s c o r e (t))

Wherein, indicator function

Step 5c, if P=C-1, represents sample z_qTrue class to be designated as the confidence level of k higher；Semi-supervised learning main Purpose is that the unmarked sample point finding easy labelling, confidence level high utilizes sorter model to make the prediction of class mark, there was added In labelling training set, so by z_qPut into empty set S_semiIn, gatheredz_q1For set S_semiIn sample Point, s' is set S_semiIn the total number of sample point；

Step 5d, if P ≠ C-1, represents sample z_qTrue class to be designated as the confidence level of k relatively low；In Active Learning, will compare More difficult point, the screening sample of informative out carry out handmarking, so by z_qPut into S in set_actIn, collected Closez_q2For set S_actIn sample point, s " for set S_actIn the total number of sample point；

Step 6, to set S_semiWith set S_actIn sample point rarefaction representation, it is achieved step is:

Step 6a, builds dictionary A, A=[x₁,x₂,…,x_n], x₁,x₂,…,x_nFor having the sample point in labelling training set, n For there being the labelling training sample total number of point, sample point characteristic dimension is D, then the size of dictionary is D × n；

Step 6b, to set S_semiIn sample point z_q1With set S_actIn sample point z_q2Carry out rarefaction representation respectively, To rarefaction representation formula z_q1=A α₁With z_q2=A α₂；

Step 6c, utilizes orthogonal matching pursuit algorithm OMP to obtain sample point z_q1With sample point z_q2Rarefaction representation coefficient to Amount:WithWherein | | | |₂For l₂Norm, metric data reconstructed error；||·||₁For l₁Norm, is used for ensureing vector α₁With vector α₂Degree of rarefication, λ be reconstruct Error term and the balance factor of sparse item, realize as follows:

Step 6c1, initializes residual error item r⁽⁰⁾=z_q, indexed setNull vector, primary iteration J=1 is tieed up for K

Step 6c2, finds out residual error r^(J-1)With the jth row x in dictionary A_jSubscript λ corresponding to inner product maximum,

Step 6c3, updates indexed set Λ^(J), Λ^(J)(J)=λ.According to indexed set, from dictionary A, select the atom of correspondence Row constitute set A^(J)=A (:, Λ^(J)(1:J))；

Step 6c4, utilizes method of least square to obtain the α that J rank are approached^(J)=argmin | | z_q-A^(J)α||₂；

Step 6c5, updates residual error r^(J)=z_q-A^(J)α^(J), J=J+1；

Step 6c6, repeats step 6c2～step 6c5, and judges whether J is more than K, and if so, iteration terminates, and otherwise, performs Step 6c2.

z_qFor set S_semiWith S_actIn sample point, α is rarefaction representation coefficient vector；

Step 7, according to rarefaction representation coefficient vector α₁And α₂The class mark having marker samples point corresponding to middle nonzero term position l_i∈ 1,2 ..., C}, to set S_semiWith set S_actIn sample point z_q1With z_q2Screen.

Step 7a, as the jth dictionary atom x in dictionary A_jWith study collectionMiddle q-th sample point z_q When belonging to same class, α corresponding position α_jiValue is 1, is 0 during inhomogeneity；If set S_semiMiddle sample point z_q1Prediction class mark k with Its sparse coefficient matrix α₁The class mark l having marker samples point corresponding to middle nonzero term position_iIdentical, represent this sample point z_q1With This has marker samples point to belong to same class, then by this sample point z_q1Class mark give as l_i。

Step 7b, if set S_actIn sample point z_q2Prediction class mark k and its sparse coefficient matrix α₂Middle nonzero term position The corresponding class mark l having marker samples_iDifference, represents this sample point z_q2Class mark through grader prediction obtains with rarefaction representation The class mark arrived is inconsistent, this sample point z_q2Belong to the sample point of difficult point of comparison, then screened, transfer to expert to carry out manually Mark.

Step 8, by S_semiSet gives class target sample point z_q1And S_actThe sample point z manually marked in set_q2 Having joined in labelling training set X, inputting new has labelling training setThe characteristic vector of middle sample point is with right The class mark matrix re-training classifier parameters answered, obtains new sorter model；

Step 9, output category result

Using gradient to promote decision tree classifier, what first step input was new has labelling training setMiddle sample The characteristic vector of this point and category label collectionFor training, second step input test collectionMiddle survey Sample characteristic vector originally, promotes decision tree classifier by gradient, obtains the class mark matrix of test setWherein, l'_j Represent the category label belonging to jth test sample.

Step 10, calculates nicety of grading

Through contrasting real class mark matrix, obtain nicety of grading result.

Below in conjunction with emulation experiment, the technique effect of the present invention is further described.

1. simulated conditions:

Emulation experiment is Intel Core (TM) i3-3110M, dominant frequency 2.40GHz at CPU, inside saves as the WINDOWS 7 of 4G Carry out with MATLAB 2014a software in system.

2. emulation content and analysis:

Emulation experiment uses the unloaded visible ray/Infrared Imaging Spectrometer of NASA NASA jet propulsion laboratory The Indian Pine image that AVIRIS obtained in the northwestward, Indiana in June, 1992, image size is 145 × 145, altogether 220 wave bands, the wave band removing noise and air and waters absorption also has 200 wave bands, totally 16 class terrestrial object information, due to portion Sub-category data amount check is considerably less, in emulation experiment, only considers 9 class data shown in table 1, and entire image is divided into 9 classes.

9 class data in table 1Indian Pine image

Classification	Item name	Number
			1	Corn-no till	1434
2	Corn-min	834
			3	Grass/Pasture	497
4	Grass/Trees	747
			5	Hay-windrowed	489
6	Soybeans-no till	968
			7	Soybeans-min	2468
8	Soybean-clean	614
			9	Woods	1294

Using the present invention to classify high spectrum image Indian Pine with prior art, the prior art of contrast is Paper " A New Semi-supervised Approach for Hyper-spectral Image Classification With Different Active Learning " (WHISPERS, 2012) middle semi-supervised Active Learning Method proposed.This Bright utilize gradient promote decision tree GBDT as grader, the high spectrum image combined with semi-supervised learning based on Active Learning Sorting technique is abbreviated as SSAc+GBDT.

In the present invention, the decision tree number of GBDT grader is set to 100, and down-sampling ratio setting is 50%；Window size C × c is set to 15 × 15, and ratio per1 of choosing of study collection is set to 30%.

From 9 class data shown in table 1, every class chooses fixed number of samples point as there being labelling training set, chooses certain proportion Sample point is unmarked sample point as test set, study collection with sample point in test set as study collection, residue sample point, With prior art, 9 class data are carried out 10 subseries experiments by the present invention, take the meansigma methods of classification results, as final classification just Really rate, as in figure 2 it is shown, the nicety of grading being two kinds of methods when every class has labelling training sample point number r to be respectively 5,10,15 Simulation comparison figure, abscissa represents that every class has the number of labelling training sample point, vertical coordinate presentation class precision.Permissible from Fig. 2 Find out when every class select when having marker samples point number difference, nicety of grading of the present invention is apparently higher than prior art.

To sum up, the present invention combines semi-supervised algorithm fusion to high spectrum image on the basis of promoting decision tree based on gradient Classify, make full use of the structural information of unmarked sample point, it is possible to reduce amount of calculation, and obtain higher nicety of grading, There is certain advantage compared with the existing methods.

Claims

1. promote a hyperspectral image classification method for the semi-supervised algorithm fusion of decision tree based on gradient, comprise the steps:

(1) input comprises C class, the high spectrum image of N number of sample point, each sample point takes its neighborhood window, takes in this window The maximum of the every one-dimensional characteristic of all sample points is as the space characteristics of this central sample point, by the spectral signature of sample point with empty Between feature series connection, obtain sample point empty spectrum signature vector；

(2a) from every class sample point of the high spectrum image of input, randomly select r sample point, obtain there is labelling training setThe category label collection of its correspondence isWherein, n is for there being the labelling training sample total number of point, and n =C × r, x_iFor there being the i-th of labelling training set to have marker samples point, l_iThe classification belonging to labelling training sample point is had for i-th Label, l_i∈ 1,2 ..., C}, R are real number field, and D is the intrinsic dimensionality of sample point；

(2b) having the sample point beyond marker samples point from the n chosen, the ratio of randomly selecting is the sample point of per1, obtains Study collectionWherein, s is the study collection total number of sample point, and s=(N-n) × per1, z_qConcentrate for study Q-th sample point；

(2c) residue sample point is utilized to constitute test setM is the total number of test set sample, m=N-n-s, y_j Jth test sample point for test set；

(3) utilization has labelling training setIn the characteristic vector of sample point and corresponding class mark matrix, to ladder Degree promotes decision tree GBDT classifier parameters and is trained, and every two classes have marker samples point to train to obtain two grader moulds Type, finally, C class has marker samples point to can get C × (C-1)/2 two sorter model；

(4) study is collectedIn sample point be input in multiple two sorter models obtained, obtain this Practise the prediction class mark k of each sample point in collection Z；

(5) according to the study collection obtainedIn each sample point z_qPrediction class mark k, it is judged that each two classification In device model, each sample point z_qWhen being assigned to kth class, whether the number of times P that wins victory of class mark k is equal to C-1, the most then by this sample This point adds empty set S_semiIn, otherwise, sample point is added empty set S_actIn；All sample points in study collection Z are carried out one by one Judge, gatheredAnd setWherein, z_q1For set S_semiIn sample point, z_q2For Set S_actIn sample point, s' for set S_semiIn the total number of sample point, s " for set S_actIn the total number of sample point, its In, s '+s "=s；

(6) rarefaction representation is utilized, to the set S obtained_semiWith set S_actIn sample point screen, it is achieved step is:

(6a) all sample points structure dictionary A=[x having in labelling training set X is utilized₁,x₂,…,x_n], and utilize the word of structure Allusion quotation A, respectively to set S_semiIn sample point z_q1With set S_actIn sample point z_q2Carry out rarefaction representation: z_q1=A α₁, z_q2= Aα₂, wherein, α₁And α₂It it is rarefaction representation coefficient vector；

(6c) according to rarefaction representation coefficient vector α₁And α₂The class mark having marker samples point corresponding to middle nonzero term, i.e. l_i∈{1, 2 ..., C}, will gather S_semiMiddle prediction class mark k and class mark l_iIdentical sample point z_q1Screen, and all by filter out The class mark of sample point gives class mark l_i；S will be gathered simultaneously_actMiddle prediction class mark k and class mark l_iDifferent sample point z_q2Filter out Come, and transfer to expert manually to mark all sample points filtered out；

(7) S will be gathered_semiMiddle imparting class mark l_iSample point z_q1With set S_actIn the sample point z that manually marks_q2, add To having in labelling training set X, re-training classifier parameters, obtain new sorter model；

(8) iterative step (3)～step (7), until meeting the iterations set, utilizes the sorter model finally given, To test setIn sample point classify, obtain the classification results of test set

The classification hyperspectral imagery side promoting the semi-supervised algorithm fusion of decision tree based on gradient the most according to claim 1 Method, it is characterised in that the number of times P that wins victory of the class mark k described in step (5), realizes as follows:

(5a) kth class is utilized to have marker samples point and t class to have marker samples point to train two sorter models obtained, to sample z_qClassifying, obtain predicting the outcome value score (k) and score (t), wherein, k ∈ 1,2 ..., C}, t ∈ 1,2 ..., C}, And k ≠ t；

(5b) utilize predict the outcome value score (k) and the score (t) obtained, ask for each sample point z_qClassification is that winning victory of k is secondary Number P:

P = Σ_{t = 1, t &NotEqual; k}^{C} I (s c o r e (k) > s c o r e (t))

Wherein, indicator functionF=score (k) ＞ score (t).

The classification hyperspectral imagery side promoting the semi-supervised algorithm fusion of decision tree based on gradient the most according to claim 1 Method, it is characterised in that the orthogonal matching pursuit algorithm OMP that utilizes described in step (6) obtains sample point z_q1With sample point z_q2's Rarefaction representation coefficient vector, realizes as follows:

(6a) residual error item r is initialized⁽⁰⁾=z_q, indexed setNull vector, primary iteration J=1 is tieed up for K

(6b) residual error r is found out^(J-1)With the jth row x in dictionary A_jSubscript λ corresponding to inner product maximum,

(6c) indexed set Λ is updated^(J), Λ^(J)(J)=λ.According to indexed set, the atom row selecting correspondence from dictionary A constitute collection Close A^(J)=A (:, Λ^(J)(1:J))；

(6d) method of least square is utilized to obtain what J rank were approached

(6e) residual error r is updated^(J)=z_q-A^(J)α^(J), J=J+1；

(6f) repeating step (6b)～(6e), and judge whether J is more than K, if so, iteration terminates, and otherwise, performs step (6b).