CN110175195A

CN110175195A - Mixed gas detection model construction method based on extreme random tree

Info

Publication number: CN110175195A
Application number: CN201910329097.3A
Authority: CN
Inventors: 许永辉; 孙超; 赵玺; 杨子萱
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2019-04-23
Filing date: 2019-04-23
Publication date: 2019-08-27
Anticipated expiration: 2039-04-23
Also published as: CN110175195B

Abstract

The invention discloses a kind of mixed gas detection model construction method based on extreme random tree, including carrying out data acquisition to mixed gas, obtain data set, the data set includes at least three gas signal time serieses, and the optimal crooked route of gas signal time series is calculated, it is screened using optimal crooked route；Gas characteristic is extracted to the gas signal time series after screening using Principal Component Analysis；Model is established using extreme random number algorithm, and is classified to target mixed gas.The present invention proposes the mixed gas detection model construction method based on extreme random tree, largely improves classification accuracy and time efficiency.

Description

Mixed gas detection model construction method based on extreme random tree

Technical field

The present invention relates to machine olfaction technical fields, in particular to based on the mixed gas detection model of extreme random tree Construction method.

Background technique

In current mixed gas detection field, Many researchers have been achieved for good classifying quality, such as using branch Hold vector machine (SVM), artificial neural network (ANN), k-nearest neighbor (KNN) scheduling algorithm.In order to improve the accuracy rate of classification, wherein There is researcher to propose a kind of Adaboost.M2 model of optimization, by multiple Classifiers Combination, carries out the classification experiments of drug, pass through The setting of different fusion rules, final highest recognition accuracy are 91.75%.There are also the posterior probability extracted from SVM Algorithm for estimating detects 10 kinds of bacterial components in people's blood using machine olfaction, recognition accuracy is higher but time cost compared with Greatly.Another part researcher's document solves the uncertainty relationship in gas source positioning using the processing of probability bayesian algorithm, leads to simultaneously The path planning algorithm of Markov decision process is crossed, the location efficiency of gas in practice is improved.PCA and artificial neural network (ANN) application of algorithm can be improved and differentiate soil moisture content, but ANN algorithm shortage is explanatory, and restrains speed Degree is slower, and efficiency is lower.There is no the levels that a kind of algorithm can make detection accuracy reach 99% or more in the prior art.And And never have researcher considered gas sensor itself data accuracy problem；And for traditional characteristic extracting mode PCA is the algorithm when dimension is higher, when algorithm dimension is not high, needs to construct its feature；And it is calculated in classification It is stronger for anti-capability of fitting in method, at the same training time speed is fast and the more algorithm of higher classification accuracy not It supports.But there has been no the models of the extreme random tree algorithm based on random forest innovatory algorithm for current patent, to solve to mix Field of gas detection problem.

Therefore, how a kind of mixed gas detection model structure based on extreme random tree, with high measurement accuracy is provided Construction method is those skilled in the art's technical problem urgently to be resolved.

Summary of the invention

The present invention situation low for two kinds of mixed gas classification accuracies, the models such as traditional support vector machines (SVM) Classification accuracy and time efficiency are not high enough, therefore the present invention proposes the mixed gas detection model based on extreme random tree Construction method largely improves classification accuracy and time efficiency.Concrete scheme is as follows:

S1, data acquisition is carried out to mixed gas, obtains data set, the data set includes at least three gas signals Time series, and the optimal crooked route of gas signal time series is calculated, when carrying out gas signal using optimal crooked route Between sequence screening；

S2, gas characteristic is extracted to the gas signal time series after screening using Principal Component Analysis；

S3, model is established using extreme random number algorithm, and classify to target mixed gas.

Preferably, the optimal crooked route calculating process of gas time sequence is as follows in the S1:

S11, the distance matrix for constructing two gas signal time serieses；Two time serieses are respectively X=(x₁, x₂... x_m), Y=(y₁, y₂... y_n), wherein two length of time series are m, n.D_m×nFor m × n of two time serieses construction Distance matrix

Wherein, D_m×nIn element d_ijIt is to pass through x_iAnd y_iCoordinate distance is calculated, calculating process are as follows:

d_ij=| | x_i-y_j||_w

It is exactly Euclidean distance 2- norm, 1≤i≤m, 1≤j≤n as w=2；

S12, pass through D_m×nOne is found apart from the smallest crooked route p_min, i.e., optimal crooked route

p_min={ p₁,p₂,…p_d,…p_k}

k∈{max(m,n),m+n+1}

Wherein, p_dFor search to point d_ijWhen, the current Cumulative Distance of crooked route, then p_d+1Calculating formula are as follows:

p_d+1=p_d+min[d_(i+1)j,d_(i+1)(j+1),d_i(j+1)]；

S13, give up P_minMaximum two groups of gas signal time serieses, residual gas signal time sequence is as step 2 Input data.

Preferably, the S2 is specifically included:

The primitive character building of S21, gas signal；It constructs to obtain the original spy of gas signal multidimensional using interaction feature method Sign；

S22, dimension-reduction treatment is carried out using Principal Component Analysis to the gas signal multidimensional primitive character, obtained original Data sample.

Preferably, the S3 is specifically included:

S31, in the disaggregated model of extreme random tree, each base classifier is instructed using whole primary data samples Practice, wherein raw data set D, sample size N, feature quantity M；

S32, decision tree is generated according to CART algorithm；When carrying out node split, in each division node at random from M M feature is selected in feature, is randomly selected several classifications and is put into one of branch, remaining classification is put into another branch, together When calculate the best split values of each node, select optimum attributes division, and without cut operator in division；Division Subset iteration out generates a decision tree to preset value；

S33, by step S31, S32 repetitive operation K times, ultimately generate the extreme random tree mould being made of K decision tree Type；

S34, the extreme random tree-model after training is tested, final classification results is generated eventually by ballot.

Compared with the prior art the present invention has the advantages that

The invention proposes the dynamic time warping algorithms based on DTW, and classification accuracy is improved 26.87%；It is based on Primitive character building and Principal Component Analysis Algorithm, classification accuracy improve 25.8%；Change eventually by extreme random tree algorithm Into the time efficiency problem in random forests algorithm, final classification accuracy rate has reached 99.17%, and time efficiency is than random Forest algorithm improves 66.85%, only 103.2568 seconds.The method proposed through the invention, solves for mixed gas Classification problem, random forests algorithm is made that and is largely improved, the classification for improving machine olfaction system is accurate Rate offers theoretical foundation to simulate the algorithm of olfactory neural system.Using extremely random tree algorithm, generated by ballot decision Prediction result, generalization ability are stronger；Using whole primary data samples training base classifier, training result precision is higher；Due to It is random selection in node split, randomness substantially enhances.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only The embodiment of the present invention for those of ordinary skill in the art without creative efforts, can be with Other attached drawings are obtained according to the attached drawing of offer.

Fig. 1 is that the present invention is based on the flow charts of the mixed gas detection model construction method of extreme random tree；

Fig. 2 is that inventive sensor acquires gas data response diagram；

Fig. 3 is inventive sensor TGS2602 to the dynamic response curve figure in the case of Et_L_Me_H；

Fig. 4 is that feature of present invention engineering is abstracted three-dimensional feature figure；

Fig. 5 is the extreme random tree algorithm schematic diagram of the present invention；

Fig. 6 is 10 folding cross validation accuracy rate schematic diagrames after DTW of the present invention；

Fig. 7 is cross validation accuracy rate schematic diagram after feature of present invention building；

Fig. 8 is inventive algorithm model running time comparison diagram.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts it is all its His embodiment, shall fall within the protection scope of the present invention.

A kind of mixed gas detection model construction method based on extreme random tree is present embodiments provided,

S1 dynamic time warping algorithm (DTW)

The present embodiment is detected with the mixed gas that ethylene-CH4 and ethylene-CO are mixed to get.By 6 under every kind of label Secondary experiment forms different data sets, wherein each label refers to a kind of gas mixing classification.Continue in the data sampling stage Time is 300 seconds.Gas is not passed through in the initial 60 second time.The mixed gas for setting concentration ratio is passed through gas at 60 seconds Interior, mixed gas be passed through the time be 180 seconds.It is passed through without mixed gas within last 60 seconds.Sensor array is classified as 8 sensor groups At sensor frequency is set as 50HZ, and mixed gas data set is acquired by 8 sensors and obtained.According to time rule by data Collection is stored, and each data set includes 11 column datas: time (s), temperature, humidity (%) and TGS2600, TGS2612, TGS2611, TGS2610, TGS2602, TGS2602, TGS2620, TGS2620 sensor acquire data.Sensor acquires data It is indicated for its resistance value with A, unified value is then converted to by Rs (KOhm)=10* (3110-A)/A.For certain primary experiment Sensor response diagram referring to Figure of description 2, by taking Et_H_Me_n situation as an example, Et indicate ethylene H represent high concentration, Me table Showing that methane n represents concentration is zero, and abscissa is the time, and ordinate is the sensor reading after conversion.

In order to probe into the acquisition data cases of sensor, for TGS2602, under same label, (i.e. Et_M_Me_M is marked Label) the case where respond tracing analysis, be respectively TGS2602 to Et_L_Me_H situation referring to Figure of description 3 (1)-(6) Under dynamic response curve.As can be seen from the figure for the response in same situation, there are different journeys in same sensor The variation of degree.Wherein it can clearly be seen that in finally experiment twice, discovery sensor response curve and have before obvious It is different.It can therefore be concluded that in an experiment, because the problems such as the configuration of experiment condition, it may appear that different degrees of data are different Cause situation.

By analysis before, data need to carry out effective pretreatment work.Since mixed gas data are when being based on Between sequence gas signal response curve, for data sets carry out dynamic time warping work.Dynamic time warping is based on dynamic State plans a kind of algorithm of (DP) thought, and characteristic parameter dislocation is optimized, its basic principle is found in time sequence Optimal crooked route between column.It is found in other sequences by the coordinate value of data point in a sequence most identical The point of feature calculates the distance between same characteristic features point after finding, and is made with this to calculate the sum of the distance of two time serieses For optimal crooked route.

Assuming that two time serieses are respectively X=(x₁, x₂... x_m), Y=(y₁, y₂... y_n), wherein two time sequences Column length is m, n.D_m×nFor the distance matrix of m × n of two time serieses construction.

Wherein D_m×nIn element d_ijIt is to pass through x_iAnd y_iCoordinate distance is calculated, calculating process are as follows:

d_ij=| | x_i-y_j||_w

It is exactly Euclidean distance 2- norm as w=2.And pass through D_m×nOne is found apart from the smallest crooked route p_min, It is exactly the DTW distance between two time serieses.

p_min={ p₁,p₂,…p_d,…p_k}

k∈{max(m,n),m+n+1}

Wherein, if p_dFor search to point d_ijWhen, the current Cumulative Distance of crooked route.

For p_minSearching to meet three conditions are as follows: 1) fixed starting-point, the starting point in path are d₁₁, terminal d_mn。 2) monotonicity is consistent, if the current point d of search_ij, current Cumulative Distance is p_d, p_d+1=p_d+d_i′j′, then i ' > > i, j ' > > j.3) continuity is consistent, if the current point of search is d_ij, current Cumulative Distance is p_d, p_d+1=p_d+d_i′j′, then i ' < < i+ 1, j ' < < j+1.Meet three above condition, searching route initial position is determined by first point, and determine search road at the two or three point The position of next point of diameter is one of right, top or the upper right side of current point, if current point is p_d, and it is false If Searching point is d at this time_ij, then p_d+1Calculating formula are as follows:

p_d+1=p_d+min[d_(i+1)j, d_(i+1)(j+1), d_i(j+1)]

Finally obtain p_min, different and generate accumulation to solve sequence length while by cumulative distance handling averagely The case where having differences property of distance.

D=p_min/k

D is the Cumulative Distance for equalizing two sequences.

Due to the limitation of 3 constraint condition, DTW algorithm has traversed all observation points, and every original series are ok Find corresponding points.Eventually by the setting of dynamic time warping algorithm (DTW) algorithm, we carry out sample from initial data Preliminary screening, to complete the further promotion to classifying quality.

Every kind of label of raw data set includes 6 repetition experimental datas, that is, each sensor is directed to a kind of mixed gas Classification carries out 6 groups of acquisitions, the time series of 6 gas signals is obtained, by P in DTW algorithm_minIt calculates, gives up P_minIt is maximum Experimental data twice, input data of the remaining data as S2.

The selection of S2 data and feature extraction:

Primitive character building

In doing comparative test, primitive character training is used to designed classifier and comparison-of-pair sorting's device and has been constructed Feature after is trained comparison.Original data set has 8 dimensional features, to improve classification accuracy, carries out structure to data characteristics The case where building, comparing different features finds the feature best to classifying quality.Why feature construction is carried out, is because of instruction Practice data to determine, the highest accuracy rate that can reach just determines therewith.By feature construction, recognizer can handle The problem of habit ability difference.So improving sorting algorithm accuracy rate by building new feature on the basis of primitive character.

Common feature construction method has interaction feature, such as feature A and B, and creates feature A*B, A-B, A/B, A+B This meeting is so that feature space explodes.The present embodiment is applied to due to carrying out the acquisition of gas signal data using 8 sensors Feature be 8 dimensional features, created feature be A-B, A/ B, then after creating interaction feature, obtain the original spy of gas signal multidimensional Sign, characteristic become 56.

The specific implementation step of principal component analysis (PCA)

The specific implementation steps are as follows for principal component analysis:

(1) initial data is standardized

PCA is the covariance matrix based on data, data it is not of uniform size, in order to be consistent the dimension of data, therefore Initial characteristic data should be standardized first.Data are subtracted to the mean value of dimension, then the standard deviation divided by dimension.

E(X_i) indicate data mean value, D (X_i) indicate data variance.

(2) covariance matrix of data is calculated

The covariance matrix of data is exactly the correlation matrix of primitive character after standardization.It is derived as shown in formula.

Correlation matrix R can be expressed as

(3) characteristic value and feature vector of coefficient R are calculated

By characteristic equationThe characteristic value for solving correlation matrix is λ_i(i=1,2,3...p), feature Vector is the sequence carried out characteristic value from big to small, λ₁≥λ₂≥...≥λ_p≥0.By λ_iIt substitutes into (R- λ iE) x=0, asks Solve feature vector a_i, and by a_iUnit turns to e_i。

(4) by calculating accumulation contribution rate, principal component is found out

The accumulation contribution rate of the good characteristic value of calculated permutations, it is general before t characteristic value accumulation contribution rate to 85%- When 95%, so that it may take this t as principal component, when t takes 3, t=3 in the present embodiment, the contribution rate of accumulative total of characteristic value reaches 90%.

(5) load of principal component is found out

According to above formula, the linear combination that 8 dimension datas are converted to 8 variables finds out principal component Y=(y₁,y₂,...,y_m)^T。

In order to illustrate the discreteness of data characteristics, by all data, each classification is abstracted into three-dimensional feature, such as specification Shown in attached drawing 4, Fig. 4 is to be presented in three-dimensional figure to original 8 dimensional feature data abstraction at 3 dimensional features.XYZ indicates three-dimensional coordinate Axis.It can be found that feature has apparent discrete type, can not can complete to classify by the single algorithm of tradition.

The extremely random tree algorithm of S3:

Extreme random tree

Extreme random tree (abbreviation ET, also known as extreme random forest) is similar to random forests algorithm, is by more decisions Tree is integrated, thus has many same advantages.If classifying quality is outstanding and accuracy is high, high dimensional feature can be handled well Data are simultaneously not necessarily to carry out feature selecting, and energy parallelization calculates the advantages that execution efficiency is high.In processing mixed gas detection classification neck Domain, Ensemble Learning Algorithms classification accuracy with higher, but be complete used in every decision tree in extreme random tree algorithm Portion's initial data, and random forests algorithm is then to sample to generate training sample using bootstrap.And extreme random tree exists It is to randomly select division node when node split, and non-selected best division threshold value or feature.It is referring to Figure of description 5 Extreme random tree algorithm schematic diagram.

Difference between extreme random tree and random forests algorithm:

First, the training sample of random forests algorithm is to sample to generate by bootstrap, however extreme random tree In every decision tree use all original training sample data, facilitate reduce model deviation.

Second, random forest sorting algorithm is in node split, the selected section feature first from all features, according to This Partial Feature accurately chooses best divisional mode (such as GINI index etc.) Lai Shengcheng decision tree by division.And extremely with Machine tree algorithm is then random selection divisional mode.Specific implementation form are as follows: for the division of classification form, randomly select certain A little categorical datas are put into a branch, remaining categorical data is put into another branch；For the division of numeric form, with Machine chooses a threshold value between maximum and minimum value, as the data principle of left and right branch, greater than the number of the threshold value According to a branch is put into, the data less than the threshold value are put into another branch, and sample data is put into Liang Ge branch.Then For classification problem herein, split values are calculated using GINI index meter.All features of the node are traversed, whole features are obtained Split values, the feature for choosing maximum split values are divided and (for regression problem, calculate split values using mean square error).

In extreme random tree algorithm, since all training data samples are OOB (outside bag) data sample, Calculating to the prediction error of extreme random tree is the error calculation to the OOB sample.It is found in the research of this project, Trained time efficiency, classification accuracy, to training data in terms of, extreme random tree be superior to Machine forest algorithm.

Extreme random tree algorithm realizes step

Wherein extreme random tree algorithm is indicated with { E (K, X, D) }, wherein E presentation class device model, D indicate original number According to sample, K indicates the quantity of decision tree.Every decision tree inputs X={ x according to sample₁,x₂,...,x_mPrediction result is generated, Categorised decision is finally obtained according to voting rule.Specific step is as follows for extreme random tree algorithm:

(1): in the disaggregated model of extreme random tree, each base classifier using whole training samples (OOB sample) into Row training, it is assumed that raw data set D, sample size N, feature quantity M.

(2): decision tree is generated according to CART algorithm.When carrying out node split, in each division node at random from M M feature is selected in feature, is randomly selected certain classifications and is put into one of branch, remaining classification is put into another branch, together When calculate the best split values of each node, select optimum attributes division, and without cut operator in division.Division Subset iteration out generates a decision tree to preset value.

(3): by step (1), (2) repetitive operation K times, ultimately generating the extreme random tree mould being made of K decision tree Type.

(4): testing via test data the extreme random tree-model come is trained, generated eventually by ballot Final classification results.

For the effect for verifying proposed classifier, to the mixed of original ethylene and methane and ethylene and carbon monoxide Gas sample is closed, we carry out model analysis and verifying by the way of 10 folding cross validations.Specific classification results and analysis It is as follows.

Wherein by dynamic time warping (DTW) algorithm, it is 1,2,3 that the basic parameter num in DTW, which is arranged, in we And tested without using the case where DTW, as a result as shown in Figure of description 6.

1 10 folding cross validation accuracy rate of table

From Fig. 6 and table 1 as can be seen that as num=3, five folding cross validation mean value accuracy rate ratio num=0 are to mention It is high by 26.87%.From the point of view of time efficiency, num=3 ratio num=0 model running time efficiency improves 56.04%.Therefore As can be seen that modelling effect is obviously improved, while improving the accuracy rate of classification after DTW.It is attached referring to specification Fig. 7 is the DTW model running time, after experiment is repeated several times, when the parameter of DTW is set as 3, and time model operation Time is most short, is 103.2568 seconds.

We use feature construction mode, and the dimension of Lai Zengjia data selects optimal feature to be trained, analysis knot Fruit is as shown in Figure 7.From Fig. 7 analysis it is found that if keeping characteristic dimension constant, recognition accuracy is only 73.37%, and if passing through A-B after mode increases dimension, becomes 28 dimension data features, it is 87.50% that recognition accuracy, which increases, there it can be seen that right In special characteristic, purposive elevation dimension, there is good trend for feature discrete type.Therefore 56 are risen to from by feature After dimension, the discrimination situation constant compared with dimension improves 18.97%, response by dimensionality reduction PCA algorithm after, final discrimination is 99.17%.

Extremely random tree algorithm is analyzed again, is done algorithms of different comparison and is found, same to random forest, XGboost algorithm comparison, The accuracy rate and time efficiency of extreme random tree algorithm are all higher, have double dominant characteristic.

After the comparison of many algorithms, at present in integrated study sorting algorithm, wherein random forests algorithm is the most general Time, effect is preferably also.Therefore comparative experiments is done by the extreme random tree algorithm of improved random forests algorithm.Simultaneously will XGBoost algorithm and GBDT algorithm do the comparison of accuracy rate and time efficiency.Analytical table 2 is it is found that extreme random tree algorithm exists More random forest algorithm improves 4.42% in accuracy rate, improves 5.00% compared with XGboost algorithm, is promoted compared with GBDT algorithm 7.99%.

2 algorithm classification accuracy rate correlation data of table

It is sorting algorithm model running time comparison diagram referring to Figure of description 8, is analyzed according to Fig. 8, doing algorithm model In the experiment of run-time efficiency, wherein the runing time of extreme random tree algorithm is most short, only 103.2568 seconds, than random The time efficiency of forest algorithm improves 66.85%, and wherein XGBoost algorithm is because model is the most complicated, when model running Between also longest.Therefore the extreme random tree algorithm proposed is obviously improved in accuracy rate and time efficiency.

Above to a kind of mixed gas detection model construction method progress based on extreme random tree provided by the present invention It is discussed in detail, used herein a specific example illustrates the principle and implementation of the invention, above embodiments Explanation be merely used to help understand method and its core concept of the invention；Meanwhile for the general technology people of this field Member, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this explanation Book content should not be construed as limiting the invention.

Herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this Actual relationship or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to nonexcludability Include so that include a series of elements process, method, article or equipment not only include those elements, but also Including other elements that are not explicitly listed, or further include for this process, method, article or equipment it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including institute State in the process, method, article or equipment of element that there is also other identical elements.

Claims

1. a kind of mixed gas detection model construction method based on extreme random tree, which comprises the steps of:

S1, data acquisition is carried out to mixed gas, obtains data set, the data set includes at least three gas signal time sequences Column, and the optimal crooked route of gas signal time series is calculated, gas signal time series is carried out using optimal crooked route Screening；

2. a kind of mixed gas detection model construction method based on extreme random tree according to claim 1, feature It is, the optimal crooked route calculating process of gas time sequence is as follows in the S1:

S11, the distance matrix for constructing two gas signal time serieses；Two time serieses are respectively X=(x₁, x₂... x_m)、Y =(y₁, y₂... y_n), wherein two length of time series are m, n.D_m×nFor two time serieses construction m × n apart from square Battle array

d_ij=| | x_i-y_j||_w

It is exactly Euclidean distance 2- norm, 1≤i≤m, 1≤j≤n as w=2；

p_min={ p₁,p₂,…p_d,…p_k}

k∈{max(m,n),m+n+1}

p_d+1=p_d+min[d_(i+1)j,d_(i+1)(j+1),d_i(j+1)]；

S13, give up P_minMaximum two groups of gas signal time serieses, input of the residual gas signal time sequence as step 2 Data.

3. a kind of mixed gas detection model construction method based on extreme random tree according to claim 1, feature It is, the S2 is specifically included:

The primitive character building of S21, gas signal；It constructs to obtain gas signal multidimensional primitive character using interaction feature method；

S22, dimension-reduction treatment is carried out using Principal Component Analysis to the gas signal multidimensional primitive character, obtains initial data sample This.

4. a kind of mixed gas detection model construction method based on extreme random tree according to claim 1, feature It is, the S3 is specifically included:

S31, in the disaggregated model of extreme random tree, each base classifier is trained using whole primary data samples, In, raw data set D, sample size N, feature quantity M；

S32, decision tree is generated according to CART algorithm；When carrying out node split, in each division node at random from M feature M feature is selected, several classifications is randomly selected and is put into one of branch, remaining classification is put into another branch, calculates simultaneously The best split values of each node select optimum attributes division, and without cut operator in division；The subset divided out Iteration generates a decision tree to preset value；

S33, by step S31, S32 repetitive operation K times, ultimately generate the extreme random tree-model being made of K decision tree；