CN106934414A - It is a kind of based on the gradual Ensemble classifier method with noise label data - Google Patents
It is a kind of based on the gradual Ensemble classifier method with noise label data Download PDFInfo
- Publication number
- CN106934414A CN106934414A CN201710081412.6A CN201710081412A CN106934414A CN 106934414 A CN106934414 A CN 106934414A CN 201710081412 A CN201710081412 A CN 201710081412A CN 106934414 A CN106934414 A CN 106934414A
- Authority
- CN
- China
- Prior art keywords
- grader
- sample
- branch
- sigma
- bootstrap
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of based on the gradual Ensemble classifier method with noise label data, comprise the following steps:Input training sample and test sample;Sample is carried out using bootstrap methods and tie up sampling, obtain B bootstrap branch;Grader is trained to B bootstrap branch using LDA methods;Integrated classifier set Γ (P) of a newly-built sky, selection first is added in Γ (P) from the grader of generation;Progressively choose in the remaining grader and to meet the grader of condition and be added to Γ (P);Until the number chosen reaches number G set in advance, stop selection;The corresponding weight of integrated classifier set and each grader branch for choosing is exported simultaneously;Test sample is classified, last predicting the outcome is drawn.The present invention, while studying sample dimension and attribute dimensions, is obtained in that preferable classifying quality in the data set with noise label.
Description
Technical field
It is more particularly to a kind of based on the gradual collection with noise label data the invention belongs to computer machine learning areas
Constituent class method.
Background technology
Integrated study, as an important branch of machine learning, is applied to data mining, intelligent transportation system, biology
The fields such as informatics, pattern-recognition, obtain the concern of more and more researchers.Relative to single grader, integrated study side
Method can be with the multiple graders under integrated different situations, as a unified grader.This kind of integrated classifier has stabilization
The characteristics of property, robustness and high-accuracy.Sum it up, integrated classifier is successfully used in due to outstanding performance
In no field.
But, traditional integrated learning approach is mainly sample peacekeeping attribute dimension is separated to be studied, not right
It carries out overall research.For example, Bagging algorithms are only studied sample dimension, and random subspace algorithms are only right
Attribute dimension is studied.This considers sample dimension or the method for only considering attribute dimensions, is not enough to structure one powerful
Integrated classifier, and the sample with noise is processed.For example, in some data sets, the pattern with feature is present
In some attribute dimensions, but for other data sets, same signature pattern can not play identical effect.
The content of the invention
Shortcoming and deficiency it is an object of the invention to overcome prior art, there is provided a kind of based on noise label data
Gradual Ensemble classifier method, in the data set with noise label, while studying sample dimension and attribute dimensions, energy
Enough obtain preferable classifying quality.
It is a kind of based on the gradual Ensemble classifier method with noise label data, comprise the following steps:
S1, input training sample and test sample;
S2, carried out using bootstrap methods sample dimension sampling, obtain B bootstrap branch;
S3, grader is trained to B bootstrap branch using LDA linear discriminant analysis method, generate respective classification
Device;
S4, newly-built integrated classifier set Γ (P), are initialized as sky, are selected from the grader of step S3 generations
First grader is added in Γ (P);
S5, the selection of gradual grader:Progressively chosen in remaining grader follow-up outstanding grader as point
Branch is added in Γ (P);Until the number of branches chosen reaches the number of branches G of integrated classifier set set in advance, stop
Only select;The corresponding weight of integrated classifier set and each grader branch for choosing is exported simultaneously;
S6, test sample is classified using integrated classifier set and each grader branch corresponding weight, drawn
Last predicts the outcome.
Preferably, step S1 is comprised the concrete steps that:One data set with noise label to be sorted of input, uses 5 times
Cross validation is tested, specifically:
Test for the first time:1st part used as test data set Pe, it is left 4 parts as training dataset Pr;Training dataset Pr
={ (p1,y1),(p2,y2),…,(pl,yl), l is training sample number, pi(i ∈ { 1 ..., l }) is training sample, yiIt is sample
This label, while each piThere is d attribute dimension;
Test for second:2nd part used as test data set Pe, it is left 4 parts as training dataset Pr;
By that analogy, 5 experiments are carried out altogether.
Preferably, in step S2, using bootstrap methods to training dataset PrCarry out sample dimension sampling:
Using there is the sampling put back to, wherein sample rate isHave
τ1∈ [0,1] is the stochastic variable that unitizes;According to training sample piSubscript is taken out entering row stochastic sample one by one
Take, be designated as under specific sampling:
M therein is the subscript of select sample, τ2∈ [0,1] is the stochastic variable that unitizes;Every time in experiment,
Under one sample rate, select and select every time in B times, B timesIndividual training sample, just obtains B training sample set, that is, generate B
Bootstrap branches
Preferably, step 3 trains comprising the concrete steps that for grader;Each bootstrap branch separately as an instruction
Practice collection, using LDA algorithm, generate respective graderThe object function of LDA is as follows:
ΞbRepresent object function;K represents the number summation of label;Λ(k|pb) represent in bootstrap branches ObIn sample
This pbLabel k prior probability function;Υ(yb| k) it is the loss function of sample classification result, wherein k is true tag, yb
It is prediction label, and when sample is correctly classified, there are Υ (yb| k)=0, otherwise Υ (yb| k)=1;
Λ therein (k | pb) calculation be:
It is thereinIt is bootstrap branches O with ∑ kbIn each label k average and covariance matrix;| ∑ k | withIt is the determinant and inverse matrix of ∑ k;Λ(pb) it is a standardized constant;Λ (k) be kth class training sample number with
ObThe ratio of all numbers of samples in branch.
Preferably, step S4 is comprised the concrete steps that:
The newly-built integrated classifier set Γ (P) of S4-1, are initialized as sky
S4-2 initializes the weight of all samples,
S4-3 calculates the accuracy rate ξ of each bootstrap branch classifierj(j ∈ { 1 ..., B }), chooses accuracy rate highest
First selected grader of conduct:
S4-4 calculates grader χ1The Weight composite error of sample of classification error be:
Error functions thereini∈{1,…,l};χ(pi) represent grader χ pairs
In sample piClassification results;
S4-5 calculates grader χ1Corresponding weight is θ1:
S4-6 is by grader χ1It is added in integrated classifier set Γ (P):
Γ1(P)=θ1χ1;
The new weight that S4-7 updates all training samples is
The weight has been normalized, therefore has:
Preferably, step S5 is comprised the concrete steps that:
S5-1 calculates remaining each graderThe first integrated loss functionG ∈ 1 ..., and G } changed for current
Article used in lieu of a preface number:
ξ thereinjIt is the grader after the regulation of training sample weightCorresponding grader accuracy rate;Grader is apart from letter
Number φ (Oj,Oh) represent bootstrapOjWith OhSimilitude, OjIt is grader χjCorresponding bootstrap branches, OhTo have obtained
Grader set in the corresponding bootstrap branches set of all graders;β1And β2The proportioning of both weights is represented,
And have β1+β2=1;
Remaining each grader is calculatedThe first integrated loss functionAnd it is ranked up;Calculate the
Two integrated loss function Π2(Γ):
C therein is sample label, χhIt is acquired integrated classifier set Γg-1(P) h-th grader in;
From the first integrated loss functionMaximum grader starts to compare, if
Establishment then considers next grader;Until above formula is invalid, grader now is integrated as next addition
The grader of grader set Γ (P);
S5-2 calculates the classification error of new each grader branch of integrated classifier after new grader branch is added
The Weight composite error of sample be:
Herein Represent the number of branches of goal set Γ (P);Then current newly-increased dividing is updated
Class device weight is:
S5-3 is added to newest grader in the set for having selected, and generates newest integrated classifier set:
The weight that all training samples are updated on the basis of new integrated classifier is:
Normalized weight after wherein updating has:
S5-4 continues executing with step S5-1~S5-3, until the number of branches chosen reaches number of branches G set in advance,
Stop iteration;The integrated classifier set Γ for choosing is exported simultaneouslyGAnd corresponding weight.
Further, grader distance function φ (O in step S5-1j,Oh) computational methods be:Bootstrap OjWith OhCan
To regard two Gaussian Mixture distributions as, Ω is designated as respectivelyjWith Ωh;For two gauss hybrid models,
Corresponding weight isWithCorresponding weight isK1With K2Point
Wei not gauss hybrid models ΩjWith ΩhThe number of corresponding component;
It is thereinRepresent two Gaussian ProfilesWithPasteur's distance;
Gaussian Profile is represent respectivelyWithMean vector and covariance matrix.
Preferably, the specific method of step S6 is:
By prediction label of each branch to the sample after the calculating of each grader branch, can be drawn;For every
Individual prediction label obtains the last y that predicts the outcome, it is necessary to carry out the ballot of Weight*:
NoteIt is g-th grader χ in integrated classifier setgTo the pre- mark of all samples
Sign,It is i-th sample fiPrediction label, c ∈ { 0,1 ..., k-1 } be specific sample label, k be total classification number.
The present invention compared with prior art, has the following advantages that and beneficial effect:
1st, the present invention have studied sample dimension and attribute dimensions simultaneously, be mainly used in the real-life label with noise
Data set classification problem, can preferably solve this common classification problem.
2nd, the present invention proposes gradual integrated framework, and preferable integrated result is obtained with less integrated branch, carries
The validity of integrator high.
3rd, the present invention proposes a grader selection algorithm based on different Similarity measures, for selecting preferably
Grader, so as to constitute effective Ensemble classifier algorithm.
Brief description of the drawings
Fig. 1 is the flow chart of embodiment method;
Fig. 2 is the experimental result of different classifications device.
Specific embodiment
With reference to embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention are not limited
In this.
It is a kind of based on the gradual Ensemble classifier method with noise label data, comprise the following steps:
S1, input training sample and test sample;
S2, carried out using bootstrap methods sample dimension sampling, obtain B bootstrap branch;
S3, grader is trained to B bootstrap branch using LDA linear discriminant analysis method, generate respective classification
Device;
S4, newly-built integrated classifier set Γ (P), are initialized as sky, are selected from the grader of step S3 generations
First grader is added in Γ (P);
S5, the selection of gradual grader:Progressively chosen in remaining grader follow-up outstanding grader as point
Branch is added in Γ (P);Until the number of branches chosen reaches the number of branches G of integrated classifier set set in advance, stop
Only select;The corresponding weight of integrated classifier set and each grader branch for choosing is exported simultaneously;
S6, test sample is classified using integrated classifier set and each grader branch corresponding weight, drawn
Last predicts the outcome.
The method of 1 pair of the present embodiment does further specific descriptions below in conjunction with the accompanying drawings.
Step 1, input training sample and test sample;
One data set with noise label to be sorted of input.There is attribute dimension to be tieed up with sample in each data set.Often
One sample of behavior one is tieed up, each to be classified as an attribute dimension, and each sample has its sample label.By the data set average mark
Into 5 parts, tested using 5 times of cross validations.Specifically:
Test for the first time:1st part used as test data set Pe, it is left 4 parts as training dataset Pr.Training dataset is
Pr={ (p1,y1),(p2,y2),…,(pl,yl), l is training sample number, pi(i ∈ { 1 ..., l }) is training sample, yi∈
{ -1,1 } is sample label (one classification of a tag representation), extends to multi-class classification problem.While each piThere is d
Attribute dimension.
Test for second:2nd part used as test data set Pe, it is left 4 parts as training dataset Pr.By that analogy, enter altogether
5 experiments of row.
Step 2, using bootstrap methods to training dataset PrCarry out sample dimension sampling:
Using there is the sampling put back to, wherein sample rate isHave
τ1∈ [0,1] is the stochastic variable that unitizes.The method is according to training sample piSubscript come enter it is row stochastic one by one
Sampling.It is designated as under specific sampling:
M therein is the subscript of select sample, τ2∈ [0,1] is the stochastic variable that unitizes.Can finally pick out
Individual training sample.5 experiments altogether, test only one of which sample rate every time;Every time in experiment, under a sample rate, B is selected
It is secondary, selected every time in B timesIndividual training sample, just obtains B training sample set.
According to step 2 select come training sample, generate B bootstrap branch
The method of sampling is used due to the step, sampling sample only fraction out reports the training sample with noise, this
Sample can improve validity of the method to noise data.
Step 3, uses linear discriminant analysis (LDA) Algorithm for Training grader:
Each above-mentioned bootstrap branch separately as a training set, using LDA algorithm, respective point is generated
Class deviceThe reason for using LDA to carry out is that the algorithm is a dimension-reduction algorithm, and LDA can be reduced simultaneously
Noise is tieed up with redundant attributes are removed, and reaches the integrated purpose of attribute dimension, improves the effect of classification.The object function of LDA is as follows:
ΞbRepresent object function;K represents the number summation of label;Λ(k|pb) represent in bootstrap branches ObIn sample
This pbLabel k prior probability function;Υ(yb| k) it is the loss function of sample classification result, wherein k is true tag, yb
It is prediction label, and when sample is correctly classified, there are Υ (yb| k)=0, otherwise Υ (yb| k)=1.
Λ therein (k | pb) calculation be:
It is thereinIt is bootstrap branches O with ∑ kbIn each label k average and covariance matrix;| ∑ k | withIt is the determinant and inverse matrix of ∑ k;Λ(pb) it is a standardized constant;Λ (k) be kth class training sample number with
ObThe ratio of all numbers of samples in branch.
Step 4, selects first grader:
4.1 newly-built integrated classifier set Γ (P), are initialized as sky.
The weight of the 4.2 all samples of initialization,
The 4.3 accuracy rate ξ for calculating each bootstrap branch classifierj(j ∈ { 1 ..., B }), chooses accuracy rate highest
First selected grader of conduct:
4.4 calculate first Weight composite error (Weighted of the sample of the classification error of selected grader
Sum Error) be:
Error functions (Error Function) therein:i∈{1,…,l};χ
(pi) grader χ is represented for sample piClassification results, be 1 or -1.
4.5 calculate first selected grader χ1Corresponding weight is θ1:
4.6 are added in integrated classifier set Γ (P) first selected grader:
Γ1(P)=θ1χ1
The 4.7 new weights for updating all training samples are
The weight has been normalized, therefore has:
Step 5, gradual grader selection:
5.1 steps are main on the basis of step 4, progressively choose follow-up outstanding grader branch carry out it is integrated.
The method of gradual grader selection is:According to each branch of certain logical calculated(remove and be selected into the remaining of Γ (P)
Branch) integrated loss functionIt is defined as:
ξ thereinjAfter sample weights regulationThe corresponding grader accuracy rate of branch;Grader distance function φ
(Oj,Oh) represent bootstrapOjWith OhSimilitude, φ (Oj,Oh) function be primarily used to calculate prepare the branch to be added
The correlation gathered with the branch for having elected.OjIt is grader χjCorresponding bootstrap branches, OhIt is back iteration
The corresponding bootstrap branches set of grader set of middle acquisition.β1And β2The proportioning of both weights is represented, and has β1+β2
=1.
Specifically:BootstrapOjWith OhTwo Gaussian Mixture distribution (Gaussian mixture can be regarded as
Models, GMMs), Ω is designated as respectivelyjWith Ωh;For two gauss hybrid models:It is corresponding
Weight isWithCorresponding weight isK1With K2It is respectively high
This mixed model ΩjWith ΩhThe number of corresponding component;
It is thereinRepresent two Gaussian ProfilesWithPasteur distance (Bhattacharyya
Distance)。 Gaussian Profile is represent respectivelyWithMean vector and covariance matrix.
In general, grader loss function Π1The definition of (χ) needs to consider two aspects:A) sample of Weight
Weight distribution;B) diversity of the different bootstrap in different Similarity measures.
First calculate the value of the grader loss function of remaining each branch for not adding integrated classifierAnd it is right
It is ranked up;From integrated loss functionMaximum branch starts to calculate, if following formula
Establishment then considers next branch, and until above formula is invalid, grader now collects composition as next addition
The grader of class device set Γ (P):
C ∈ { -1,1 } therein are the set of sample label (true tag), χhIt is acquired integrated classifier set
Γg-1(P) h-th linear discriminant analysis grader in.
Use integrated loss function Π2(Γ) is added in final set determining which grader;Π2(Γ's)
Meaning is the branch of the classification accuracy reduction after removing addition.
The grader of next addition integrated classifier set Γ (P) is thereby determined that.
5.2 after new grader branch is added, and calculates the sample of the classification error of new each branch of integrated classifier
Weight composite error (Weighted Sum Error) be:
It is thereinG ∈ { 1 ..., G } are current iteration sequence number,Represent the branch of goal set Γ (P)
Number.Then need to update current newly-increased grader weight and be:
5.3 newest graders are added in the set that previous step has been selected, and generate newest integrated classifier set:
The weight that newest sample is updated on the basis of new integrated classifier is:
Normalized weight after wherein updating has:
5.4 continue executing with step 5.1~5.3, until the number of branches chosen reaches number of branches G set in advance, stop
Only iteration;The integrated classifier set Γ for choosing is exported simultaneouslyGAnd corresponding weight.
In step 5.1, for grader distance function φ (Oj,Oh), there are different definition methods.Bootstrap OjWith
OhTwo Gaussian Mixtures distribution (Gaussian mixture models, GMMs) can be regarded as, Ω is designated as respectivelyjWith Ωh.And can
Using the parameter of K-means algorithm initialization GMMs models, to be come using Expectation-Maximization (EM) algorithm
The parameter value that acquisition most has.
For two gauss hybrid models,Corresponding weight isWithCorresponding weight isK1With K2Respectively gauss hybrid models ΩjWith ΩhIt is right
The number of the component answered.There is following several method to calculate corresponding grader distance function φ (Oj,Oh)。
1.φ1(Oj,Oh) be defined as choosing the most short Gaussian Profile of two distances, it is specifically defined as:
It is thereinRepresent two Gaussian ProfilesWithPasteur's distance.
Gaussian Profile is represent respectivelyWithMean vector and covariance matrix.
2.φ2(Oj,Oh) it is defined as choosing two apart from farthest Gaussian Profile, it is specifically defined as:
3.φ3(Oj,Oh) paired average similarity is defined as, it is specifically defined as:
4.φ4(Oj,Oh) the average similarity calculating of Weight is defined as, specifically it is defined as:
φ above4(Oj,Oh) definition method major advantage be to add weight, can the different branches of calculating it is similar
Property.Can be obtained from experiment simultaneously, the 4th kind of method is optimal.Therefore the present embodiment sorting technique also using this definition
Calculate.
By being input into training sample, the sample with noise is removed using the Bagging method of samplings and LDA dimension-reduction algorithms and is tieed up
With attribute dimension;Generate a series of bootstraps sub-branches and sample member corresponding with each sub-branch;Using based on point
The gradual selection algorithm of the specific cost function of class device (Cost Function) and integrated cost function is carried out to grader
Selection, and iteration obtains the corresponding weight of each branch;Branch outcome is collected using the voting method of Weight, most
The classification results of integrated classifier are obtained eventually.The accuracy of the present embodiment method is further analyzed, step is as follows:
The 1 part of test set for splitting in step 1 as the input data of the grader attribute dimension Pe(each data
Concentrating, there is attribute dimension to be tieed up with sample.One sample dimension of each behavior, each to be classified as an attribute dimension).By each classification
After the calculating of device, prediction label of each branch to the sample can be drawn.
Prediction label for above step each result obtains last prediction knot, it is necessary to carry out the ballot of Weight
Really.Remember simultaneouslyIt is g-th grader χ in integrated classifier setgTo the pre- mark of all samples
Sign,It is i-th sample fiPrediction label, c ∈ { 0,1 ..., k-1 } be specific class label, k be total classification number.
Weight ballot is carried out according to following formula, the last y that predicts the outcome is obtained*:
In an experiment, the result for being marked according to method, the result with original training sample is compared, and calculates correspondence
Classification accuracy (classification Accuracy, AC).
P thereinSTest set is represented, | PS| represent in test set PSIn test sample number.For sample pi:It is the prediction label based on the gradual Ensemble classifier method with noise label data,It is true for the sample
Real label.In specific experiment, each result has all carried out 10 computings, and the use of its average value is that final classification is accurate
Rate.Wherein it is mainly used in reducing the influence of randomness using 5 times of cross validations.
Fig. 2 illustrates the experimental result of different classifications device, and the data for inclining overstriking correspond on the data set accuracy rate most
Method high.Result shows that the method that the present embodiment is proposed can obtain preferable classification results on different pieces of information collection.From most
Whole result can show that the method more can effectively solve the classification problem with noise label data.
Above-described embodiment is the present invention preferably implementation method, but embodiments of the present invention are not by above-described embodiment
Limitation, it is other it is any without departing from Spirit Essence of the invention and the change, modification, replacement made under principle, combine, simplification,
Equivalent substitute mode is should be, is included within protection scope of the present invention.
Claims (9)
1. it is a kind of based on the gradual Ensemble classifier method with noise label data, it is characterised in that to comprise the following steps:
S1, input training sample and test sample;
S2, carried out using bootstrap methods sample dimension sampling, obtain B bootstrap branch;
S3, grader is trained to B bootstrap branch using LDA linear discriminant analysis method, generate respective grader;
S4, newly-built integrated classifier set Γ (P), are initialized as sky, and first is selected from the grader of step S3 generations
Individual grader is added in Γ (P);
S5, the selection of gradual grader:Progressively chosen in remaining grader the follow-up grader for meeting condition as point
Branch is added in Γ (P);Until the number of branches chosen reaches the number of branches G of integrated classifier set set in advance, stop
Only select;The corresponding weight of integrated classifier set and each grader branch for choosing is exported simultaneously;
S6, test sample is classified using integrated classifier set and each grader branch corresponding weight, drawn last
Predict the outcome.
2. gradual Ensemble classifier method according to claim 1, it is characterised in that step S1's comprises the concrete steps that:It is defeated
Enter a data set with noise label to be sorted, select training dataset Pr={ (p1,y1),(p2,y2),…,(pl,
yl), l is training sample number, pi(i ∈ { 1 ..., l }) is training sample, yiIt is sample label, while each piThere is d category
Property dimension.
3. gradual Ensemble classifier method according to claim 1, it is characterised in that carried out using 5 times of cross validations
Experiment, specifically:
Test for the first time:1st part used as test data set Pe, it is left 4 parts as training dataset Pr;Training dataset Pr=
{(p1,y1),(p2,y2),…,(pl,yl), l is training sample number, pi(i ∈ { 1 ..., l }) is training sample, yiIt is sample
Label, while each piThere is d attribute dimension;
Test for second:2nd part used as test data set Pe, it is left 4 parts as training dataset Pr;
By that analogy, 5 experiments are carried out altogether.
4. gradual Ensemble classifier method according to claim 2, it is characterised in that in step S2, use bootstrap
Method is to training dataset PrCarry out sample dimension sampling:
Using there is the sampling put back to, wherein sample rate isHave
τ1∈ [0,1] is the stochastic variable that unitizes;According to training sample piSubscript enters row stochastic sampling one by one,
It is designated as under specific sampling:
M therein is the subscript of select sample, τ2∈ [0,1] is the stochastic variable that unitizes;Every time in experiment, at one
Under sample rate, select and select every time in B times, B timesIndividual training sample, just obtains B training sample set, that is, generate B
Bootstrap branches
5. gradual Ensemble classifier method according to claim 4, it is characterised in that step 3 training grader it is specific
Step is:Each bootstrap branch separately as a training set, using LDA algorithm, respective grader is generatedThe object function of LDA is as follows:
ΞbRepresent object function;K represents the number summation of label;Λ(k|pb) represent in bootstrap branches ObIn sample pb
Label k prior probability function;Υ(yb| k) it is the loss function of sample classification result, wherein k is true tag, ybFor pre-
Mark label, and when sample is correctly classified, there is Υ (yb| k)=0, otherwise Υ (yb| k)=1;
Λ therein (k | pb) calculation be:
It is thereinIt is bootstrap branches O with ∑ kbIn each label k average and covariance matrix;| ∑ k | withFor
The determinant and inverse matrix of ∑ k;Λ(pb) it is a standardized constant;Λ (k) is kth class training sample number and ObBranch
In all numbers of samples ratio.
6. gradual Ensemble classifier method according to claim 2, it is characterised in that step S4's comprises the concrete steps that:
The newly-built integrated classifier set Γ (P) of S4-1, are initialized as sky;
S4-2 initializes the weight of all samples,
S4-3 calculates the accuracy rate ξ of each bootstrap branch classifierj(j ∈ { 1 ..., B }), chooses accuracy rate highest and makees
It is first selected grader:
S4-4 calculates grader χ1The Weight composite error of sample of classification error be:
Error functions thereini∈{1,…,l};χ(pi) grader χ is represented for sample
This piClassification results;
S4-5 calculates grader χ1Corresponding weight is θ1:
S4-6 is by grader χ1It is added in integrated classifier set Γ (P):
Γ1(P)=θ1χ1;
The new weight that S4-7 updates all training samples is
The weight has been normalized, therefore has:
7. gradual Ensemble classifier method according to claim 6, it is characterised in that step S5's comprises the concrete steps that:
S5-1 calculates remaining each graderThe first integrated loss functionG ∈ 1 ..., and G } it is current iteration sequence
Number:
ξ thereinjIt is the grader after the regulation of training sample weightCorresponding grader accuracy rate;Grader distance function φ
(Oj,Oh) represent bootstrapOjWith OhSimilitude, OjIt is grader χjCorresponding bootstrap branches, OhFor acquired
The corresponding bootstrap branches set of all graders in grader set;β1And β2The proportioning of both weights is represented, and is had
β1+β2=1;
Remaining each grader is calculatedThe first integrated loss functionAnd it is ranked up;Second is calculated to collect
Into loss function Π2(Γ):
C therein is sample label, χhIt is acquired integrated classifier set Γg-1(P) h-th grader in;
From the first integrated loss functionMaximum grader starts to compare, if
Establishment then considers next grader;Until above formula is invalid, grader now is used as next addition Ensemble classifier
The grader of device set Γ (P);
S5-2 calculates the sample of the classification error of new each grader branch of integrated classifier after new grader branch is added
This Weight composite error is:
Herein Represent the number of branches of goal set Γ (P);Then current newly-increased grader is updated
Weight is:
S5-3 is added to newest grader in the set for having selected, and generates newest integrated classifier set:
The weight that all training samples are updated on the basis of new integrated classifier is:
Normalized weight after wherein updating has:
S5-4 continues executing with step S5-1~S5-3, until the number of branches chosen reaches number of branches G set in advance, stops
Iteration;The integrated classifier set Γ for choosing is exported simultaneouslyGAnd corresponding weight.
8. gradual Ensemble classifier method according to claim 7, it is characterised in that grader distance function φ in step S5-1
(Oj,Oh) computational methods be:BootstrapOjWith OhTwo Gaussian Mixture distributions can be regarded as, Ω is designated as respectivelyjWith Ωh;For two
Individual gauss hybrid models,Corresponding weight isWith
Corresponding weight isK1With K2Respectively gauss hybrid models ΩjWith ΩhThe number of corresponding component;
It is thereinRepresent two Gaussian ProfilesWithPasteur's distance;Point
Gaussian Profile is not representWithMean vector and covariance matrix.
9. gradual Ensemble classifier method according to claim 7, it is characterised in that the specific method of step S6 is:
By prediction label of each branch to the sample after the calculating of each grader branch, can be drawn;It is pre- for each
Mark label obtain the last y that predicts the outcome, it is necessary to carry out the ballot of Weight*:
NoteIt is g-th grader χ in integrated classifier setgTo the prediction label of all samples,
It is i-th sample fiiPrediction label, c ∈ { 0,1 ..., k-1 } be specific sample label, k be total classification number.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710081412.6A CN106934414A (en) | 2017-02-15 | 2017-02-15 | It is a kind of based on the gradual Ensemble classifier method with noise label data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710081412.6A CN106934414A (en) | 2017-02-15 | 2017-02-15 | It is a kind of based on the gradual Ensemble classifier method with noise label data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106934414A true CN106934414A (en) | 2017-07-07 |
Family
ID=59423237
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710081412.6A Pending CN106934414A (en) | 2017-02-15 | 2017-02-15 | It is a kind of based on the gradual Ensemble classifier method with noise label data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106934414A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451101A (en) * | 2017-07-21 | 2017-12-08 | 江南大学 | It is a kind of to be layered integrated Gaussian process recurrence soft-measuring modeling method |
CN108021941A (en) * | 2017-11-30 | 2018-05-11 | 四川大学 | Use in medicament-induced hepatotoxicity Forecasting Methodology and device |
-
2017
- 2017-02-15 CN CN201710081412.6A patent/CN106934414A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107451101A (en) * | 2017-07-21 | 2017-12-08 | 江南大学 | It is a kind of to be layered integrated Gaussian process recurrence soft-measuring modeling method |
CN107451101B (en) * | 2017-07-21 | 2020-06-09 | 江南大学 | Method for predicting concentration of butane at bottom of debutanizer by hierarchical integrated Gaussian process regression soft measurement modeling |
CN108021941A (en) * | 2017-11-30 | 2018-05-11 | 四川大学 | Use in medicament-induced hepatotoxicity Forecasting Methodology and device |
CN108021941B (en) * | 2017-11-30 | 2020-08-28 | 四川大学 | Method and device for predicting drug hepatotoxicity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Karthika et al. | A Naïve Bayesian classifier for educational qualification | |
Buscema et al. | Training with input selection and testing (TWIST) algorithm: a significant advance in pattern recognition performance of machine learning | |
CN100585617C (en) | Based on sorter integrated face identification system and method thereof | |
CN108090510A (en) | A kind of integrated learning approach and device based on interval optimization | |
CN103473556B (en) | Hierarchical SVM sorting technique based on rejection subspace | |
CN106228183A (en) | A kind of semi-supervised learning sorting technique and device | |
CN103927550B (en) | A kind of Handwritten Numeral Recognition Method and system | |
CN106126972A (en) | A kind of level multi-tag sorting technique for protein function prediction | |
Dehuri et al. | A hybrid genetic based functional link artificial neural network with a statistical comparison of classifiers over multiple datasets | |
CN105760888A (en) | Neighborhood rough set ensemble learning method based on attribute clustering | |
CN109165672A (en) | A kind of Ensemble classifier method based on incremental learning | |
Sushil et al. | Rule induction for global explanation of trained models | |
CN106326843A (en) | Face recognition method | |
CN104966106A (en) | Biological age step-by-step predication method based on support vector machine | |
Patacsil | Survival analysis approach for early prediction of student dropout using enrollment student data and ensemble models | |
CN106934414A (en) | It is a kind of based on the gradual Ensemble classifier method with noise label data | |
CN114049527A (en) | Self-knowledge distillation method and system based on online cooperation and fusion | |
Kumar et al. | Analysis of feature selection and data mining techniques to predict student academic performance | |
CN116306785A (en) | Student performance prediction method of convolution long-short term network based on attention mechanism | |
Hastarimasuci et al. | Variable Selection to Determine Majors of Student using K-Nearest Neighbor and Naïve Bayes Classifier Algorithm | |
Ntoutsi et al. | A general framework for estimating similarity of datasets and decision trees: exploring semantic similarity of decision trees | |
CN109858543A (en) | The image inferred based on low-rank sparse characterization and relationship can degree of memory prediction technique | |
CN114997175A (en) | Emotion analysis method based on field confrontation training | |
US20220188647A1 (en) | Model learning apparatus, data analysis apparatus, model learning method and program | |
Gaber et al. | Optimisation of ensemble classifiers using genetic algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170707 |