CN106599906A

CN106599906A - Multiple kernel learning classification method based on noise probability function

Info

Publication number: CN106599906A
Application number: CN201611052894.4A
Authority: CN
Inventors: 武德安; 冯杰; 吴磊; 陈鹏; 冯江远
Original assignee: Chengdu Gkhb Information Technology Co ltd; University of Electronic Science and Technology of China
Current assignee: Chengdu Gkhb Information Technology Co ltd; University of Electronic Science and Technology of China
Priority date: 2016-11-25
Filing date: 2016-11-25
Publication date: 2017-04-26

Abstract

The invention discloses a multiple kernel learning classification method based on a noise probability function. The method comprises the following steps: calculation of the noise probability function; selection of a base classifier ft<*>(x) and calculation of corresponding coefficients at<*> in each round of iteration; and update of weight. The multiple kernel learning classification method based on the noise probability function is suitable for a classification algorithm of a noise-polluted data set, and has the advantages of no need to solve the complex optimization problem, being small in calculation amount compared with a conventional multiple kernel learning method, effectively solving the problem of noise sensitivity of conventional multiple kernel boosting learning, and being better in robustness.

Description

Multiple Kernel Learning sorting technique based on noise probability function

Technical field

The present invention relates to a kind of Multiple Kernel Learning sorting technique based on noise probability function belongs to data mining technology field.

Background technology

Linear SVM (SVM) is proposed that, with going deep into that SVM is studied, SVM is penetrated into by Cortes and Vapnik The numerous areas of machine learning, such as pattern classification, regression estimates, Multilayer networks etc..SVM achieves huge success, But it belongs to monokaryon study (Single Kernel Learning), with certain limitation.

Machine learning field, Multiple Kernel Learning (Multiple Kernel Learning) increasingly receives publicity, because phase Than monokaryon study, Multiple Kernel Learning can overcome huge in sample characteristics, Heterogeneous Information, multidimensional data irregularly and data exist High-dimensional feature space is distributed uneven phenomenon.

Various effective Multiple Kernel Learnings theories and method, such as Lanckriet in 2004, Bartlett etc. are occurred in that in recent years People proposes the learning method based on Semidefinite Programming (Semidefinite program), the same year Bach, Jordan et al. proposition Optimization method based on quadratic constraints type quadratic programming (Quadratically constrained quajdratic Program), Sonnernburg in 2006, Ratsch et al. are proposed based on semo-infinite linear optimization (Semi-infinite Linear program) learning method, the same year Smola, Ratsch et al. are proposed based on of hypernucleus (Hyperkernels) Learning method, Rakotomamonjy in 2007 et al. propose simple Multiple Kernel Learning method (Simple MKL), Tao Jian texts in 2011 Multinuclear local domain adaptive learning method (Local Learning-based Domain are proposed with Wang Shitong Adaptation)。

Said method achieves certain success in different application field, but these traditional Multiple Kernel Learning methods need to ask The optimization problem of one complexity of solution, amount of calculation is larger, is difficult convergence.Hao Xia in 2012, Steven propose integrated multinuclear Algorithm frame MKBoost of habit, itself test result indicate that, the algorithm greatly reduces amount of calculation, it may have higher precision, but The introducing of Boosting thoughts, also brings the problem to noise-sensitive.

The content of the invention

The purpose of the present invention is that and provide a kind of multinuclear based on noise probability function in order to solve the above problems Sorting technique is practised, the method does not spend the optimization problem of solving complexity, and amount of calculation is little, and efficiently solves to noise-sensitive Problem.

The present invention is achieved through the following technical solutions above-mentioned purpose：

A kind of Multiple Kernel Learning sorting technique based on noise probability function, comprises the following steps：

(1) calculating of noise probability function；

(2) the base grader f in iteration is often taken turns_t ^*The selection of (x) and coefficient of correspondenceCalculating；

(3) renewal of weight.

Preferably, in the step (1), noise probability function is calculated as follows

Wherein：

In formula, Z_iIt is sample (x_i,y_i) K nearest neighbor point set, f (x) be base grader, y_jFor true classification, u^KNN (x_i,y_i) it is noise detection result,It is u^KNN(x_i,y_i) meansigma methodss under base grader f (x), λ is artificial arrange parameter；

If set Z_iThe sample of middle classification error is more, then sample (x_i,y_i) be noise probability it is bigger

In the step (2), based on noise probability functionIt is defined below loss function：

Loss function L (y, f (x)) is minimized, base grader f in t wheel iteration is then selected as follows_t ^*(x) and Calculate its corresponding coefficient

Wherein,

In formula, F_t-1(x_i) represent the assembled classifier obtained after (t-1) takes turns iteration；

In the step (3), using the sample noise probability under M kernel functionIt is initial as follows Change the coefficient related to the selection of base graderAnd sample weights

Known t takes turns the data of iteration, and weight is updated as follows：

The beneficial effects of the present invention is：

Multiple Kernel Learning sorting technique based on noise probability function of the present invention is applied to by noise contaminated data collection Sorting algorithm, advantage is the optimization problem for not spending solving complexity, and amount of calculation is less than traditional Multiple Kernel Learning method, and effectively Solve the problems, such as traditional multinuclear integrated study (Multiple Kernel Boosting Learning) to noise-sensitive, Robustness is more preferable.

Specific embodiment

With reference to embodiment, the invention will be further described：

Multiple Kernel Learning sorting technique based on noise probability function of the present invention is comprised the following steps：

(1) calculating of noise probability function；

(3) renewal of weight；

Wherein, in the step (1), noise probability function is calculated as follows

Wherein：

Wherein,

Known t takes turns the data of iteration, and weight is updated as follows：

Embodiment：

In order to verify to the correctness of this method and effectiveness, we are tested using 6 UCI data sets.It is right It is as shown in table 1 below using 8 kernel functions (5 gaussian kernel functions, 3 Polynomial kernel functions) in each data set：

The information of the UCI data sets of table 1

Datasets	Samples	Features	Classes
				Balance-scale	567	4	2
Breast-cancer	569	32	2
				Ionosphere	351	34	2
Blood-transfusion	748	5	2
				Diabetic-retinopathy	1151	20	2
Pima-indians	768	8	2

Under each noise level, experiment 30 times are repeated to data set, experimental result is the meansigma methodss of 30 experiments, as follows Shown in table 2：

10%, 20%, 30% training sample category attribute value is changed in experiment at random respectively, to obtain different noise water Flat training set.In sample noise probability calculation, K=7 in k-nearest neighbor KNN carries out distance metric using Euclidean distance, makes an uproar Sound probability functionMiddle λ=8.6.

As shown in Table 2, in the case of muting, three kinds of test of heuristics errors are suitable, when noise level is 10%, MKB_NP algorithms Balance-scale, Ionosphere, Pima-indians these three data concentrated expressions better than other two Individual algorithm；When noise level is 20%, new algorithm is in Balance-scale, Blood-transfusion, Diabetic- Test error is minimum in retinopathy these three data sets；When noise level is 30%, MKB_NP algorithms are in Balance- This four data concentrated expressions of scale, Breast-cancer, Blood-transfusion, Pima-indians are the most excellent.

In sum, MKB_NP algorithms are that performance of the inventive method on 6 data sets is better than MKB_D1 and MKB_D2 Algorithm, and in the data classification of higher noise levels, lower to noise data sensitivity, training error is less, and robustness is more It is good.

Above-described embodiment is presently preferred embodiments of the present invention, is not the restriction to technical solution of the present invention, as long as Without the technical scheme that creative work can be realized on the basis of above-described embodiment, it is regarded as falling into patent of the present invention Rights protection scope in.

Claims

1. a kind of Multiple Kernel Learning sorting technique based on noise probability function, it is characterised in that：Comprise the following steps：

(1) calculating of noise probability function；

(3) renewal of weight.

2. the Multiple Kernel Learning sorting technique based on noise probability function according to claim 1, it is characterised in that：The step Suddenly in (1), noise probability function is calculated as follows

Wherein：

\overset{&OverBar;}{u} = Σ_{i = 1}^{N} u^{K N N} (x_{i}, y_{i}) / N,

In formula, Z_iIt is K nearest neighbor point set of sample (xi, yi), f (x) is base grader, y_jFor true classification, u^KNN(x_i, y_i) it is noise detection result,It is u^KNN(x_i,y_i) meansigma methodss under base grader f (x), λ is artificial arrange parameter；

f_{t}^{*} (x) = \underset{f (x)}{a r g m i n} \underset{f (x_{i}) &NotEqual; y_{i}}{Σ} (w_{1 i}^{t} - w_{2 i}^{t})

a_{t}^{*} = \frac{1}{2} l o g (\frac{\underset{f (x_{i}) &NotEqual; y_{i}}{Σ} w_{2 i}^{t} + \underset{f (x_{i}) = y_{i}}{Σ} w_{1 i}^{t}}{\underset{f (x_{i}) &NotEqual; y_{i}}{Σ} w_{1 i}^{t} + \underset{f (x_{i}) = y_{i}}{Σ} w_{2 i}^{t}}),

Wherein,

In the step (3), using the sample noise probability under M kernel functionAs follows initialization with The coefficient of the selection correlation of base graderAnd sample weights

D_{i}^{1} = \frac{w_{1 i}^{1} + w_{2 i}^{1}}{Σ_{i = 1}^{N} (w_{1 i}^{1} + w_{2 i}^{1})},

Known t takes turns the data of iteration, and weight is updated as follows：

D_{i}^{t + 1} = (w_{1 i}^{t} + w_{2 i}^{t}) / Σ_{i = 1}^{N} (w_{1 i}^{t} + w_{2 i}^{t})

w_{2 i}^{t + 1} = w_{2 i}^{t} \exp (y_{i} a_{t}^{*} f_{t}^{*} (x_{i}))

w_{1 i}^{t + 1} = w_{1 i}^{t} \exp (- y_{i} a_{t}^{*} f_{t}^{*} (x_{i})) .