CN106022356B - A kind of multiple view GEPSVM Web page classification method based on gradient descent method - Google Patents
A kind of multiple view GEPSVM Web page classification method based on gradient descent method Download PDFInfo
- Publication number
- CN106022356B CN106022356B CN201610307835.0A CN201610307835A CN106022356B CN 106022356 B CN106022356 B CN 106022356B CN 201610307835 A CN201610307835 A CN 201610307835A CN 106022356 B CN106022356 B CN 106022356B
- Authority
- CN
- China
- Prior art keywords
- view
- webpage
- sample data
- hyperplane
- mvgdsvm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000011478 gradient descent method Methods 0.000 title claims abstract description 25
- 238000000034 method Methods 0.000 title claims description 29
- 238000012549 training Methods 0.000 claims abstract description 44
- 238000005457 optimization Methods 0.000 claims abstract description 25
- 230000006870 function Effects 0.000 claims description 47
- 239000011159 matrix material Substances 0.000 claims description 16
- 241001269238 Data Species 0.000 claims description 2
- 238000007635 classification algorithm Methods 0.000 abstract description 17
- 238000012706 support-vector machine Methods 0.000 abstract description 11
- 238000004422 calculation algorithm Methods 0.000 description 11
- 238000012360 testing method Methods 0.000 description 5
- 230000007423 decrease Effects 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The multiple view GEPSVM webpage classification algorithm based on gradient descent method that the invention proposes a kind of, including MvGDSVM Web page classifying model parameter training step and web data classifying step;MvGDSVM Web page classifying model parameter training step includes: step A: input webpage training sample data;Step B: webpage training sample data are pre-processed;Step C: training MvGDSVM Web page classifying model parameter;Web data classifying step includes: step a: inputting webpage sample data to be measured;Step b: it treats survey grid page sample data and is standardized pretreatment;Step c: survey grid page sample data is treated by MvGDSVM Web page classifying model and is classified.Multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the consistency classified between different views is maximized by introducing a multiple view collaboration standardization item, to be effectively combined the generalized eigenvalue proximal support vector machine of the raising of two single-views, the optimization problem of generation is finally solved using conjugate gradient decent.
Description
Technical field
The present invention relates to Webpage classification technology field more particularly to a kind of multiple view GEPSVM nets based on gradient descent method
Page sorting algorithm (abbreviation MvGDSVM webpage classification algorithm).
Background technique
In recent years, with the popularity of the internet, the network information exponentially increases, it has become people and obtains information
Important means.In face of magnanimity and the network information of content complexity, oneself desired information can not many times be accurately positioned, and
By the classification of webpage, rapid from the network information of magnanimity, accurate the interested information of user can be obtained.
Currently, existing generalized eigenvalue proximal support vector machine (Generalized eigenvalue proximal
Support vector machine, GEPSVM) and improve property generalized eigenvalue proximal support vector machine (Improved
Generalized eigenvalue proximal support vector machine, IGEPSVM) it is all simple and effective
Classification method.
1. generalized eigenvalue proximal support vector machine
A) linear GEPSVM
Generalized eigenvalue proximal support vector machine is a kind of simple and effective two classification method in supervised learning, is utilized
Two hyperplane classify to data point.Wherein one type of each hyperplane from two class data is close as far as possible, from another
Outer one kind is remote as far as possible.Generalized eigenvalue proximal support vector machine is non-to obtain the two by solving a pair of of generalized eigenvalue problem
Parallel hyperplane.
Assuming that in real space RdIn, having n label is yi(i=1,2 ..., n) the sample point of ∈ {+1, -1 }.Wherein, square
Battle arrayIndicate feature of the sample for belonging to+1 class on first view, matrixIndicate the sample for belonging to -1 class
Originally feature (the n on second view1+n2=n).
In real space RdTwo hyperplane of middle definition:
xTw1+γ1=0, xTw2+γ2=0 (1)
GEPSVM parallel condition of two hyperplane in the SVM for having abandoned standard closest on hyperplane, and require: the
One hyperplane is as close as possible from the sample point of class+1, and the sample point from class -1 is as far as possible;Second hyperplane is from class -1
Sample point is as close as possible, and the sample point from class+1 is as far as possible.The decision objective of GEPSVM produces following a pair of of optimization problem:
With
Wherein | | | | indicate 2- norm.The two above optimization problems can be simplified as:
With
A Tikhonov standardization item is introduced, (4) and (5) can be typically canonicalized into:
With
Wherein ε is a non-negative weight coefficient.
It makes as given a definition:
Wherein G and H is two in R(d+1)×(d+1)On symmetrical matrix, z1And z2It is two in Rd+1Hyperplane parameter.That
Optimization problem (6) and (7) can write a Chinese character in simplified form into:
With
Optimization problem (9) and (10) above is complete rayleigh quotient, so their globally optimal solution can be by asking
Following relevant generalized eigenvalue problem is solved to obtain
(G+εI)z1=λ Hz1,z1≠0 (11)
With
(H+εI)z2=λ Gz2,z2≠0. (12)
First and second it is optimal close to hyperplane be generalized eigenvalue problem (11) and (12) respectively for minimum special
The corresponding feature vector of value indicative.
Clearly for a test sample x, the anticipation function of linear GEPSVM is
Wherein | | it is an ABS function,Indicate the vertical range of x to first Optimal Separating Hyperplane,Indicate the vertical range of x to second Optimal Separating Hyperplane.This anticipation function shows if sample x is from first
A Optimal Separating Hyperplane is closer, it is just assigned to class+1, and otherwise it is assigned to class -1.
B) GEPSVM of kernel method
Linear GEPSVM can arrive nonlinear situation by kernel method come extensive.Consider the super of two karyogenesis
Plane replaces plane (1):
WhereinK is a kernel function.Herein, mainly consider common Gaussian kernel (Gaussian
Kernel), its i-th j (i=1,2 ..., n1, j=1,2 ..., n) a element provides as follows:
Whereinμ is a Gauss nuclear parameter.Notice that plane (1) is one (14) in fact
Special circumstances: assuming that using a linear kernelThen it allows
The process that training below obtains two hyperplane is identical as the training process form of linear GEPSVM.
For a test sample x, the anticipation function of the GEPSVM of core is
WhereinIndicate the distance of x to first Optimal Separating Hyperplane based on core,Indicate the distance of x to second Optimal Separating Hyperplane based on core.This anticipation function shows if sample
From first Optimal Separating Hyperplane of this x is closer, it is just assigned to class+1, and otherwise it is assigned to class -1.
2. the generalized eigenvalue proximal support vector machine of raising property
A) linear IGEPSVM
The decision objective of GEPSVM produces following a pair of of optimization problem:
With
Wherein | | | | indicate 2- norm.
GEPSVM can generate singular value problem in generalized eigenvalue decomposition, and in order to overcome this defect, IGEPSVM is used
Subtract to replace removing in GEPSVM to measure two class samples to the distance between Optimal Separating Hyperplane difference.So optimization problem (17) and
(18) it can be converted to:
With
Wherein ν is a weight coefficient.In order to eliminate hyperplane variable (wi,γi) (i=1,2) norm, introduce one
Tikhonov standardization item.Then pass through following definition:
Then optimization problem (19) and (20) can be typically canonicalized into:
With
The two above optimization problems are complete rayleigh quotients.The Lagrangian of formula (21) can be write as:
Wherein λ1And λ2It is Lagrange multiplier.Enable formula (23) about variable (z1,α1) to seek the value after local derviation be zero, it obtains
Following equation
2(G+εI-νH)z1-2λ1Z=0.
So the globally optimal solution of optimization problem (21) can be acquired by solving following eigenvalue problem
(G+εI-νH)z1=λ1z1.(24)
Similar, the globally optimal solution of optimization problem (22) can be acquired by following eigenvalue problem
(H+εI-νG)z2=λ1z2.(25)
First and second it is optimal close to hyperplane be eigenvalue problem (24) and (25) respectively for minimal eigenvalue
Corresponding feature vector.
Clearly for a test sample x, the anticipation function of linear IGEPSVM is
B) IGEPSVM of core
Linear IGEPSVM can arrive nonlinear situation by kernel method come extensive.Consider two karyogenesis
Hyperplane replaces plane (1):
WhereinThe kernel function that K is mentioned before being.
Subsequent training process is identical as the training process form of linear IGEPSVM.And the prediction of the IGEPSVM of core is quasi-
It is then identical with the GEPSVM of core.
But currently, existing either generalized eigenvalue proximal support vector machine, or the generalized eigenvalue of raising property
Closest to supporting vector Web page classifying application be not very extensively.Because the classification that the two algorithms are all single-views is calculated
Method has certain limitation.They cannot all make full use of the characteristic information on multiple views of webpage, point on webpage
The space that class precision is also improved.
Summary of the invention
The present invention proposes that a kind of multiple view GEPSVM algorithm is used for Web page classifying, can make full use of multiple views of webpage
Information, Lai Tigao classification performance.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, including MvGDSVM net
Page disaggregated model parameter training step and web data classifying step;
The MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
The web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the step B
Pretreatment includes:
Step B1: the feature vector on each view of webpage training sample data is determined;
Step B2: the feature vector on each view of all webpage training sample data is made at standardization respectively
Reason.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the step C, lead to
Multiple view collaboration standardization item is crossed to maximize the consistency classified between different views.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step C includes:
Step C1: maximized on each view two class samples between hyperplane at a distance from it is poor, while minimize same
Two hypothesis functions act on the result on different views on one webpage training sample;
Step C2: conjugate gradient decent optimization object function is used, the gradient of objective function is provided.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step C is into one
Step includes:
Step C3: Optimal Separating Hyperplane parameter is acquired using MvGDSVM;
Step C4: it calculates separately webpage training sample on each view and obtains decision to the vertical range of two hyperplane
The prediction result of function.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step b acceptance of the bid
Standardization pre-processes
Step b1: the feature vector on each view of webpage sample data to be measured is determined;
Step b2: standardization is made respectively to the feature vector on each view of all webpage web datas to be measured.
It is right in the step c in multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method
The web data to be measured carries out classification
Step c1: using the optimal parameter for the MvGDSVM disaggregated model that training sample data obtain, each view is calculated separately
Vertical range of the sample to two hyperplane on figure;
Step c2: the optimum prediction function obtained when using training is classified to treat survey grid page sample data.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, linear MvGDSVM
In, vertical range such as following formula of the webpage sample to two hyperplane on each view:
view 1:
view 2:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample
Data are to the vertical range of first hyperplane on first view, and dist12 expression webpage sample data is in first view
On to second hyperplane vertical range;Dist21 indicates webpage sample data on second view to first hyperplane
Vertical range, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Table
Show the feature vector on first view of webpage sample data, x2Indicate second view of webpage sample data on feature to
Amount;First hyperplane parameter of first view is expressed as (w1,γ1), second hyperplane parameter is expressed as (u1,ζ1);The
First hyperplane parameter of two views is expressed as (w2,γ2), second hyperplane parameter is expressed as (u2,ζ2)。
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the MvGDSVM of core,
Distance such as following formula of the webpage sample to two hyperplane on each view:
view 1:
view 2:
Wherein,MatrixIndicate that first kind webpage sample data is regarded at first
Feature on figure;MatrixIndicate feature of the first kind webpage sample data on second view;MatrixIndicate feature of the second class webpage sample data on first view;MatrixIndicate the second class net
Feature of the page sample data on second view;K is kernel function.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the decision function
Prediction result such as following formula:
Wherein,For the prediction result of the decision function on first view;It is the decision function on second view
Prediction result;It is the prediction result of the decision function in conjunction with two views.
Multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method is regarded one more by introducing
Figure collaboration standardizes item to maximize the consistency classified between different views, to be effectively combined the raising of two single-views
The generalized eigenvalue proximal support vector machine (IGEPSVM) of property.The excellent of generation is finally solved using conjugate gradient decent
Change problem.
Detailed description of the invention
Fig. 1 is that the present invention is based on the process frame diagrams of the multiple view GEPSVM webpage classification algorithm of gradient descent method.
Specific embodiment
In conjunction with following specific embodiments and attached drawing, the invention will be described in further detail.Implement process of the invention, item
Part, experimental method etc. are among the general principles and common general knowledge in the art in addition to what is specifically mentioned below, the present invention
There are no special restrictions to content.
The multiple view GEPSVM webpage classification algorithm based on gradient descent method that the invention proposes a kind of, including MvGDSVM
Web page classifying model parameter training step and web data classifying step.
In the present invention, MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
In the present invention, the web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
MvGDSVM webpage classification algorithm proposed by the present invention, is divided into two kinds of situations of linear and nonlinear: 1. is linear
MvGDSVM
Consider two classification problems an of web data, giving n label is yi(i=1,2 ..., n) ∈ {+1, -1 }
Webpage sample point.After webpage training sample data prediction, the spy on each view of all webpage training samples is obtained
Levy vector.Wherein matrixIndicate feature of the sample point for belonging to class+1 on first view, matrix
Indicate feature of the sample point for belonging to class+1 on second view.MatrixIndicate that the sample point for belonging to class -1 exists
Feature on first view, matrixIndicate feature of the sample point for belonging to class -1 on second view.It is aobvious
So, n1+n2=n.
For each view, the definition of following two hyperplane is provided respectively:
Wherein x1(x2) indicate x first (second) view feature.
The present invention is provided as given a definition:
Wherein, G1And H1BeOn symmetrical matrix, G2And H2BeOn symmetrical matrix.z1
And p1Be twoHyperplane parameter, z2And p2Be twoOn hyperplane parameter.
For the feature of two views of combination, following multiple view collaboration standardization item is introduced:
By combining the IGEPSVM of two single-views, the present invention provides first optimization problem of MvGDSVM:
Wherein ν, δ are non-negative weight parameters.
The objective function of optimization problem above it is understood that at: maximize on each view two class samples and super flat
The distance between face difference minimizes two functions on the same training sample at the same timeWithEffect
Result on different views.
Optimization problem (28) above can be simplified as:
In the present invention, optimization object function F is removed using conjugate gradient decent1(z1,z2), target letter is then provided respectively
Number is about z1And z2Gradient
For a non-convex function, what gradient descent method acquired is locally optimal solution.So it cannot ensure to be optimized
The optimal solution of problem (29).For preferable Optimal Separating Hyperplane, three groups of difference z are selected1,z2Initial value come to optimization problem
(29) gradient decline is carried out:
1.z1And z2It is acquired respectively by the IGEPSVM of single-view.
2.z1And z2It is the unit column vector of corresponding dimension respectively.
3.z1And z2The column vector of corresponding dimension respectively, their each element be between [- 1,1] value with
Machine number.
First group of initial value is used as referring generally to the initialization strategy that it can ensure to use carries out objective function (29)
The locally optimal solution acquired after gradient decline is better than z certainly1And z2Be respectively the IGEPSVM of two single-views optimal classification it is flat
The case where parameter in face.It shows that multiple view method theoretically proposed by the present invention is more compared with corresponding haplopia drawing method
Add effective.
In the same manner, by introducing another multiple view collaboration standardization item
The present invention provides second optimization problem of MvGDSVM:
It can be simplified as:
Present invention conjugate gradient decent simultaneously removes optimization object function F using initialization strategy above2(p1,p2), so
Target function type (31) is provided respectively afterwards about p1And p2Gradient
The Optimal Separating Hyperplane parameter of needs is acquired now with linear MvGDSVM: first of first view is super flat
Face parameter (w1,γ1) and second hyperplane parameter (u1,ζ1) and second view first hyperplane parameter (w2,
γ2) and second hyperplane parameter (u2,ζ2).For a webpage sample to be tested x, first to the feature on each of which view to
Amount makees standardization respectively, then calculates separately the vertical range of x to two hyperplane on each view:
Then the present invention provides three kinds of different decision functions:
WhereinWithIt is the prediction result of the decision function on first view and on second view respectively,It is knot
Close the prediction result of the decision function of two views.
2. the MvGDSVM of core
For nonlinear situation, the hyperplane of karyogenesis is introduced.For two hyperplane on each view, make
Such as give a definition:
Wherein K is kernel function mentioned hereinabove
It is provided first as given a definition:
Wherein G1, H1, G2And H2It is in R(n+1)×(n+1)On symmetrical matrix, z1, p1, z2, p2It is in Rn+1On hyperplane ginseng
Number.
In the MvGDSVM of core, in order to combine the feature of two views, following multiple view collaboration standardization item is introduced
Then by combining the IGEPSVM of two single-views, first optimization that the present invention provides the MvGDSVM of core is asked
Topic:
Wherein ν is a non-negative weight parameter.Optimization problem (34) above can be simplified as
Similarly, optimization object function F is removed using conjugate gradient decent and using initialization strategy above1(z1,
z2), target function type (35) is then provided respectively about z1And z2Gradient:
In the same manner, by introducing another multiple view collaboration standardization item:
The present invention provides second optimization problem of the MvGDSVM of core
It can be simplified as:
Optimization object function F is removed with conjugate gradient decent and using initialization strategy above2(p1,p2), then distinguish
Target function type (37) is provided about p1And p2Gradient
For the MvGDSVM of core, decision function is identical with the formula (33) in linear MvGDSVM, but
Be to point to hyperplane distance definition be different from formula (32) for a webpage sample to be tested x, first to each of which
Feature vector on view makees standardization respectively, then calculates separately the distance of x to two hyperplane on each view:
Illustrate specific implementation method and verifying of the invention by the example on a true web data collection originally
The effect of invention algorithm.Web page classifying data set be from Cornell University, University of Washington,
University of Wisconsin, and University of Texas is received on this four American university department of computer science websites
Collection is gathered around there are two the webpage of view composition, and one of view is the word characteristic of webpage itself, another view
The word characteristic being directed in the hyperlink of this webpage.The dimension of the two views is 500 and 87 respectively.This data set has altogether
There are 1051 samples, wherein the webpage in relation to course 230, the webpage of unrelated course 821.500 samples are randomly choosed
To verify the classification performance of MvGDSVM algorithm of the present invention.
For this 500 webpage samples, training sample data and test sample data are divided them into first.By
The average verifying accuracy rate that 5-fold cross validation in training sample data using grid search obtains, selects optimal mould
Shape parameter.For MvGDSVM method, in addition to compound decision function, it is also considered that the decision function from each view, wherein maximum
The decision function for verifying accuracy rate will be adopted, sees formula (33).After optimal model parameter and decision function all choose,
Performance of all methods on test set will be assessed.Process above is repeated 5 times at random, then by average accuracy and
Corresponding standard deviation methodical classification performance to show.Following table be MvGDSVM and compare algorithm average classification it is accurate
Rate and standard deviation.Wherein IGEPSVM1, IGEPSVM2 and IGEPSVM3 are the IGEPSVM algorithms of single-view, the above two make respectively
With in the feature vector of first view and second view.And IGEPSVM3 connects the feature vector of two views
As a view.MvGDSVM algorithm of the invention has very well compared with corresponding haplopia drawing method as can be seen from the table
Performance improve.Compared with another classical multiple view learning method SVM-2K, algorithm of the invention is regardless of in accuracy rate
With all show preferably in stability.This illustrates that multiple view learning algorithm MvGDSVM of the invention is complete on Web page classifying
Effectively.
Algorithm | IGEPSVM1 | IGEPSVM2 | IGEPSVM3 | MvGDSVM | SVM-2K |
Accuracy rate (%) | 78.4(5.81) | 88.6(3.13) | 78.6(5.59) | 89.80(3.49) | 89.20(4.66) |
Protection content of the invention is not limited to above embodiments.Without departing from the spirit and scope of the invention, originally
Field technical staff it is conceivable that variation and advantage be all included in the present invention, and with appended claims be protect
Protect range.
Claims (5)
1. a kind of multiple view GEPSVM Web page classification method based on gradient descent method, which is characterized in that including MvGDSVM webpage
Disaggregated model parameter training step and web data classifying step;Wherein, the MvGDSVM is more views based on gradient descent method
Scheme GEPSVM;
The MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
The step C includes:
Step C1: maximized on each view two class samples between hyperplane at a distance from it is poor, while minimize same
Two hypothesis functions act on the result on different views on webpage training sample;
Step C2: optimization object function is removed using conjugate gradient decent and using initialization strategy, provides the ladder of objective function
Degree;
Step C3: Optimal Separating Hyperplane parameter is acquired using MvGDSVM;
Step C4: it calculates separately webpage training sample on each view and obtains decision function to the vertical range of two hyperplane
Prediction result;
In linear MvGDSVM, vertical range such as following formula of the webpage training sample to two hyperplane on each view:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample data
To the vertical range of first hyperplane on first view, dist12 indicates that webpage sample data arrives on first view
The vertical range of second hyperplane;Dist21 indicates webpage sample data hanging down to first hyperplane on second view
Straight distance, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Indicate net
Feature vector on page first view of sample data, x2Indicate the feature vector on second view of webpage sample data;The
First hyperplane parameter of one view is expressed as (w1,γ1), second hyperplane parameter is expressed as (u1,ζ1);Second
First hyperplane parameter of view is expressed as (w2,γ2), second hyperplane parameter is expressed as (u2,ζ2);
In the MvGDSVM of core, vertical range such as following formula of the webpage training sample to two hyperplane on each view:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample data
To the vertical range of first hyperplane on first view, dist12 indicates that webpage sample data arrives on first view
The vertical range of second hyperplane;Dist21 indicates webpage sample data hanging down to first hyperplane on second view
Straight distance, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Indicate net
Feature vector on page first view of sample data, x2Indicate the feature vector on second view of webpage sample data;The
First hyperplane parameter of one view is expressed as (w1,γ1), second hyperplane parameter is expressed as (u1,ζ1);Second
First hyperplane parameter of view is expressed as (w2,γ2), second hyperplane parameter is expressed as (u2,ζ2);MatrixIndicate feature of the first kind webpage sample data on first view;Square
Battle arrayIndicate feature of the first kind webpage sample data on second view;MatrixIndicate the second class
Feature of the webpage sample data on first view;MatrixIndicate the second class webpage sample data at second
Feature on view;K is kernel function;
The prediction result of the decision function such as following formula:
Wherein, dist11 indicates webpage sample data on first view to the vertical range of first hyperplane, dist12
Indicate webpage sample data on first view to the vertical range of second hyperplane;Dist21 indicates webpage sample data
To the vertical range of first hyperplane on second view, dist22 indicates that webpage sample data arrives on second view
The vertical range of second hyperplane;For the prediction result of the decision function on first view;It is on second view
Decision function prediction result;It is the prediction result of the decision function in conjunction with two views;
The web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
2. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute
The pretreatment stated in step B includes:
Step B1: the feature vector on each view of webpage training sample data is determined;
Step B2: make the pre- place of standardization respectively to the feature vector on each view of all webpage training sample data
Reason.
3. the multiple view GEPSVM Web page classification method described in claim 1 based on gradient descent method, which is characterized in that described
In step C, standardization item is cooperateed with to maximize the consistency classified between different views by multiple view.
4. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute
Stating the pretreatment of step b Playsization includes:
Step b1: the feature vector on each view of webpage sample data to be measured is determined;
Step b2: make standardization pretreatment respectively to the feature vector on each view of all webpage sample datas to be measured.
5. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute
It states that the web data to be measured classify in step c and includes:
Step c1: it using the optimal parameter for the MvGDSVM disaggregated model that training sample data obtain, calculates separately on each view
Vertical range of the webpage training sample to two hyperplane;
Step c2: the optimum prediction function obtained when using training is classified to treat survey grid page sample data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610307835.0A CN106022356B (en) | 2016-05-11 | 2016-05-11 | A kind of multiple view GEPSVM Web page classification method based on gradient descent method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610307835.0A CN106022356B (en) | 2016-05-11 | 2016-05-11 | A kind of multiple view GEPSVM Web page classification method based on gradient descent method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106022356A CN106022356A (en) | 2016-10-12 |
CN106022356B true CN106022356B (en) | 2019-07-26 |
Family
ID=57100334
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610307835.0A Expired - Fee Related CN106022356B (en) | 2016-05-11 | 2016-05-11 | A kind of multiple view GEPSVM Web page classification method based on gradient descent method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106022356B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101789000A (en) * | 2009-12-28 | 2010-07-28 | 青岛朗讯科技通讯设备有限公司 | Method for classifying modes in search engine |
CN101872343A (en) * | 2009-04-24 | 2010-10-27 | 罗彤 | Semi-supervised mass data hierarchy classification method |
CN103605794A (en) * | 2013-12-05 | 2014-02-26 | 国家计算机网络与信息安全管理中心 | Website classifying method |
CN105447520A (en) * | 2015-11-23 | 2016-03-30 | 盐城工学院 | Sample classification method based on weighted PTSVM (projection twin support vector machine) |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915684B (en) * | 2015-06-30 | 2018-03-27 | 苏州大学 | A kind of image-recognizing method and device based on the more plane SVMs of robust |
-
2016
- 2016-05-11 CN CN201610307835.0A patent/CN106022356B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101872343A (en) * | 2009-04-24 | 2010-10-27 | 罗彤 | Semi-supervised mass data hierarchy classification method |
CN101789000A (en) * | 2009-12-28 | 2010-07-28 | 青岛朗讯科技通讯设备有限公司 | Method for classifying modes in search engine |
CN103605794A (en) * | 2013-12-05 | 2014-02-26 | 国家计算机网络与信息安全管理中心 | Website classifying method |
CN105447520A (en) * | 2015-11-23 | 2016-03-30 | 盐城工学院 | Sample classification method based on weighted PTSVM (projection twin support vector machine) |
Non-Patent Citations (2)
Title |
---|
A co-regularized approach to semi-supervised learning with multiple views;Vikas Sindhwani 等;《Proceedings of ICML Workshop on Learning with Multiple Views》;20051231;第74-79页 |
Multisurface proximal support vector machine classification via generalized eigenvalues;Olvi L.Mangasarian 等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20060131;第29卷(第1期);第69-74页 |
Also Published As
Publication number | Publication date |
---|---|
CN106022356A (en) | 2016-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yue et al. | A deep learning framework for hyperspectral image classification using spatial pyramid pooling | |
Zhang | Community structure detection in complex networks with partial background information | |
CN105243398B (en) | The method of improvement convolutional neural networks performance based on linear discriminant analysis criterion | |
Ding et al. | Unsupervised self-correlated learning smoothy enhanced locality preserving graph convolution embedding clustering for hyperspectral images | |
CN104657718B (en) | A kind of face identification method based on facial image feature extreme learning machine | |
Luo et al. | A SVDD approach of fuzzy classification for analog circuit fault diagnosis with FWT as preprocessor | |
Stabinger et al. | 25 years of cnns: Can we compare to human abstraction capabilities? | |
CN109389207A (en) | A kind of adaptive neural network learning method and nerve network system | |
CN106485259B (en) | A kind of image classification method based on high constraint high dispersive principal component analysis network | |
CN109508644A (en) | Facial paralysis grade assessment system based on the analysis of deep video data | |
CN104298999B (en) | EO-1 hyperion feature learning method based on recurrence autocoding | |
Sun | Learning algorithm and hidden node selection scheme for local coupled feedforward neural network classifier | |
Gonzalez et al. | Diversity during training enhances detection of novel stimuli | |
CN106650818A (en) | Resting state function magnetic resonance image data classification method based on high-order super network | |
Wang et al. | Are face and object recognition independent? A neurocomputational modeling exploration | |
CN109816030A (en) | A kind of image classification method and device based on limited Boltzmann machine | |
CN109740734A (en) | A kind of method of neuron spatial arrangement in optimization convolutional neural networks | |
Zhou et al. | Fabric wrinkle rating model based on ResNet18 and optimized random vector functional-link network | |
CN107563305A (en) | Expand the face identification method of collaboration presentation class based on multisample | |
CN107194383A (en) | Based on improving Hu not bending moment and ELM traffic mark board recognition methods and device | |
CN111144453A (en) | Method and equipment for constructing multi-model fusion calculation model and method and equipment for identifying website data | |
Wang et al. | A vortex identification method based on extreme learning machine | |
CN106022356B (en) | A kind of multiple view GEPSVM Web page classification method based on gradient descent method | |
Mao et al. | Node based row-filter convolutional neural network for brain network classification | |
Chandra et al. | Detection of defects in fabrics using subimage-based singular value decomposition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP02 | Change in the address of a patent holder | ||
CP02 | Change in the address of a patent holder |
Address after: 200241 No. 500, Dongchuan Road, Shanghai, Minhang District Patentee after: EAST CHINA NORMAL University Address before: 200062 No. 3663, Putuo District, Shanghai, Zhongshan North Road Patentee before: EAST CHINA NORMAL University |
|
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190726 |