CN106022356B - A kind of multiple view GEPSVM Web page classification method based on gradient descent method - Google Patents

A kind of multiple view GEPSVM Web page classification method based on gradient descent method Download PDF

Info

Publication number
CN106022356B
CN106022356B CN201610307835.0A CN201610307835A CN106022356B CN 106022356 B CN106022356 B CN 106022356B CN 201610307835 A CN201610307835 A CN 201610307835A CN 106022356 B CN106022356 B CN 106022356B
Authority
CN
China
Prior art keywords
view
webpage
sample data
hyperplane
mvgdsvm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610307835.0A
Other languages
Chinese (zh)
Other versions
CN106022356A (en
Inventor
孙仕亮
董超
谢锡炯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201610307835.0A priority Critical patent/CN106022356B/en
Publication of CN106022356A publication Critical patent/CN106022356A/en
Application granted granted Critical
Publication of CN106022356B publication Critical patent/CN106022356B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The multiple view GEPSVM webpage classification algorithm based on gradient descent method that the invention proposes a kind of, including MvGDSVM Web page classifying model parameter training step and web data classifying step;MvGDSVM Web page classifying model parameter training step includes: step A: input webpage training sample data;Step B: webpage training sample data are pre-processed;Step C: training MvGDSVM Web page classifying model parameter;Web data classifying step includes: step a: inputting webpage sample data to be measured;Step b: it treats survey grid page sample data and is standardized pretreatment;Step c: survey grid page sample data is treated by MvGDSVM Web page classifying model and is classified.Multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the consistency classified between different views is maximized by introducing a multiple view collaboration standardization item, to be effectively combined the generalized eigenvalue proximal support vector machine of the raising of two single-views, the optimization problem of generation is finally solved using conjugate gradient decent.

Description

A kind of multiple view GEPSVM Web page classification method based on gradient descent method
Technical field
The present invention relates to Webpage classification technology field more particularly to a kind of multiple view GEPSVM nets based on gradient descent method Page sorting algorithm (abbreviation MvGDSVM webpage classification algorithm).
Background technique
In recent years, with the popularity of the internet, the network information exponentially increases, it has become people and obtains information Important means.In face of magnanimity and the network information of content complexity, oneself desired information can not many times be accurately positioned, and By the classification of webpage, rapid from the network information of magnanimity, accurate the interested information of user can be obtained.
Currently, existing generalized eigenvalue proximal support vector machine (Generalized eigenvalue proximal Support vector machine, GEPSVM) and improve property generalized eigenvalue proximal support vector machine (Improved Generalized eigenvalue proximal support vector machine, IGEPSVM) it is all simple and effective Classification method.
1. generalized eigenvalue proximal support vector machine
A) linear GEPSVM
Generalized eigenvalue proximal support vector machine is a kind of simple and effective two classification method in supervised learning, is utilized Two hyperplane classify to data point.Wherein one type of each hyperplane from two class data is close as far as possible, from another Outer one kind is remote as far as possible.Generalized eigenvalue proximal support vector machine is non-to obtain the two by solving a pair of of generalized eigenvalue problem Parallel hyperplane.
Assuming that in real space RdIn, having n label is yi(i=1,2 ..., n) the sample point of ∈ {+1, -1 }.Wherein, square Battle arrayIndicate feature of the sample for belonging to+1 class on first view, matrixIndicate the sample for belonging to -1 class Originally feature (the n on second view1+n2=n).
In real space RdTwo hyperplane of middle definition:
xTw11=0, xTw22=0 (1)
GEPSVM parallel condition of two hyperplane in the SVM for having abandoned standard closest on hyperplane, and require: the One hyperplane is as close as possible from the sample point of class+1, and the sample point from class -1 is as far as possible;Second hyperplane is from class -1 Sample point is as close as possible, and the sample point from class+1 is as far as possible.The decision objective of GEPSVM produces following a pair of of optimization problem:
With
Wherein | | | | indicate 2- norm.The two above optimization problems can be simplified as:
With
A Tikhonov standardization item is introduced, (4) and (5) can be typically canonicalized into:
With
Wherein ε is a non-negative weight coefficient.
It makes as given a definition:
Wherein G and H is two in R(d+1)×(d+1)On symmetrical matrix, z1And z2It is two in Rd+1Hyperplane parameter.That Optimization problem (6) and (7) can write a Chinese character in simplified form into:
With
Optimization problem (9) and (10) above is complete rayleigh quotient, so their globally optimal solution can be by asking Following relevant generalized eigenvalue problem is solved to obtain
(G+εI)z1=λ Hz1,z1≠0 (11)
With
(H+εI)z2=λ Gz2,z2≠0. (12)
First and second it is optimal close to hyperplane be generalized eigenvalue problem (11) and (12) respectively for minimum special The corresponding feature vector of value indicative.
Clearly for a test sample x, the anticipation function of linear GEPSVM is
Wherein | | it is an ABS function,Indicate the vertical range of x to first Optimal Separating Hyperplane,Indicate the vertical range of x to second Optimal Separating Hyperplane.This anticipation function shows if sample x is from first A Optimal Separating Hyperplane is closer, it is just assigned to class+1, and otherwise it is assigned to class -1.
B) GEPSVM of kernel method
Linear GEPSVM can arrive nonlinear situation by kernel method come extensive.Consider the super of two karyogenesis Plane replaces plane (1):
WhereinK is a kernel function.Herein, mainly consider common Gaussian kernel (Gaussian Kernel), its i-th j (i=1,2 ..., n1, j=1,2 ..., n) a element provides as follows:
Whereinμ is a Gauss nuclear parameter.Notice that plane (1) is one (14) in fact Special circumstances: assuming that using a linear kernelThen it allows
The process that training below obtains two hyperplane is identical as the training process form of linear GEPSVM.
For a test sample x, the anticipation function of the GEPSVM of core is
WhereinIndicate the distance of x to first Optimal Separating Hyperplane based on core,Indicate the distance of x to second Optimal Separating Hyperplane based on core.This anticipation function shows if sample From first Optimal Separating Hyperplane of this x is closer, it is just assigned to class+1, and otherwise it is assigned to class -1.
2. the generalized eigenvalue proximal support vector machine of raising property
A) linear IGEPSVM
The decision objective of GEPSVM produces following a pair of of optimization problem:
With
Wherein | | | | indicate 2- norm.
GEPSVM can generate singular value problem in generalized eigenvalue decomposition, and in order to overcome this defect, IGEPSVM is used Subtract to replace removing in GEPSVM to measure two class samples to the distance between Optimal Separating Hyperplane difference.So optimization problem (17) and (18) it can be converted to:
With
Wherein ν is a weight coefficient.In order to eliminate hyperplane variable (wii) (i=1,2) norm, introduce one Tikhonov standardization item.Then pass through following definition:
Then optimization problem (19) and (20) can be typically canonicalized into:
With
The two above optimization problems are complete rayleigh quotients.The Lagrangian of formula (21) can be write as:
Wherein λ1And λ2It is Lagrange multiplier.Enable formula (23) about variable (z11) to seek the value after local derviation be zero, it obtains Following equation
2(G+εI-νH)z1-2λ1Z=0.
So the globally optimal solution of optimization problem (21) can be acquired by solving following eigenvalue problem
(G+εI-νH)z11z1.(24)
Similar, the globally optimal solution of optimization problem (22) can be acquired by following eigenvalue problem
(H+εI-νG)z21z2.(25)
First and second it is optimal close to hyperplane be eigenvalue problem (24) and (25) respectively for minimal eigenvalue Corresponding feature vector.
Clearly for a test sample x, the anticipation function of linear IGEPSVM is
B) IGEPSVM of core
Linear IGEPSVM can arrive nonlinear situation by kernel method come extensive.Consider two karyogenesis Hyperplane replaces plane (1):
WhereinThe kernel function that K is mentioned before being.
Subsequent training process is identical as the training process form of linear IGEPSVM.And the prediction of the IGEPSVM of core is quasi- It is then identical with the GEPSVM of core.
But currently, existing either generalized eigenvalue proximal support vector machine, or the generalized eigenvalue of raising property Closest to supporting vector Web page classifying application be not very extensively.Because the classification that the two algorithms are all single-views is calculated Method has certain limitation.They cannot all make full use of the characteristic information on multiple views of webpage, point on webpage The space that class precision is also improved.
Summary of the invention
The present invention proposes that a kind of multiple view GEPSVM algorithm is used for Web page classifying, can make full use of multiple views of webpage Information, Lai Tigao classification performance.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, including MvGDSVM net Page disaggregated model parameter training step and web data classifying step;
The MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
The web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the step B Pretreatment includes:
Step B1: the feature vector on each view of webpage training sample data is determined;
Step B2: the feature vector on each view of all webpage training sample data is made at standardization respectively Reason.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the step C, lead to Multiple view collaboration standardization item is crossed to maximize the consistency classified between different views.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step C includes:
Step C1: maximized on each view two class samples between hyperplane at a distance from it is poor, while minimize same Two hypothesis functions act on the result on different views on one webpage training sample;
Step C2: conjugate gradient decent optimization object function is used, the gradient of objective function is provided.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step C is into one Step includes:
Step C3: Optimal Separating Hyperplane parameter is acquired using MvGDSVM;
Step C4: it calculates separately webpage training sample on each view and obtains decision to the vertical range of two hyperplane The prediction result of function.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the step b acceptance of the bid Standardization pre-processes
Step b1: the feature vector on each view of webpage sample data to be measured is determined;
Step b2: standardization is made respectively to the feature vector on each view of all webpage web datas to be measured.
It is right in the step c in multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method The web data to be measured carries out classification
Step c1: using the optimal parameter for the MvGDSVM disaggregated model that training sample data obtain, each view is calculated separately Vertical range of the sample to two hyperplane on figure;
Step c2: the optimum prediction function obtained when using training is classified to treat survey grid page sample data.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, linear MvGDSVM In, vertical range such as following formula of the webpage sample to two hyperplane on each view:
view 1:
view 2:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample Data are to the vertical range of first hyperplane on first view, and dist12 expression webpage sample data is in first view On to second hyperplane vertical range;Dist21 indicates webpage sample data on second view to first hyperplane Vertical range, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Table Show the feature vector on first view of webpage sample data, x2Indicate second view of webpage sample data on feature to Amount;First hyperplane parameter of first view is expressed as (w11), second hyperplane parameter is expressed as (u11);The First hyperplane parameter of two views is expressed as (w22), second hyperplane parameter is expressed as (u22)。
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, in the MvGDSVM of core, Distance such as following formula of the webpage sample to two hyperplane on each view:
view 1:
view 2:
Wherein,MatrixIndicate that first kind webpage sample data is regarded at first Feature on figure;MatrixIndicate feature of the first kind webpage sample data on second view;MatrixIndicate feature of the second class webpage sample data on first view;MatrixIndicate the second class net Feature of the page sample data on second view;K is kernel function.
In multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method, the decision function Prediction result such as following formula:
Wherein,For the prediction result of the decision function on first view;It is the decision function on second view Prediction result;It is the prediction result of the decision function in conjunction with two views.
Multiple view GEPSVM webpage classification algorithm proposed by the present invention based on gradient descent method is regarded one more by introducing Figure collaboration standardizes item to maximize the consistency classified between different views, to be effectively combined the raising of two single-views The generalized eigenvalue proximal support vector machine (IGEPSVM) of property.The excellent of generation is finally solved using conjugate gradient decent Change problem.
Detailed description of the invention
Fig. 1 is that the present invention is based on the process frame diagrams of the multiple view GEPSVM webpage classification algorithm of gradient descent method.
Specific embodiment
In conjunction with following specific embodiments and attached drawing, the invention will be described in further detail.Implement process of the invention, item Part, experimental method etc. are among the general principles and common general knowledge in the art in addition to what is specifically mentioned below, the present invention There are no special restrictions to content.
The multiple view GEPSVM webpage classification algorithm based on gradient descent method that the invention proposes a kind of, including MvGDSVM Web page classifying model parameter training step and web data classifying step.
In the present invention, MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
In the present invention, the web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
MvGDSVM webpage classification algorithm proposed by the present invention, is divided into two kinds of situations of linear and nonlinear: 1. is linear MvGDSVM
Consider two classification problems an of web data, giving n label is yi(i=1,2 ..., n) ∈ {+1, -1 } Webpage sample point.After webpage training sample data prediction, the spy on each view of all webpage training samples is obtained Levy vector.Wherein matrixIndicate feature of the sample point for belonging to class+1 on first view, matrix Indicate feature of the sample point for belonging to class+1 on second view.MatrixIndicate that the sample point for belonging to class -1 exists Feature on first view, matrixIndicate feature of the sample point for belonging to class -1 on second view.It is aobvious So, n1+n2=n.
For each view, the definition of following two hyperplane is provided respectively:
Wherein x1(x2) indicate x first (second) view feature.
The present invention is provided as given a definition:
Wherein, G1And H1BeOn symmetrical matrix, G2And H2BeOn symmetrical matrix.z1 And p1Be twoHyperplane parameter, z2And p2Be twoOn hyperplane parameter.
For the feature of two views of combination, following multiple view collaboration standardization item is introduced:
By combining the IGEPSVM of two single-views, the present invention provides first optimization problem of MvGDSVM:
Wherein ν, δ are non-negative weight parameters.
The objective function of optimization problem above it is understood that at: maximize on each view two class samples and super flat The distance between face difference minimizes two functions on the same training sample at the same timeWithEffect Result on different views.
Optimization problem (28) above can be simplified as:
In the present invention, optimization object function F is removed using conjugate gradient decent1(z1,z2), target letter is then provided respectively Number is about z1And z2Gradient
For a non-convex function, what gradient descent method acquired is locally optimal solution.So it cannot ensure to be optimized The optimal solution of problem (29).For preferable Optimal Separating Hyperplane, three groups of difference z are selected1,z2Initial value come to optimization problem (29) gradient decline is carried out:
1.z1And z2It is acquired respectively by the IGEPSVM of single-view.
2.z1And z2It is the unit column vector of corresponding dimension respectively.
3.z1And z2The column vector of corresponding dimension respectively, their each element be between [- 1,1] value with Machine number.
First group of initial value is used as referring generally to the initialization strategy that it can ensure to use carries out objective function (29) The locally optimal solution acquired after gradient decline is better than z certainly1And z2Be respectively the IGEPSVM of two single-views optimal classification it is flat The case where parameter in face.It shows that multiple view method theoretically proposed by the present invention is more compared with corresponding haplopia drawing method Add effective.
In the same manner, by introducing another multiple view collaboration standardization item
The present invention provides second optimization problem of MvGDSVM:
It can be simplified as:
Present invention conjugate gradient decent simultaneously removes optimization object function F using initialization strategy above2(p1,p2), so Target function type (31) is provided respectively afterwards about p1And p2Gradient
The Optimal Separating Hyperplane parameter of needs is acquired now with linear MvGDSVM: first of first view is super flat Face parameter (w11) and second hyperplane parameter (u11) and second view first hyperplane parameter (w2, γ2) and second hyperplane parameter (u22).For a webpage sample to be tested x, first to the feature on each of which view to Amount makees standardization respectively, then calculates separately the vertical range of x to two hyperplane on each view:
Then the present invention provides three kinds of different decision functions:
WhereinWithIt is the prediction result of the decision function on first view and on second view respectively,It is knot Close the prediction result of the decision function of two views.
2. the MvGDSVM of core
For nonlinear situation, the hyperplane of karyogenesis is introduced.For two hyperplane on each view, make Such as give a definition:
Wherein K is kernel function mentioned hereinabove
It is provided first as given a definition:
Wherein G1, H1, G2And H2It is in R(n+1)×(n+1)On symmetrical matrix, z1, p1, z2, p2It is in Rn+1On hyperplane ginseng Number.
In the MvGDSVM of core, in order to combine the feature of two views, following multiple view collaboration standardization item is introduced
Then by combining the IGEPSVM of two single-views, first optimization that the present invention provides the MvGDSVM of core is asked Topic:
Wherein ν is a non-negative weight parameter.Optimization problem (34) above can be simplified as
Similarly, optimization object function F is removed using conjugate gradient decent and using initialization strategy above1(z1, z2), target function type (35) is then provided respectively about z1And z2Gradient:
In the same manner, by introducing another multiple view collaboration standardization item:
The present invention provides second optimization problem of the MvGDSVM of core
It can be simplified as:
Optimization object function F is removed with conjugate gradient decent and using initialization strategy above2(p1,p2), then distinguish Target function type (37) is provided about p1And p2Gradient
For the MvGDSVM of core, decision function is identical with the formula (33) in linear MvGDSVM, but Be to point to hyperplane distance definition be different from formula (32) for a webpage sample to be tested x, first to each of which Feature vector on view makees standardization respectively, then calculates separately the distance of x to two hyperplane on each view:
Illustrate specific implementation method and verifying of the invention by the example on a true web data collection originally The effect of invention algorithm.Web page classifying data set be from Cornell University, University of Washington, University of Wisconsin, and University of Texas is received on this four American university department of computer science websites Collection is gathered around there are two the webpage of view composition, and one of view is the word characteristic of webpage itself, another view The word characteristic being directed in the hyperlink of this webpage.The dimension of the two views is 500 and 87 respectively.This data set has altogether There are 1051 samples, wherein the webpage in relation to course 230, the webpage of unrelated course 821.500 samples are randomly choosed To verify the classification performance of MvGDSVM algorithm of the present invention.
For this 500 webpage samples, training sample data and test sample data are divided them into first.By The average verifying accuracy rate that 5-fold cross validation in training sample data using grid search obtains, selects optimal mould Shape parameter.For MvGDSVM method, in addition to compound decision function, it is also considered that the decision function from each view, wherein maximum The decision function for verifying accuracy rate will be adopted, sees formula (33).After optimal model parameter and decision function all choose, Performance of all methods on test set will be assessed.Process above is repeated 5 times at random, then by average accuracy and Corresponding standard deviation methodical classification performance to show.Following table be MvGDSVM and compare algorithm average classification it is accurate Rate and standard deviation.Wherein IGEPSVM1, IGEPSVM2 and IGEPSVM3 are the IGEPSVM algorithms of single-view, the above two make respectively With in the feature vector of first view and second view.And IGEPSVM3 connects the feature vector of two views As a view.MvGDSVM algorithm of the invention has very well compared with corresponding haplopia drawing method as can be seen from the table Performance improve.Compared with another classical multiple view learning method SVM-2K, algorithm of the invention is regardless of in accuracy rate With all show preferably in stability.This illustrates that multiple view learning algorithm MvGDSVM of the invention is complete on Web page classifying Effectively.
Algorithm IGEPSVM1 IGEPSVM2 IGEPSVM3 MvGDSVM SVM-2K
Accuracy rate (%) 78.4(5.81) 88.6(3.13) 78.6(5.59) 89.80(3.49) 89.20(4.66)
Protection content of the invention is not limited to above embodiments.Without departing from the spirit and scope of the invention, originally Field technical staff it is conceivable that variation and advantage be all included in the present invention, and with appended claims be protect Protect range.

Claims (5)

1. a kind of multiple view GEPSVM Web page classification method based on gradient descent method, which is characterized in that including MvGDSVM webpage Disaggregated model parameter training step and web data classifying step;Wherein, the MvGDSVM is more views based on gradient descent method Scheme GEPSVM;
The MvGDSVM Web page classifying model parameter training step includes:
Step A: input webpage training sample data;
Step B: the webpage training sample data are pre-processed;
Step C: training MvGDSVM Web page classifying model parameter;
The step C includes:
Step C1: maximized on each view two class samples between hyperplane at a distance from it is poor, while minimize same Two hypothesis functions act on the result on different views on webpage training sample;
Step C2: optimization object function is removed using conjugate gradient decent and using initialization strategy, provides the ladder of objective function Degree;
Step C3: Optimal Separating Hyperplane parameter is acquired using MvGDSVM;
Step C4: it calculates separately webpage training sample on each view and obtains decision function to the vertical range of two hyperplane Prediction result;
In linear MvGDSVM, vertical range such as following formula of the webpage training sample to two hyperplane on each view:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample data To the vertical range of first hyperplane on first view, dist12 indicates that webpage sample data arrives on first view The vertical range of second hyperplane;Dist21 indicates webpage sample data hanging down to first hyperplane on second view Straight distance, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Indicate net Feature vector on page first view of sample data, x2Indicate the feature vector on second view of webpage sample data;The First hyperplane parameter of one view is expressed as (w11), second hyperplane parameter is expressed as (u11);Second First hyperplane parameter of view is expressed as (w22), second hyperplane parameter is expressed as (u22);
In the MvGDSVM of core, vertical range such as following formula of the webpage training sample to two hyperplane on each view:
Wherein, view 1 and view 2 respectively indicates first view and second view;Dist11 indicates webpage sample data To the vertical range of first hyperplane on first view, dist12 indicates that webpage sample data arrives on first view The vertical range of second hyperplane;Dist21 indicates webpage sample data hanging down to first hyperplane on second view Straight distance, dist22 indicate webpage sample data on second view to the vertical range of second hyperplane;x1Indicate net Feature vector on page first view of sample data, x2Indicate the feature vector on second view of webpage sample data;The First hyperplane parameter of one view is expressed as (w11), second hyperplane parameter is expressed as (u11);Second First hyperplane parameter of view is expressed as (w22), second hyperplane parameter is expressed as (u22);MatrixIndicate feature of the first kind webpage sample data on first view;Square Battle arrayIndicate feature of the first kind webpage sample data on second view;MatrixIndicate the second class Feature of the webpage sample data on first view;MatrixIndicate the second class webpage sample data at second Feature on view;K is kernel function;
The prediction result of the decision function such as following formula:
Wherein, dist11 indicates webpage sample data on first view to the vertical range of first hyperplane, dist12 Indicate webpage sample data on first view to the vertical range of second hyperplane;Dist21 indicates webpage sample data To the vertical range of first hyperplane on second view, dist22 indicates that webpage sample data arrives on second view The vertical range of second hyperplane;For the prediction result of the decision function on first view;It is on second view Decision function prediction result;It is the prediction result of the decision function in conjunction with two views;
The web data classifying step includes:
Step a: webpage sample data to be measured is inputted;
Step b: pretreatment is standardized to the webpage sample data to be measured;
Step c: classified by MvGDSVM Web page classifying model to the webpage sample data to be measured.
2. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute The pretreatment stated in step B includes:
Step B1: the feature vector on each view of webpage training sample data is determined;
Step B2: make the pre- place of standardization respectively to the feature vector on each view of all webpage training sample data Reason.
3. the multiple view GEPSVM Web page classification method described in claim 1 based on gradient descent method, which is characterized in that described In step C, standardization item is cooperateed with to maximize the consistency classified between different views by multiple view.
4. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute Stating the pretreatment of step b Playsization includes:
Step b1: the feature vector on each view of webpage sample data to be measured is determined;
Step b2: make standardization pretreatment respectively to the feature vector on each view of all webpage sample datas to be measured.
5. the multiple view GEPSVM Web page classification method based on gradient descent method as described in claim 1, which is characterized in that institute It states that the web data to be measured classify in step c and includes:
Step c1: it using the optimal parameter for the MvGDSVM disaggregated model that training sample data obtain, calculates separately on each view Vertical range of the webpage training sample to two hyperplane;
Step c2: the optimum prediction function obtained when using training is classified to treat survey grid page sample data.
CN201610307835.0A 2016-05-11 2016-05-11 A kind of multiple view GEPSVM Web page classification method based on gradient descent method Expired - Fee Related CN106022356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610307835.0A CN106022356B (en) 2016-05-11 2016-05-11 A kind of multiple view GEPSVM Web page classification method based on gradient descent method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610307835.0A CN106022356B (en) 2016-05-11 2016-05-11 A kind of multiple view GEPSVM Web page classification method based on gradient descent method

Publications (2)

Publication Number Publication Date
CN106022356A CN106022356A (en) 2016-10-12
CN106022356B true CN106022356B (en) 2019-07-26

Family

ID=57100334

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610307835.0A Expired - Fee Related CN106022356B (en) 2016-05-11 2016-05-11 A kind of multiple view GEPSVM Web page classification method based on gradient descent method

Country Status (1)

Country Link
CN (1) CN106022356B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789000A (en) * 2009-12-28 2010-07-28 青岛朗讯科技通讯设备有限公司 Method for classifying modes in search engine
CN101872343A (en) * 2009-04-24 2010-10-27 罗彤 Semi-supervised mass data hierarchy classification method
CN103605794A (en) * 2013-12-05 2014-02-26 国家计算机网络与信息安全管理中心 Website classifying method
CN105447520A (en) * 2015-11-23 2016-03-30 盐城工学院 Sample classification method based on weighted PTSVM (projection twin support vector machine)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104915684B (en) * 2015-06-30 2018-03-27 苏州大学 A kind of image-recognizing method and device based on the more plane SVMs of robust

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101872343A (en) * 2009-04-24 2010-10-27 罗彤 Semi-supervised mass data hierarchy classification method
CN101789000A (en) * 2009-12-28 2010-07-28 青岛朗讯科技通讯设备有限公司 Method for classifying modes in search engine
CN103605794A (en) * 2013-12-05 2014-02-26 国家计算机网络与信息安全管理中心 Website classifying method
CN105447520A (en) * 2015-11-23 2016-03-30 盐城工学院 Sample classification method based on weighted PTSVM (projection twin support vector machine)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A co-regularized approach to semi-supervised learning with multiple views;Vikas Sindhwani 等;《Proceedings of ICML Workshop on Learning with Multiple Views》;20051231;第74-79页
Multisurface proximal support vector machine classification via generalized eigenvalues;Olvi L.Mangasarian 等;《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》;20060131;第29卷(第1期);第69-74页

Also Published As

Publication number Publication date
CN106022356A (en) 2016-10-12

Similar Documents

Publication Publication Date Title
Yue et al. A deep learning framework for hyperspectral image classification using spatial pyramid pooling
Zhang Community structure detection in complex networks with partial background information
CN105243398B (en) The method of improvement convolutional neural networks performance based on linear discriminant analysis criterion
Ding et al. Unsupervised self-correlated learning smoothy enhanced locality preserving graph convolution embedding clustering for hyperspectral images
CN104657718B (en) A kind of face identification method based on facial image feature extreme learning machine
Luo et al. A SVDD approach of fuzzy classification for analog circuit fault diagnosis with FWT as preprocessor
Stabinger et al. 25 years of cnns: Can we compare to human abstraction capabilities?
CN109389207A (en) A kind of adaptive neural network learning method and nerve network system
CN106485259B (en) A kind of image classification method based on high constraint high dispersive principal component analysis network
CN109508644A (en) Facial paralysis grade assessment system based on the analysis of deep video data
CN104298999B (en) EO-1 hyperion feature learning method based on recurrence autocoding
Sun Learning algorithm and hidden node selection scheme for local coupled feedforward neural network classifier
Gonzalez et al. Diversity during training enhances detection of novel stimuli
CN106650818A (en) Resting state function magnetic resonance image data classification method based on high-order super network
Wang et al. Are face and object recognition independent? A neurocomputational modeling exploration
CN109816030A (en) A kind of image classification method and device based on limited Boltzmann machine
CN109740734A (en) A kind of method of neuron spatial arrangement in optimization convolutional neural networks
Zhou et al. Fabric wrinkle rating model based on ResNet18 and optimized random vector functional-link network
CN107563305A (en) Expand the face identification method of collaboration presentation class based on multisample
CN107194383A (en) Based on improving Hu not bending moment and ELM traffic mark board recognition methods and device
CN111144453A (en) Method and equipment for constructing multi-model fusion calculation model and method and equipment for identifying website data
Wang et al. A vortex identification method based on extreme learning machine
CN106022356B (en) A kind of multiple view GEPSVM Web page classification method based on gradient descent method
Mao et al. Node based row-filter convolutional neural network for brain network classification
Chandra et al. Detection of defects in fabrics using subimage-based singular value decomposition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 200241 No. 500, Dongchuan Road, Shanghai, Minhang District

Patentee after: EAST CHINA NORMAL University

Address before: 200062 No. 3663, Putuo District, Shanghai, Zhongshan North Road

Patentee before: EAST CHINA NORMAL University

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190726