CN105046269A - Multi-instance multi-label scene classification method based on multinuclear fusion - Google Patents

Multi-instance multi-label scene classification method based on multinuclear fusion Download PDF

Info

Publication number
CN105046269A
CN105046269A CN201510344990.5A CN201510344990A CN105046269A CN 105046269 A CN105046269 A CN 105046269A CN 201510344990 A CN201510344990 A CN 201510344990A CN 105046269 A CN105046269 A CN 105046269A
Authority
CN
China
Prior art keywords
many
bag
threshold value
sample datas
numbering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510344990.5A
Other languages
Chinese (zh)
Other versions
CN105046269B (en
Inventor
邹海林
陈彤彤
丁昕苗
柳婵娟
刘影
申倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ludong University
Original Assignee
Ludong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ludong University filed Critical Ludong University
Priority to CN201510344990.5A priority Critical patent/CN105046269B/en
Publication of CN105046269A publication Critical patent/CN105046269A/en
Application granted granted Critical
Publication of CN105046269B publication Critical patent/CN105046269B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a multi-instance multi-label scene classification method based on multinuclear fusion. The method comprises the following steps of inputting one multi-instance multi-label data set and splitting into a multi-instance data set and a multi-label data set; using different thresholds to establish a correlation matrix for each package in the multi-instance data set; according to the obtained correlation matrix, calculating a basic nuclear function between each two multi-instance data packages under a same threshold, wherein a basic nuclear function value forms a basic nuclear matrix; carrying out convex combination on element values of a same position in the basic nuclear matrixes under the different thresholds so as to obtain a multinuclear matrix; using the multi-label data set to carry out training so as to obtain a plurality of multinuclear SVM classifiers. The plurality of multinuclear SVM classifiers are used for predicting a label set of an unknown multi-instance data package so as to realize scene classification. By using the multi-instance multi-label scene classification method based on the multinuclear fusion, scene classification accuracy is increased. The invention also relates to a multi-instance multi-label scene classification system based on the multinuclear fusion.

Description

A kind of many example many labels scene classification method based on multi-core integration
Technical field
The present invention relates to machine learning techniques field, be specifically related to a kind of many example many labels scene classification method based on multi-core integration.
Background technology
Multi-instance learning learns by supervision type a kind of learning method of developing out, propose when the nineties in 20th century, people's drugs was active first, it regards each pharmacy molecule as a bag, each isomeride of molecule regards an example in bag as, if this molecule has a kind of isomeride to be suitable for pharmacy, be then positive closure by packet making corresponding for this molecule, otherwise be labeled as negative bag, a learning system is finally constructed by the method, and then the known molecule be suitable for or be unsuitable for pharmacy is learnt, correctly predict whether other new molecules are applicable to pharmacy.Since then, multi-instance learning becomes the focus of research always, and is widely used in, in the systematic searching of text, image and video, being incorporated in many labelings problem by multi-instance learning again subsequently, proposes many example many labels learning frameworks.
At present, for the solution of many example many labels problems concerning study, all by being converted into many examples list label problem or single example many labels problem, and then be converted into traditional supervised learning problem and solve, it represents the learning framework that algorithm has MIML_BOOST method and MIML_SVM method, wherein, MIML_BOOST algorithm first many example many labels problems is converted into many examples list label problem, recycling MIBOOSTING algorithm solves many example problem, but this method solves in many example problem process utilizing MIBOOSTING algorithm, establishing a capital owing to differing in positive closure is positive example, add that the way of this bag label can cause larger error to all examples of bag, MIML_SVM algorithm first many example many labels problems is converted into single example many labels problem, recycling MLSVM algorithm solves many labels problem, but MIMLSVM algorithm represents the distance of wrapping between bag with the minimum Hausdorff distance in two bags between example and example, when the negative example of a positive closure is very similar to negative example of wrapping, the method for expressing of this distance can cause the identification of positive bags and negative bags to decline, and affects classifying quality.In addition, also have KISAR algorithm, this algorithm is by finding the Taxonomy and evolution with the many examples of the maximally related example implementation of a certain class label many labels problem in each bag.Somebody proposes many example many labels learning algorithms MIMLwel (Multi-InstanceMulti-LabelLearningwithWeakLabel) with soft label.In order to realize many Tag Estimations of large data sets efficiently, someone proposes MIMLfast algorithm again, first this algorithm builds the low n-dimensional subspace n that all label shares, and then utilizes stochastic gradient descent method to train the specific linear model of label, thus Optimal scheduling loss.
Although above algorithm achieves good effect in the many example many labels problems of solution, all do not consider the correlativity of wrapping interior example.And in a lot of practical application, especially scene classification problem, the independence assumption of example is difficult to ensure, classifying quality will be caused so undesirable.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of many example many labels scene classification method based on multi-core integration, improves scene classification accuracy.
The technical scheme that the present invention solves the problems of the technologies described above is as follows: a kind of many example many labels scene classification method based on multi-core integration, comprise the following steps:
Step 1, input example many label datas more than collection, is designated as and described many example many label datas collection are split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Step 2, uses each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix W i s; Described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value;
Step 3, to seek common ground every basic kernel function more than two between sample data bag under a threshold value according to the correlation matrix obtained in step 2, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
Step 4, by the basic nuclear matrix K under the different threshold values that obtain in step 3 gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K;
Step 5, utilizes many label datas collection Y ilearn with the multi-kernel function obtained in step 4, obtain multiple multinuclear SVM classifier, the quantity of described sorter is identical with the labels class quantity that described many label datas are concentrated, and described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification.
The invention has the beneficial effects as follows: respectively correlation matrix is set up to each many sample datas bag by using different threshold values, then can by the dependency expression in many sample datas bag between example out, then the correlation matrix of foundation is carried out to process the basic nuclear matrix obtained under each threshold value, and then same position element combinations in each basic nuclear matrix is obtained multinuclear matrix, the mode of multi-core integration is adopted to obtain multi-kernel function, can learn for many labels of data centralization, thus obtain multiple multinuclear SVM classifier, be applicable to many labelings problem, also can be applicable to the complex situations such as sample set Heterogeneous data simultaneously.
On the basis of technique scheme, the present invention can also do following improvement.
Further, use a threshold value to a bag X in described step 2 ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix W i s, wherein, W i ssubscript s represent the numbering of threshold value, W i ssubscript i represent the numbering of many sample datas bag.
The beneficial effect of above-mentioned further scheme is adopted to be: the correlationship in many sample datas bag between example to be showed with a correlation matrix, many sample datas bag is made to be represented to be converted into by multiple example and represented by a correlation matrix, and due to the threshold value of the correlation matrix of the prediction correspondence of different classes of label different, so the foundation of correlation matrix under different threshold value can solve many labelings problem better.
Further, the threshold value t in described step 2.2 svalue is [0,4], and the number of described threshold value is [10,15].
Adopting the beneficial effect of above-mentioned further scheme to be by being limited within the scope of one threshold number, avoiding because threshold number is too many and increase the complexity of method.
Further, described step 3 every basic kernel function more than two between sample data bag under asking same threshold value according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W i a s W j b s k s ( x i a , x j b ) Σ a = 1 n i W i a s Σ b = 1 n j W j b s
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix W i sin the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
The beneficial effect of above-mentioned further scheme is adopted to be by using general kernel function and correlation matrix to express basic kernel function more than two between sample data bag, take into full account the correlative character between example in many sample datas bag, feature is mapped to higher dimensional space from lower dimensional space simultaneously, realizes classification.
Further, adopt convex combination to the basic nuclear matrix K under different threshold value in described step 4 gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1 ;
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
The beneficial effect of above-mentioned further scheme is adopted to be adopt convex combination multiple basic kernel function to be combined, thus reach the object combining multiple feature space, former data set is made to be mapped to multiple different feature space, through convex combination make the dirigibility of method and accuracy higher, labeling problem can be applicable to, also can solve the complex situations such as sample set Heterogeneous data preferably simultaneously.
A kind of many example many labels scene classification systems based on multi-core integration of the present invention, comprising:
Load module, for inputting the many label datas of example more than collection, is designated as and described many example many label datas collection is split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Correlation matrix sets up module, for using each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix W i s; Described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value;
Basic nuclear matrix module, to seek common ground every basic kernel function more than two between sample data bag under a threshold value for setting up the correlation matrix obtained in module according to correlation matrix, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
Composite module, for the basic nuclear matrix K under the different threshold values that will obtain in basic nuclear matrix module gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K;
Study module, for utilizing many label datas collection Y ilearn with the multi-kernel function obtained in composite module, obtain multiple multinuclear SVM classifier, the quantity of described sorter is identical with the labels class quantity that described many label datas are concentrated, and described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification.
The beneficial effect of employing technique scheme is: set up correlation matrix to each many sample datas bag respectively by using different threshold values, then can by the dependency expression in many sample datas bag between example out, then the correlation matrix of foundation is carried out to process the basic nuclear matrix obtained under each threshold value, and then same position element combinations in each basic nuclear matrix is obtained multinuclear matrix, the mode of multi-core integration is adopted to obtain multi-kernel function, can learn for many labels of data centralization, thus obtain multiple multinuclear SVM classifier, be applicable to many labelings problem, also can be applicable to the complex situations such as sample set Heterogeneous data simultaneously.
Further, the technical scheme adopted is:
Described correlation matrix is set up in module and is used a threshold value to a bag X ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix W i s, wherein, W i ssubscript s represent the numbering of threshold value, W i ssubscript i represent the numbering of many sample datas bag.
The beneficial effect of above-mentioned further technical scheme is adopted to be: the correlationship in many sample datas bag between example to be showed with a correlation matrix, many sample datas bag is made to be represented to be converted into by multiple example and represented by a correlation matrix, and due to the threshold value of the correlation matrix of the prediction correspondence of different classes of label different, so the foundation of correlation matrix under different threshold value can solve many labelings problem better.
Further, the technical scheme adopted is: the threshold value t value in described step 2.2 is [0,4], and the number of described threshold value is [10,15].
Adopting the beneficial effect of above-mentioned further scheme to be by being limited within the scope of one threshold number, avoiding because threshold number is too many and increase the complexity of method.
Further, the technical scheme adopted is: described basic nuclear matrix module asks every basic kernel function more than two between sample data bag according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W i a s W j b s k s ( x i a , x j b ) Σ a = 1 n i W i a s Σ b = 1 n j W j b s ;
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix W i sin the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
The beneficial effect of technique scheme is adopted to be by using general kernel function and correlation matrix to express basic kernel function more than two between sample data bag, take into full account the correlative character between example in many sample datas bag, feature is mapped to higher dimensional space from lower dimensional space simultaneously, realizes classification.
Further, the technical scheme adopted is: described composite module adopts convex combination to the basic nuclear matrix K under different threshold value gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1 ;
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
The beneficial effect of above-mentioned further scheme is adopted to be adopt convex combination multiple basic kernel function to be combined, thus reach the object combining multiple feature space, former data set is made to be mapped to multiple different feature space, through convex combination make the dirigibility of method and accuracy higher, labeling problem can be applicable to, also can solve the complex situations such as sample set Heterogeneous data preferably simultaneously.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of a kind of many example many labels scene classification method based on multi-core integration of the present invention.
Embodiment
Be described principle of the present invention and feature below in conjunction with accompanying drawing, example, only for explaining the present invention, is not intended to limit scope of the present invention.
As shown in Figure 1, a kind of many example many labels scene classification method based on multi-core integration of the present invention, comprise the following steps:
Step 1, input example many label datas more than collection, is designated as and described many example many label datas collection are split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Step 2, uses each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix W i s; Described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value;
Use a threshold value to a bag X in step 2 ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix W i s, wherein, W i ssubscript s represent the numbering of threshold value, W i ssubscript i represent the numbering of many sample datas bag; Threshold value t svalue is [0,4], and the number of described threshold value is [10,15].
Such as: be provided with 3 width images, i.e. 3 bags, numbering is respectively 1,2,3; Comprise 2 respectively, 6,7 examples in each bag, choosing 3 threshold values is: t 1, t 2, t 3;
First, threshold value t is used 1respectively correlation matrix is set up to these 3 bags, is specially:
For first bag, define the matrix of 2 × 2, judge whether the Gauss's distance in bag 1 between example 1 and example 1 is less than t 1if be less than, then the element assignment arranged by the 1st row the 1st of this matrix is 1; Otherwise assignment is 0; Then judge example 1 and example 2 successively, example 2 and example 1, whether the Gauss's distance between example 2 and example 2 is less than threshold value t 1if be less than, then arranged by this matrix the 1st row the 2nd, the 2nd row the 1st arranges, and the element assignment that the 2nd row the 2nd arranges is 1; Otherwise assignment is 0; Obtaining threshold value is t 1time correlation matrix W 1 1;
For second bag, define the matrix of 6 × 6, judge whether the Gauss's distance in bag 2 between every two examples is less than t 1if be less than, be then 1 by the element assignment of this matrix correspondence position, otherwise assignment is 0; Obtaining threshold value is t 1time correlation matrix W 1 2;
For the 3rd bag, in like manner obtaining threshold value is t 1correlation matrix W 1 3;
In like manner, threshold value t is used 2respectively correlation matrix is set up to these 3 bags, be respectively
In like manner, threshold value t is used 3respectively correlation matrix is set up to these 3 bags, be respectively
Step 3, to seek common ground every basic kernel function more than two between sample data bag under a threshold value according to the correlation matrix obtained in step 2, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
Step 3 is every basic kernel function more than two between sample data bag under asking same threshold value according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W i a s W j b s k s ( x i a , x j b ) Σ a = 1 n i W i a s Σ b = 1 n j W j b s
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix W i sin the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
Step 4, by the basic nuclear matrix K under the different threshold values that obtain in step 3 gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K;
Adopt convex combination to the basic nuclear matrix K under different threshold value in step 4 gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
Such as: respectively by threshold value t 1, t 2, t 3under in the basic nuclear matrix that obtains the element value of same position carry out convex combination, obtain the element value of same position in multinuclear matrix, in like manner, the like, then can obtain multinuclear matrix.
Step 5, utilizes many label datas collection Y ito the multi-kernel function study obtained in step 4, namely multiple sorter is obtained, the quantity of described sorter is identical with the number of labels that described many label datas are concentrated, described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification, is specially and adopts SimpleMKL method to learn multi-kernel function.
A kind of many example many labels scene classification systems based on multi-core integration of the present invention, comprising:
Load module, for inputting the many label datas of example more than collection, is designated as and described many example many label datas collection is split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Correlation matrix sets up module, for using each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix W i s; Described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value; Correlation matrix is set up in module and is used a threshold value to a bag X ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix W i s, wherein, W i ssubscript s represent the numbering of threshold value, W i ssubscript i represent the numbering of many sample datas bag; Threshold value t value in step 2.2 is [0,4], and the number of described threshold value is [10,15].
Basic nuclear matrix module, to seek common ground every basic kernel function more than two between sample data bag under a threshold value for setting up the correlation matrix obtained in module according to correlation matrix, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value; Basic nuclear matrix module asks every basic kernel function more than two between sample data bag according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W i a s W j b s k s ( x i a , x j b ) Σ a = 1 n i W i a s Σ b = 1 n j W j b s ;
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix W i sin the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
Composite module, for the basic nuclear matrix K under the different threshold values that will obtain in basic nuclear matrix module gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K; Composite module adopts convex combination to the basic nuclear matrix K under different threshold value gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1 ;
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
Study module, for utilizing many label datas collection Y ilearn with the multi-kernel function obtained in composite module, obtain multiple multinuclear SVM classifier, the quantity of described sorter is identical with the labels class quantity that described many label datas are concentrated, and described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification.
Below by experiment, effect of the present invention is described:
For verifying the performance of the inventive method, MSRCv2 and Scene image data set is tested.MSRCv2 data set comprises 591 pictures, belongs to 23 classes altogether, wherein has a lot of image to belong to multiclass simultaneously.By segmentation, every pictures proper vector composition comprising multiple corresponding zones of different.Scene classification image data set is made up of 2000 natural scene pictures, and these pictures belong to 5 class such as desert, mountain range, wherein have the picture more than 20% to belong to multiple classification simultaneously.Every width imagery exploitation SBN algorithm is divided into 9 regions, and each region is represented by the proper vector of one 15 dimension, and 9 regions of every width image exemplarily constitute bag corresponding to this width image.
In traditional supervised learning, each object only has a label, usually only assess performance is carried out with accuracy just passable, but for many labels problem concerning study, need an object prediction multiclass label, only do not have cogency with accuracy, therefore usually evaluating many labels learning performance by 5 indexs, is hammingloss, one-error, coverage, rankingloss, averageprecision respectively.Wherein first 4 refer to that target value is less, and performance is better, and the value of averageprecision is larger, and performance is better.In addition also have two new many labels evaluation indexes, be respectively averagerecall, averageF1.Averagerecall calculate predicted go out the average mark of appropriate label, averageF1 describes the balance between mean accuracy and average recall rate, and two kinds of evaluation indexes are larger, represent that the performance of this algorithm is better.In the present invention's experiment, the comprehensive performance adopting these 7 kinds of evaluation indexes (hammingloss, one-error, coverage, rankingloss, averageprecision, averagerecall, averageF1) to carry out evaluation method, carries out checking explanation by contrasting 7 kinds of indexs to effect of the present invention.
The sample of the experiment random selecting 2/3 on MSRCv2 image data set is as training set, and remaining sample is used as test.Experiment random selecting on Scene image data set 1500 samples, as training set, remain 500 as test set.Experiment repetition 30 times, finally tries to achieve mean value and the standard deviation of algorithm classification performance index.
For MSRCv2, composition threshold value threshold=[0.2,0.4,0.6,0.8,1,1.2,1.4,1.6,1.8,2], the parameter gam=[0.2,0.4,0.6,0.8,1,1.2,1.4,1.6,1.8,2] of the different RBF kernel functions that different scale is corresponding;
For Scene image data set, gam=[0.2,0.8,1.6,200,3.2,5.6,500,7,9,100], threshold=[0.2,0.4,0.6,0.8,1,1.2,1.4,1.6,1.8,2]
The classification performance index of the inventive method and the many labeling algorithms of existing multiple many examples compared in experiment, existing method comprises: MIMLBOOST, MIMLSVM, MIMLSVMmi, MIMLNN, MIMLfast, and KISAR.Wherein, MIMLSVMmi algorithm is the improvement utilizing MI-SVM method replacement MIBOOSTING method to make on the basis of MIMLBOOST algorithm, and MIMLNN algorithm is the improvement utilizing two-layer neural network replacement MLSVM method to make on the basis of MIMLSVM algorithm.In addition, also contrast with many labeling algorithms ML-kNN.The step-length γ of MIMLfast algorithm t0/ (1+ η γ 0t), γ 0=0.005, η=10 -5, the upper bound of norm is set to 1.In MIMLBOOST, boosting takes turns number and is set to 50.The parameter k of MIMLSVM is set to 20% of training set.In ML_kNN, arest neighbors number is set to 10.
Table 1 and table 2 sets forth the Experimental comparison results on two experimental data collection, and optimal result runic represents.Found out by table 1, the inventive method is better than other 7 kinds of algorithms on hammingloss, coverage, rankingloss, averageprecision, averagerecall and averageF1.The performance of MIMLNN algorithm and the performance of MK_MIML are more or less the same.Meanwhile, by comparing the standard deviation of overall performance, we find out that MIMLNN's is minimum, and MK_MIML takes second place, and the stability of visible the inventive method is higher than existing several method.Therefore, method of the present invention improves the accuracy of image scene classification.
Herein why algorithm can obtain best classifying quality, and one is that correlative character between example more fully can characterize bag, and classification accuracy is improved; Two is that the introducing of multi-core integration makes the dirigibility of algorithm improve, and is more suitable for the prediction of many labels.Meanwhile, also there is certain deficiency in algorithm: on the one hand, and algorithm utilizes composition to represent the correlativity between example, and the corresponding figure of each bag, improves the complexity of algorithm; On the other hand, the introducing of multi-core integration means will construct multiple basic kernel function, all will to learn a multi-core classifier for every class label simultaneously, turn improve the complexity of algorithm.Therefore, the complexity how reducing algorithm need to solve.
Experimental result on table 1MSRCv2 data set
Experimental result on table 2Scene image data set
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1., based on many example many labels scene classification method of multi-core integration, it is characterized in that, comprise the following steps:
Step 1, input example many label datas more than collection, is designated as and described many example many label datas collection are split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Step 2, uses each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value;
Step 3, to seek common ground every basic kernel function more than two between sample data bag under a threshold value according to the correlation matrix obtained in step 2, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
Step 4, by the basic nuclear matrix K under the different threshold values that obtain in step 3 gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K;
Step 5, utilizes many label datas collection Y ilearn with the multi-kernel function obtained in step 4, obtain multiple multinuclear SVM classifier, the quantity of described sorter is identical with the labels class quantity that described many label datas are concentrated, and described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification.
2. a kind of many example many labels scene classification method based on multi-core integration according to claim 1, is characterized in that, use a threshold value to a bag X in described step 2 ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix wherein, subscript s represent the numbering of threshold value, subscript i represent the numbering of many sample datas bag.
3. a kind of many example many labels scene classification method based on multi-core integration according to claim 2, is characterized in that, the threshold value t in described step 2.2 svalue is [0,4], and the number of described threshold value is [10,15].
4. a kind of many example many labels scene classification method based on multi-core integration according to claim 1, it is characterized in that, described step 3 is every basic kernel function more than two between sample data bag under asking same threshold value according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W i a s W j b s k s ( x i a , x j b ) Σ a = 1 n i W i a s Σ b = 1 n j W j b s
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix in the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
5. a kind of many example many labels scene classification method based on multi-core integration as claimed in any of claims 1 to 4, is characterized in that, adopt convex combination to the basic nuclear matrix K under different threshold value in described step 4 gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1 ;
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
6., based on many example many labels scene classification systems of multi-core integration, it is characterized in that, comprising:
Load module, for inputting the many label datas of example more than collection, is designated as and described many example many label datas collection is split into the X={X of sample data collection more than i| i=1,2 ..., label data collection Y={Y more than m} and i| i=1,2 ..., m};
Wherein, i is the numbering that the many label datas of many examples concentrate many sample datas bag, and m is total number of bag, and m gets positive integer; X irefer to the many sample datas bag being numbered i in many sample datas collection X, be designated as x i1represent many sample datas bag X iin be numbered 1 example, x i2represent many sample datas bag X iin be numbered 2 example, represent many sample datas bag X iin be numbered n iexample, n irefer to be numbered the example number comprised in the bag of i, n ivalue is positive integer; y i1represent label data collection Y iin be numbered 1 label, y i2represent label data collection Y iin be numbered 2 label, represent label data collection Y iin be numbered l ilabel, l ifor label data collection Y iin the label number that comprises, l ivalue is positive integer;
Correlation matrix sets up module, for using each threshold value in multiple threshold value respectively to each many sample datas bag X iset up correlation matrix, then, under same threshold value, each many sample datas bag can set up a correlation matrix described threshold value t s∈ (t 1, t 2..., t s), wherein, S is total number of threshold value, and s represents the numbering of threshold value;
Basic nuclear matrix module, to seek common ground every basic kernel function more than two between sample data bag under a threshold value for setting up the correlation matrix obtained in module according to correlation matrix, described multiple basic kernel function value forms basic nuclear matrix, element value in described basic nuclear matrix is every basic kernel function value more than two between sample data bag under same threshold value, the line number of element value and the numbering of row number corresponding respectively sample data bag more than two in described basic nuclear matrix; For different threshold values, then can obtain the basic nuclear matrix K under different threshold value gs, g is basic nuclear matrix mark, and s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
Composite module, for the basic nuclear matrix K under the different threshold values that will obtain in basic nuclear matrix module gsthe element of middle same position combines, and obtains a multi-kernel function K (X i, X j), described multiple multi-kernel function value composition multinuclear matrix, the corresponding every multi-kernel function value more than two between sample data bag of the element value namely in described multinuclear matrix K;
Study module, for utilizing many label datas collection Y ilearn with the multi-kernel function obtained in composite module, obtain multiple multinuclear SVM classifier, the quantity of described sorter is identical with the labels class quantity that described many label datas are concentrated, and described sorter is used for predicting the tally set of the unknown many sample datas bag thus realizing scene classification.
7. a kind of many example many labels scene classification systems based on multi-core integration according to claim 6, is characterized in that, described correlation matrix is set up in module and used a threshold value to a bag X ithe process setting up correlation matrix is specially:
Step 2.1, defines a n i× n imatrix W, the line number in described matrix and row difference corresponding many sample datas bag X iin the numbering of two examples;
Step 2.2, judges example x iawith example x iubetween Gauss's distance whether be less than threshold value t, if example x iawith example x iubetween Gauss's distance be less than threshold value t s, then the element assignment arranged by capable for a of matrix W u is 1; Otherwise assignment is 0; A, u are example numbering, and equal value is [1, n i] between integer; Until by complete for each the element assignment in matrix W, obtain many sample datas bag X icorrelation matrix wherein, subscript s represent the numbering of threshold value, subscript i represent the numbering of many sample datas bag.
8. a kind of many example many labels scene classification systems based on multi-core integration according to claim 7, it is characterized in that, the threshold value t value in described step 2.2 is [0,4], and the number of described threshold value is [10,15].
9. a kind of many example many labels scene classification method based on multi-core integration according to claim 8, it is characterized in that, described basic nuclear matrix module asks every basic kernel function more than two between sample data bag according to following formula:
K g s ( X i , X j ) = Σ a = 1 n i Σ b = 1 n j W ia s W j b s k s ( x i a , x j b ) Σ a = 1 n i W ia s Σ b = 1 n j W j b s ;
Wherein, X i, X jrepresent the many sample datas bag being numbered i and being numbered j respectively, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, g be basic nuclear matrix mark, s is the numbering of basic nuclear matrix, with the numbering one_to_one corresponding of described threshold value;
for many sample datas bag X icorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of a, wherein, i is the numbering of many sample datas bag, and s is the numbering of basic nuclear matrix, and with the numbering one_to_one corresponding of described threshold value, a is described many sample datas bag X icorresponding threshold value is t scorrelation matrix in line number; represent many sample datas bag X icorresponding threshold value is t scorrelation matrix in the element value of a capable u row, n ifor many sample datas bag X icorresponding threshold value is t stotal line number of correlation matrix or total columns, with many sample datas bag X iin example number equal;
for many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the inverse of the capable all elements sum of b, wherein, j is the numbering of many sample datas bag; represent many sample datas bag X jcorresponding threshold value is t scorrelation matrix in the element value of b capable v row; n jfor many sample datas bag X jtotal line number of corresponding correlation matrix or total columns, with many sample datas bag X jin example number equal;
K s(x ia, x jb) be general kernel function, tried to achieve by Radial basis kernel function, be specially: k s(x ia, x jb)=exp (-γ || x ia-x jb|| 2); Wherein, s is the numbering of threshold value; x iafor many sample datas bag X iin be numbered the example of a, i is the numbering of many sample datas bag, and a is the numbering of example; x jbfor many sample datas bag X jin be numbered the example of b, j is the numbering of many sample datas bag, and b is the numbering of example; Exp (-γ || x ia-x jb|| 2) be take e as the exponential function of the truth of a matter, e=2.71828, power is-γ || x ia-x jb|| 2, || x ia-x jb|| be x ia-x jbnorm, γ is core coefficient, gets arbitrary value, and for different basic nuclear matrix, core coefficient gamma gets different value.
10. according to a kind of many example many labels scene classification method based on multi-core integration in claim 6-9 described in any one, it is characterized in that, described composite module adopts convex combination to the basic nuclear matrix K under different threshold value gsthe element of middle same position combines, and the described multi-kernel function obtained through convex combination is:
K ( X i , X j ) = Σ s = 1 S d s K g s ( X i , X j ) , d s ≥ 0 , Σ s = 1 S d s = 1 ;
In formula, K (X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween multi-kernel function, be also in multinuclear matrix i-th row jth row element; d sfor weight coefficient; S is the numbering of threshold value, and value is positive integer; S is total number of basic nuclear matrix, K gs(X i, X j) be many sample datas bag X iwith many sample datas bag X jbetween basic kernel function, be also basic nuclear matrix K gsin i-th row jth row element.
CN201510344990.5A 2015-06-19 2015-06-19 A kind of more example multi-tag scene classification methods based on multi-core integration Expired - Fee Related CN105046269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510344990.5A CN105046269B (en) 2015-06-19 2015-06-19 A kind of more example multi-tag scene classification methods based on multi-core integration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510344990.5A CN105046269B (en) 2015-06-19 2015-06-19 A kind of more example multi-tag scene classification methods based on multi-core integration

Publications (2)

Publication Number Publication Date
CN105046269A true CN105046269A (en) 2015-11-11
CN105046269B CN105046269B (en) 2019-02-22

Family

ID=54452798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510344990.5A Expired - Fee Related CN105046269B (en) 2015-06-19 2015-06-19 A kind of more example multi-tag scene classification methods based on multi-core integration

Country Status (1)

Country Link
CN (1) CN105046269B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656692A (en) * 2016-03-14 2016-06-08 南京邮电大学 Multi-instance multi-label learning based area monitoring method used in wireless sensor network
CN106127247A (en) * 2016-06-21 2016-11-16 广东工业大学 Image classification method based on multitask many examples support vector machine
CN106295548A (en) * 2016-08-05 2017-01-04 西北工业大学 A kind of method for tracking target and device
CN107330463A (en) * 2017-06-29 2017-11-07 南京信息工程大学 Model recognizing method based on CNN multiple features combinings and many nuclear sparse expressions
CN108764192A (en) * 2018-06-04 2018-11-06 华中师范大学 A kind of more example multi-tag learning methods towards safe city video monitoring application
CN110175657A (en) * 2019-06-05 2019-08-27 广东工业大学 A kind of image multi-tag labeling method, device, equipment and readable storage medium storing program for executing
CN111488400A (en) * 2019-04-28 2020-08-04 北京京东尚科信息技术有限公司 Data classification method, device and computer readable storage medium
CN112100390A (en) * 2020-11-18 2020-12-18 智者四海(北京)技术有限公司 Scene-based text classification model, text classification method and device
CN113141349A (en) * 2021-03-23 2021-07-20 浙江工业大学 HTTPS encrypted flow classification method with self-adaptive fusion of multiple classifiers

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127029A (en) * 2007-08-24 2008-02-20 复旦大学 Method for training SVM classifier in large scale data classification
CN103839084A (en) * 2014-03-12 2014-06-04 湖州师范学院 Multi-kernel support vector machine multi-instance learning algorithm applied to pedestrian re-identification
CN104091038A (en) * 2013-04-01 2014-10-08 太原理工大学 Method for weighting multiple example studying features based on master space classifying criterion

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127029A (en) * 2007-08-24 2008-02-20 复旦大学 Method for training SVM classifier in large scale data classification
CN104091038A (en) * 2013-04-01 2014-10-08 太原理工大学 Method for weighting multiple example studying features based on master space classifying criterion
CN103839084A (en) * 2014-03-12 2014-06-04 湖州师范学院 Multi-kernel support vector machine multi-instance learning algorithm applied to pedestrian re-identification

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
潘强等: "《结合示例空间概念权重的多示例核学习方法》", 《科学技术与工程》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656692B (en) * 2016-03-14 2019-05-24 南京邮电大学 Area monitoring method based on more example Multi-label learnings in wireless sensor network
CN105656692A (en) * 2016-03-14 2016-06-08 南京邮电大学 Multi-instance multi-label learning based area monitoring method used in wireless sensor network
CN106127247A (en) * 2016-06-21 2016-11-16 广东工业大学 Image classification method based on multitask many examples support vector machine
CN106127247B (en) * 2016-06-21 2019-07-09 广东工业大学 Image classification method based on the more example support vector machines of multitask
CN106295548A (en) * 2016-08-05 2017-01-04 西北工业大学 A kind of method for tracking target and device
CN107330463A (en) * 2017-06-29 2017-11-07 南京信息工程大学 Model recognizing method based on CNN multiple features combinings and many nuclear sparse expressions
CN108764192A (en) * 2018-06-04 2018-11-06 华中师范大学 A kind of more example multi-tag learning methods towards safe city video monitoring application
CN108764192B (en) * 2018-06-04 2021-05-18 华中师范大学 Multi-example multi-label learning method for safe city video monitoring application
CN111488400A (en) * 2019-04-28 2020-08-04 北京京东尚科信息技术有限公司 Data classification method, device and computer readable storage medium
CN110175657A (en) * 2019-06-05 2019-08-27 广东工业大学 A kind of image multi-tag labeling method, device, equipment and readable storage medium storing program for executing
CN112100390A (en) * 2020-11-18 2020-12-18 智者四海(北京)技术有限公司 Scene-based text classification model, text classification method and device
CN112100390B (en) * 2020-11-18 2021-05-07 智者四海(北京)技术有限公司 Scene-based text classification model, text classification method and device
CN113141349A (en) * 2021-03-23 2021-07-20 浙江工业大学 HTTPS encrypted flow classification method with self-adaptive fusion of multiple classifiers
CN113141349B (en) * 2021-03-23 2022-07-15 浙江工业大学 HTTPS encrypted flow classification method with self-adaptive fusion of multiple classifiers

Also Published As

Publication number Publication date
CN105046269B (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN105046269A (en) Multi-instance multi-label scene classification method based on multinuclear fusion
CN105210064B (en) Classifying resources using deep networks
CN103258214B (en) Based on the Classifying Method in Remote Sensing Image of image block Active Learning
CN109685110B (en) Training method of image classification network, image classification method and device, and server
CN108520275A (en) A kind of regular system of link information based on adjacency matrix, figure Feature Extraction System, figure categorizing system and method
CN104573669A (en) Image object detection method
CN106991382A (en) A kind of remote sensing scene classification method
CN103258210B (en) A kind of high-definition image classification method based on dictionary learning
CN108510559A (en) It is a kind of based on have supervision various visual angles discretization multimedia binary-coding method
CN109886330B (en) Text detection method and device, computer readable storage medium and computer equipment
CN103679185A (en) Convolutional neural network classifier system as well as training method, classifying method and application thereof
CN106778832A (en) The semi-supervised Ensemble classifier method of high dimensional data based on multiple-objection optimization
CN104850890A (en) Method for adjusting parameter of convolution neural network based on example learning and Sadowsky distribution
CN104966105A (en) Robust machine error retrieving method and system
CN110111337A (en) A kind of general human body analytical framework and its analytic method based on figure transfer learning
CN103177265B (en) High-definition image classification method based on kernel function Yu sparse coding
CN105335756A (en) Robust learning model and image classification system
CN110188827A (en) A kind of scene recognition method based on convolutional neural networks and recurrence autocoder model
CN112446331A (en) Knowledge distillation-based space-time double-flow segmented network behavior identification method and system
CN107392254A (en) A kind of semantic segmentation method by combining the embedded structural map picture from pixel
CN103942214B (en) Natural image classification method and device on basis of multi-modal matrix filling
CN115546525A (en) Multi-view clustering method and device, electronic equipment and storage medium
CN104036021A (en) Method for semantically annotating images on basis of hybrid generative and discriminative learning models
CN103258212A (en) Semi-supervised integrated remote-sensing image classification method based on attractor propagation clustering
CN106227836A (en) Associating visual concept learning system and method is supervised with the nothing of word based on image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190222

Termination date: 20210619