CN108416364A - Integrated study data classification method is merged in subpackage - Google Patents

Integrated study data classification method is merged in subpackage Download PDF

Info

Publication number
CN108416364A
CN108416364A CN201810097334.3A CN201810097334A CN108416364A CN 108416364 A CN108416364 A CN 108416364A CN 201810097334 A CN201810097334 A CN 201810097334A CN 108416364 A CN108416364 A CN 108416364A
Authority
CN
China
Prior art keywords
sample
subset
weight
sorter model
integrated study
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810097334.3A
Other languages
Chinese (zh)
Inventor
李勇明
张�成
王品
李淋玉
谭晓衡
颜芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN201810097334.3A priority Critical patent/CN108416364A/en
Publication of CN108416364A publication Critical patent/CN108416364A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention discloses a kind of subpackage fusion integrated study data classification method, includes the following steps:S1:It obtains data and forms training set and test set;S2:Training set is divided into K subset using Subspace partition module;S3:Corresponding a subset trains a sorter model;S4:Calculate the corresponding weight factor of each sorter model;S5:Testing data is inputted in each sorter model, the sample label of each sorter model output obtains classification results to the end with corresponding weight factor multiplication rear weight.Its effect is:Learn by subpackage and to sample in every sub-spaces, weaken influence of the overlapping area to sorter model in sample space, then each subset misclassification sample is enhanced, is transferred in next subset and learns again, increases sample utilization rate.The prediction of all subsets is weighted using more spatial weightings fusion integrated study module integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.

Description

Integrated study data classification method is merged in subpackage
Technical field
The invention belongs to the data Classification and Identification technologies in big data field, and in particular to a kind of subpackage fusion integrated study Data classification method.
Background technology
In big data field, data classification has a wide range of applications, such as medical diagnosis, Judgment by emotion, semantics recognition And image recognition etc..Common grader mainly uses:Random forest (RF) algorithm, K arest neighbors (KNN) algorithm, support to Amount machine (SVM) model, extreme learning machine (ELM) model etc..Although existing research is in feature extraction, feature learning and classification Device design etc. all makes great progress, but sample study is not often taken seriously.
By taking the diagnosis of Parkinson disease based on voice data as an example, in speech sample and preprocessing process, it may be adopted Collect equipment, the influence of the factors such as noise, there may be large error, shapes between finally obtained numerical value sample and actual sample At exceptional sample.Exceptional sample normally results in different classes of sample aliasing in sample space and forms overlapping region, overlapping region Sample may mislead sorter model.There is presently no results of study can prove grader mould of this part sample to foundation Type is advantageous or harmful.Or existing method delete this part sample or be regarded as it is important as other samples, and It does not account for weakening influence of these samples to grader by algorithm.
Invention content
Based on drawbacks described above, the present invention provides a kind of subpackage fusion integrated study data classification method, and it is right that this method passes through Sample space is learnt, influence of the reduction overlapping region sample to disaggregated model.First, by each sample in training set Centroid distance measure ratio is calculated as sample weights.Sample in training sample is arranged according to sample weights descending.So The training set sample of sequence is divided into several subsets successively afterwards.Secondly, using staying a cross validation (LOO) method pair one The wrong classification samples and error rate of subset are calculated, and go out a sub- sorter model using each trained.It is based on Sample weights in each subset calculate penalty factor, and the weight factor of subset is calculated by the error rate of the subset after LOO. In the learning process of all subsets, it is transmitted in next subset after the misclassification sample from previous subset is enhanced, Next subset is learnt again.Again, the power of each subset is calculated using the weight factor of subset and penalty factor Weight, and the test result of each sub-classifier is weighted using subset weight.By being carried out to sample in every sub-spaces Study, and each subset misclassification sample is enhanced, it is transferred in next subset and learns again, realized to existing with this There is making full use of for sample, increases sample utilization rate.Integrated study module is merged to the pre- of all subsets using more spatial weightings Survey is weighted integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.
To achieve the above object, specific technical solution of the present invention is as follows:
A kind of subpackage fusion integrated study data classification method, feature include the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is right The weight factor multiplication rear weight answered obtains classification results to the end.
Further, Subspace partition module described in step S2 uses power of the class heart distance metric ratio as sample Weight by calculating the class heart distance metric ratio of each sample in training set, and is lined up, finally successively by from big size order It is divided into K subset.
Further, step S3 carries out the training of sorter model using subspace sample delivery type training method, specifically For:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], it is tested using an intersection is stayed Demonstration is verified to obtain prediction label set to be Lk
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According toCalculate the grader mould of K trained The weight factor of type.
Further, classification error rate in step S32Its In:
wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkMiddle s The class heart distance metric ratio weighted value of a sample, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk (j) indicate j-th of sample by misclassification.
Further, setting subset TkIn by staying the misclassification sample set after a cross validation to beThe sample Next subset T is transmitted to after enhancingk+1It is middle to be learnt again.
Further, the enhancement method of misclassification sample isWherein: It is the wrong original weight for dividing sample,It is the wrong weight for dividing sample after enhancing.
Further, it is weighted processing using more spatial weighting integrated study modules, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightkk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
The present invention remarkable result be:
The Subspace partition module that this method proposes is based on the concept wrapped in bagging algorithms, by training set according to one Fixed criterion is directly divided into several subsets, rather than random sampling is repeated as bagging algorithms, and duplicate removal is saved on algorithm Multiple sampling process reduces time complexity and weakens overlapping area according to sample distribution characteristic dividing subset in sample space Sample influence to other samples in training sorter model, between subspace sample delivery type training module with reference to The thought for the concept and grader weight calculation that sample enhances in Adaboost algorithm, to sample in every sub-spaces It practises, and each subset misclassification sample is enhanced, be transferred in next subset and learn again, realized to existing with this Sample makes full use of, and increases sample utilization rate;Finally integrated study module is merged to all subsets using more spatial weightings Prediction is weighted integrated, to further influence of the reduction overlapping area sample to sorter model, improves nicety of grading.
Description of the drawings
Fig. 1 is the control flow chart of the present invention;
Fig. 2 is data subpackage flow chart in Subspace partition module;
Fig. 3 is the class heart apart from schematic diagram calculation;
Fig. 4 is subspace sample delivery type training flow chart;
Fig. 5 is the flow chart of more spatial weighting integrated studies;
The classification accuracy average result of randomly drawing sample when Fig. 6 is different subsets number;
Fig. 7 show each subset weight and test set prediction result under different situations;
Fig. 8 is the impact of performance figure of algorithms of different in specific embodiment.
Specific implementation mode
In order to keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and Specific embodiment is described in detail.
As shown in Figure 1, the present embodiment provides a kind of subpackages to merge integrated study data classification method, include the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is right The weight factor multiplication rear weight answered obtains classification results to the end.
The present embodiment applies this method to during diagnosis of Parkinson disease, by carrying out classification processing to voice data, Realize the early diagnosis and prediction of Parkinson's disease.Using to data set be " Training set ", by Sakar et al. provide, And it is downloaded from the machine learning data set library website University of California, Irvine (UCI).The data set is divided into two portions Point:With Parkinson's disease subject and health volunteer.Wherein suffering from the subject of Parkinson's disease has male 14, and women 6 Example;There are male 10, women 10 in health volunteer.Therefore, data set shares 40 subjects.Entirely data set includes 1040 samples, each sample have 26 features.It is worth noting that, each subject has 26 samples, these sample representations 26 different semantic tasks.
The above method can be divided into Subspace partition module (SP) in specific implementation, and subspace sample delivery type trains mould Block (TST) and more spatial weighting integrated study module (MWEL) three parts realize that SP modules are used to carry out dividing son to training set Collection.Sub-classifier model is trained using each subset and calculate the relevant parameter of subset in TST modules.With MWEL modules Fusion is weighted to the prediction label of all subsets, obtains final classification results.
As shown in Fig. 2, bagging algorithms by training set using there is the method for putting back to stochastical sampling multiple to generate New training set, each newly trained sample number is identical as the sample number of original training set.Then each new training set trains one A sorter model is simultaneously verified with test set.Finally, to the sorter model of each new training set by way of ballot Prediction label is weighted, and obtains final result.Obviously, the training set in bagging algorithms is obtained by random sampling , which results in the uncertainties of result.When carrying out classification experiments using bagging algorithms, usually experiment is repeated more It is secondary, using the average value of many experiments result as final result.The time that such experimentation undoubtedly increases model is complicated Degree.Subspace partition module (SP) proposed by the present invention is based on the concept wrapped in bagging algorithms, by training set according to one Fixed criterion is directly divided into several subsets, rather than random sampling is repeated as bagging algorithms.In this process, Training set sample is weighted with sample class heart distance ratio.
Assuming that the training set T comprising K class samples is expressed as T=[S0;S1;...;St;...;SK], t=1,2 ..., K;Its The sample set that middle classification is t is:St=[s1;s2;...;si;...;sm], i=1,2 ... ..m, m are subset sample number.St In i-th of sample be expressed as si=[f1;...;fj;...;fn], i=1,2 ... ..n, fjIndicate j-th of feature of the sample. As shown in figure 3, point B is class StCenter of a sample's point, coordinate representation isWherein For j-th of feature of i-th of sample.Point C is class StIn i-th of sample coordinate.Point A is other foreign peoples The central point of sample, is expressed as:
And haveα=∠ CDB, β=∠ CDA.Then there is sample siWith it is similar Center of a sample's point distance is:
It is with foreign peoples center of a sample point distance:
It is possible thereby to which the class heart distance metric ratio for acquiring sample i is:
As seen from the figure there is three kinds of situations in the class heart distance metric ratio wi of sample:
From the point of view of geometric angle, due to AD=DB in triangle, CD=DC, alpha+beta=180 °, therefore the length of line segment AC and BC Spend (i.e. d0And d1) closely bound up with the size of angle α and β.If α < β, d0< d1, it is meant that wi< 1;If α=β, Then triangle Δ ADC and Δ DBC is congruent triangles, then d0=d1And wi=1;If α > β, d0> d1, also with regard to table Show it is wi> 1.
It is found by analysis, w values are bigger, and the aliasing degree between sample and other different classes of samples is bigger, in d0Phase With in the case of, sampled point is remoter from other classification samples, and w values are smaller.Based on this, class heart distance metric ratio can be used for Indicate the aliasing degree of sample and other classification samples.In the ideal case, w is smaller, and representative sample is in structure sorter model In it is more advantageous, w is bigger, and represented sample may more mislead sorter model in entire sample space.Therefore, originally Invention uses weight of the class heart distance metric ratio as sample.The class heart of each sample in training set is calculated using formula (3) Distance metric ratio is worth to w, and is ranked up from big to small to training set sample by w, the training set after finally sample sorts It is averagely divided into K subset successively, we term it subpackages for this process.
Assuming that original training set is divided into K subset after Subspace partition module, can be expressed as:
T=[T1,T2,…,Tk,…,TK], k=1,2 ..., K.
The sample weights of training set are expressed as:W=[W1,W2,…,Wk,…,WK], wherein:
Wk=[w1,w2,…,wj,wj,ws], (j=1,2 ..., s) represent k-th of subset TkWeight set, s tables Show this subset TkSample size.
By the study found that after training set is divided into K subset, with the increase of subsequence number, each subset The separability of sample should show as being concave function.Subset separability is bigger, and the performance of the sub-classifier after training is better, whole The weight of sub-classifier should bigger in a model.
Step S3 carries out the training of sorter model using subspace sample delivery type training method, specially:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], it is tested using an intersection is stayed Demonstration is verified to obtain prediction label set to be Lk
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According to:
Calculate the weight factor of the sorter model of K trained.
With reference to the method for thought and grader weight calculation that sample in Adaboost algorithm enhances, statistics subset TkMiddle mistake The classification error rate of classification samples and subset, and calculate the weight factor of the sorter model of trained.
Classification error rate in step S32:
Wherein:wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkThe class heart of middle s sample Distance metric ratio weighted value, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk(j) it indicates j-th Sample is by misclassification.
Assuming that by staying the misclassification sample set after a cross validation to beAnd in TST modules, mistake classification Sample set in sample next subset T is passed to after enhancingk+1It is middle to be learnt again.Therefore, it transmits successively, accidentally The sample of classification always can in next subset re -training, to increase the utilization rate of sample.The flow of TST modules Figure is specific as shown in Figure 4.
As shown in figure 4, before dividing sample to be transferred in next subset the mistake of each subset, need to mistake divide sample into Row enhancing.Because in Subspace partition module, subset is divided according to the descending of sample weights.Therefore in each subset Sample weights successively decrease successively, enhancing sample just need reduce mistake divide sample class heart distance measurement ratio.However, being more worth It must consider, the wrong classification samples of previous subset may mistake classification again in next subset learning process. Therefore, it is necessary to inhibit the influence of wrong classification samples in previous subset to the grader weight of next trained.Together When, αkIt is bigger, indicate subset TkSample separability it is bigger, be put into next subset misclassification sample be more possible under the influence of The weight of a subset.So these misclassification samples should reduce the interference to model to the greatest extent.Sample enhancing in the present invention Mode is expressed as:
Wherein:It is the wrong original weight for dividing sample,It is the wrong weight for dividing sample after enhancing.Wherein sample barycenter Distance metric ratio (is referred to as sample weights).Since αkAlways meet αk>=0, then just there is exp (αk) >=1 is always It sets up.Formula (8) can reduce the class heart distance metric ratio for accidentally dividing sample well, realize sample enhancing.Moreover, exp (αk) be a monotonic increase function.αkBigger, the new weight for accidentally dividing sample that formula (8) generates is smaller, to next height Collect αk+1The influence of parameter is smaller.In this way, can prevent from having well the sample of larger separability to have compared with The increase of small separability subset weight.
Next, as shown in figure 5, the prediction label of all subsets is weighted using MWEL modules it is integrated, further Weaken influence of the overlapping area sample to sorter model, final nicety of grading is obtained by integrated study.Each subset instruction Weight of the experienced sorter model in entire model can be calculated using equation (5).However, the test in sample space The distribution of sample is totally unknown.Formula (5) cannot be used for indicating completely point in final mask by each trained The weight of class device.In order to improve robustness of the model to test set, needs a penalty factor to carry out antithetical phrase collection weight and carry out about Beam.Therefore, it is weighted processing using more spatial weighting integrated study modules in the present embodiment, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightkk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
Pass through above-mentioned design, αkBy βkConstraint can improve the generalization ability of model.Assuming that the weight set of K subset It is expressed as Weight=[weight1,weight2,…,weightK], and weight of the subset in entire model is to depend on Weight is calculated.If the λ in entire modelkThe weight for representing k-th subset, in order to ensureThen λk's Calculation is:
Further, in order to which the feasibility to the above method is verified, following experiment has also been devised in the present invention.
(1) because the sample space sample distribution situation of different data collection is different, it is therefore desirable to determine draw for data sets The optimum number of Molecule Set.Moreover, the quantity of subpackage cannot be too big, it can not be too small.Sample size in too big each subset Too small, training is insufficient;Inhomogeneous sample aliasing depth is too big in subset if too small, is unfavorable for sample classification.Therefore, Using subpackage proposed by the invention fusion integrated study sorting technique by after training set training pattern, in random slave training set It extracts 26 samples to be verified, statistical forecast accuracy rate.For data set used in the present embodiment, subpackage number is from 5-9 Between select, in the case of different packet numbers, 20 times experiment predictablity rates average result it is as shown in Figure 6.
It will be appreciated from fig. 6 that the best subpackage number of the used data set of the present embodiment is 7.In order to verify the sample of each subset Separability, in certain experimentation, the classification for having counted 7 subset subset samples in the case where staying a cross validation is accurate A concave function is substantially presented in the classification accuracy of true rate, the respective sample of seven subsets, has confirmed method part to each subset The analysis of performance.It can be seen that classification accuracy it is high subset sample separability it is big, the subset weight in entire model should be more Greatly;The low subset sample separability of classification accuracy is small, and the subset weight in entire model should be smaller.
The wrong transmission for dividing sample, increases the utilization rate of sample between subset.In certain experimentation, each subset is to next The mistake that a subset is transmitted divides number of samples to be denoted as Ni, it is transferred to next subset and then secondary wrong point of number is denoted as Mi+1, system The results are shown in Table 1 for meter:
Table 1:Next subset mistake is transmitted to divide sample number and transfer samples wrong score mesh compares again
Number of subsets 1 2 3 4 5 6 7
Ni 2 8 57 64 26 18 1
Mi+1 0 4 13 4 1 1
Table 1 the result shows that the mistake transmitted each time divides sample, have most can correctly be divided in next subset Class, to prove, mistake divides the utilization ratio that the transmission of sample increases sample, realizes making full use of for available sample.
Because the weight parameter of a subset under the influence of the wrong transmission meeting for dividing sample, therefore counted context of methods respectively and existed There is no mistake that sample is divided to transmit, wrong point of sample transmits but no specimen enhances, and mistake divides sample to transmit and enhance.
Fig. 7 each subset weight and test set prediction result in the case of showing three kinds.Sample is divided to pass to verify mistake with this Pass the influence enhanced experimental result influence and sample to experimental result.
As seen in Figure 7, mistake is not transmitted to be divided to sample and transmit the wrong son for being divided to sample but not reinforcing two kinds of situations Collect weight, it is seen that after sample transmits, for the weighting curve of each subset closer to a concave function, improving on the whole can The weight for dividing the big subset of property, reduces the weight of the small subset of separability.Comparative sample enhancing is in non-reinforced situation, it is seen that Sample enhancing antithetical phrase collection weights influence and little on the whole.
For performance more of the invention, method 1 is conventional method as a result, being directly to be classified to obtain to data set 's;Method 2 is the result of bagging algorithms;Method 3 is to use SP modules in context of methods, after TST modules, to each height Collection prediction result carries out the result of ballot succession.Method 4 is to classify fully according to method proposed by the present invention, four kinds of methods The comparison curves of accuracy rate is as shown in Figure 8.
As seen in Figure 8, method proposed by the present invention is on the graders such as RF, SVM (linear) or SVM (RBF), Its classification performance is significantly improved.
Finally, it should be noted that the present embodiment description is merely a preferred embodiment of the present invention, the ordinary skill of this field Personnel under the inspiration of the present invention, without prejudice to the purpose of the present invention and the claims, as can making multiple types It indicates, such transformation is each fallen within protection scope of the present invention.

Claims (7)

1. integrated study data classification method is merged in a kind of subpackage, feature includes the following steps:
S1:It obtains data and forms training set and test set;
S2:Training set is divided into K subset using Subspace partition module, K is the integer more than or equal to 2;
S3:Corresponding a subset trains a sorter model;
S4:Calculate the corresponding weight factor of each sorter model;
S5:Testing data is inputted in each sorter model, the sample label of each sorter model output with it is corresponding Weight factor multiplication rear weight obtains classification results to the end.
2. integrated study data classification method is merged in subpackage according to claim 1, it is characterised in that:Described in step S2 Subspace partition module uses weight of the class heart distance metric ratio as sample, by the class for calculating each sample in training set Heart distance metric ratio, and be lined up successively by descending order, finally it is divided into K subset.
3. integrated study data classification method is merged in subpackage according to claim 1 or 2, it is characterised in that:Step S3 is adopted The training of sorter model is carried out with subspace sample delivery type training method, specially:
S31:Set subset TkTrue tag representation be:Yk=[y1,y2,…,yj,…,ys], using staying a cross-validation method It is verified to obtain prediction label set to be Lk
S32:Count subset TkThe classification error rate error_rate of middle misclassification sample and subset,
S33:According toCalculate the sorter model of K trained Weight factor.
4. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Divide in step S32 Class error rateWherein:
wjIndicate the class heart distance metric ratio of j-th of sample,Indicate subset TkMiddle s sample This class heart distance metric ratio weighted value, weight (j) represent the initialization weight of j-th of sample;I(Yk(j)≠Lk(j) table Show j-th of sample by misclassification.
5. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Set subset TkIn By staying the misclassification sample set after a cross validation to beThe sample is transmitted to next subset T after enhancingk+1 It is middle to be learnt again.
6. integrated study data classification method is merged in subpackage according to claim 5, it is characterised in that:Misclassification sample Enhancement method isWherein:It is the wrong original weight for dividing sample,It is to increase Mistake divides the weight of sample after strong.
7. integrated study data classification method is merged in subpackage according to claim 3, it is characterised in that:Added using more spaces Power integrated study module is weighted processing, specially:
S41:According toCalculate separately the penalty factor of K subset;
S42:According to weightkk·αkCalculate the weight of each partitions of subsets device;
S43:Calculate the weight of the sample predictions label of each partitions of subsets device output.
CN201810097334.3A 2018-01-31 2018-01-31 Integrated study data classification method is merged in subpackage Pending CN108416364A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810097334.3A CN108416364A (en) 2018-01-31 2018-01-31 Integrated study data classification method is merged in subpackage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810097334.3A CN108416364A (en) 2018-01-31 2018-01-31 Integrated study data classification method is merged in subpackage

Publications (1)

Publication Number Publication Date
CN108416364A true CN108416364A (en) 2018-08-17

Family

ID=63127486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810097334.3A Pending CN108416364A (en) 2018-01-31 2018-01-31 Integrated study data classification method is merged in subpackage

Country Status (1)

Country Link
CN (1) CN108416364A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109765341A (en) * 2019-01-25 2019-05-17 重庆水利电力职业技术学院 A kind of structure monitoring system for civil engineering
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium
CN111382758A (en) * 2018-12-28 2020-07-07 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN111709488A (en) * 2020-06-22 2020-09-25 电子科技大学 Dynamic label deep learning algorithm
CN111783093A (en) * 2020-06-28 2020-10-16 南京航空航天大学 Malicious software classification and detection method based on soft dependence
CN111882003A (en) * 2020-08-06 2020-11-03 北京邮电大学 Data classification method, device and equipment
CN112183582A (en) * 2020-09-07 2021-01-05 中国海洋大学 Multi-feature fusion underwater target identification method
CN113393932A (en) * 2021-07-06 2021-09-14 重庆大学 Parkinson's disease voice sample segment multi-type reconstruction transformation method
US11507882B2 (en) 2019-09-12 2022-11-22 Beijing Xiaomi Intelligent Technology Co., Ltd. Method and device for optimizing training set for text classification and storage medium
CN116843998A (en) * 2023-08-29 2023-10-03 四川省分析测试服务中心 Spectrum sample weighting method and system
CN118098623A (en) * 2024-04-26 2024-05-28 菏泽医学专科学校 Medical information data intelligent management method and system based on big data

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382758B (en) * 2018-12-28 2023-12-26 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN111382758A (en) * 2018-12-28 2020-07-07 杭州海康威视数字技术股份有限公司 Training image classification model, image classification method, device, equipment and medium
CN109765341A (en) * 2019-01-25 2019-05-17 重庆水利电力职业技术学院 A kind of structure monitoring system for civil engineering
CN110222762A (en) * 2019-06-04 2019-09-10 恒安嘉新(北京)科技股份公司 Object prediction method, apparatus, equipment and medium
US11507882B2 (en) 2019-09-12 2022-11-22 Beijing Xiaomi Intelligent Technology Co., Ltd. Method and device for optimizing training set for text classification and storage medium
CN111709488A (en) * 2020-06-22 2020-09-25 电子科技大学 Dynamic label deep learning algorithm
CN111783093A (en) * 2020-06-28 2020-10-16 南京航空航天大学 Malicious software classification and detection method based on soft dependence
CN111882003A (en) * 2020-08-06 2020-11-03 北京邮电大学 Data classification method, device and equipment
CN111882003B (en) * 2020-08-06 2024-01-23 北京邮电大学 Data classification method, device and equipment
CN112183582A (en) * 2020-09-07 2021-01-05 中国海洋大学 Multi-feature fusion underwater target identification method
CN113393932B (en) * 2021-07-06 2022-11-25 重庆大学 Parkinson's disease voice sample segment multi-type reconstruction transformation method
CN113393932A (en) * 2021-07-06 2021-09-14 重庆大学 Parkinson's disease voice sample segment multi-type reconstruction transformation method
CN116843998A (en) * 2023-08-29 2023-10-03 四川省分析测试服务中心 Spectrum sample weighting method and system
CN116843998B (en) * 2023-08-29 2023-11-14 四川省分析测试服务中心 Spectrum sample weighting method and system
CN118098623A (en) * 2024-04-26 2024-05-28 菏泽医学专科学校 Medical information data intelligent management method and system based on big data

Similar Documents

Publication Publication Date Title
CN108416364A (en) Integrated study data classification method is merged in subpackage
CN105589806B (en) A kind of software defect tendency Forecasting Methodology based on SMOTE+Boosting algorithms
CN106383891A (en) Deep hash-based medical image distributed retrieval method
Gustafsson et al. Comparison and validation of community structures in complex networks
CN111832608A (en) Multi-abrasive-particle identification method for ferrographic image based on single-stage detection model yolov3
CN111090764B (en) Image classification method and device based on multitask learning and graph convolution neural network
CN112001110B (en) Structural damage identification monitoring method based on vibration signal space real-time recurrent graph convolutional neural network
CN103605711B (en) Construction method and device, classification method and device of support vector machine
CN110175697A (en) A kind of adverse events Risk Forecast System and method
EP3968337A1 (en) Target object attribute prediction method based on machine learning and related device
CN104966106B (en) A kind of biological age substep Forecasting Methodology based on support vector machines
CN115908255A (en) Improved light-weight YOLOX-nano model for target detection and detection method
Nugraha et al. Particle swarm optimization–Support vector machine (PSO-SVM) algorithm for journal rank classification
Rahman et al. Automatic identification of abnormal blood smear images using color and morphology variation of RBCS and central pallor
CN112690774B (en) Magnetic resonance image-based stroke recurrence prediction method and system
CN113486202A (en) Method for classifying small sample images
US20220319002A1 (en) Tumor cell isolines
Mahendra et al. Optimizing convolutional neural network by using genetic algorithm for COVID-19 detection in chest X-ray image
CN109800854A (en) A kind of Hydrophobicity of Composite Insulator grade determination method based on probabilistic neural network
TWI599896B (en) Multiple decision attribute selection and data discretization classification method
Sridhar et al. Multi-lane capsule network architecture for detection of COVID-19
CN112382382B (en) Cost-sensitive integrated learning classification method and system
Chang et al. An Efficient Hybrid Classifier for Cancer Detection.
CN113361653A (en) Deep learning model depolarization method and device based on data sample enhancement
Qin A cancer cell image classification program: based on CNN model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180817

RJ01 Rejection of invention patent application after publication