CN110110845A - Learning method based on parallel multi-level width neural network - Google Patents

Learning method based on parallel multi-level width neural network Download PDF

Info

Publication number
CN110110845A
CN110110845A CN201910331708.8A CN201910331708A CN110110845A CN 110110845 A CN110110845 A CN 110110845A CN 201910331708 A CN201910331708 A CN 201910331708A CN 110110845 A CN110110845 A CN 110110845A
Authority
CN
China
Prior art keywords
neural network
level width
test
width neural
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910331708.8A
Other languages
Chinese (zh)
Other versions
CN110110845B (en
Inventor
席江波
房建武
吴田军
康梦华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changan University
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN201910331708.8A priority Critical patent/CN110110845B/en
Publication of CN110110845A publication Critical patent/CN110110845A/en
Application granted granted Critical
Publication of CN110110845B publication Critical patent/CN110110845B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a learning method based on a parallel multi-level width neural network, which comprises the following steps: obtaining a verification set and constructing a base classifier; training and verifying each level of the parallel M-level width neural network to obtain a trained parallel M-level width neural network and verification output corresponding to each level of the width neural network; obtaining a decision threshold value of each level of width neural network through statistical calculation; and testing the verified parallel multi-level width neural network through a test set. The neural network has a multi-stage structure, each stage learns different parts of data, and parallel training and testing can be realized. Each level adopts a width neural network to carry out feature learning in the width direction; realizing the integration of the classifiers in two width directions by using a plurality of width neural networks as the reconnection of the base classifiers in the width directions; incremental learning of the network is realized by adding a new level of width neural network; and can realize parallelization test.

Description

A kind of learning method based on parallel multi-level width neural network
Technical field
The invention belongs to artificial intelligence and machine learning techniques fields, and in particular to one kind is based on parallel multi-level width mind Learning method through network.
Background technique
As the learning model based on deep learning network obtains in the fields such as large-scale image processing and machine vision The complexity of immense success, learning model also quicklys increase, they need a large amount of high dimensional data to go to train, to increase Add required computing resource and calculates the time.In addition, actual data are frequently not homogeneity, some samples are very easy to Classification, but also have many sample classifications relatively difficult.Most of errors in classification occur when input sample is difficult classification Wait, for example, sample uneven distribution, abnormal obtain sample and close to classification boundaries or the sample of linearly inseparable etc..
In existing deep learning model, simple sample and complex samples are handled in a like fashion, reduce meter Calculate the service efficiency of resource.Meanwhile existing deep learning network such as convolutional neural networks often have plurality of layers, all samples Originally it will pass through all network layers, it can be very time-consuming when carrying out extensive or test to network.And early stage is parallel Organizing network only receives the sample by nonlinear transformation refused by upper level in every level-one, these samples are become The other spaces for being easy to classify are changed to, to classify again.But how to realize high dimensional data for different difficulty Data sample carry out adjustment and the distribution of computing resource with improve the speed of learning classification and efficiency this problem there is no To very good solution.
Summary of the invention
In view of the foregoing drawbacks, the present invention provides a kind of learning method based on parallel multi-level width neural network, this hairs Bright neural network has multilevel structure, and every level-one is learnt for the different piece in data, and can realize that parallelization is instructed Practice and tests.Every level-one carries out feature learning in width direction using a kind of width neural network;Pass through multiple width nerve nets Network is again coupled to as base classifier in width direction, realizes the combining classifiers in two width directions;It is new by increasing The incremental learning of the width neural fusion network of level-one;And can realize that parallelization is tested, substantially reduce complex samples The learning classification time improves network operation efficiency.
In order to achieve the above object, the present invention is resolved using following technical scheme.
(2) a kind of learning method based on parallel multi-level width neural network, parallel multi-level width neural network include Multistage width neural network, wherein every level width neural network includes sequentially connected input layer, hidden layer, decision-making level and defeated Layer out, whether the decision-making level is for determining each test sample by when prime exports, the learning method includes following step It is rapid:
Step 1, original training sample collection is obtained, parallel M level width neural network Net is constructed1... Netm..., NetM(m =1,2 ..., M), base classifier of every level width neural network as respective stages;By being carried out M times to original training sample collection Data transformation, it is corresponding to obtain M verifying collection xv_1... xv_m... xv_M
Wherein, the total sample number that original training sample integrates is Ntr
Step 2, x is collected using original training sample collection and M verifyingv_1... xv_m... xv_MRespectively to parallel M level width Every grade of neural network is trained and verifies, parallel M level width neural network and every level width nerve net after being trained The corresponding verifying of network exports yv_m(m=1,2 ..., M);Each verifying output y is obtained using minimum error methodv_mCorresponding label yv_ind_m, and then correct point of the verifying collection of every level width neural network of the parallel M level width neural network after being trained Class sample set yvc_mWith wrong classified sample set yvw_m
Step 3, to the correct of the verifying collection of every level width neural network of the parallel M level width neural network after training Classified sample set yvc_mWith wrong classified sample set yvw_mStatistics calculating is carried out respectively, corresponds to every level width mind after being trained Decision-making value T through networkm;By the decision-making value T of every level width neural networkmDecision as corresponding level width neural network Foundation obtains the parallel M level width neural network that decision-making value determines;
Step 4, test set is obtained, the input for the parallel M level width neural network that test set is determined as decision-making value Data parallel inputs to every level width neural network that decision-making value determines and is tested, and obtains every grade that decision-making value determines The output of width neural network;The error vector for obtaining every level width neural network, every level width mind that decision-making value is determined Output through network is judged, so that the test output for obtaining every level width neural network that decision-making value determines is corresponding Label ytest_ind_m
The characteristics of technical solution of the present invention and further improvement are as follows:
(1) in step 1, the data be transformed to the sample that original sample is concentrated by elastic registration (Elastic) into Row compression or deformation;Or the data are transformed to revolve by the sample that affine transformation (Affine) concentrates original sample Turn, overturn, zoom in or out.
(2) described that x is collected using original training sample collection and M verifying in step 2v_1... xv_m... xv_MRespectively to parallel Every grade of M level width neural network is trained and verifies, and it includes following sub-steps:
Sub-step 2.1, using original training sample collection as the 1st level width neural network Net1Input sample, to the 1st Level width neural network Net1It is trained, the first level width neural network after being trained.
Sub-step 2.2 collects x using the first verifyingv_1The 1st level width neural network after training is verified, is obtained The wrong classified sample set y of the verifying collection of 1st level width neural networkvw_1
Sub-step 2.3, by the wrong classified sample set y of the first level width neural networkvw_1As the 2nd level width nerve The input sample A of networkv_1;Training sample set A is randomly selected from original training sample concentration againv_2, make total input sample collection {Av_1+Av_2In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { Av_1+Av_2Make For the input sample of the 2nd level width neural network.
Sub-step 2.4, using total input sample collection { Av_1+Av_22nd level width neural network is trained, it obtains The 2nd level width neural network after training;X is collected using the second verifyingv_2The 2nd level width neural network after training is tested Card obtains the wrong classified sample set y of the verifying collection of the 2nd level width neural networkvw_2
And so on, 3rd level is trained respectively to M level width neural network, parallel M grades after being trained The corresponding verifying output y of width neural network and every level width neural networkv_m(m=1,2 ..., M).
(3) in step 2, the minimum error method are as follows:
Firstly, setting total classification number that original training sample integrates as C, R-matrix R is constructedj(1≤j≤C)。
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension For C × Ntr
Secondly, exporting y according to the verifying of every level width neural network after trainingv_m, obtain verifying output yv_mWith it is corresponding The R-matrix R of gradejBetween error vector:
Jv_mj=| | softmax (yv_m)-Rj||2, 1≤j≤C;
Wherein, Jv_mjDimension be 1 × Ntr;yv_mDimension be C × Ntr
Finally, exporting y to verifyingv_mWith the R-matrix R of respective stagesjBetween error vector Jv_mjIt minimizes, obtains The corresponding class label y of every level width neural network after trainingv_ind_m:
Wherein, yv_ind_mDimension be 1 × Ntr
(4) in step 3, it includes following sub-step that the statistics, which calculates:
Sub-step 3.1, correct point of the m level width neural network of the parallel M level width neural network after setting training Class sample set and wrong classified sample set are respectively as follows: yvc_mAnd yvw_m, what correct classified sample set and wrong classification samples were concentrated Total sample number is respectively as follows: Nvc_mAnd Nvw_m, and Nvc_m+Nvw_m=Ntr, then correct classified sample set and wrong classified sample set Error is respectively as follows:
evc_m=| | softmax (yvc_m)-tvc_m||2
evw_m=| | softmax (yvw_m)-tvw_m||2
Wherein, tvc_mIt is correct classification samples y in m level width neural networkvc_mCorresponding true tag, tvw_mIt is m grades Wrong classification samples y in width neural networkvw_mCorresponding true tag.
Sub-step 3.2, according to correct classified sample set yvc_mWith wrong classified sample set yvw_m, calculate separately out correct point Class sample set yvc_mMean value and variance be respectively μcAnd σc;Mistake classified sample set yvw_mMean value and variance be respectively: uwWith σw;Then correct classified sample set yvc_mWith wrong classified sample set yvw_mCorresponding Gaussian Profile is respectively:
Correct classified sample set yvc_mWith wrong classified sample set yvw_mCorresponding Gaussian probability-density function is respectively:
Sub-step 3.3, according to wrong classified sample set yvw_mError evw_mAnd variances sigmaw, obtain m level width nerve net The decision-making value T of networkm=min (evw_m)-ασw
Wherein, α is a constant, for providing allowance, so that the wrong classification samples y of institutevw_mIt is refused when prime Absolutely.
(5) in step 4, the acquisition test set are as follows: obtain original test sample collection xtest;By M data extending, It is corresponding to obtain M group test sample collection xtest_1..., xtest_m..., xtest_M, as test set.
Further, the data extending are as follows: to the original test sample collection xtestIn each sample carry out respectively NtestDThe secondary data transformation, correspondence obtain NtestDA test sample collection, the parallel M level width mind determined as decision-making value The test set x of m level width neural network through networktest_m
Wherein, original test sample collection XtestMiddle test sample sum is Ntest_saples
(6) in step 4, the error vector for obtaining every level width neural network includes following sub-step:
Sub-step 4.1, by M group test sample collection xtest_1, xtest_2..., xtest_MInput to decision-making value parallel respectively Determining parallel M level width neural network, the corresponding N for obtaining every level width neural network that decision-making value determinestestDA output ytest_m_d, (d=1,2 ... NtestD)。
Sub-step 4.2, to the N for every level width neural network that decision-making value determinestestDA output ytest_m_d, (d=1,2 ... NtestD) average value is calculated, obtain the test output for every level width neural network that decision-making value determines
Sub-step 4.3 sets total classification number of test set as C, constructs R-matrix Rj(1≤j≤C);It is defeated to obtain verifying Y outv_mWith the R-matrix R of respective stagesjBetween error vector:
Jtest_mj=| | softmax (ytest_m)-Rj||2, 1≤j≤C;
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension For C × Ntest_samples;Jtest_mjDimension be 1 × Ntest_samples, yv_mDimension be C × Ntest_samples
(7) output of the every level width neural network determined to decision-making value judges are as follows:
The minimal error of current level width neural network is less than or equal to when prime decision-making value, then is judged as and works as prime For the correct classification output stage of the output:
min(Jtest_mj)≤Tm
The minimal error of current level width neural network is greater than when prime decision-making value, then is judged as when prime can not Correctly classified to the output, which is transferred to next level width neural network and is tested, is so recycled, until this Correct classification output stage is found in output:
min(Jtest_mj) > Tm
(8) in step 4, the test for obtaining every level width neural network that decision-making value determines exports corresponding mark Sign ytest_ind_mAre as follows:
Wherein, ytest_ind_mDimension be 1 × Ntest_samples
Compared with prior art, the invention has the benefit that
(1) neural network of the invention has multistage base classifier, and every level-one is used to the different piece sample of learning data set This, can adaptively determine the structure of neural network, realize computing resource according to problem and the complexity of data set Optimization.
(2) neural network of the invention has the advantages that incremental learning, when new training data is available, realizes to working as The judgement of preceding neural network, according to judging result, it is determined whether can correctly be classified to newly-increased training data, if cannot be into Row properly separates, then learns new sample by increasing the new width radial basis function level-one new as neural network, and Without re -training whole network.
(3) neural network of the invention can carry out concurrent testing when test, that is, simultaneously test data To all grades of network, the decision-making value of every level-one obtained in training process determines each test sample finally by which The neural network of level-one exports, and concurrent testing process greatly reduces waiting time when actual use network.
(4) neural network of the invention can be used as a kind of general learning framework, have very strong flexibility, each Grade can use BP neural network, convolutional neural networks or other kinds of classifier according to actual needs.
Detailed description of the invention
The present invention is described in further details in the following with reference to the drawings and specific embodiments.
Fig. 1 is the schematic diagram and its training test process schematic diagram of parallel multi-level neural network of the invention;Wherein, Fig. 1 It (a) is parallel multi-level width neural networks principles figure of the invention;Fig. 1 (b) is parallel multi-level width neural network of the invention Training and verification process schematic diagram;Fig. 1 (c) is the test process schematic diagram of parallel multi-level width neural network of the invention.
Fig. 2 is the structure chart of parallel multi-level width neural network of the invention.
Fig. 3 (a) is error distribution of the verifying collection of parallel multi-level width neural network of the invention in wherein level-one Figure;Fig. 3 (b) is the Gaussian probability-density function of the statistical parameter in Fig. 3 (a).
Fig. 4 is test result of the parallel 26 level width neural network on MMNIST data set in the embodiment of the present invention With the classification results comparison diagram of existing learning model.
Specific embodiment
Embodiment of the present invention is described in detail below in conjunction with embodiment, but those skilled in the art It will be understood that following embodiment is merely to illustrate the present invention, and it is not construed as limiting the scope of the invention.
Using MNIST hand-written data collection, which is the image of 8 gray scale handwritten numerals 0~9, image Size is 28 × 28, and 10 class, there is 60000 original training sample collection in total, and 10000 images are newly to learn as test set One of the important general purpose image data collection of model training test.For the data set, with reference to Fig. 1 and Fig. 2, the present embodiment is used Width radial primary function network is all made of width radial direction base as base classifier, i.e. every grade of parallel multi-level width neural network Function Network, the series for choosing parallel width neural network is 26.
(1) verifying collection is obtained, base classifier is constructed.
Firstly, to NtrThe image pattern that=60000 original training samples are concentrated carries out 26 hypoelasticity transformation respectively, obtains Collect x to M=26 verifyingv_1, xv_2..., xv_26, guarantee that the mistake for the verifying collection for having enough divides sample in the present embodiment This, it includes N that each verifying, which is concentrated,val=10 data sets obtained by original trained set transformation.Wherein, the sample of each verifying collection This number is the N of original training sample collectionval=10 times.
Secondly, designing parallel multi-level width neural network as base classifier using width radial primary function network;M =26 width radial primary function networks link together, and form parallel multi-level width neural network Net1, Net2... NetM; Each base classifier is as level-one, the different piece be absorbed in data set.
Finally, building width radial primary function network.Detailed process is as follows:
Building includes N0k=1000 Gaussian bases areRadial primary function network, should The center of radial primary function network is a subset for being derived from original training sample collection at random, and standard deviation value is constant.Using Sliding window obtains original training sample and concentrates the multiple groups local feature image of each image pattern, to obtain multiple groups part Eigenmatrix obtains multiple radial primary function networks i.e. using multiple groups local feature matrix as the input data of Gaussian bases For width radial primary function network.
2) every grade of parallel M level width neural network is trained and is verified, the parallel M level width after being trained Neural network and the corresponding verifying of every level width neural network export yv_m(m=1,2 ..., M).
1st level width radial primary function network is trained using original training sample collection, after training, the instruction of mistake point Practice the width radial primary function network that sample is sent to the 2nd grade, as a part of second training set, the net that the 2nd grade of Lai Xunlian Network.The verifying collection obtained using step (1) to the training network verification when prime, while providing more error samples, makees For a part of next stage training set.Include following sub-step specifically as shown in Fig. 1 (a), (b):
Sub-step 2.1, using original training sample collection as the 1st level width neural network Net1Input sample, to the 1st Level width neural network Net1It is trained, the 1st level width neural network after being trained.
Sub-step 2.2 collects x using the first verifyingv_1The 1st level width neural network after training is verified, is obtained The wrong classified sample set y of 1st level width neural networkvw_1
Sub-step 2.3, by the wrong classified sample set y of the 1st level width neural networkvw_1As the 2nd level width nerve net The input sample A of networkv_1;Training sample set A is randomly selected from original training sample concentration againv_2, make total input sample collection { Av_1 +Av_2In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { Av_1+Av_2It is used as the 2nd The input sample of level width neural network.
Sub-step 2.4, using total input sample collection { Av_1+Av_22nd level width neural network is trained, it obtains The 2nd level width neural network after training;X is collected using the second verifyingv_2The 2nd level width neural network after training is tested Card, obtains the wrong classified sample set y of the 2nd level width neural networkvw_2
Sub-step 2.3 and 2.4 is repeated, 3rd level is trained respectively to M level width neural network, after being trained Parallel M level width neural network and every level width neural network corresponding verifying output yv_m(m=1,2 ..., M),.
The specific training of above-mentioned width radial primary function network and verification process are as follows:
For the image pattern that original training sample is concentrated as input data, image size is M1×M2=28 × 28.It is sliding Dynamic window size is r=13 × 13, and the initial position of sliding window is located at the upper left corner of each image pattern, selection sliding step A length of 1 pixel, sliding window from left to right, successively slide from top to bottom, 60000 image patterns in sliding window 3 d image block stretch become matrix xk∈Rr×N, i.e., it is corresponding original each local feature image to be pressed into pixel composition respectively Matrix forms a column vector after arranging the 2nd of each original matrix the to last Leie time sequence arrangement to the 1st;It will be N number of Column vector sequence rearranges the local feature matrix x of one group of training image samplek(1≤k≤K), local feature matrix xk's Each column represent a sample.Again local feature matrix xkIt inputs to including N0k=1000 Gaussian bases areRadial primary function network, output is denoted as:
Wherein,For the column vector comprising N=60000 element.
Sliding window slides a corresponding radial primary function network every time can be obtained K=(M after final sliding1-m +1)(M2- m+1)=(28-13+1) × (28-13+1)=256 radial primary function network.
For each radial primary function network, sequence and down-sampling are introduced by the output of Gaussian bases to it.Needle To each radial primary function network, to its Gaussian bases output data Φ Jing Guo nonlinear transformationkIntroduce sequence and under Sampling.To the output data Φ of width radial primary function networkkEach column sum, obtain a row vector, row vector Each element be each image to be processed local specific position the sum of pixel, it is special to the part of each image to be processed It positions the sum of pixel set and carries out descending arrangement, obtain descending vectorUsing index skBy descending to Measure akIn the corresponding home position of local specific position of each image to be processed be marked, the output data to be sorted Φ′k=sort (Φk, sk)。
Down-sampling is carried out to the output data of sequence, sets down-sampling interval NkS=20, the output number through over-sampling Are as follows:
Then the output number of total width radial primary function network isSampling output is Φks =subsample (Φ 'k, NkS), then the output of Gaussian bases is Φ=[Φ1S, Φ2S..., ΦKS]。
Setting desired output is D=[D1, D2..., DC];To the defeated of the Gaussian bases of width radial primary function network Linear layer connection is carried out out, then the weight of linear layer are as follows: W=[W1, W2..., WC];
Wherein, C=10 is the classification sum of original sample.
Obtain the classification output Y=[Y of width radial primary function network1, Y2..., YC]=Φ W;Specifically, pass through minimum Change the least mean-square estimate that square error calculates the weight of linear layerSpecific formula are as follows:
The weight of linear layer is calculated by the pseudo inverse matrix that the Gaussian bases of width radial primary function network export Φ Least mean-square estimate
Wherein, Φ+The pseudo inverse matrix of Φ is exported for the Gaussian bases of width radial primary function network.
Finally, the classification output of width radial primary function network is calculated are as follows:
And then the width radial primary function network after being trained, to the width radial primary function network after every grade of training It is verified using corresponding verifying collection, the corresponding verifying of every level width radial primary function network after being trained exports yv_m(m =1,2 ..., M).
Y is exported by the verifying of acquisitionv_m(m=1,2 ..., M) further obtains each verifying output yv_mCorresponding class Distinguishing label yv_ind_m, the specific steps are as follows:
Firstly, setting total classification number that original training sample integrates as C, R-matrix R is constructedj(1≤j≤C)。
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension For C × Ntr
Secondly, exporting y according to the verifying of every level width neural network after trainingv_m, obtain verifying output yv_mWith it is corresponding The R-matrix R of gradejBetween error vector:
Jv_mj=| | softmax (yv_m)-Rj||2, 1≤j≤C;
Wherein, Jv_mjDimension be 1 × Ntr;yv_mDimension be C × Ntr
Finally, exporting y to verifyingv_mWith the R-matrix R of respective stagesjBetween error vector Jv_mjIt minimizes, obtains The corresponding class label y of every level width neural network after trainingv_ind_m:
Wherein, yv_ind_mDimension be 1 × Ntr
By the corresponding class label y of every level width neural network after trainingv_ind_mVerifying with every grade exports yv_mInto Row compares, and can be obtained the correct classified sample set y of every level width neural networkvc_mWith wrong classified sample set yvw_m
(3) pass through the decision-making value T that every level width neural network is calculatedm
The relatively difficult part of present networks is the determination of every level-one decision-making value, it is used to determine when test, often One sample should be exported by which rank of network.After training and test, classify to correct classified sample set and mistake Sample set carries out statistics calculating respectively.Assuming that in m level width neural network, correct classified sample set and wrong classified sample set It is respectively as follows: yvc_mAnd yvw_m, the total sample number of correct classified sample set and wrong classified sample set is respectively as follows: Nvc_mAnd Nvw_m, And Nvc_m+Nvw_m=Ntr
In the above verification process, in order to guarantee that final sample has enough error sample collection, each verifying collection be can be It include NvalA that original training sample collection is passed through to the verifying sample set that data convert, i.e., each verifying collection may include NvalGroup verifying sample set, that is to say, that each sample number integrated of verifying is the N of original training samplevalTimes.
The error of two class sample sets is calculate by the following formula:
evc_m=| | softmax (yvc_m)-tvc_m||2
evw_m=| | softmax (yvw_m)-tvw_m||2
Wherein, tvc_mAnd tvw_mIt is correct classification samples y in m gradesvc_mWith wrong classification samples yvw_mCorresponding true mark Label.Assuming that correct classification and mistake are classified, the mean value of these two types of sample statistics and variance are respectively: μc, uw, σc, σw, right therewith Two Gaussian Profiles answered are respectively:
Its Gaussian probability-density function is respectively:
In the level-one of parallel multi-level width neural network, verifying collection error distribution and its probability density function such as Fig. 3 (a) and (b) is shown, then the decision-making value of m level width neural network are as follows:
Tm=min (evw_m)-ασw
Wherein, α is a constant, for providing allowance, so that the wrong classification samples y of institutevw_mIt is refused when prime Absolutely.
(4) it is tested by the parallel multi-level width neural network that test set determines decision-making value
As shown in Fig. 1 (c), specific test process are as follows:
Firstly, obtaining test set, detailed process are as follows: obtain original test sample collection Xtest;It is right by M data extending M group test sample collection x should be obtainedtest_1..., xtest_m..., xtest_M, as test set;Wherein, original test sample collection xtestMiddle test sample sum is Ntest_samples
Above-mentioned data extending are as follows: to the original test sample collection XtestIn each sample carry out N respectivelytestDIt is secondary Data transformation, correspondence obtain NtestDA test sample collection, the m as the parallel M level width neural network that decision-making value determines The test set x of level width neural networktest_m
The stability that above-mentioned test set acquisition methods can be tested in subsequent test process.
Secondly, by M group test sample collection xtest_1..., xtest_m..., xtest_MDecision-making value is inputed to parallel to determine Parallel M level width neural network, test set is tested, i.e., is determined the corresponding decision-making value that inputs to of every group of test set Every level width neural network tested, the corresponding N for obtaining every level width neural network that decision-making value determinestestDA survey Try sample set output;To NtestDThe output of a test sample collection is averaged, and obtains every level width nerve that decision-making value determines The test of network exports
Again, total classification number of test set is set as C, constructs R-matrix Rj(1≤j≤C);Obtain verifying output yv_m With the R-matrix R of respective stagesjBetween error vector:
Jtest_mj=| | softmax (ytest_m)-Rj||2, 1≤j≤C;
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension For C × Ntest_samples;Jtest_mjDimension be 1 × Ntest_samples, yv_mDimension be C × Ntest_samples
Finally, judging the output for every level width neural network that decision-making value determines, specifically: when prime is wide The minimal error for spending neural network is less than or equal to the i.e. min (J when prime decision-making valuetest_mj)≤Tm, then it is judged as current Grade is the correct classification output stage of the output.
The minimal error of current level width neural network is greater than the i.e. min (J when prime decision-making valuetest_mj) > Tm, then It is judged as that working as prime can not correctly classify to the output, be transferred to next level width neural network for the output and test, So circulation, until correct classification output stage is found in the output.And then obtain every level width neural network that decision-making value determines Test export corresponding labelWherein, ytest_ind_mDimension be 1 × Ntest_samples
If test sample cannot export for 25 grades in front, directly exported at last the 26th grade.
Test set finally can be obtained in the output L of whole networktest;Wherein, correct classification samples and wrong classification samples Calculating, and then the precision of the sample classification of available parallel multi-level width neural network of the present invention can be counted.
Comparative example
Using original training sample collection same as the previously described embodiments, verifying collection and test set, random forest is respectively adopted (RF), multilayer perceptron (MP), conventional radial primary function network (RBF), support vector machines (SVM), range learning system (BLS), condition deep learning model (CDL), deepness belief network (DBL), convolutional neural networks LeNet-5, depth Bohr is hereby Deep woods (gc) is used as base classifier at random for graceful machine (DBM) and depth, carries out learning classification, finally obtained various study sides Method is as shown in Figure 4 to the precision of data classification.
From fig. 4, it can be seen that compared to the learning model of current mainstream: random forest (RF), multilayer perceptron (MP), Conventional radial primary function network (RBF), support vector machines (SVM), range learning system (BLS), condition deep learning model (CDL), deepness belief network (DBL), convolutional neural networks LeNet-5, depth Boltzmann machine (DBM) and depth are random Deep woods (gc forest), the accuracy rate of the classification results of parallel multi-level width neural network (PMWNN) of the invention have non- Often high competitiveness, the final nicety of grading of the method for the present invention is that 99.10%, WRBF is width radial primary function network.And Compared to depth, deep woods learning model, the method for the present invention neural network have multistage base neural net at random, and every level-one is used to learn The different piece sample for practising data set, can adaptively determine neural network according to problem and the complexity of data set Structure realizes the optimization of computing resource;Meanwhile neural network of the invention can carry out concurrent testing when test, Exactly test data is given simultaneously all grades of network, the decision-making value of every level-one obtained in training process is every to determine A test sample is finally exported by which rank of neural network, when concurrent testing process greatly reduces actual use network Waiting time.
In addition, incremental learning may be implemented in parallel multi-level width neural network of the invention, i.e., when there is new data It waits, new width radial primary function network can be increased to learn new characteristic, it is wide without the entire parallel multi-level of re -training Neural network is spent, this means that the network proposed can learn new knowledge under the premise of not forgeing old knowledge.New instruction Practice data and inputs to current M grades of network, if wrong point of sample, they and the original training set Jing Guo data extending New training dataset, the new width radial primary function network of training are established together, while being tested using new verifying collection Card, and decision-making value is calculated, to establish M+1 grades of networks.Finally, new parallel multi-level width neural network will be by M+1 Level width radial primary function network composition.Meanwhile the parallel multi-level width neural network that the present invention designs can when test With concurrent testing, all test samples can all give all grades of width radial primary function network, which decision-making value determines One width radial primary function network distributes to corresponding test sample.The process withouts waiting for other grades of network output, To parallelization when test, test process is accelerated.
Each level width neural network in parallel multi-level width neural network of the invention, can be width radial direction base Function Network, BP neural network, convolutional neural networks or other classifiers, and every grade of base point of multistage width neural network The type of class device can be different.
Obviously, various changes and modifications can be made to the invention without departing from of the invention by those skilled in the art Spirit and scope.In this way, if these modification and variations of the invention belong to the model of the claims in the present invention and its equivalent technologies Within enclosing, then the present invention is also intended to include these modifications and variations.

Claims (10)

1. a kind of learning method based on parallel multi-level width neural network, parallel multi-level width neural network includes multistage width Neural network, wherein every level width neural network includes sequentially connected input layer, hidden layer and output layer, which is characterized in that The learning method the following steps are included:
Step 1, original training sample collection is obtained, parallel M level width neural network Net is constructed1... Netm..., NetM(m=1, 2 ..., M), base classifier of every level width neural network as respective stages;Become by carrying out M data to original training sample collection It changes, it is corresponding to obtain M verifying collection xv_1... xv_m... xv_M:
Wherein, the total sample number that original training sample integrates is Ntr
Step 2, x is collected using original training sample collection and M verifyingv_1... xv_m, xv_MRespectively to parallel M level width neural network Every grade be trained and verify, the parallel M level width neural network and every level width neural network after being trained are corresponding Verifying output yv_m(m=1,2 ..., M);Each verifying output y is obtained using minimum error methodv_mCorresponding label yv_ind_m, into And the correct classified sample set of the verifying collection of every level width neural network of the parallel M level width neural network after being trained yvc_mWith wrong classified sample set yvw_m
Step 3, to the correct classification sample of the verifying collection of every level width neural network of the parallel M level width neural network after training This collection yvc_mWith wrong classified sample set yvw_mStatistics calculating is carried out respectively, corresponds to every level width neural network after being trained Decision-making value Tm;By the decision-making value T of every level width neural networkmAs the decision-making foundation of corresponding level width neural network, obtain The parallel M level width neural network determined to decision-making value;
Step 4, test set is obtained, the input data for the parallel M level width neural network that test set is determined as decision-making value, Every level width neural network that decision-making value determines is inputed to parallel to be tested, and every level width mind that decision-making value determines is obtained Output through network;The error vector for obtaining every level width neural network, every level width neural network that decision-making value is determined Output judged, so that the test for obtaining every level width neural network that decision-making value determines exports corresponding label ytest_ind_m
2. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 1 In, the data are transformed to be compressed or deformed by the sample that elastic registration concentrates original sample;Or the data become It is changed to and is rotated, overturn by the sample that affine transformation concentrates original sample, zoomed in or out.
3. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 2 In, it is described that x is collected using original training sample collection and M verifyingv_1... xv_m... xv_MRespectively to parallel M level width neural network Every grade be trained and verify, it includes following sub-steps:
Sub-step 2.1, using original training sample collection as the 1st level width neural network Net1Input sample, to the 1st level width Neural network Net1It is trained, the first level width neural network after being trained;
Sub-step 2.2 collects x using the first verifyingv_1The 1st level width neural network after training is verified, obtains the 1st grade The wrong classified sample set y of the verifying collection of width neural networkvw_1
Sub-step 2.3, by the wrong classified sample set y of the first level width neural networkvw_1As the 2nd level width neural network Input sample Av_1;Training sample set A is randomly selected from original training sample concentration againv_2, make total input sample collection { Av_1+Av_2} In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { Av_1+Av_2It is used as the 2nd level width The input sample of neural network;
Sub-step 2.4, using total input sample collection { Av_1+Av_22nd level width neural network is trained, after being trained The 2nd level width neural network;X is collected using the second verifyingv_2The 2nd level width neural network after training is verified, is obtained The wrong classified sample set y of the verifying collection of 2nd level width neural networkvw_2
And so on, 3rd level is trained respectively to M level width neural network, the parallel M level width mind after being trained Corresponding verifying output y through network and every level width neural networkv_m(m=1,2 ..., M).
4. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 2 In, the minimum error method are as follows:
Firstly, setting total classification number that original training sample integrates as C, R-matrix R is constructedj(1≤j≤C)。
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension be C × Ntr
Secondly, exporting y according to the verifying of every level width neural network after trainingv_m, obtain verifying output yv_mWith respective stages R-matrix RjBetween error vector:
Jv_mj=| | softmax (yv_m)-Rj||2, 1≤j≤C;
Wherein, | | | |22 norms of representing matrix, softmax () are normalization exponential function;Jv_mjDimension be 1 × Ntr; yv_mDimension be C × Ntr
Finally, exporting y to verifyingv_mWith the R-matrix R of respective stagesjBetween error vector Jv_mjIt minimizes, is trained The corresponding class label y of every level width neural network afterwardsv_ind_m:
Wherein, yv_ind_mDimension be 1 × Ntr
5. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 3 In, it includes following sub-step that the statistics, which calculates:
Sub-step 3.1, the correct classification sample of the m level width neural network of the parallel M level width neural network after setting training This collection and wrong classified sample set are respectively as follows: yvc_mAnd yvw_m, correct classified sample set and the sample that wrong classification samples are concentrated are total Number is respectively as follows: Nvc_mAnd Nvw_m, and Nvc_m+Nvw_m=Ntr, then the error of correct classified sample set and wrong classified sample set is distinguished Are as follows:
evc_m=| | softmax (yvc_m)-tvc_m||2
evw_m=| | softmax (yvw_m)-tvw_m||2
Wherein, tvc_mIt is correct classification samples y in m level width neural networkvc_mCorresponding true tag, tvw_mIt is m level width mind Through classification samples y wrong in networkvw_mCorresponding true tag;
Sub-step 3.2, according to correct classified sample set yvc_mWith wrong classified sample set yvw_m, calculate separately out correct classification sample This collection yvc_mMean value and variance be respectively μcAnd σc;Mistake classified sample set yvw_mMean value and variance be respectively: uwAnd σw;Then Correct classified sample set yvc_mWith wrong classified sample set yvw_mCorresponding Gaussian Profile is respectively:
Correct classified sample set yvc_mWith wrong classified sample set yvw_mCorresponding Gaussian probability-density function is respectively:
Sub-step 3.3, according to wrong classified sample set yvw_mError evw_mAnd variances sigmaw, obtain determining for m level width neural network Plan threshold value Tm=min (evw_m)-ασw
Wherein, α is a constant, for providing allowance, so that the wrong classification samples y of institutevw_mIt is rejected when prime.
6. the learning method according to claim 2 based on parallel multi-level width neural network, which is characterized in that step 4 In, the acquisition test set are as follows: obtain original test sample collection xtest;It is corresponding to obtain M group test specimens by M data extending This collection xrest_1..., xtest_m..., xtest_M, as test set.
7. the learning method according to claim 6 based on parallel multi-level width neural network, which is characterized in that the number According to expansion are as follows: to the original test sample collection xtestIn each sample carry out N respectivelytestDThe secondary data transformation, it is corresponding Obtain NtestDA test sample collection, the m level width neural network as the parallel M level width neural network that decision-making value determines Test set xtest_m
Wherein, original test sample collection xtestMiddle test sample sum is Ntest_samples
8. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 4 In, the error vector for obtaining every level width neural network includes following sub-step:
Sub-step 4.1, by M group test sample collection xtest_1, xtest_2..., xtest_MDecision-making value is inputed to parallel respectively to determine Parallel M level width neural network, the corresponding N for obtaining every level width neural network that decision-making value determinestestDA output ytest_md, (d=1,2 ... NtestD);
Sub-step 4.2, to the N for every level width neural network that decision-making value determinestestDA output ytest_m_d, (d=1,2 ... NtestD) Average value is calculated, the test output for every level width neural network that decision-making value determines is obtained
Sub-step 4.3 sets total classification number of test set as C, constructs R-matrix Rj(1≤j≤C);Obtain verifying output yv_m With the R-matrix R of respective stagesjBetween error vector:
Jtest_mj=| | softmax (ytest_m)-Rj||2, 1≤j≤C;
Wherein, R-matrix RjThe element of jth row be all 1, remaining element is all 0, each R-matrix RjDimension be C × Ntest_samples;Jtest_mjDimension be 1 × Ntest_samples, yv_mDimension be C × Ntest_samples
9. the learning method according to claim 8 based on parallel multi-level width neural network, which is characterized in that described right The output for every level width neural network that decision-making value determines is judged are as follows:
The minimal error of current level width neural network is less than or equal to when prime decision-making value, then is judged as when prime is that this is defeated Correct classification output stage out:
min(Jtest_mj)≤Tm
The minimal error of current level width neural network is greater than when prime decision-making value, then is judged as when prime can not be defeated to this Correctly classified out, which is transferred to next level width neural network and is tested, is so recycled, until the output is found Correct classification output stage:
min(Jtest_mj) > Tm
10. the learning method according to claim 9 based on parallel multi-level width neural network, which is characterized in that step 4 In, the test for obtaining every level width neural network that decision-making value determines exports corresponding label ytest_ind_mAre as follows:
Wherein, ytest_ind_mDimension be 1 × Ntest_samples
CN201910331708.8A 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network Active CN110110845B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910331708.8A CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910331708.8A CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Publications (2)

Publication Number Publication Date
CN110110845A true CN110110845A (en) 2019-08-09
CN110110845B CN110110845B (en) 2020-09-22

Family

ID=67486407

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910331708.8A Active CN110110845B (en) 2019-04-24 2019-04-24 Learning method based on parallel multi-level width neural network

Country Status (1)

Country Link
CN (1) CN110110845B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008647A (en) * 2019-11-06 2020-04-14 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111340184A (en) * 2020-02-12 2020-06-26 北京理工大学 Deformable reflector surface shape control method and device based on radial basis function
CN112966761A (en) * 2021-03-16 2021-06-15 长安大学 Extensible adaptive width neural network learning method
CN113449569A (en) * 2020-03-27 2021-09-28 威海北洋电气集团股份有限公司 Mechanical signal health state classification method and system based on distributed deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160019454A1 (en) * 2014-07-18 2016-01-21 James LaRue J Patrick's Ladder A Machine Learning Enhancement Tool
US20170132514A1 (en) * 2012-12-24 2017-05-11 Google Inc. System and method for parallelizing convolutional neural networks
CN107784312A (en) * 2016-08-24 2018-03-09 腾讯征信有限公司 Machine learning model training method and device
CN108351985A (en) * 2015-06-30 2018-07-31 亚利桑那州立大学董事会 Method and apparatus for large-scale machines study

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170132514A1 (en) * 2012-12-24 2017-05-11 Google Inc. System and method for parallelizing convolutional neural networks
US20160019454A1 (en) * 2014-07-18 2016-01-21 James LaRue J Patrick's Ladder A Machine Learning Enhancement Tool
CN108351985A (en) * 2015-06-30 2018-07-31 亚利桑那州立大学董事会 Method and apparatus for large-scale machines study
CN107784312A (en) * 2016-08-24 2018-03-09 腾讯征信有限公司 Machine learning model training method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111008647A (en) * 2019-11-06 2020-04-14 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111008647B (en) * 2019-11-06 2022-02-08 长安大学 Sample extraction and image classification method based on void convolution and residual linkage
CN111340184A (en) * 2020-02-12 2020-06-26 北京理工大学 Deformable reflector surface shape control method and device based on radial basis function
CN111340184B (en) * 2020-02-12 2023-06-02 北京理工大学 Deformable reflector surface shape control method and device based on radial basis function
CN113449569A (en) * 2020-03-27 2021-09-28 威海北洋电气集团股份有限公司 Mechanical signal health state classification method and system based on distributed deep learning
CN113449569B (en) * 2020-03-27 2023-04-25 威海北洋电气集团股份有限公司 Mechanical signal health state classification method and system based on distributed deep learning
CN112966761A (en) * 2021-03-16 2021-06-15 长安大学 Extensible adaptive width neural network learning method
CN112966761B (en) * 2021-03-16 2024-03-19 长安大学 Extensible self-adaptive width neural network learning method

Also Published As

Publication number Publication date
CN110110845B (en) 2020-09-22

Similar Documents

Publication Publication Date Title
CN110110845A (en) Learning method based on parallel multi-level width neural network
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
CN107871100A (en) The training method and device of faceform, face authentication method and device
CN106529499A (en) Fourier descriptor and gait energy image fusion feature-based gait identification method
CN106295694A (en) Face recognition method for iterative re-constrained group sparse representation classification
CN112039687A (en) Small sample feature-oriented fault diagnosis method based on improved generation countermeasure network
CN111311702B (en) Image generation and identification module and method based on BlockGAN
CN110533116A (en) Based on the adaptive set of Euclidean distance at unbalanced data classification method
CN112949738B (en) Multi-class unbalanced hyperspectral image classification method based on EECNN algorithm
CN108108760A (en) A kind of fast human face recognition
CN106991355A (en) The face identification method of the analytical type dictionary learning model kept based on topology
CN106650667A (en) Pedestrian detection method and system based on support vector machine
CN104820825A (en) Adaboost algorithm-based face recognition optimization method
CN109344845A (en) A kind of feature matching method based on Triplet deep neural network structure
CN112819063B (en) Image identification method based on improved Focal loss function
CN110458178A (en) The multi-modal RGB-D conspicuousness object detection method spliced more
CN113139536A (en) Text verification code identification method and equipment based on cross-domain meta learning and storage medium
CN107886066A (en) A kind of pedestrian detection method based on improvement HOG SSLBP
CN109816030A (en) A kind of image classification method and device based on limited Boltzmann machine
CN109993042A (en) A kind of face identification method and its device
CN106778714A (en) LDA face identification methods based on nonlinear characteristic and model combination
CN109509188A (en) A kind of transmission line of electricity typical defect recognition methods based on HOG feature
CN104598898B (en) A kind of Aerial Images system for rapidly identifying and its method for quickly identifying based on multitask topology learning
CN110059705A (en) A kind of OCR recognition result decision method and equipment based on modeling
CN109948589A (en) Facial expression recognizing method based on quantum deepness belief network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant