CN110110845A

CN110110845A - Learning method based on parallel multi-level width neural network

Info

Publication number: CN110110845A
Application number: CN201910331708.8A
Authority: CN
Inventors: 席江波; 房建武; 吴田军; 康梦华
Original assignee: Changan University
Current assignee: Changan University
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2019-08-09
Anticipated expiration: 2039-04-24
Also published as: CN110110845B

Abstract

The invention discloses a learning method based on a parallel multi-level width neural network, which comprises the following steps: obtaining a verification set and constructing a base classifier; training and verifying each level of the parallel M-level width neural network to obtain a trained parallel M-level width neural network and verification output corresponding to each level of the width neural network; obtaining a decision threshold value of each level of width neural network through statistical calculation; and testing the verified parallel multi-level width neural network through a test set. The neural network has a multi-stage structure, each stage learns different parts of data, and parallel training and testing can be realized. Each level adopts a width neural network to carry out feature learning in the width direction; realizing the integration of the classifiers in two width directions by using a plurality of width neural networks as the reconnection of the base classifiers in the width directions; incremental learning of the network is realized by adding a new level of width neural network; and can realize parallelization test.

Description

A kind of learning method based on parallel multi-level width neural network

Technical field

The invention belongs to artificial intelligence and machine learning techniques fields, and in particular to one kind is based on parallel multi-level width mind Learning method through network.

Background technique

As the learning model based on deep learning network obtains in the fields such as large-scale image processing and machine vision The complexity of immense success, learning model also quicklys increase, they need a large amount of high dimensional data to go to train, to increase Add required computing resource and calculates the time.In addition, actual data are frequently not homogeneity, some samples are very easy to Classification, but also have many sample classifications relatively difficult.Most of errors in classification occur when input sample is difficult classification Wait, for example, sample uneven distribution, abnormal obtain sample and close to classification boundaries or the sample of linearly inseparable etc..

In existing deep learning model, simple sample and complex samples are handled in a like fashion, reduce meter Calculate the service efficiency of resource.Meanwhile existing deep learning network such as convolutional neural networks often have plurality of layers, all samples Originally it will pass through all network layers, it can be very time-consuming when carrying out extensive or test to network.And early stage is parallel Organizing network only receives the sample by nonlinear transformation refused by upper level in every level-one, these samples are become The other spaces for being easy to classify are changed to, to classify again.But how to realize high dimensional data for different difficulty Data sample carry out adjustment and the distribution of computing resource with improve the speed of learning classification and efficiency this problem there is no To very good solution.

Summary of the invention

In view of the foregoing drawbacks, the present invention provides a kind of learning method based on parallel multi-level width neural network, this hairs Bright neural network has multilevel structure, and every level-one is learnt for the different piece in data, and can realize that parallelization is instructed Practice and tests.Every level-one carries out feature learning in width direction using a kind of width neural network；Pass through multiple width nerve nets Network is again coupled to as base classifier in width direction, realizes the combining classifiers in two width directions；It is new by increasing The incremental learning of the width neural fusion network of level-one；And can realize that parallelization is tested, substantially reduce complex samples The learning classification time improves network operation efficiency.

In order to achieve the above object, the present invention is resolved using following technical scheme.

(2) a kind of learning method based on parallel multi-level width neural network, parallel multi-level width neural network include Multistage width neural network, wherein every level width neural network includes sequentially connected input layer, hidden layer, decision-making level and defeated Layer out, whether the decision-making level is for determining each test sample by when prime exports, the learning method includes following step It is rapid:

Step 1, original training sample collection is obtained, parallel M level width neural network Net is constructed₁... Net_m..., Net_M(m =1,2 ..., M), base classifier of every level width neural network as respective stages；By being carried out M times to original training sample collection Data transformation, it is corresponding to obtain M verifying collection x_{v_1}... x_{v_m}... x_{v_M}；

Wherein, the total sample number that original training sample integrates is N_tr。

Step 2, x is collected using original training sample collection and M verifying_{v_1}... x_{v_m}... x_{v_M}Respectively to parallel M level width Every grade of neural network is trained and verifies, parallel M level width neural network and every level width nerve net after being trained The corresponding verifying of network exports y_{v_m}(m=1,2 ..., M)；Each verifying output y is obtained using minimum error method_{v_m}Corresponding label y_{v_ind_m}, and then correct point of the verifying collection of every level width neural network of the parallel M level width neural network after being trained Class sample set y_{vc_m}With wrong classified sample set y_{vw_m}；

Step 3, to the correct of the verifying collection of every level width neural network of the parallel M level width neural network after training Classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}Statistics calculating is carried out respectively, corresponds to every level width mind after being trained Decision-making value T through network_m；By the decision-making value T of every level width neural network_mDecision as corresponding level width neural network Foundation obtains the parallel M level width neural network that decision-making value determines；

Step 4, test set is obtained, the input for the parallel M level width neural network that test set is determined as decision-making value Data parallel inputs to every level width neural network that decision-making value determines and is tested, and obtains every grade that decision-making value determines The output of width neural network；The error vector for obtaining every level width neural network, every level width mind that decision-making value is determined Output through network is judged, so that the test output for obtaining every level width neural network that decision-making value determines is corresponding Label y_{test_ind_m}。

The characteristics of technical solution of the present invention and further improvement are as follows:

(1) in step 1, the data be transformed to the sample that original sample is concentrated by elastic registration (Elastic) into Row compression or deformation；Or the data are transformed to revolve by the sample that affine transformation (Affine) concentrates original sample Turn, overturn, zoom in or out.

(2) described that x is collected using original training sample collection and M verifying in step 2_{v_1}... x_{v_m}... x_{v_M}Respectively to parallel Every grade of M level width neural network is trained and verifies, and it includes following sub-steps:

Sub-step 2.1, using original training sample collection as the 1st level width neural network Net₁Input sample, to the 1st Level width neural network Net₁It is trained, the first level width neural network after being trained.

Sub-step 2.2 collects x using the first verifying_{v_1}The 1st level width neural network after training is verified, is obtained The wrong classified sample set y of the verifying collection of 1st level width neural network_{vw_1}。

Sub-step 2.3, by the wrong classified sample set y of the first level width neural network_{vw_1}As the 2nd level width nerve The input sample A of network_{v_1}；Training sample set A is randomly selected from original training sample concentration again_{v_2}, make total input sample collection {A_{v_1}+A_{v_2}In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { A_{v_1}+A_{v_2}Make For the input sample of the 2nd level width neural network.

Sub-step 2.4, using total input sample collection { A_{v_1}+A_{v_2}2nd level width neural network is trained, it obtains The 2nd level width neural network after training；X is collected using the second verifying_{v_2}The 2nd level width neural network after training is tested Card obtains the wrong classified sample set y of the verifying collection of the 2nd level width neural network_{vw_2}。

And so on, 3rd level is trained respectively to M level width neural network, parallel M grades after being trained The corresponding verifying output y of width neural network and every level width neural network_{v_m}(m=1,2 ..., M).

(3) in step 2, the minimum error method are as follows:

Firstly, setting total classification number that original training sample integrates as C, R-matrix R is constructed_j(1≤j≤C)。

Wherein, R-matrix R_jThe element of jth row be all 1, remaining element is all 0, each R-matrix R_jDimension For C × N_tr。

Secondly, exporting y according to the verifying of every level width neural network after training_{v_m}, obtain verifying output y_{v_m}With it is corresponding The R-matrix R of grade_jBetween error vector:

J_{v_mj}=| | softmax (y_{v_m})-R_j||₂, 1≤j≤C；

Wherein, J_{v_mj}Dimension be 1 × N_tr；y_{v_m}Dimension be C × N_tr。

Finally, exporting y to verifying_{v_m}With the R-matrix R of respective stages_jBetween error vector J_{v_mj}It minimizes, obtains The corresponding class label y of every level width neural network after training_{v_ind_m}:

Wherein, y_{v_ind_m}Dimension be 1 × N_tr。

(4) in step 3, it includes following sub-step that the statistics, which calculates:

Sub-step 3.1, correct point of the m level width neural network of the parallel M level width neural network after setting training Class sample set and wrong classified sample set are respectively as follows: y_{vc_m}And y_{vw_m}, what correct classified sample set and wrong classification samples were concentrated Total sample number is respectively as follows: N_{vc_m}And N_{vw_m}, and N_{vc_m}+N_{vw_m}=N_tr, then correct classified sample set and wrong classified sample set Error is respectively as follows:

e_{vc_m}=| | softmax (y_{vc_m})-t_{vc_m}||₂；

e_{vw_m}=| | softmax (y_{vw_m})-t_{vw_m}||₂；

Wherein, t_{vc_m}It is correct classification samples y in m level width neural network_{vc_m}Corresponding true tag, t_{vw_m}It is m grades Wrong classification samples y in width neural network_{vw_m}Corresponding true tag.

Sub-step 3.2, according to correct classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}, calculate separately out correct point Class sample set y_{vc_m}Mean value and variance be respectively μ_cAnd σ_c；Mistake classified sample set y_{vw_m}Mean value and variance be respectively: u_wWith σ_w；Then correct classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}Corresponding Gaussian Profile is respectively:

Correct classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}Corresponding Gaussian probability-density function is respectively:

Sub-step 3.3, according to wrong classified sample set y_{vw_m}Error e_{vw_m}And variances sigma_w, obtain m level width nerve net The decision-making value T of network_m=min (e_{vw_m})-ασ_w。

Wherein, α is a constant, for providing allowance, so that the wrong classification samples y of institute_{vw_m}It is refused when prime Absolutely.

(5) in step 4, the acquisition test set are as follows: obtain original test sample collection x_test；By M data extending, It is corresponding to obtain M group test sample collection x_{test_1}..., x_{test_m}..., x_{test_M}, as test set.

Further, the data extending are as follows: to the original test sample collection x_testIn each sample carry out respectively N_testDThe secondary data transformation, correspondence obtain N_testDA test sample collection, the parallel M level width mind determined as decision-making value The test set x of m level width neural network through network_{test_m}。

Wherein, original test sample collection X_testMiddle test sample sum is N_{test_saples}。

(6) in step 4, the error vector for obtaining every level width neural network includes following sub-step:

Sub-step 4.1, by M group test sample collection x_{test_1}, x_{test_2}..., x_{test_M}Input to decision-making value parallel respectively Determining parallel M level width neural network, the corresponding N for obtaining every level width neural network that decision-making value determines_testDA output y_{test_m_d}, (d=1,2 ... N_testD)。

Sub-step 4.2, to the N for every level width neural network that decision-making value determines_testDA output y_{test_m_d}, (d=1,2 ... N_testD) average value is calculated, obtain the test output for every level width neural network that decision-making value determines

Sub-step 4.3 sets total classification number of test set as C, constructs R-matrix R_j(1≤j≤C)；It is defeated to obtain verifying Y out_{v_m}With the R-matrix R of respective stages_jBetween error vector:

J_{test_mj}=| | softmax (y_{test_m})-R_j||₂, 1≤j≤C；

Wherein, R-matrix R_jThe element of jth row be all 1, remaining element is all 0, each R-matrix R_jDimension For C × N_{test_samples}；J_{test_mj}Dimension be 1 × N_{test_samples}, y_{v_m}Dimension be C × N_{test_samples}。

(7) output of the every level width neural network determined to decision-making value judges are as follows:

The minimal error of current level width neural network is less than or equal to when prime decision-making value, then is judged as and works as prime For the correct classification output stage of the output:

min(J_{test_mj})≤T_m。

The minimal error of current level width neural network is greater than when prime decision-making value, then is judged as when prime can not Correctly classified to the output, which is transferred to next level width neural network and is tested, is so recycled, until this Correct classification output stage is found in output:

min(J_{test_mj}) > T_m。

(8) in step 4, the test for obtaining every level width neural network that decision-making value determines exports corresponding mark Sign y_{test_ind_m}Are as follows:

Wherein, y_{test_ind_m}Dimension be 1 × N_{test_samples}。

Compared with prior art, the invention has the benefit that

(1) neural network of the invention has multistage base classifier, and every level-one is used to the different piece sample of learning data set This, can adaptively determine the structure of neural network, realize computing resource according to problem and the complexity of data set Optimization.

(2) neural network of the invention has the advantages that incremental learning, when new training data is available, realizes to working as The judgement of preceding neural network, according to judging result, it is determined whether can correctly be classified to newly-increased training data, if cannot be into Row properly separates, then learns new sample by increasing the new width radial basis function level-one new as neural network, and Without re -training whole network.

(3) neural network of the invention can carry out concurrent testing when test, that is, simultaneously test data To all grades of network, the decision-making value of every level-one obtained in training process determines each test sample finally by which The neural network of level-one exports, and concurrent testing process greatly reduces waiting time when actual use network.

(4) neural network of the invention can be used as a kind of general learning framework, have very strong flexibility, each Grade can use BP neural network, convolutional neural networks or other kinds of classifier according to actual needs.

Detailed description of the invention

The present invention is described in further details in the following with reference to the drawings and specific embodiments.

Fig. 1 is the schematic diagram and its training test process schematic diagram of parallel multi-level neural network of the invention；Wherein, Fig. 1 It (a) is parallel multi-level width neural networks principles figure of the invention；Fig. 1 (b) is parallel multi-level width neural network of the invention Training and verification process schematic diagram；Fig. 1 (c) is the test process schematic diagram of parallel multi-level width neural network of the invention.

Fig. 2 is the structure chart of parallel multi-level width neural network of the invention.

Fig. 3 (a) is error distribution of the verifying collection of parallel multi-level width neural network of the invention in wherein level-one Figure；Fig. 3 (b) is the Gaussian probability-density function of the statistical parameter in Fig. 3 (a).

Fig. 4 is test result of the parallel 26 level width neural network on MMNIST data set in the embodiment of the present invention With the classification results comparison diagram of existing learning model.

Specific embodiment

Embodiment of the present invention is described in detail below in conjunction with embodiment, but those skilled in the art It will be understood that following embodiment is merely to illustrate the present invention, and it is not construed as limiting the scope of the invention.

Using MNIST hand-written data collection, which is the image of 8 gray scale handwritten numerals 0~9, image Size is 28 × 28, and 10 class, there is 60000 original training sample collection in total, and 10000 images are newly to learn as test set One of the important general purpose image data collection of model training test.For the data set, with reference to Fig. 1 and Fig. 2, the present embodiment is used Width radial primary function network is all made of width radial direction base as base classifier, i.e. every grade of parallel multi-level width neural network Function Network, the series for choosing parallel width neural network is 26.

(1) verifying collection is obtained, base classifier is constructed.

Firstly, to N_trThe image pattern that=60000 original training samples are concentrated carries out 26 hypoelasticity transformation respectively, obtains Collect x to M=26 verifying_{v_1}, x_{v_2}..., x_{v_26}, guarantee that the mistake for the verifying collection for having enough divides sample in the present embodiment This, it includes N that each verifying, which is concentrated,_val=10 data sets obtained by original trained set transformation.Wherein, the sample of each verifying collection This number is the N of original training sample collection_val=10 times.

Secondly, designing parallel multi-level width neural network as base classifier using width radial primary function network；M =26 width radial primary function networks link together, and form parallel multi-level width neural network Net₁, Net₂... Net_M； Each base classifier is as level-one, the different piece be absorbed in data set.

Finally, building width radial primary function network.Detailed process is as follows:

Building includes N_0k=1000 Gaussian bases areRadial primary function network, should The center of radial primary function network is a subset for being derived from original training sample collection at random, and standard deviation value is constant.Using Sliding window obtains original training sample and concentrates the multiple groups local feature image of each image pattern, to obtain multiple groups part Eigenmatrix obtains multiple radial primary function networks i.e. using multiple groups local feature matrix as the input data of Gaussian bases For width radial primary function network.

2) every grade of parallel M level width neural network is trained and is verified, the parallel M level width after being trained Neural network and the corresponding verifying of every level width neural network export y_{v_m}(m=1,2 ..., M).

1st level width radial primary function network is trained using original training sample collection, after training, the instruction of mistake point Practice the width radial primary function network that sample is sent to the 2nd grade, as a part of second training set, the net that the 2nd grade of Lai Xunlian Network.The verifying collection obtained using step (1) to the training network verification when prime, while providing more error samples, makees For a part of next stage training set.Include following sub-step specifically as shown in Fig. 1 (a), (b):

Sub-step 2.1, using original training sample collection as the 1st level width neural network Net₁Input sample, to the 1st Level width neural network Net₁It is trained, the 1st level width neural network after being trained.

Sub-step 2.2 collects x using the first verifying_{v_1}The 1st level width neural network after training is verified, is obtained The wrong classified sample set y of 1st level width neural network_{vw_1}。

Sub-step 2.3, by the wrong classified sample set y of the 1st level width neural network_{vw_1}As the 2nd level width nerve net The input sample A of network_{v_1}；Training sample set A is randomly selected from original training sample concentration again_{v_2}, make total input sample collection { A_{v_1} +A_{v_2}In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { A_{v_1}+A_{v_2}It is used as the 2nd The input sample of level width neural network.

Sub-step 2.4, using total input sample collection { A_{v_1}+A_{v_2}2nd level width neural network is trained, it obtains The 2nd level width neural network after training；X is collected using the second verifying_{v_2}The 2nd level width neural network after training is tested Card, obtains the wrong classified sample set y of the 2nd level width neural network_{vw_2}。

Sub-step 2.3 and 2.4 is repeated, 3rd level is trained respectively to M level width neural network, after being trained Parallel M level width neural network and every level width neural network corresponding verifying output y_{v_m}(m=1,2 ..., M),.

The specific training of above-mentioned width radial primary function network and verification process are as follows:

For the image pattern that original training sample is concentrated as input data, image size is M₁×M₂=28 × 28.It is sliding Dynamic window size is r=13 × 13, and the initial position of sliding window is located at the upper left corner of each image pattern, selection sliding step A length of 1 pixel, sliding window from left to right, successively slide from top to bottom, 60000 image patterns in sliding window 3 d image block stretch become matrix x_k∈R^r×N, i.e., it is corresponding original each local feature image to be pressed into pixel composition respectively Matrix forms a column vector after arranging the 2nd of each original matrix the to last Leie time sequence arrangement to the 1st；It will be N number of Column vector sequence rearranges the local feature matrix x of one group of training image sample_k(1≤k≤K), local feature matrix x_k's Each column represent a sample.Again local feature matrix x_kIt inputs to including N_0k=1000 Gaussian bases areRadial primary function network, output is denoted as:

Wherein,For the column vector comprising N=60000 element.

Sliding window slides a corresponding radial primary function network every time can be obtained K=(M after final sliding₁-m +1)(M₂- m+1)=(28-13+1) × (28-13+1)=256 radial primary function network.

For each radial primary function network, sequence and down-sampling are introduced by the output of Gaussian bases to it.Needle To each radial primary function network, to its Gaussian bases output data Φ Jing Guo nonlinear transformation_kIntroduce sequence and under Sampling.To the output data Φ of width radial primary function network_kEach column sum, obtain a row vector, row vector Each element be each image to be processed local specific position the sum of pixel, it is special to the part of each image to be processed It positions the sum of pixel set and carries out descending arrangement, obtain descending vectorUsing index s_kBy descending to Measure a_kIn the corresponding home position of local specific position of each image to be processed be marked, the output data to be sorted Φ′_k=sort (Φ_k, s_k)。

Down-sampling is carried out to the output data of sequence, sets down-sampling interval N_kS=20, the output number through over-sampling Are as follows:

Then the output number of total width radial primary function network isSampling output is Φ_ks =subsample (Φ '_k, N_kS), then the output of Gaussian bases is Φ=[Φ_1S, Φ_2S..., Φ_KS]。

Setting desired output is D=[D₁, D₂..., D_C]；To the defeated of the Gaussian bases of width radial primary function network Linear layer connection is carried out out, then the weight of linear layer are as follows: W=[W₁, W₂..., W_C]；

Wherein, C=10 is the classification sum of original sample.

Obtain the classification output Y=[Y of width radial primary function network₁, Y₂..., Y_C]=Φ W；Specifically, pass through minimum Change the least mean-square estimate that square error calculates the weight of linear layerSpecific formula are as follows:

The weight of linear layer is calculated by the pseudo inverse matrix that the Gaussian bases of width radial primary function network export Φ Least mean-square estimate

Wherein, Φ⁺The pseudo inverse matrix of Φ is exported for the Gaussian bases of width radial primary function network.

Finally, the classification output of width radial primary function network is calculated are as follows:

And then the width radial primary function network after being trained, to the width radial primary function network after every grade of training It is verified using corresponding verifying collection, the corresponding verifying of every level width radial primary function network after being trained exports y_{v_m}(m =1,2 ..., M).

Y is exported by the verifying of acquisition_{v_m}(m=1,2 ..., M) further obtains each verifying output y_{v_m}Corresponding class Distinguishing label y_{v_ind_m}, the specific steps are as follows:

J_{v_mj}=| | softmax (y_{v_m})-R_j||₂, 1≤j≤C；

Wherein, J_{v_mj}Dimension be 1 × N_tr；y_{v_m}Dimension be C × N_tr。

Wherein, y_{v_ind_m}Dimension be 1 × N_tr。

By the corresponding class label y of every level width neural network after training_{v_ind_m}Verifying with every grade exports y_{v_m}Into Row compares, and can be obtained the correct classified sample set y of every level width neural network_{vc_m}With wrong classified sample set y_{vw_m}。

(3) pass through the decision-making value T that every level width neural network is calculated_m

The relatively difficult part of present networks is the determination of every level-one decision-making value, it is used to determine when test, often One sample should be exported by which rank of network.After training and test, classify to correct classified sample set and mistake Sample set carries out statistics calculating respectively.Assuming that in m level width neural network, correct classified sample set and wrong classified sample set It is respectively as follows: y_{vc_m}And y_{vw_m}, the total sample number of correct classified sample set and wrong classified sample set is respectively as follows: N_{vc_m}And N_{vw_m}, And N_{vc_m}+N_{vw_m}=N_tr。

In the above verification process, in order to guarantee that final sample has enough error sample collection, each verifying collection be can be It include N_valA that original training sample collection is passed through to the verifying sample set that data convert, i.e., each verifying collection may include N_valGroup verifying sample set, that is to say, that each sample number integrated of verifying is the N of original training sample_valTimes.

The error of two class sample sets is calculate by the following formula:

e_{vc_m}=| | softmax (y_{vc_m})-t_{vc_m}||₂；

e_{vw_m}=| | softmax (y_{vw_m})-t_{vw_m}||₂；

Wherein, t_{vc_m}And t_{vw_m}It is correct classification samples y in m grades_{vc_m}With wrong classification samples y_{vw_m}Corresponding true mark Label.Assuming that correct classification and mistake are classified, the mean value of these two types of sample statistics and variance are respectively: μ_c, u_w, σ_c, σ_w, right therewith Two Gaussian Profiles answered are respectively:

Its Gaussian probability-density function is respectively:

In the level-one of parallel multi-level width neural network, verifying collection error distribution and its probability density function such as Fig. 3 (a) and (b) is shown, then the decision-making value of m level width neural network are as follows:

T_m=min (e_{vw_m})-ασ_w；

(4) it is tested by the parallel multi-level width neural network that test set determines decision-making value

As shown in Fig. 1 (c), specific test process are as follows:

Firstly, obtaining test set, detailed process are as follows: obtain original test sample collection X_test；It is right by M data extending M group test sample collection x should be obtained_{test_1}..., x_{test_m}..., x_{test_M}, as test set；Wherein, original test sample collection x_testMiddle test sample sum is N_{test_samples}。

Above-mentioned data extending are as follows: to the original test sample collection X_testIn each sample carry out N respectively_testDIt is secondary Data transformation, correspondence obtain N_testDA test sample collection, the m as the parallel M level width neural network that decision-making value determines The test set x of level width neural network_{test_m}。

The stability that above-mentioned test set acquisition methods can be tested in subsequent test process.

Secondly, by M group test sample collection x_{test_1}..., x_{test_m}..., x_{test_M}Decision-making value is inputed to parallel to determine Parallel M level width neural network, test set is tested, i.e., is determined the corresponding decision-making value that inputs to of every group of test set Every level width neural network tested, the corresponding N for obtaining every level width neural network that decision-making value determines_testDA survey Try sample set output；To N_testDThe output of a test sample collection is averaged, and obtains every level width nerve that decision-making value determines The test of network exports

Again, total classification number of test set is set as C, constructs R-matrix R_j(1≤j≤C)；Obtain verifying output y_{v_m} With the R-matrix R of respective stages_jBetween error vector:

J_{test_mj}=| | softmax (y_{test_m})-R_j||₂, 1≤j≤C；

Finally, judging the output for every level width neural network that decision-making value determines, specifically: when prime is wide The minimal error for spending neural network is less than or equal to the i.e. min (J when prime decision-making value_{test_m}j)≤T_m, then it is judged as current Grade is the correct classification output stage of the output.

The minimal error of current level width neural network is greater than the i.e. min (J when prime decision-making value_{test_mj}) > T_m, then It is judged as that working as prime can not correctly classify to the output, be transferred to next level width neural network for the output and test, So circulation, until correct classification output stage is found in the output.And then obtain every level width neural network that decision-making value determines Test export corresponding labelWherein, y_{test_ind_m}Dimension be 1 × N_{test_samples}。

If test sample cannot export for 25 grades in front, directly exported at last the 26th grade.

Test set finally can be obtained in the output L of whole network_test；Wherein, correct classification samples and wrong classification samples Calculating, and then the precision of the sample classification of available parallel multi-level width neural network of the present invention can be counted.

Comparative example

Using original training sample collection same as the previously described embodiments, verifying collection and test set, random forest is respectively adopted (RF), multilayer perceptron (MP), conventional radial primary function network (RBF), support vector machines (SVM), range learning system (BLS), condition deep learning model (CDL), deepness belief network (DBL), convolutional neural networks LeNet-5, depth Bohr is hereby Deep woods (gc) is used as base classifier at random for graceful machine (DBM) and depth, carries out learning classification, finally obtained various study sides Method is as shown in Figure 4 to the precision of data classification.

From fig. 4, it can be seen that compared to the learning model of current mainstream: random forest (RF), multilayer perceptron (MP), Conventional radial primary function network (RBF), support vector machines (SVM), range learning system (BLS), condition deep learning model (CDL), deepness belief network (DBL), convolutional neural networks LeNet-5, depth Boltzmann machine (DBM) and depth are random Deep woods (gc forest), the accuracy rate of the classification results of parallel multi-level width neural network (PMWNN) of the invention have non- Often high competitiveness, the final nicety of grading of the method for the present invention is that 99.10%, WRBF is width radial primary function network.And Compared to depth, deep woods learning model, the method for the present invention neural network have multistage base neural net at random, and every level-one is used to learn The different piece sample for practising data set, can adaptively determine neural network according to problem and the complexity of data set Structure realizes the optimization of computing resource；Meanwhile neural network of the invention can carry out concurrent testing when test, Exactly test data is given simultaneously all grades of network, the decision-making value of every level-one obtained in training process is every to determine A test sample is finally exported by which rank of neural network, when concurrent testing process greatly reduces actual use network Waiting time.

In addition, incremental learning may be implemented in parallel multi-level width neural network of the invention, i.e., when there is new data It waits, new width radial primary function network can be increased to learn new characteristic, it is wide without the entire parallel multi-level of re -training Neural network is spent, this means that the network proposed can learn new knowledge under the premise of not forgeing old knowledge.New instruction Practice data and inputs to current M grades of network, if wrong point of sample, they and the original training set Jing Guo data extending New training dataset, the new width radial primary function network of training are established together, while being tested using new verifying collection Card, and decision-making value is calculated, to establish M+1 grades of networks.Finally, new parallel multi-level width neural network will be by M+1 Level width radial primary function network composition.Meanwhile the parallel multi-level width neural network that the present invention designs can when test With concurrent testing, all test samples can all give all grades of width radial primary function network, which decision-making value determines One width radial primary function network distributes to corresponding test sample.The process withouts waiting for other grades of network output, To parallelization when test, test process is accelerated.

Each level width neural network in parallel multi-level width neural network of the invention, can be width radial direction base Function Network, BP neural network, convolutional neural networks or other classifiers, and every grade of base point of multistage width neural network The type of class device can be different.

Obviously, various changes and modifications can be made to the invention without departing from of the invention by those skilled in the art Spirit and scope.In this way, if these modification and variations of the invention belong to the model of the claims in the present invention and its equivalent technologies Within enclosing, then the present invention is also intended to include these modifications and variations.

Claims

1. a kind of learning method based on parallel multi-level width neural network, parallel multi-level width neural network includes multistage width Neural network, wherein every level width neural network includes sequentially connected input layer, hidden layer and output layer, which is characterized in that The learning method the following steps are included:

Step 1, original training sample collection is obtained, parallel M level width neural network Net is constructed₁... Net_m..., Net_M(m=1, 2 ..., M), base classifier of every level width neural network as respective stages；Become by carrying out M data to original training sample collection It changes, it is corresponding to obtain M verifying collection x_{v_1}... x_{v_m}... x_{v_M}:

Wherein, the total sample number that original training sample integrates is N_tr；

Step 2, x is collected using original training sample collection and M verifying_{v_1}... x_{v_m}, x_{v_M}Respectively to parallel M level width neural network Every grade be trained and verify, the parallel M level width neural network and every level width neural network after being trained are corresponding Verifying output y_{v_m}(m=1,2 ..., M)；Each verifying output y is obtained using minimum error method_{v_m}Corresponding label y_{v_ind_m}, into And the correct classified sample set of the verifying collection of every level width neural network of the parallel M level width neural network after being trained y_{vc_m}With wrong classified sample set y_{vw_m}；

Step 3, to the correct classification sample of the verifying collection of every level width neural network of the parallel M level width neural network after training This collection y_{vc_m}With wrong classified sample set y_{vw_m}Statistics calculating is carried out respectively, corresponds to every level width neural network after being trained Decision-making value T_m；By the decision-making value T of every level width neural network_mAs the decision-making foundation of corresponding level width neural network, obtain The parallel M level width neural network determined to decision-making value；

Step 4, test set is obtained, the input data for the parallel M level width neural network that test set is determined as decision-making value, Every level width neural network that decision-making value determines is inputed to parallel to be tested, and every level width mind that decision-making value determines is obtained Output through network；The error vector for obtaining every level width neural network, every level width neural network that decision-making value is determined Output judged, so that the test for obtaining every level width neural network that decision-making value determines exports corresponding label y_{test_ind_m}。

2. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 1 In, the data are transformed to be compressed or deformed by the sample that elastic registration concentrates original sample；Or the data become It is changed to and is rotated, overturn by the sample that affine transformation concentrates original sample, zoomed in or out.

3. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 2 In, it is described that x is collected using original training sample collection and M verifying_{v_1}... x_{v_m}... x_{v_M}Respectively to parallel M level width neural network Every grade be trained and verify, it includes following sub-steps:

Sub-step 2.1, using original training sample collection as the 1st level width neural network Net₁Input sample, to the 1st level width Neural network Net₁It is trained, the first level width neural network after being trained；

Sub-step 2.2 collects x using the first verifying_{v_1}The 1st level width neural network after training is verified, obtains the 1st grade The wrong classified sample set y of the verifying collection of width neural network_{vw_1}；

Sub-step 2.3, by the wrong classified sample set y of the first level width neural network_{vw_1}As the 2nd level width neural network Input sample A_{v_1}；Training sample set A is randomly selected from original training sample concentration again_{v_2}, make total input sample collection { A_{v_1}+A_{v_2}} In sample number be equal to the sample number that original training sample is concentrated, and by total input sample collection { A_{v_1}+A_{v_2}It is used as the 2nd level width The input sample of neural network；

Sub-step 2.4, using total input sample collection { A_{v_1}+A_{v_2}2nd level width neural network is trained, after being trained The 2nd level width neural network；X is collected using the second verifying_{v_2}The 2nd level width neural network after training is verified, is obtained The wrong classified sample set y of the verifying collection of 2nd level width neural network_{vw_2}；

And so on, 3rd level is trained respectively to M level width neural network, the parallel M level width mind after being trained Corresponding verifying output y through network and every level width neural network_{v_m}(m=1,2 ..., M).

4. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 2 In, the minimum error method are as follows:

Wherein, R-matrix R_jThe element of jth row be all 1, remaining element is all 0, each R-matrix R_jDimension be C × N_tr；

Secondly, exporting y according to the verifying of every level width neural network after training_{v_m}, obtain verifying output y_{v_m}With respective stages R-matrix R_jBetween error vector:

J_{v_mj}=| | softmax (y_{v_m})-R_j||₂, 1≤j≤C；

Wherein, | | | |₂2 norms of representing matrix, softmax () are normalization exponential function；J_{v_mj}Dimension be 1 × N_tr； y_{v_m}Dimension be C × N_tr；

Finally, exporting y to verifying_{v_m}With the R-matrix R of respective stages_jBetween error vector J_{v_mj}It minimizes, is trained The corresponding class label y of every level width neural network afterwards_{v_ind_m}:

Wherein, y_{v_ind_m}Dimension be 1 × N_tr。

5. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 3 In, it includes following sub-step that the statistics, which calculates:

Sub-step 3.1, the correct classification sample of the m level width neural network of the parallel M level width neural network after setting training This collection and wrong classified sample set are respectively as follows: y_{vc_m}And y_{vw_m}, correct classified sample set and the sample that wrong classification samples are concentrated are total Number is respectively as follows: N_{vc_m}And N_{vw_m}, and N_{vc_m}+N_{vw_m}=N_tr, then the error of correct classified sample set and wrong classified sample set is distinguished Are as follows:

e_{vc_m}=| | softmax (y_{vc_m})-t_{vc_m}||₂；

e_{vw_m}=| | softmax (y_{vw_m})-t_{vw_m}||₂；

Wherein, t_{vc_m}It is correct classification samples y in m level width neural network_{vc_m}Corresponding true tag, t_{vw_m}It is m level width mind Through classification samples y wrong in network_{vw_m}Corresponding true tag；

Sub-step 3.2, according to correct classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}, calculate separately out correct classification sample This collection y_{vc_m}Mean value and variance be respectively μ_cAnd σ_c；Mistake classified sample set y_{vw_m}Mean value and variance be respectively: u_wAnd σ_w；Then Correct classified sample set y_{vc_m}With wrong classified sample set y_{vw_m}Corresponding Gaussian Profile is respectively:

Sub-step 3.3, according to wrong classified sample set y_{vw_m}Error e_{vw_m}And variances sigma_w, obtain determining for m level width neural network Plan threshold value T_m=min (e_{vw_m})-ασ_w；

Wherein, α is a constant, for providing allowance, so that the wrong classification samples y of institute_{vw_m}It is rejected when prime.

6. the learning method according to claim 2 based on parallel multi-level width neural network, which is characterized in that step 4 In, the acquisition test set are as follows: obtain original test sample collection x_test；It is corresponding to obtain M group test specimens by M data extending This collection x_{rest_1}..., x_{test_m}..., x_{test_M}, as test set.

7. the learning method according to claim 6 based on parallel multi-level width neural network, which is characterized in that the number According to expansion are as follows: to the original test sample collection x_testIn each sample carry out N respectively_testDThe secondary data transformation, it is corresponding Obtain N_testDA test sample collection, the m level width neural network as the parallel M level width neural network that decision-making value determines Test set x_{test_m}；

Wherein, original test sample collection x_testMiddle test sample sum is N_{test_samples}。

8. the learning method according to claim 1 based on parallel multi-level width neural network, which is characterized in that step 4 In, the error vector for obtaining every level width neural network includes following sub-step:

Sub-step 4.1, by M group test sample collection x_{test_1}, x_{test_2}..., x_{test_M}Decision-making value is inputed to parallel respectively to determine Parallel M level width neural network, the corresponding N for obtaining every level width neural network that decision-making value determines_testDA output y_{test_md}, (d=1,2 ... N_testD)；

Sub-step 4.2, to the N for every level width neural network that decision-making value determines_testDA output y_{test_m_d}, (d=1,2 ... N_testD) Average value is calculated, the test output for every level width neural network that decision-making value determines is obtained

Sub-step 4.3 sets total classification number of test set as C, constructs R-matrix R_j(1≤j≤C)；Obtain verifying output y_{v_m} With the R-matrix R of respective stages_jBetween error vector:

J_{test_mj}=| | softmax (y_{test_m})-R_j||₂, 1≤j≤C；

Wherein, R-matrix R_jThe element of jth row be all 1, remaining element is all 0, each R-matrix R_jDimension be C × N_{test_samples}；J_{test_mj}Dimension be 1 × N_{test_samples}, y_{v_m}Dimension be C × N_{test_samples}。

9. the learning method according to claim 8 based on parallel multi-level width neural network, which is characterized in that described right The output for every level width neural network that decision-making value determines is judged are as follows:

The minimal error of current level width neural network is less than or equal to when prime decision-making value, then is judged as when prime is that this is defeated Correct classification output stage out:

min(J_{test_mj})≤T_m；

The minimal error of current level width neural network is greater than when prime decision-making value, then is judged as when prime can not be defeated to this Correctly classified out, which is transferred to next level width neural network and is tested, is so recycled, until the output is found Correct classification output stage:

min(J_{test_mj}) > T_m。

10. the learning method according to claim 9 based on parallel multi-level width neural network, which is characterized in that step 4 In, the test for obtaining every level width neural network that decision-making value determines exports corresponding label y_{test_ind_m}Are as follows:

Wherein, y_{test_ind_m}Dimension be 1 × N_{test_samples}。