CN112232494A

CN112232494A - Method for constructing pulse neural network for feature extraction based on frequency induction

Info

Publication number: CN112232494A
Application number: CN202011246943.4A
Authority: CN
Inventors: 杨旭; 雷云霖; 蔡建; 宦紫仪; 王淼; 林侠侣
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-01-15

Abstract

A method for constructing a pulse neural network for feature extraction based on frequency induction comprises establishing an input layer, performing primary processing on input data, and converting the input data into a pulse sequence; establishing a feature extraction layer, introducing a frequency induction heuristic mechanism on a pulse neuron, performing structure learning by using an accumulative growth algorithm based on an HEBB rule, performing weight learning by using an STDP rule, inputting data needing feature extraction after learning is finished, wherein the output of the last layer of the feature extraction layer is the feature, and starting simulation for each piece of training data for a plurality of times for learning. The method can effectively generate memory, efficiently extract features, have good robustness through small sample learning, and lead the network capability to be stronger, the learning to be more effective and the operation efficiency to be higher due to the introduction of a heuristic mechanism based on frequency induction, thereby leading the network synapse growth speed to be kept in a reasonable range, leading the network scale to be in a dynamic balance state and preventing the overgrowth condition.

Description

Method for constructing pulse neural network for feature extraction based on frequency induction

Technical Field

The invention belongs to the technical field of artificial intelligence and neural networks, and particularly relates to a method for constructing a pulse neural network for feature extraction based on frequency induction.

Background

The impulse neural network is called as a third generation artificial neural network, a model which is closer to a real neuron is considered to have greater potential in the aspect of intelligence, but now the sample processing in the process of training the impulse neural network is also based on discrete numerical values, so that the information is lost by converting numerical values into impulses, some characteristics in the samples are lost due to the loss, namely, a great deal of noise is added, the learning effect of the impulse neural network is influenced, and the learning is insufficient. There is currently a lack of an excellent method of converting features into pulse trains. When a neural network is used for processing specific problems, the efficiency of a feature extraction stage is low, the system function and the overall processing capacity are seriously influenced, and the synaptic plasticity based on frequency induction is common in the brain, such as LTP and LTD mechanisms, but the synaptic plasticity is not widely discussed in a pulse neural network learning algorithm.

Disclosure of Invention

In order to overcome the disadvantages of the prior art, the present invention aims to provide a method for constructing a spiking neural network for feature extraction based on frequency induction, which enables the spiking neural network to learn features in data for a specific task based on frequency-induced synaptic plasticity, and finally outputs a pulse sequence as an extraction of the features in the data. The method can effectively generate memory, efficiently extract features, has good robustness through small sample learning, has stronger network capability, more effective learning and higher operation efficiency due to the introduction of a heuristic mechanism based on frequency induction, can keep the network synapse growth speed in a reasonable range, and can prevent overgrowth when the network scale is in a dynamic balance state.

In order to achieve the purpose, the invention adopts the technical scheme that:

a method for constructing a pulse neural network for feature extraction based on frequency induction comprises the following steps:

step 1: establishing an input layer, performing primary processing on input data, and converting the input data into a pulse sequence;

step 2: establishing a feature extraction layer, introducing a frequency induction heuristic mechanism on a pulse neuron, performing structure learning by using an accumulative growth algorithm based on an HEBB rule, performing weight learning by using an STDP rule, inputting data needing feature extraction after learning is finished, wherein the output of the last layer of the feature extraction layer is the feature, and starting simulation for each input data sample for a plurality of times for learning.

The step 1 specifically comprises the following steps:

step 1.1: constructing an input layer using p pulse generators;

step 1.2: performing primary processing on input data samples, weakening noise in the samples by a non-neural network method, and sequencing the samples according to categories;

step 1.3: converting the data processed in the step 1.2 into pulse excitation time, and exciting the data after the value is larger;

step 1.4: a simulation is started for each sample, and the pulse generator sends a pulse at a corresponding time to convert the data into a pulse sequence.

The step 2 specifically comprises the following steps:

step 2.1: constructing a feature extraction layer by using three layers of pulse neural networks, wherein each layer of network is provided with q normal pulse neurons, and a plurality of synapses are randomly generated from a previous layer to a next layer during initialization;

step 2.2: setting a time window and an excitation counter for each pulse neuron, judging whether the pulse neuron is under the influence of frequency-induced inhibition or enhancement effect for each simulation, and if so, applying the influence;

step 2.3: for each simulation, the pulse neurons between two adjacent layers carry out structure learning through an accumulative growth algorithm based on the HEBB rule, the limitation of material and energy is guaranteed when synapses grow, namely the synapses are a long-term process and a neuron cannot grow in large quantity, and a competitive relationship exists, namely as learning progresses, some growing synapses become unimportant and degenerate;

step 2.4: for each simulation, the synapses of the spiking neurons are weight learned by the STDP rule, i.e. if the pre-synaptic neuron fires before the post-synaptic neuron, the synaptic weight needs to be increased, the closer the interval, the more increased, and if the pre-synaptic neuron fires after the post-synaptic neuron, the synaptic weight is decreased, the closer the interval, the more decreased.

In step 2.1, the weight is initialized to one third of the upper limit of the synaptic weight.

The step 2.2 specifically comprises the following steps:

step 2.2.1: setting a firing counter cc and a time window with the size of L for each pulse neuron;

step 2.2.2: counting the value of cc in the time window, if satisfying formula 1, it indicates that the impulse neuron is in the enhanced state, and satisfying formula 2, it indicates that the impulse neuron is in the inhibited state, and theta in formulas 1 and 2_PAnd theta_DJudgment thresholds representing enhancement and suppression, respectively;

cc∈[θ_P,L]equation 1

cc∈(0,θ_D]Equation 2

Step 2.2.3: if the neuron is in the enhanced state, establishing a weight between the neuron and all the post-synaptic neurons thereof, and fixing the weight to w_PSynapse of (a), wherein w_PShould be large enough to ensure excitation of the postsynaptic nerve group; if the neuron is in the inhibited state, the synapses between the neuron and all its postsynaptic neurons are disconnected and then fixed to w with a weight_DSynapse replacement of wherein w_DThe postsynaptic neurons are difficult to excite, meanwhile, disconnected synaptic information is stored, and a user can recover subsequently;

step 2.2.4: at the end of each simulation run, all neurons in the enhancement or inhibition state are traversed, and the enhancement or inhibition state is exited if it has been maintained for a time window.

For the neuron being enhanced, exiting the enhancing state disconnects the synapse temporarily established; for the inhibited neurons, additionally exerting a long-term attenuating effect, the probability of attenuation P is calculated according to equation 3:

for each synapse, the synapse is disconnected with the probability of P, if the synapse is not disconnected, the weight of the synapse is multiplied by an attenuation factor alpha with the probability of P, if the synapse is not disconnected or attenuated, the original value when the synapse does not enter the inhibited state is recovered, wherein rho is a probability constant in formula 3, rho is smaller than 1, and Dc is a counter for the neuron to continuously enter the inhibited state.

The step 2.3 specifically comprises the following steps:

step 2.3.1: for each two adjacent layers, a coefficient matrix A, A is established_ijIndicates the importance of growing synapses between neurons i in the previous layer and neurons j in the subsequent layer, if synapses already exist between ij, A_ijIs always 0;

step 2.3.2: after each simulation is finished, traversing the neuron pairs between two adjacent layers, and updating the coefficient matrix according to a formula 4:

wherein λ_HEBBIs the coefficient of attenuation, Δ A_maxIs the maximum value at which the coefficient can be increased each time,

and

respectively the m-th firing time of j neuron and the n-th firing time of i neuron in the simulation, th is a statistical time threshold,

it is shown that the firing of j-pulse neurons within the time of firing of i-pulse neurons th is counted,

is the updated coefficient after the k-th simulation is finished,

is the coefficient before update;

step 2.3.3: after updating the coefficient matrix, setting a growth threshold value theta_HEBBIf, if

Greater than theta_HEBBThen neuron i will generate a synapse to neuron j;

step 2.3.4: traversing presynaptic neurons which will grow, and selecting traversed neurons i

The largest K postsynaptic neurons j synapse for growth.

The step 2.4 specifically comprises the following steps:

2.4.1: after the first simulation is finished, scanning all synapses, and calculating delta t according to a formula 5 for each excitation of front and back neurons:

Δt＝t_pre-t_post+ delay formula 5

Wherein t is_preAnd t_postRespectively representing the excitation time of a presynaptic neuron and a postsynaptic neuron, and delay is synaptic delay;

2.4.2: the synaptic weight is modified according to equation 6:

wherein w_maxIs the upper limit of the weight, λ_STDPIs the learning rate, μ₊And mu_-The weight determination coefficients, alpha, during weight increase and decay, respectively_STDPIs the asymmetry factor, K₊And K_-Respectively, the time convergence coefficients of weight decay and growth, e is a natural constant, tau-and tau₊Time scaling factors for weight increase and decay, respectively, and w' and w are weights after and before update, respectively.

The features are pulse sequence features, are directly used as input for training other pulse neural networks, or are converted into numerical features by using a mathematical method, and are used for learning other non-pulse neural network models.

Compared with the prior art, the invention has the beneficial effects that:

1) the invention can efficiently extract the characteristics of the data, can directly extract the characteristics into pulse sequence characteristics for training other pulse neural networks, and fills the blank of the neighborhood research of the existing pulse neural networks.

2) The invention introduces a heuristic mechanism based on frequency induction, improves a synapse growth algorithm based on an HEBB rule and a weight change algorithm based on an STDP rule, and has the advantages of stronger network capability, more effective learning, higher operation efficiency, dynamic balance state of network scale, prevention of overgrowth and good robustness through small sample learning.

3) The method can be used for extracting the characteristics of data such as pictures and voice, and for the pictures, the pictures can be converted into the pulse sequence through the method, the pulse sequence can represent the characteristics contained in the pictures and can be input into the pulse neural network for classified learning of the pictures, and the learning effect of the pulse neural network on the pictures can be improved. For voice data, the method can convert audio frequency in voice into pulse information, and the pulse information is input into a pulse neural network for voice recognition to train, so that the recognition rate of the voice and the robustness to abnormal pronunciation can be improved.

Drawings

Fig. 1 is an overall structural view of the present invention.

Detailed Description

The embodiments of the present invention will be described in detail below with reference to the drawings and examples.

As shown in FIG. 1, a method for constructing a spiking neural network for feature extraction based on frequency induction constructs an input layer and a feature extraction layer. The sample is input into the input layer after being coded, neurons in the input layer are excited at corresponding time, and then the input is input into the feature extraction layer for learning. A heuristic mechanism based on frequency induction is introduced to the pulse neuron of the feature extraction layer, under the mechanism, if the pulse neuron is excited at a high frequency within a period of time, the neuron is considered to have a high value on current data, the performance of the neuron is enhanced, if the pulse neuron is excited at a low frequency, the neuron is considered to be noise and is inhibited, and the enhancing or inhibiting effect lasts for a long time. Meanwhile, the synapse structure learning of the impulse neurons in the feature extraction layer uses an accumulative growth algorithm based on the HEBB rule, the growth of synapses in the algorithm is used for the reference of the situation of a real brain, and the energy, substance and speed limits are considered, namely, the growth of a synapse is a long-term process, and the synapse may become less important as the learning progresses, and the synapse needs to grow in other places, so the synapse stops growing. The learning of the weight of synapses is performed using the STDP rule. The output of the last layer after learning is the feature converted into the pulse sequence. The algorithm can enable the network to effectively generate memory, efficiently extract features, achieve good robustness through small sample learning, meanwhile, a heuristic mechanism induced by frequency has a positive effect on learning of the impulse neural network, enable the network to have stronger capacity, enable the learning to be more effective, enable the operation efficiency to be higher, enable the network synapse growth speed to be kept in a reasonable range, enable the network scale to be in a dynamic balance state, and prevent overgrowth.

Referring to fig. 1, taking feature extraction for picture recognition as an example, the method includes the following steps:

step 1: and establishing an input layer, performing primary processing on input data, and converting the input data into a pulse sequence.

Step 1.1: the samples are subjected to preliminary processing, noise in the samples is weakened through a non-neural network method such as denoising and enhancing (in the embodiment, one layer of convolution is selected to be performed on the pictures), and the samples are sorted according to categories.

Step 1.2: and (3) constructing an input layer by using p pulse generators, converting the data processed in the step 1.1 into pulse excitation time, and exciting the data after the convolution is larger.

Step 1.3: a simulation is started for each sample, and the pulse generator sends a pulse at a corresponding time to convert the data into a pulse sequence.

Step 2.1: and constructing a feature extraction network by using three layers of pulse neural networks, wherein each layer of network has q normal pulse neurons, a previous layer generates a plurality of synapses to a next layer randomly during initialization, and weight initialization is preferably one third of the upper limit of the synapse weight according to experience.

Step 2.2: setting a time window and an excitation counter for each pulse neuron, judging whether the pulse neuron is under the influence of frequency-induced inhibition or enhancement effect for each simulation, and if so, applying the influence.

step 2.2.2: counting the value of cc in the time window, if satisfying formula 1, it indicates that the neuron is in the enhanced state, and satisfying formula 2, it indicates that the neuron is in the inhibited state, and theta in formulas 1 and 2_PAnd theta_DJudgment thresholds respectively representing enhancement and suppression states;

cc∈[θ_P,L]equation 1

cc∈(0,θ_D]Equation 2

Step 2.2.3: if the neuron is in the enhanced state, establishing a weight between the neuron and all the post-synaptic neurons thereof, and fixing the weight to w_PSynapse of (a), wherein w_PShould be large enough to ensure excitation of the postsynaptic nerve group.

Step 2.2.4: if the neuron is in the inhibited state, the synapses between the neuron and all its postsynaptic neurons are disconnected and then fixed to w with a weight_DSynapse replacement of wherein w_DThe postsynaptic neurons are difficult to excite, meanwhile, disconnected synaptic information is stored, and a user can recover subsequently;

step 2.2.5: at the end of each simulation run, all neurons in the enhancement or inhibition state are traversed, and the enhancement or inhibition state is exited if it has been maintained for a time window. For the neurons being enhanced, exiting the enhanced state disconnects the synapses that were temporarily established. For inhibited neurons, which require additional long-term attenuating effects, the probability of attenuation P will be calculated according to equation 3:

Step 2.3: for each simulation, the pulse neurons between two adjacent layers are subjected to structure learning through an accumulative growth algorithm based on the HEBB rule, the limitation of material and energy in synapse growth is guaranteed, namely the synapse growth is a long-term process and a neuron cannot grow in large quantity, and a competitive relationship exists, namely as learning progresses, some growing synapses become unimportant and are degraded.

Step 2.3.1: for every two adjacentLayer, establishing a coefficient matrix A, A_ijIndicates the importance of growing synapses between neurons i in the previous layer and neurons j in the subsequent layer, if synapses already exist between ij, A_ijIs always 0;

and

is the updated coefficient after the k-th simulation is finished,

is the coefficient before update.

Greater than theta_HEBBThen neuron i will generate a synapse to neuron j;

step 2.3.4: traversing presynaptic neurons that will grow, pairSelecting the traversed neuron i

The largest K postsynaptic neurons j synapse for growth.

Δt＝t_pre-t_post+ delay formula 5

Wherein t is_preAnd t_postRespectively, pre-synaptic and post-synaptic neuron firing times, and delay is synaptic delay.

2.4.2: the synaptic weight is modified according to equation 6:

wherein w_maxIs the upper limit of the weight, λ_STDPIs the learning rate, μ₊And mu_-The weight determination coefficients, alpha, during weight increase and decay, respectively_STDPIs the asymmetry factor, K₊And K_-Respectively, the time convergence coefficient of weight attenuation and increase, e is a natural constant, tau_-And τ₊Time scaling factors for weight increase and decay, respectively, and w' and w are weights after and before update, respectively.

Step 2.5: after learning is finished, data needing feature extraction is input, the output of the last layer of the feature extraction layer is the feature, the feature is a pulse sequence feature, the pulse sequence feature can be directly used as input for training other pulse neural networks, and can also be converted into a numerical feature by using a mathematical method for learning other non-pulse neural network models.

The following is a specific application embodiment of the present invention in MNIST data set identification.

The steps and effects of the invention are specifically described by taking the establishment of a spiking neural network for feature extraction, namely MNIST data set identification, as an example based on frequency induction.

In this example, 100 corresponding pictures are randomly selected for each digit as a training set for training the method of the present invention. The convolution operation is first performed on the pictures in the selected data set, in this example 4 convolution kernels are selected, and the size of the convolved pictures is 12 x 12. The input layer in the inventive spiking neural network needs to have 12 × 4 pulse generators. All pictures in the training set are sorted by category next, i.e. pictures of 0 are put together, pictures of 1 are put together, and pictures of 2, 3, 4, 5, 6, 7, 8 and 9 are also put together individually. Next, a feature extraction layer is constructed using step 2.1 of the present invention, in this case 1000 x 3 in size. And inputting the pictures in sequence, wherein the pictures are input in sequence, and the picture process of inputting 0 is repeated three times after the input, all the pictures of 1 are input in sequence after the input is finished, the process of inputting in sequence is repeated 3 times, and then the pictures of 2, 3, 4, 5, 6, 7, 8 and 9 are input in the mode. The simulation was started 500ms each time a picture was entered. The network is then trained according to step 2.2, step 2.3 and step 2.4 as described in the present invention. And after training is finished, closing the weight change capability of the feature extraction layer to fix the network, namely completing the construction of the pulse neural network for feature extraction based on frequency induction. Then, an output layer with 10 pulse neurons can be established after the feature extraction layer, the feature extraction layer and the output layer are fully connected, and a HEBB algorithm can be used for supervised learning to achieve the aim of identifying the picture. Experiments show that the pulse neural network for feature extraction based on frequency induction can effectively extract the features of the pictures under the training of small data after being constructed, and different responses are generated to different types of pictures. The method extracts the features and then inputs the features into the pulse neural network for MNIST handwritten digit recognition to learn, thereby improving the accuracy of the network and improving the recognition accuracy of the illegible handwritten data. The method can be used for image entity recognition, the image is firstly subjected to feature extraction according to the method, then the image is input into an impulse neural network for learning, and then possible existing entities in the image are output.

Meanwhile, the invention can also be used for extracting the characteristics of other data, such as voice data. In the task of converting voice into characters, the method can convert the audio frequency in the voice into pulse information, and the pulse information is input into a pulse neural network for voice recognition to train, so that the recognition rate of the voice and the robustness to nonstandard pronunciation can be improved.

Claims

1. A method for constructing a pulse neural network for feature extraction based on frequency induction is characterized by comprising the following steps:

2. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 1, wherein the step 1 specifically comprises the following steps:

step 1.1: constructing an input layer using p pulse generators;

3. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 1, wherein the step 2 specifically comprises the following steps:

4. The method for constructing a spiking neural network for feature extraction based on frequency induction as claimed in claim 3, wherein in the step 2.1, the weight is initialized to one third of the synapse weight upper limit.

5. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 3, wherein the step 2.2 comprises the following steps:

cc∈[θ_P,L]equation 1

cc∈(0,θ_D]Equation 2

6. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 5, wherein for the enhanced neuron, exiting the enhanced state disconnects the temporarily established synapse; for the inhibited neurons, additionally exerting a long-term attenuating effect, the probability of attenuation P is calculated according to equation 3:

7. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 3, wherein the step 2.3 comprises the following steps:

and

indicating that the j pulse neuron is excited before the i pulse neuron by thThe excitation conditions in the inner tube are counted,

is the updated coefficient after the k-th simulation is finished,

is the coefficient before update;

Greater than theta_HEBBThen neuron i will generate a synapse to neuron j;

The largest K postsynaptic neurons j synapse for growth.

8. The method for constructing a spiking neural network for feature extraction based on frequency induction according to claim 3, wherein the step 2.4 comprises the following steps:

Δt＝t_pre-t_post+ delay formula 5

2.4.2: the synaptic weight is modified according to equation 6:

9. The method for constructing the spiking neural network for feature extraction based on frequency induction as claimed in claim 1 or 3, wherein the features are pulse sequence features, and are directly used as input for other spiking neural network training or are converted into numerical features by using a mathematical method for learning of other non-spiking neural network models.