CN118228789A - Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently - Google Patents

Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently Download PDF

Info

Publication number
CN118228789A
CN118228789A CN202410548521.4A CN202410548521A CN118228789A CN 118228789 A CN118228789 A CN 118228789A CN 202410548521 A CN202410548521 A CN 202410548521A CN 118228789 A CN118228789 A CN 118228789A
Authority
CN
China
Prior art keywords
short
long
quantized
memory network
time memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410548521.4A
Other languages
Chinese (zh)
Inventor
周卫东
刘国洋
边栋
于治楼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202410548521.4A priority Critical patent/CN118228789A/en
Publication of CN118228789A publication Critical patent/CN118228789A/en
Pending legal-status Critical Current

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to a method and a system for detecting acceleration of long-short-term memory network FPGA hardware and abnormal electroencephalogram signals, which can be quantified efficiently, and comprises the following steps: the method comprises the steps of quantizing a long-short-time memory network capable of being quantized efficiently, deploying the quantized long-short-time memory network into programmable logic PL accelerated by FPGA hardware, compiling the quantized long-short-time memory network into Verilog codes, and generating an IP core for acceleration; the input signal, the quantization bias and the weight are transmitted through an AXI bus; after the quantized long-short-term memory network finishes calculation, the output data is transmitted back to the ARM processor unit PS accelerated by FPGA hardware through the same AXI bus. The invention can obviously reduce the memory occupation space of the long-short-time memory network, reduce the operation power consumption of the network, is beneficial to the deployment and the high-efficiency operation of the long-short-time memory network on the low-power-consumption edge hardware equipment, and promotes the real-time processing and response.

Description

Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently
Technical Field
The invention relates to a method for accelerating the hardware of a long-short-term memory network FPGA (field programmable gate array) and a method and a system for detecting abnormal electroencephalogram signals, which belong to the technical fields of neural networks, artificial intelligence and FPGA.
Background
The cyclic neural network (Recurrent Neural Network, RNN), especially a Long Short time memory network based on a Long Short time memory unit (Long Short-TermMemory, LSTM), is a neural network which is specially designed for processing time series data and can be trained end to end, has excellent performance and wide application in the fields of time series analysis, natural language processing and the like, and is also becoming a research hotspot in the field of electroencephalogram signal analysis.
Compared with the traditional time sequence modeling method, the long-short-time memory network can integrate and learn the brain electrical characteristics with resolution from the original brain electrical signals better. However, the parameter amount of the long-short-term memory network is large, and all parameters are generally stored in a computer by using 32-bit or 16-bit floating point numbers, so that the occupied memory space is too large, and the parameters are difficult to be deployed in a mobile phone or other low-power-consumption edge computing hardware. Currently, the main approach to solve this problem is to quantize the parameters of the model into low-bit-width integers, so as to facilitate FPGA hardware deployment. However, the existing model parameter low-order-width quantization method generally needs training calibration during quantization or data calibration after quantization to keep the original precision of the model, has complicated flow, and limits the deployment and application of long-time memory networks in the FPGA. Therefore, aiming at the problems of long-short time memory networks, a novel long-short time memory network capable of being quantized efficiently and an FPGA hardware acceleration method thereof are provided and are used for detecting abnormal electroencephalogram signals.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a long-short-time memory network FPGA hardware acceleration method capable of being quantized efficiently. The invention provides the long-short time memory unit with normalized cell state (CELL STATE) output, which is convenient for long-short time memory network quantification and avoids data calibration operation in the traditional post-quantification method. Meanwhile, a nonlinear activation function quantization scheme of a long-short-time memory unit based on a Sigmoid piecewise function lookup table is provided, and the piecewise function is adopted to approximate the nonlinear activation function, so that quantization without losing precision can be realized under the condition of low quantization bit width; and a long-short-term memory network FPGA hardware accelerator capable of being quantized efficiently is provided.
The invention further provides an abnormal electroencephalogram signal detection method and system based on the long-short-term memory network FPGA hardware accelerator capable of being quantized efficiently.
Term interpretation:
1. Quantification: a process of mapping a set of parameters to another value range by mathematical transformation. Mapping transformation is usually implemented by using a linear mapping method. The original values of the parameters are typically floating point numbers, while the quantized parameter values are typically integers.
3. Training: and (3) inputting a group of data into the neural network, comparing the obtained output result with a label corresponding to the group of data, calculating an error, obtaining a gradient value of each parameter through a back propagation algorithm, and updating the gradient value.
4. Long and short time memory unit: the long and short time memory unit is a special form of a recurrent neural network, and each long and short time memory unit comprises three gates: forget gates (deciding which information to discard), input gates (deciding which new information to update to cell state), and output gates (deciding the next hidden state). The gating structure enables long-distance maintenance and state transmission between long-time and short-time memory units, and long-distance dependency relationship is effectively learned and memorized.
5. Long and short term memory network: the neural network is formed by sequentially connecting a plurality of long and short time memory units according to a specific sequence and is used for processing, predicting and classifying time sequence data. The long and short term memory network maintains and updates information in each cell through its unique gating mechanism, allowing the network to maintain and communicate information over long distances among the cells as time series data is processed.
6. Electroencephalogram signal feature extraction based on convolutional neural network: and extracting the characteristics of the brain electrical signals by using a Convolutional Neural Network (CNN). The convolutional neural network comprises one or more convolutional layers, can receive an original electroencephalogram signal as an input, and automatically extracts characteristics of the electroencephalogram signal through the convolutional layers, the activating layers, the pooling layers and the like of the network.
The technical scheme of the invention is as follows:
A long-short-time memory network FPGA hardware acceleration method capable of being quantized efficiently runs in an FPGA hardware accelerator, and the FPGA hardware accelerator comprises an ARM processor unit PS and a programmable logic unit PL; comprising the following steps:
the long-short time memory network capable of being quantized efficiently consists of a plurality of long-short time memory units which are connected in sequence and capable of being quantized efficiently;
Performing parameter quantization on the trained long-short-time memory network with floating point parameters and capable of being quantized efficiently to obtain a quantized long-short-time memory network;
compiling the quantized long-short-time memory network into Verilog codes, generating IP cores for acceleration, and deploying the IP cores in a programmable logic unit PL;
the ARM processor unit PS is responsible for data preparation and preprocessing work and also for softmax mapping operation;
The quantized configuration parameters of each unit of the long-short-time memory network comprise the number of input/output channels, the dimension of feature vectors, the number of neurons and quantization coefficients are firstly transmitted to a programmable logic unit PL at the PS end of an ARM processor unit through a simplified advanced extension interface (AXI Lite) bus;
the input signals, quantization bias and weights are transmitted to the programmable logic unit PL at the ARM processor unit PS end via an advanced extension interface (AXI) bus; after the quantized long-short-time memory unit finishes calculation at the programmable logic unit PL end, the output data is transmitted back to the ARM processor unit PS end through the same AXI bus;
parameter configuration of a next quantized long-short-time memory unit is carried out, a new round of feature vector calculation is started, and the process is repeated until the whole quantized long-short-time memory network calculation is completed;
And outputting a calculation result.
According to the invention, the segment activation function of the high-efficiency quantifiable long-short-time memory unit
Wherein T a is an adjustable Sigmoid function approximation parameter;
Further preferably, T a =2.5;
According to the invention, the calculation process of the cell state parameter in the long-short-time memory unit capable of being quantified with high efficiency is as follows:
Wherein, the ≡is Hadamard product, c t is the cell state of the t time step in the long and short time memory unit which can be quantized efficiently, i t、ft、gt is the input gate control value, forget gate control value and cell candidate unit value of the t time step in the long and short time memory unit which can be quantized efficiently.
According to the invention, the parameter quantization process of the trained long-short-time memory network with floating point parameters comprises the following steps:
(1) Initializing quantization parameters, including:
Let the overall weight matrix w= [ W f;Wi;Wo;Wg ], then W
Let the total cyclic weight matrix r= [ R f;Ri;Ro;Rg ], then
Initializing all weight quantization bit widths B WR =8;
Initializing a piecewise function lookup table quantization bit width B LUT =4;
initializing a fixed point digital width B Fix =24;
Initializing a weight quantization factor
Initializing a weight matrix quantization scale Q W=max(|W|)/MWR;
initializing a cyclic weight matrix quantization scale Q R=max(|R|)/MWR; wherein, the max (·) function is a maximum value function, and the |·| function is an absolute value function;
initializing an input quantization scale Q X=1/MWR;
Initializing hidden unit quantization scales Initializing input weight quantization scalesInitializing input cyclic weight quantization scale/>Initializing candidate unit quantization scale/>
(2) Generating a quantization look-up table; comprising the following steps:
First, with-T a as the start value, T a as the end value, the increment (i.e., step size) is set to Generating an arithmetic sequence I d; then, calculate the piecewise function/>Is provided for the quantization look-up table: /(I)
(3) Quantizing the weight matrix; comprising the following steps:
the quantized forgetting gate weight matrix is shown as formula (3):
the quantization input gate weight matrix is as shown in formula (4):
the quantization output gate weight matrix is as shown in formula (5):
the cell candidate weight matrix is quantified as shown in formula (6):
the quantized forgetting gate cyclic weight matrix is shown in formula (7):
the quantization is input into the gate cyclic weight matrix, as shown in formula (8):
The quantization output gate cycle weight matrix is as shown in formula (9):
quantifying a cell candidate cyclic weight matrix as shown in formula (10):
quantifying the forgetting gate bias as shown in equation (11):
the input gate bias is quantized as shown in equation (12):
The quantization output gate bias is as shown in equation (13):
quantifying the cell candidate bias as shown in formula (14):
(4) Calculating quantized gate control and state parameter values; comprising the following steps:
calculating a quantized forget gate gating value:
Calculating a quantized input gate control value:
Calculating a quantized output gate control value:
Calculating a quantized cell candidate unit value:
calculating quantitative cell state parameter values:
calculating quantized hidden state parameter values:
a long and short time memory network FPGA hardware accelerator capable of being quantized efficiently comprises an ARM processor unit PS and a programmable logic unit PL which are connected through an advanced extension interface (AXI Lite) bus, and the long and short time memory network FPGA hardware acceleration method capable of being quantized efficiently is achieved.
An abnormal electroencephalogram signal detection method based on a long-short-term memory network FPGA hardware accelerator capable of being quantized efficiently comprises the following steps:
a data acquisition module consisting of an electroencephalogram amplifier and an A/D converter is adopted to acquire an electroencephalogram signal to be detected and store the electroencephalogram signal into a computer;
training a long-short-time memory network capable of being quantized efficiently;
the trained long-short-time memory network capable of being quantized efficiently is deployed on the FPGA hardware accelerator of the long-short-time memory network capable of being quantized efficiently;
extracting characteristics of an electroencephalogram signal to be detected;
inputting the characteristics of the electroencephalogram signals to be detected into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently to obtain output values;
Inverse quantization is carried out on the output value of the long-short-term memory network FPGA hardware accelerator which can be quantized efficiently, floating point number characteristics after inverse quantization are obtained, and inspection results (abnormal electroencephalogram or normal electroencephalogram) are output after softmax function mapping;
If the detection result is abnormal electroencephalogram, alarming is carried out through an alarm module.
According to the invention, the training process of the long-short-time memory network capable of being quantified efficiently comprises the following steps:
firstly, acquiring electroencephalogram data for training by a data acquisition module consisting of an electroencephalogram amplifier and an A/D converter;
secondly, initializing parameters and weight matrixes of floating point formats of all units according to the set quantity of units in the long-short-time memory network capable of being quantized efficiently, wherein the parameters and weight matrixes comprise:
cell state parameter at initial time step t=0 All initialized to 0;
hidden state parameter when initial time step t=0 All initialized to 0;
forgetting the gate weight matrix Input gate weight matrix/>Outputting a gate weight matrixCell candidate weight matrix/>Initializing to be a random number;
Cycle weight matrix for forgetting gate Input gate cyclic weight matrix/>Output gate cycle weight matrix/>Cell candidate cyclic weight matrix/>Initializing to be a random number;
Biasing a forgetting door Input gate bias/>Output gate bias/>Cell candidate biasAll initialized to 0; then, according to the parameter and the weight matrix, calculating various gating and state parameter values, including:
For the data u t of the t-th time step in the extracted electroencephalogram signal characteristics, calculating floating point parameter values of each gating according to the following formula:
Calculating a forget gate gating value f t as shown in formula (21):
wherein h t-1 is a hidden state parameter of the t-1 time step;
the input gate-control value i t is calculated as shown in equation (22):
the output gate gating value o t is calculated as shown in equation (23):
calculating a cell candidate unit value g t as shown in formula (24):
Calculating a cell state parameter c t as shown in formula (25):
Calculating a hidden state parameter h t as shown in formula (26):
ht=ot⊙ct (26)
Then, calculating a loss function of the high-efficiency quantized long-short-time memory network formed by a plurality of high-efficiency quantized long-short-time memory units which are sequentially connected, and carrying out repeated iterative updating on floating point parameters according to the value of the loss function and combining a counter-propagation algorithm; after the maximum iteration times are reached, the long-short-time memory network capable of efficiently quantizing stops iteration updating, floating point parameter values of all parameters and weight matrixes are fixed, and the floating point parameter values are stored.
According to the invention, the feature of the floating point number after inverse quantization is calculated; as shown in formula (27):
Wherein: for the quantized integer feature output by the long-short-time memory network at the t-th time step, Q X is the input quantization scale.
According to a preferred embodiment of the invention, a loss value is calculated; comprising the following steps:
calculating a current loss value through a loss function E according to the output hidden layer characteristic h t; the loss function E is defined as follows:
Wherein θ represents all the learnable parameters in the long-short-time memory network that can be quantified efficiently; theta-related The j-th eigenvalue of the hidden layer representing t time steps of the i-th sample, and m represents the number of samples in the back propagation optimization process;
According to a preferred embodiment of the present invention, the parameter updating includes:
Updating all the learnable parameters in the high-efficiency quantifiable long-short-term memory network according to the formula (29) from the calculated loss values:
wherein μ is the learning rate; θ v represents all the learnable parameters in the long-short-time memory network that can be efficiently quantized at the v-th iteration, Is the gradient value of the loss function E (θ v) to θ v; if v=N max,Nmax=200,Nmax is the set maximum iteration number, stopping the iterative updating by the network, and fixing the parameter value of the weight matrix; otherwise, let v add 1 and continue to calculate the gate and state parameter value, and carry out iterative update.
According to the invention, the electroencephalogram signal characteristic extraction process comprises the following steps:
Performing electroencephalogram signal feature extraction by adopting a single-layer convolutional neural network, wherein the single-layer convolutional neural network comprises 8 single-channel one-dimensional convolutional kernels with the convolutional kernel length of 5; the single-layer convolutional neural network maps the original electroencephalogram signal of each time step into an electroencephalogram signal characteristic with a dimension of 1024.
An abnormal electroencephalogram signal detection system based on a long-short-term memory network FPGA hardware accelerator capable of being quantified efficiently comprises:
a data acquisition module configured to: collecting an electroencephalogram signal to be detected through an electroencephalogram amplifier and an A/D converter;
a feature extraction module configured to: extracting features of the electroencephalogram signals to be detected, and mapping the original electroencephalogram signals into feature vectors with certain dimensions;
An abnormal electroencephalogram signal detection module configured to: inputting the feature vector into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently, dequantizing the output value of the long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently to obtain the feature of the floating point number after dequantization, and outputting the inspection result (abnormal electroencephalogram or normal electroencephalogram) after mapping by a softmax function;
an electroencephalogram abnormality alarm module configured to: and alarming the detected abnormal electroencephalogram according to the class label output by the abnormal electroencephalogram detection module.
The beneficial effects of the invention are as follows:
The floating point data format parameters of the long-short-time memory network which can be quantized efficiently are quantized into signed integer parameters with low bit width, so that the memory occupied space of the long-short-time memory network can be remarkably reduced, the operation power consumption of the network is reduced, the deployment and efficient operation of the long-short-time memory network on low-power-consumption edge hardware equipment are facilitated, and the real-time processing and response are promoted. In addition, the long-short-time memory network capable of being quantized efficiently provided by the invention can finish quantization without carrying out data calibration in the traditional quantization method and any extra data, thereby enhancing the flexibility of long-short-time memory network quantization.
Drawings
FIG. 1 is a schematic diagram of a high-efficiency quantifiable long and short memory network FPGA hardware accelerator;
FIG. 2 is a schematic diagram of a high-efficiency quantifiable long-short-term memory unit with floating-point parameters according to the present invention;
FIG. 3 is a schematic diagram of a quantized long-short-term memory unit with high efficiency after parameter quantization according to the present invention;
FIG. 4 is a schematic diagram of an efficient quantifiable long-short-time memory network consisting of a plurality of sequentially connected efficient quantifiable long-short-time memory units;
FIG. 5 is a schematic diagram of a training process flow of the long and short term memory network of the present invention;
FIG. 6 is a schematic diagram of ten-fold cross-validation accuracy at different quantization bit widths;
FIG. 7 is a schematic diagram of an abnormal EEG signal detection system based on a high-efficiency quantifiable long-short-term memory network FPGA hardware accelerator;
Detailed Description
The invention is further illustrated in the following drawings and examples, to which the invention is not limited;
Example 1
A method for accelerating the FPGA hardware of a long-short-time memory network, which can be quantized efficiently, runs in an FPGA hardware accelerator, as shown in figure 1, wherein the FPGA hardware accelerator adopts Xilinx Zynq Zedboard, and comprises an ARM processor unit PS of Zynq-7000 SoC and a 7-series programmable logic unit PL; comprising the following steps:
the long-short time memory network capable of being quantized efficiently consists of a plurality of long-short time memory units which are connected in sequence and capable of being quantized efficiently;
Performing parameter quantization on the trained long-short-time memory network with floating point parameters and capable of being quantized efficiently to obtain a quantized long-short-time memory network;
compiling the quantized long-short-time memory network into Verilog codes, generating IP cores for acceleration, and deploying the IP cores in a programmable logic unit PL;
The ARM processor unit PS is responsible for data preparation and preprocessing, and comprises the steps of receiving input data, loading weights and configuration; also responsible for softmax mapping operations;
The quantized configuration parameters of each unit of the long-short-time memory network comprise the number of input/output channels, the dimension of feature vectors, the number of neurons and quantization coefficients, and the quantized parameters are transmitted to a programmable logic unit PL at the PS end of an ARM processor unit through a simplified advanced extension interface (AXI Lite) bus;
the input signals, quantization bias and weights are transmitted to the programmable logic unit PL at the ARM processor unit PS end via an advanced extension interface (AXI) bus; after the quantized long-short-time memory unit finishes calculation at the programmable logic unit PL end, the output data is transmitted back to the ARM processor unit PS end through the same AXI bus;
parameter configuration of a next quantized long-short-time memory unit is carried out, a new round of feature vector calculation is started, and the process is repeated until the whole quantized long-short-time memory network calculation is completed;
And outputting a calculation result.
The quantized long-short-time memory units share a quantization lookup table, and the calculated quantization lookup table is stored in a Block Random Access Memory (BRAM) of the programmable logic PL; all weight data are stored in the DDR-3 memory of the ARM processor unit PS. Table 1 shows in detail the resource utilization of the designed long and short memory network FPGA hardware acceleration method running on Xilinx ZynqZedboard. The power consumption on the chip of the whole hardware acceleration system is 1.778W, wherein the ARM processor unit PS part occupies 1.542W, and the power consumption of the programmable logic unit PL part is 0.236W.
TABLE 1
Example 2
According to embodiment 1, the method for accelerating the hardware of the FPGA of the long-short-time memory network is characterized in that:
Segment activation function of long-short-time memory unit capable of being quantized efficiently
Wherein T a is an adjustable Sigmoid function approximation parameter;
Ta=2.5;
the calculation process of the cell state parameters in the long-short-time memory unit capable of being quantified efficiently comprises the following steps:
Wherein, the ≡is Hadamard product, c t is the cell state of the t time step in the long and short time memory unit which can be quantized efficiently, i t、ft、gt is the input gate control value, forget gate control value and cell candidate unit value of the t time step in the long and short time memory unit which can be quantized efficiently. The structure of the high-efficiency quantized long-short-time memory unit with floating point number parameters is shown in fig. 2, the structure of the high-efficiency quantized long-short-time memory unit with low-order wide integer parameters is shown in fig. 3, and the high-efficiency quantized long-short-time memory network formed by a plurality of high-efficiency quantized long-short-time memory units connected in sequence is shown in fig. 4.
The long-short time memory network capable of being quantized with high efficiency is different from the traditional long-short time memory network in that:
The segment activation function of the long-short-time memory network which can be quantized efficiently in the formula (1) is smaller in calculated amount and easier to quantize than the Sigmoid activation function in the traditional long-short-time memory network;
The cell state parameter calculation process of the long-short time memory network capable of being quantified efficiently in the formula (2) is increased by utilizing the piecewise activation function compared with that of the traditional long-short time memory network And (3) an activated process.
The parameter quantization process of the trained long-short-time memory network capable of being quantized efficiently comprises the following steps:
(1) Initializing quantization parameters, including:
let the overall weight matrix w= [ W f;Wi;Wo;Wg ], then
Let the total cyclic weight matrix r= [ R f;Ri;Ro;Rg ], then
Initializing all weight quantization bit widths B WR =8;
Initializing a piecewise function lookup table quantization bit width B LUT =4;
initializing a fixed point digital width B Fix =24;
Initializing a weight quantization factor
Initializing a weight matrix quantization scale Q W=max(|W|)/MWR;
initializing a cyclic weight matrix quantization scale Q R=max(|R|)/MWR; wherein, the max (·) function is a maximum value function, and the |·| function is an absolute value function;
initializing an input quantization scale Q X=1/MWR;
Initializing hidden unit quantization scales
Initializing input weight quantization scales
Initializing input cyclic weight quantization scales
Initializing candidate unit quantization scales
(2) Generating a quantization look-up table; comprising the following steps:
First, with-T a as the start value, T a as the end value, the increment (i.e., step size) is set to Generating an arithmetic sequence I d;
Then, a piecewise function is calculated Is provided for the quantization look-up table: /(I)Here, the ,Ld={0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,67,71,75,79,83,87,91,95,99,103,107,111,115,119,123,127}
(3) Quantizing the weight matrix; comprising the following steps:
the quantized forgetting gate weight matrix is shown as formula (3):
the quantization input gate weight matrix is as shown in formula (4):
the quantization output gate weight matrix is as shown in formula (5):
the cell candidate weight matrix is quantified as shown in formula (6):
the quantized forgetting gate cyclic weight matrix is shown in formula (7):
the quantization is input into the gate cyclic weight matrix, as shown in formula (8):
The quantization output gate cycle weight matrix is as shown in formula (9):
quantifying a cell candidate cyclic weight matrix as shown in formula (10):
quantifying the forgetting gate bias as shown in equation (11):
the input gate bias is quantized as shown in equation (12):
The quantization output gate bias is as shown in equation (13):
quantifying the cell candidate bias as shown in formula (14):
(4) Calculating quantized gate control and state parameter values; comprising the following steps:
calculating a quantized forget gate gating value:
Calculating a quantized input gate control value:
Calculating a quantized output gate control value:
/>
Calculating a quantized cell candidate unit value:
calculating quantitative cell state parameter values:
calculating quantized hidden state parameter values:
Example 3
A long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently comprises an ARM processor unit PS and a programmable logic unit PL of Zynq-7000 SoC which are connected through an advanced extension interface (AXI Lite) bus, and the long-short-time memory network FPGA hardware acceleration method capable of being quantized efficiently is achieved.
Example 4
An abnormal electroencephalogram signal detection method based on a long-short-term memory network FPGA hardware accelerator capable of being quantized efficiently comprises the following steps:
A data acquisition module consisting of an electroencephalogram amplifier and an A/D converter is adopted to acquire an electroencephalogram signal to be detected;
training a long-short-time memory network capable of being quantized efficiently;
The trained long-short-time memory network capable of being quantized efficiently is deployed on the FPGA hardware accelerator of the long-short-time memory network capable of being quantized efficiently, which is described in the embodiment 3;
extracting characteristics of an electroencephalogram signal to be detected;
inputting the characteristics of the electroencephalogram signals to be detected into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently to obtain output values;
Inverse quantization is carried out on the output value of the long-short-term memory network FPGA hardware accelerator which can be quantized efficiently, floating point number characteristics after inverse quantization are obtained, and inspection results (abnormal electroencephalogram or normal electroencephalogram) are output after softmax function mapping;
If the detection result is abnormal electroencephalogram, alarming is carried out through an alarm module.
As shown in fig. 5, the training process of the long-short-time memory network capable of being quantified efficiently includes:
firstly, acquiring electroencephalogram data for training by a data acquisition module consisting of an electroencephalogram amplifier and an A/D converter;
secondly, initializing parameters and weight matrixes of floating point formats of all units according to the set quantity of units in the long-short-time memory network capable of being quantized efficiently, wherein the parameters and weight matrixes comprise:
cell state parameter at initial time step t=0 All initialized to 0;
hidden state parameter when initial time step t=0 All initialized to 0;
forgetting the gate weight matrix Input gate weight matrix/>Outputting a gate weight matrixCell candidate weight matrix/>Initializing to be a random number;
Cycle weight matrix for forgetting gate Input gate cyclic weight matrix/>Output gate cycle weight matrix/>Cell candidate cyclic weight matrix/>Initializing to be a random number; /(I)
Biasing a forgetting doorInput gate bias/>Output gate bias/>Cell candidate biasAll initialized to 0;
Then, according to the parameter and the weight matrix, calculating various gating and state parameter values, including:
For the data u t of the t-th time step in the extracted electroencephalogram signal characteristics, calculating floating point parameter values of each gating according to the following formula:
Calculating a forget gate gating value f t as shown in formula (21):
wherein h t-1 is a hidden state parameter of the t-1 time step;
the input gate-control value i t is calculated as shown in equation (22):
the output gate gating value o t is calculated as shown in equation (23):
calculating a cell candidate unit value g t as shown in formula (24):
Calculating a cell state parameter c t as shown in formula (25):
Calculating a hidden state parameter h t as shown in formula (26):
ht=ot⊙ct (26)
Then, calculating a loss function of the high-efficiency quantized long-short-time memory network formed by a plurality of high-efficiency quantized long-short-time memory units which are sequentially connected, and carrying out repeated iterative updating on floating point parameters according to the value of the loss function and combining a counter-propagation algorithm; after the maximum iteration times are reached, the long-short-time memory network capable of efficiently quantizing stops iteration updating, floating point parameter values of all parameters and weight matrixes are fixed, and the floating point parameter values are stored.
Calculating the floating point number characteristics after inverse quantization; as shown in formula (27):
Wherein: for the quantized integer feature output by the long-short-time memory network at the t-th time step, Q X is the input quantization scale.
Calculating a loss value; comprising the following steps:
according to the output hidden layer characteristic h t, calculating a current loss value through a loss function E so as to execute back propagation optimization; the loss function E is defined as follows:
Wherein θ represents all the learnable parameters in the long-short-time memory network that can be quantified efficiently; theta-related The j-th eigenvalue of the hidden layer representing t time steps of the i-th sample, and m represents the number of samples in the back propagation optimization process; /(I)
Parameter updating, comprising:
Updating all the learnable parameters in the high-efficiency quantifiable long-short-term memory network according to the formula (29) from the calculated loss values:
wherein μ is the learning rate; θ v represents all the learnable parameters in the long-short-time memory network that can be efficiently quantized at the v-th iteration, Is the gradient value of the loss function E (θ v) to θ v; if v=N max,Nmax=200,Nmax is the set maximum iteration number, stopping the iterative updating by the network, and fixing the parameter value of the weight matrix; otherwise, let v add 1 and continue to calculate the gate and state parameter value, and carry out iterative update.
The electroencephalogram signal characteristic extraction process comprises the following steps:
Feature extraction is carried out by adopting a single-layer convolutional neural network, wherein the single-layer convolutional neural network comprises 8 single-channel one-dimensional convolutional kernels with the convolutional kernel length of 5; the single-layer convolutional neural network maps the original electroencephalogram signal of each time step into an electroencephalogram signal characteristic with a dimension of 1024.
Floating point number featureAnd outputting the category label after mapping by the softmax function.
The quantized long-short-term memory network is tested on abnormal electroencephalogram classification data by adopting ten-fold cross validation, and different network performances are shown in figure 6. All the accuracies in fig. 6 are the average accuracies of a ten-fold cross-validation. For reference, the accuracy of the original floating point parameter traditional long-short time memory network is 97.67%. It can be seen that the network performance at 5-bit quantization can be improved by about 1% over the conventional long-short-term memory network. Moreover, the quantization lookup table of the long-short-time memory network capable of being quantized efficiently can quantize as low as 2 bits without losing the reasoning precision of the model.
Example 5
An abnormal electroencephalogram signal detection system based on a long-short-term memory network FPGA hardware accelerator capable of being quantified with high efficiency is shown in fig. 7, and comprises:
a data acquisition module configured to: collecting an electroencephalogram signal to be detected through an electroencephalogram amplifier and an A/D converter;
a feature extraction module configured to: extracting features of the electroencephalogram signals to be detected, and mapping the original electroencephalogram signals into feature vectors with certain dimensions;
An abnormal electroencephalogram signal detection module configured to: inputting the feature vector into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently, dequantizing the output value of the long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently to obtain the feature of the floating point number after dequantization, and outputting the inspection result (abnormal electroencephalogram or normal electroencephalogram) after mapping by a softmax function;
an electroencephalogram abnormality alarm module configured to: and alarming the detected abnormal electroencephalogram according to the class label output by the abnormal electroencephalogram detection module.

Claims (10)

1. The method is characterized in that the method runs in an FPGA hardware accelerator, and the FPGA hardware accelerator comprises an ARM processor unit PS and a programmable logic unit PL; comprising the following steps:
the long-short time memory network capable of being quantized efficiently consists of a plurality of long-short time memory units which are connected in sequence and capable of being quantized efficiently;
Performing parameter quantization on the trained long-short-time memory network with floating point parameters and capable of being quantized efficiently to obtain a quantized long-short-time memory network;
compiling the quantized long-short-time memory network into Verilog codes, generating IP cores for acceleration, and deploying the IP cores in a programmable logic unit PL;
the ARM processor unit PS is responsible for data preparation and preprocessing work and also for softmax mapping operation;
The quantized configuration parameters of each unit of the long-short-time memory network comprise the number of input or output channels, the dimension of feature vectors, the number of neurons and quantization coefficients, and the quantized parameters are firstly transmitted to a programmable logic unit PL at the PS end of an ARM processor unit through a simplified advanced expansion interface bus;
The input signals, quantization bias and weight are transmitted to the programmable logic unit PL at the ARM processor unit PS end through the advanced expansion interface bus; after the quantized long-short-time memory unit finishes calculation at the programmable logic unit PL end, the output data is transmitted back to the ARM processor unit PS end through the same AXI bus;
parameter configuration of a next quantized long-short-time memory unit is carried out, a new round of feature vector calculation is started, and the process is repeated until the whole quantized long-short-time memory network calculation is completed;
And outputting a calculation result.
2. The method for accelerating the hardware of the FPGA of the long-short-time memory network capable of being quantized efficiently according to claim 1, wherein the piecewise activation function of the long-short-time memory unit capable of being quantized efficiently is characterized in thatThe method comprises the following steps:
Wherein T a is an adjustable Sigmoid function approximation parameter;
further preferably, T a =2.5.
3. The method for accelerating the hardware of the long-short-time memory network FPGA, which is capable of being quantized efficiently, according to claim 2, is characterized in that the calculation process of the cell state parameters in the long-short-time memory unit capable of being quantized efficiently is as follows:
Wherein, the ≡is Hadamard product, c t is the cell state of the t time step in the long and short time memory unit which can be quantized efficiently, i t、ft、gt is the input gate control value, forget gate control value and cell candidate unit value of the t time step in the long and short time memory unit which can be quantized efficiently.
4. The method for accelerating the hardware of the FPGA of the long-short-time memory network capable of being quantized efficiently according to claim 1, wherein the parameter quantization process of the long-short-time memory network capable of being quantized efficiently after training comprises the following steps:
(1) Initializing quantization parameters, including:
let the overall weight matrix w= [ W f;Wi;Wo;Wg ], then
Let the total cyclic weight matrix r= [ R f;Ri;Ro;Rg ], then
Initializing all weight quantization bit widths B WR =8;
Initializing a piecewise function lookup table quantization bit width B LUT =4;
initializing a fixed point digital width B Fix =24;
Initializing a weight quantization factor
Initializing a weight matrix quantization scale Q W=max(|W|)/MWR;
initializing a cyclic weight matrix quantization scale Q R=max(|R|)/MWR; wherein, the max (·) function is a maximum value function, and the |·| function is an absolute value function;
initializing an input quantization scale Q X=1/MWR;
Initializing hidden unit quantization scales
Initializing input weight quantization scales
Initializing input cyclic weight quantization scales
Initializing candidate unit quantization scales
(2) Generating a quantization look-up table; comprising the following steps:
First, with-T a as the start value, T a as the end value, the increment is set Generating an arithmetic sequence I d;
Then, a piecewise function is calculated Is provided for the quantization look-up table: /(I)
(3) Quantizing the weight matrix; comprising the following steps:
the quantized forgetting gate weight matrix is shown as formula (3):
the quantization input gate weight matrix is as shown in formula (4):
the quantization output gate weight matrix is as shown in formula (5):
the cell candidate weight matrix is quantified as shown in formula (6):
the quantized forgetting gate cyclic weight matrix is shown in formula (7):
the quantization is input into the gate cyclic weight matrix, as shown in formula (8):
The quantization output gate cycle weight matrix is as shown in formula (9):
quantifying a cell candidate cyclic weight matrix as shown in formula (10):
quantifying the forgetting gate bias as shown in equation (11):
the input gate bias is quantized as shown in equation (12):
The quantization output gate bias is as shown in equation (13):
quantifying the cell candidate bias as shown in formula (14):
(4) Calculating quantized gate control and state parameter values; comprising the following steps:
calculating a quantized forget gate gating value:
Calculating a quantized input gate control value:
Calculating a quantized output gate control value:
Calculating a quantized cell candidate unit value:
calculating quantitative cell state parameter values:
calculating quantized hidden state parameter values:
5. The long-short-time memory network FPGA hardware accelerator is characterized by comprising an ARM processor unit PS and a programmable logic unit PL which are connected through an advanced expansion interface bus, and the long-short-time memory network FPGA hardware acceleration method capable of being quantized efficiently is realized.
6. An abnormal electroencephalogram signal detection method based on a long-short-term memory network FPGA hardware accelerator capable of being quantized with high efficiency is characterized by comprising the following steps:
A data acquisition module consisting of an electroencephalogram amplifier and an A/D converter is adopted to acquire an electroencephalogram signal to be detected;
training a long-short-time memory network capable of being quantized efficiently;
the trained long-short-time memory network capable of being quantized efficiently is deployed on the FPGA hardware accelerator of the long-short-time memory network capable of being quantized efficiently;
extracting characteristics of an electroencephalogram signal to be detected;
inputting the characteristics of the electroencephalogram signals to be detected into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently to obtain output values;
Inverse quantization is carried out on the output value of the long-short-term memory network FPGA hardware accelerator which can be quantized efficiently, floating point number characteristics after inverse quantization are obtained, and inspection results are output after softmax function mapping;
If the detection result is abnormal electroencephalogram, alarming is carried out through an alarm module.
7. The method for detecting abnormal electroencephalogram signals based on the high-efficiency quantifiable long-short-time memory network FPGA hardware accelerator according to claim 6, wherein the training process of the high-efficiency quantifiable long-short-time memory network comprises the following steps:
firstly, acquiring electroencephalogram data for training by a data acquisition module consisting of an electroencephalogram amplifier and an A/D converter;
secondly, initializing parameters and weight matrixes of floating point formats of all units according to the set quantity of units in the long-short-time memory network capable of being quantized efficiently, wherein the parameters and weight matrixes comprise:
cell state parameter at initial time step t=0 All initialized to 0;
hidden state parameter when initial time step t=0 All initialized to 0;
forgetting the gate weight matrix Input gate weight matrix/>Outputting a gate weight matrixCell candidate weight matrix/>Initializing to be a random number;
Cycle weight matrix for forgetting gate Input gate cyclic weight matrix/>Output gate cycle weight matrix/>Cell candidate cyclic weight matrix/>Initializing to be a random number;
Biasing a forgetting door Input gate bias/>Output gate bias/>Cell candidate biasAll initialized to 0;
Then, according to the parameter and the weight matrix, calculating various gating and state parameter values, including:
For the data u t of the t-th time step in the extracted electroencephalogram signal characteristics, calculating floating point parameter values of each gating according to the following formula:
Calculating a forget gate gating value f t as shown in formula (21):
wherein h t-1 is a hidden state parameter of the t-1 time step;
the input gate-control value i t is calculated as shown in equation (22):
the output gate gating value o t is calculated as shown in equation (23):
calculating a cell candidate unit value g t as shown in formula (24):
Calculating a cell state parameter c t as shown in formula (25):
Calculating a hidden state parameter h t as shown in formula (26):
ht=ot⊙ct (26)
then, calculating a loss function of the high-efficiency quantized long-short-time memory network formed by a plurality of high-efficiency quantized long-short-time memory units which are sequentially connected, and carrying out repeated iterative updating on floating point parameters according to the value of the loss function and combining a counter-propagation algorithm; after the maximum iteration times are reached, stopping iteration updating by the long-short-time memory network capable of efficiently quantizing, fixing floating point parameter values of each parameter and weight matrix, and storing;
Further preferably, the inverse quantized floating point number features are calculated; as shown in formula (27):
Wherein: for the quantized integer feature output by the long-short-time memory network at the t-th time step, Q X is the input quantization scale.
8. The method for detecting abnormal electroencephalogram signals based on the high-efficiency quantifiable long-short-term memory network FPGA hardware accelerator, which is characterized by calculating a loss value; comprising the following steps:
calculating a current loss value through a loss function E according to the output hidden layer characteristic h t; the loss function E is defined as follows:
Wherein θ represents all the learnable parameters in the long-short-time memory network that can be quantified efficiently; theta-related The j-th eigenvalue of the hidden layer representing t time steps of the i-th sample, and m represents the number of samples in the back propagation optimization process;
further preferably, the parameter updating includes:
Updating all the learnable parameters in the high-efficiency quantifiable long-short-term memory network according to the formula (29) from the calculated loss values:
wherein μ is the learning rate; θ v represents all the learnable parameters in the long-short-time memory network that can be efficiently quantized at the v-th iteration, Is the gradient value of the loss function E (θ v) to θ v; if v=N max,Nmax=200,Nmax is the set maximum iteration number, stopping the iterative updating by the network, and fixing the parameter value of the weight matrix; otherwise, let v add 1 and continue to calculate the gate and state parameter value, and carry out iterative update.
9. The method for detecting abnormal electroencephalogram signals based on the high-efficiency quantized long-short-term memory network FPGA accelerator according to claim 6, wherein the electroencephalogram signal feature extraction process comprises the following steps:
Feature extraction is carried out by adopting a single-layer convolutional neural network, wherein the single-layer convolutional neural network comprises 8 single-channel one-dimensional convolutional kernels with the convolutional kernel length of 5; the single-layer convolutional neural network maps the original electroencephalogram signal of each time step into an electroencephalogram signal characteristic with a dimension of 1024.
10. An abnormal electroencephalogram signal detection system based on a long-short-term memory network FPGA hardware accelerator capable of being quantified efficiently is characterized by comprising:
a data acquisition module configured to: collecting an electroencephalogram signal to be detected through an electroencephalogram amplifier and an A/D converter;
a feature extraction module configured to: extracting features of the electroencephalogram signals to be detected, and mapping the original electroencephalogram signals into feature vectors with certain dimensions;
An abnormal electroencephalogram signal detection module configured to: inputting the feature vector into a long-short-time memory network FPGA hardware accelerator capable of being quantized efficiently, dequantizing the output value of the long-short-time memory network FPGA hardware accelerator to obtain dequantized floating point number features, and outputting a checking result after softmax function mapping;
an electroencephalogram abnormality alarm module configured to: and alarming the detected abnormal electroencephalogram according to the class label output by the abnormal electroencephalogram detection module.
CN202410548521.4A 2024-05-06 2024-05-06 Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently Pending CN118228789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410548521.4A CN118228789A (en) 2024-05-06 2024-05-06 Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410548521.4A CN118228789A (en) 2024-05-06 2024-05-06 Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently

Publications (1)

Publication Number Publication Date
CN118228789A true CN118228789A (en) 2024-06-21

Family

ID=91499613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410548521.4A Pending CN118228789A (en) 2024-05-06 2024-05-06 Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently

Country Status (1)

Country Link
CN (1) CN118228789A (en)

Similar Documents

Publication Publication Date Title
US20200265301A1 (en) Incremental training of machine learning tools
US20200210840A1 (en) Adjusting precision and topology parameters for neural network training based on a performance metric
US20200264876A1 (en) Adjusting activation compression for neural network training
CN105488563A (en) Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device
CN112200296B (en) Network model quantization method and device, storage medium and electronic equipment
CN112233675B (en) Voice wake-up method and system based on separated convolutional neural network
CN111832228A (en) Vibration transmission system based on CNN-LSTM
CN115759237A (en) End-to-end deep neural network model compression and heterogeneous conversion system and method
CN114239861A (en) Model compression method and system based on multi-teacher combined guidance quantification
CN114170512A (en) Remote sensing SAR target detection method based on combination of network pruning and parameter quantification
CN113988357A (en) High-rise building wind-induced response prediction method and device based on deep learning
Jakaria et al. Comparison of classification of birds using lightweight deep convolutional neural networks
CN114295967A (en) Analog circuit fault diagnosis method based on migration neural network
CN110288002B (en) Image classification method based on sparse orthogonal neural network
Ullah et al. L2L: A highly accurate Log_2_Lead quantization of pre-trained neural networks
US20230008856A1 (en) Neural network facilitating fixed-point emulation of floating-point computation
CN118228789A (en) Method and system for detecting abnormal electroencephalogram signals by accelerating long-short-term memory network FPGA hardware and capable of being quantized efficiently
CN115564987A (en) Training method and application of image classification model based on meta-learning
CN113804833A (en) Universal electronic nose drift calibration method based on convex set projection and extreme learning machine
CN112285565A (en) Method for predicting SOH (State of health) of battery by transfer learning based on RKHS (remote keyless entry) domain matching
WO2020234602A1 (en) Identifying at least one object within an image
US20240028895A1 (en) Switchable one-sided sparsity acceleration
Lin et al. Optimization for a neural network based on input-vectors correlation and its application to a truck scale
CN116959489B (en) Quantization method and device for voice model, server and storage medium
CN117934963B (en) Gas sensor drift compensation method

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination