CN116822592A - Target tracking method based on event data and impulse neural network - Google Patents

Target tracking method based on event data and impulse neural network Download PDF

Info

Publication number
CN116822592A
CN116822592A CN202310725451.0A CN202310725451A CN116822592A CN 116822592 A CN116822592 A CN 116822592A CN 202310725451 A CN202310725451 A CN 202310725451A CN 116822592 A CN116822592 A CN 116822592A
Authority
CN
China
Prior art keywords
pulse
data
neural network
time
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310725451.0A
Other languages
Chinese (zh)
Inventor
马德
周烨
李一涛
胡有能
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202310725451.0A priority Critical patent/CN116822592A/en
Publication of CN116822592A publication Critical patent/CN116822592A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a target tracking method based on event data and a pulse neural network, which comprises the following steps: acquiring event data, and performing time dimension compression processing of an event frame on the event data to obtain sample data; the SiamFC network is subjected to structure adjustment and converted into a pulse neural network, the SiamFC network is pre-trained by utilizing sample data, and the obtained weight and bias are transferred to the pulse neural network to optimize the pulse neural network; and acquiring real-time event data, adjusting the time step to obtain search data, and utilizing an optimized pulse neural network to infer and synthesize similarity estimation on the search data and the template data so as to obtain a real-time target tracking result. According to the target tracking method, the pulse neural network is optimized through SiamFC with training, and based on the FPGA, the pulse neural network is accelerated, and the real-time performance and the accuracy of target tracking of the pulse neural network are improved under low power consumption.

Description

Target tracking method based on event data and impulse neural network
Technical Field
The invention belongs to the technical field of target tracking, and particularly relates to a target tracking method based on event data and a pulse neural network.
Background
Target tracking is an important ring of intelligent video monitoring and is widely used in the fields of intelligent traffic, real-time monitoring and man-machine interaction. Traditional sensors and tracking algorithms are based on camera image frames, which have the disadvantages of high redundancy, high delay and high data volume, and high-speed accurate tracking is often difficult to achieve under complex environmental conditions, and challenges such as illumination, deformation, occlusion, dimensional change, image blurring, rapid motion and the like are faced. There are two main methods of correlation filter and deep learning in the current target tracking algorithm.
The full convolution twin network (sialmfc) has a paired network structure, specifically a structure with two inputs, one being a template for reference and the other being a candidate sample to be selected. In the single-target tracking task, the template serving as a reference is an object to be tracked, a target object in a first frame of a video sequence is usually selected, a candidate sample is an image search area in each frame later, and the twin network needs to find a candidate area which is the most similar to the template in the first frame in each frame later, namely the target in the frame, so that tracking of one target can be realized. Compared with other full convolution network tracking algorithms, the SiamFC has the advantages of small network structure, less calculation amount and higher tracking speed.
The impulse neural network (spike neural network, SNN) is a third generation neural network, a neural node with time-series dynamics, a synaptic structure with steady-state-plasticity balance, a function-specific network loop, etc., wherein a neuron is not activated in each iteration propagation, but is activated only when its membrane potential reaches a certain threshold. When a neuron is activated, it can generate a signal to transmit to other neurons to raise or lower its membrane potential, thus realizing the pulse emission of the simulated biological neurons, having low power consumption and high speed which are not possessed by the traditional artificial neural network, and being capable of compensating the defects of the deep learning method used for target tracking. The method for converting the artificial neural network into the impulse neural network can fully utilize the training algorithm of the existing artificial neural network and can train deeper networks.
Hardware platforms capable of carrying impulse neural networks are mainly divided into two types: a custom FPGA accelerator for a particular model and a brain-like computer adapted for a generic model. The FPGA can perform quick hardware function verification and evaluation, can quickly design iteration under the condition of small constraint on the neural network, and becomes a choice of numerous hardware designs. In addition, a successful FPGA implementation example can also be the previous step in customizing the chip, providing a reference for chip design.
Disclosure of Invention
The invention aims to solve the technical problems of large calculation amount, low real-time performance and the like of the existing target tracking method, and the target tracking method based on event data and a pulse neural network is provided.
To achieve the above object, an embodiment provides a target tracking method based on event data and a pulse neural network, including the steps of:
acquiring event data, and performing time dimension compression processing of an event frame on the event data to obtain sample data;
the SiamFC network is subjected to structure adjustment to be converted into a pulse neural network, the SiamFC network is pre-trained by utilizing sample data, and the obtained weight and bias are migrated to the pulse neural network to optimize the pulse neural network;
and acquiring real-time event data, adjusting the time step to obtain search data, and utilizing an optimized pulse neural network to infer and synthesize similarity estimation on the search data and the template data so as to obtain a real-time target tracking result.
Preferably, the event data is asynchronously collected by a dynamic vision sensor, the event data e includes position information (x, y) of each pixel point, event polarity p and time t, expressed as e= [ x, y, t, p] T The event polarity p includes positive events and negative events;
the time dimension compression processing of the event frames is carried out on the event data, and the time dimension compression processing comprises the following steps:
event data within a time period is selected to form an event package, and a target pixel point (x k ,y k ) Number of positive events of (a)And negative event numberEvent data of different polarities are stored in different channels, and two-dimensional data containing the number of positive events and the number of negative events is obtained.
Preferably, the tailoring the sialmfc network to conform to the pattern of the impulse neural network includes: setting the output of each convolution layer in the SiamFC network to be positive through an absolute value function;
setting the offset of convolution operation in each convolution layer in the SiamFC network to 0;
the max pooling operation in the sialmfc network is changed to evaluate the firing rate of neurons at each time step, i.e. the absolute firing rate accumulated over time is calculated, based on which the pulses of neurons with the maximum firing rate in one pooling window are passed.
Preferably, the absolute pulse delivery rate is calculated by:
f s (t)=f s (t-1)+x s (t)(T-t)
wherein ,fs (t) represents the firing rate of neurons s at time t, x s (t) indicates whether or not the neuron s has been pulsed at time t, x s (t) =1 indicates dispensing, x s (t) =0 indicates that no firing is performed if the neuron firesThe firing rate of the neuron is updated by the pulse, otherwise, not updated, and T represents the time window length required for one inference.
Preferably, the resulting weights and biases are migrated to the impulse neural network to optimize the impulse neural network. Comprising the following steps:
the weights and offsets are normalized by using the maximum activation value in the channel dimension of each layer, and the maximum activation value is expressed as:
where i and j represent dimension indices, l represents the number of layers,representing the weight of layer l in the i, j dimension by using the maximum activation value +.>Normalization is performed, and for non-first layers, the normalized activation value must be multiplied by +.>Restoring the input to the value before normalization of the previous layer, and then normalizing the current layer to obtain the weight after normalization>Represents the bias of layer l in the j-th dimension, < >>Representing the normalized bias;
migration to the impulse neural network is achieved by normalizing the weights and biases to optimize the impulse neural network.
Preferably, the adjusting the time step for the real-time event data includes: and accumulating the real-time event data in each time step according to the preset time step length and the time step number N, if the event number in the preset N time steps is greater than 0, marking the pixel value as 1, otherwise, marking the pixel value as 0, and obtaining the search data.
Preferably, the reasoning and comprehensive similarity estimation of the search data and the template data using the optimized impulse neural network includes:
the search data and the template data are respectively input into two branch branches of the optimized pulse neural network, the membrane potential of the last layer of activation function of each branch is accumulated in a given time without triggering a peak value through forward reasoning calculation, two pulse feature graphs corresponding to the two data are obtained at the same time, potential similarity estimation is calculated through the correlation between the membrane potentials corresponding to the two pulse feature graphs, and the potential similarity estimation is expressed as follows by a formula:
wherein ,pulse characteristic diagram corresponding to template data representing time step t,/->Representing a pulse characteristic diagram corresponding to the time step t search data, M P (z, x) represents potential similarity estimation;
meanwhile, the time correlation between the two pulse characteristic diagrams is calculated, and the time similarity estimation is matched at each time step, and the time similarity estimation is expressed as follows by a formula:
where τ represents the response period;
in the conversion process from the SiamFC network to the impulse neural network, the error is inversely proportional to time, and the potential similarity estimation and the time similarity estimation are combined to obtain the comprehensive similarity estimation, which is expressed as follows by a formula:
wherein ,fspike (z, x) denotes the integrated similarity estimate, and T denotes the time period.
Preferably, the reasoning process of the impulse neural network is accelerated by an FPGA, and the impulse neural network is quantized at a fixed point to be converted into data which can be input into the FPGA, including:
by letting V (t) =v (t) ·2 β As neuron state voltage, W will be i =w i ·2 β As synaptic weight, v thr =V thr ·2 β Performing floating-point to vertex conversion as neuron threshold, β representing the certificate of scaling factor, V (t) representing the neuron raw voltage, w i Representing the original weights of neurons, the formula is converted into the following fixed-point form:
if v (t) is not less than v thr Issuing pulses and resetting v (t) =v (t) -v thr
If v (t) < v min Reset v (t) =0
wherein ,xi (t-1) shows the pulse input value of the i-th neuron of the previous layer at the time t-1, and the value range of i is the number of the neurons of the previous layer.
Preferably, when the reasoning process of the impulse neural network is realized by the FPGA in an acceleration way, the method comprises two parts of data transmission and network calculation, wherein the data transmission part is used for transmitting network transmission impulse data from the PC to the FPGA and transmitting a network transmission impulse result obtained by the FPGA through calculation back to the PC for display;
the network calculation part is realized on the FPGA, after the pulse data is input by the network, the pulse data is output through the calculation of the established layer number, the network calculation is concentrated in the pulse convolution layer, and the pulse convolution layer is divided into a pulse convolution overall control module, a shift module, a calculation unit control module, a pulse calculation unit array and a membrane potential calculation module;
the pulse convolution overall control module is used for controlling pulse convolution layer internal data calculation, the pulse calculation unit array is formed by the pulse calculation unit array, each pulse calculation unit is used for realizing the weight accumulation of a single convolution kernel based on pulse data, the calculation unit control module is used for controlling the weight accumulation of the single convolution kernel, the shift module is used for controlling the shift of the pulse data among the pulse calculation units in the pulse calculation unit array, and the membrane potential calculation module is used for carrying out membrane potential calculation based on the accumulated weight sum of all convolution kernels in the pulse convolution layer.
Preferably, each convolution kernel is provided with 1 pulse calculation unit, input pulse data are stored in a register in the pulse calculation unit, each pulse data contains pulse information of neurons with the number of channels, when a control signal output by a control module of the calculation unit indicates to start calculation, all channels are traversed, whether input weight values are accumulated to the weight values or not is selected according to whether the pulse of the corresponding channel in the register is 1 or 0, the completion of traversing all channels is waited, the accumulated weight values are output to a membrane potential calculation module, and pulse data in the register are sent to other pulse calculation units.
Compared with the prior art, the invention has the beneficial effects that at least the following steps are included:
aiming at different training and testing, different processing is carried out on input event data, the event data stream is polarized and compressed into two-dimensional data during training, the accuracy of artificial network training is reserved, the event data stream is divided again according to time steps during testing, redundant data is reduced, the pulse characteristics of the event data are reserved, and the event data stream is matched with a pulse neural network better;
the SiamFC network is pre-trained by adopting a method for converting the traditional artificial neural network into the impulse neural network, so that the problem that the impulse issuing position of the impulse neural network is not conductive during training is avoided, and a training method of the artificial neural network with mature technology can be applied, so that the converted impulse neural network result is more accurate;
the FPGA is used for carrying out hardware acceleration on the impulse neural network, so that iteration can be rapidly designed under the condition of small constraint on the neural network. Compared with the implementation of the same algorithm on a CPU, the hardware architecture improves the speed and reduces the memory power consumption.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for target tracking based on event data and impulse neural networks provided by an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a sialmfc network according to an embodiment of the present invention;
FIG. 3 is a flow chart of target tracking using a pulsed neural network provided by an embodiment of the present invention;
FIG. 4 is a diagram of a structure and implementation process of a pulse calculation unit according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a pulse calculating unit array according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description is presented by way of example only and is not intended to limit the scope of the invention.
Aiming at the technical problems of large calculated amount and low real-time performance of the traditional target tracking method, the embodiment of the invention provides a target tracking method based on event data and a pulse neural network, which comprises the steps of event data input processing, pulse neural network construction and comprehensive similarity estimation analysis results, wherein the event data is respectively input into a SiamFC network to be trained and a pulse neural network to be inferred through adjustment of a compression time dimension and a time step, the pulse neural network structure is formed by cutting and converting the SiamFC network, and the normalization weight obtained by the pre-training SiamFC network is used. And the FPGA hardware acceleration architecture is designed to accelerate the impulse neural network, and the method can track the target image of the event data in real time, is suitable for a target moving at a high speed, and has low power consumption and high precision.
Fig. 1 is a flowchart of a target tracking method based on event data and a impulse neural network according to an embodiment of the present invention. As shown in fig. 1, the object tracking method based on event data and a pulse neural network provided by the embodiment includes the following steps:
s110, acquiring event data, and performing time dimension compression processing of an event frame on the event data to obtain sample data.
In an embodiment, an asynchronous event data stream is collected by using a dynamic vision sensor, the dynamic vision sensor senses the light intensity variation information on each pixel point, and when the light intensity variation range exceeds a certain threshold, the vision sensor can independently output the position information (x, y), the event polarity p and the time t of each pixel point, which are expressed as four-dimensional data e= [ x, y, t, p] T Wherein the event polarity p includes a positive event and a negative event. The event data stream acquired by the dynamic vision sensor is used as input data, so that the dynamic vision sensor is better suitable for scenes moving at a high speed compared with the traditional image, and the real-time performance is higher.
After the event data is obtained, the event data is preprocessed to construct sample data for training the sialmfc network. The training event data stream is input into a SiamFC network, the time dimension compression processing is carried out on the event data, and the two-dimensional event frame processing is carried out on the four-dimensional data, which comprises the following steps: event data within a time period is selected to form an event package, and a target pixel point (x k ,y k ) Number of positive events of (a)And number of negative events->Event data of different polarities are stored in different channels, and two-dimensional data containing the number of positive events and the number of negative events is obtained as sample data.
S120, performing structure adjustment on the SiamFC network to convert the SiamFC network into a pulse neural network.
In an embodiment, the sialmfc network is used for target tracking, and specific structures and parameters are shown in table 1 and fig. 2:
TABLE 1
Wherein Conv2d each represent a convolutional layer, reLU represents a neuronal layer, maxpooling represents a max pooling layer, Z represents a template frame, and X represents a search frame. size represents batch size, stinde represents time step.
In the embodiment, the structure of the SiamFC network is adjusted to be changed into a pulse neural network, and the following aspects of adjustment are mainly adopted:
first, the resulting weights and offsets of the convolution operation may be output as negative values, and the activation function may produce negative activation values, which the impulse neural network cannot represent. To address this problem, each convolution layer is followed by an absolute function such that the output of each convolution layer is non-negative.
Second, the bias is obtained after each convolution operation, and the pulse neural network cannot realize constant bias input in the process of membrane potential accumulation. For this problem, all biases may be set to 0.
Third, the maximum pooling operation in the convolutional network is to select the maximum eigenvalue in a constant size window, and the impulse neural network cannot perform nonlinear maximum pooling operation. In a pulsed neural network, the max pooling operation is to pass the pulses of neurons with the maximum firing rate in a pooling window. Thus, at each time step, the firing rate of the neurons needs to be assessed. The invention adopts the absolute pulse release rate accumulated along with time to calculate, and the release rate formula is as follows:
f s (t)=f s (t-1)+x s (t)(T-t)
wherein ,fs (t) represents the firing rate of neurons s at time t, x s (t) indicates whether or not the neuron s has been pulsed at time t, x s (t) =1 indicates dispensing, x s (T) =0 indicates that no firing is performed, if a neuron fires a pulse, the firing rate of the neuron is updated, otherwise, no updating is performed, and T indicates the time window length required for one inference.
In an embodiment, the pulses of neurons having the largest firing rate in a pooling window are passed based on the absolute firing rate. After each inference is made, the firing rate of the neuron will be reset to 0, with the later arriving pulses having less impact on firing rate. In addition, if a neuron is fired a plurality of times with a relatively late firing pulse, the firing rate may exceed that of a neuron with an earlier firing pulse.
S130, after the SiamFC network is pre-trained by using the sample data, the obtained weight and bias are migrated to the impulse neural network to optimize the impulse neural network.
In the embodiment, the sizes of the template data (template image) and the sample data (to-be-searched area image) are 127×127 and 255×255 respectively, the corresponding branches of the SiamFC network are input, end-to-end forward reasoning is performed, a confidence score diagram is obtained, and the target position is obtained through the confidence score diagram. One path of structure of the SiamFC network comprises 5 layers of convolution layers, the largest pooling layer is arranged behind the 1 st layer and the 2 nd layer, the final characteristics obtained by the two paths of convolution are subjected to convolution calculation to evaluate similarity, when the similarity is larger, the target tracking effect is better, the sample data is utilized to pretrain the SiamFC network, and pretrained weights and biases are obtained.
In the embodiment, after pretraining the sialmfc network, the obtained weights and biases are migrated to the impulse neural network to optimize the impulse neural network, specifically, in the channel dimension of each layer, the weights and biases are normalized by using the maximum activation value, and the weights and biases are expressed as:
where i and j represent dimension indices, l represents the number of layers,representing the weight of layer l in the i, j dimension by using the maximum activation value +.>Normalization is performed, and for non-first layers, the normalized activation value must be multiplied by +.>Restoring the input to the value before normalization of the previous layer, and then normalizing the current layer, wherein otherwise the transmitted information is smaller and smaller to obtain the normalized weight +.>Represents the bias of layer l in the j-th dimension, < >>And representing the normalized bias, and realizing migration to the impulse neural network through the normalized weight and the bias to optimize the impulse neural network.
And S140, acquiring real-time event data and performing time step adjustment processing to obtain search data.
In the embodiment, the event stream data for reasoning is input into the SiamSNN network after conversion, the adaptation degree of the pulse neural network and the asynchronous event stream data format is higher, the event data dimension is not required to be compressed, and only the time step length and the time step number are required to be divided again. Specifically, a preset time step (for example, 0.1 ms) and a time step number N (for example, n=50), real-time event data in each time step is accumulated according to the time step, if the number of events in the preset N time steps is greater than 0, the value of the marked pixel is 1, otherwise, the value of the marked pixel is 0, so as to obtain search data of newly divided time steps.
And S150, utilizing the optimized impulse neural network to perform reasoning and comprehensive similarity estimation on the search data and the template data so as to obtain a real-time target tracking result.
In the embodiment, when the optimized impulse neural network is utilized to infer the search data and the template data, comprehensive similarity estimation is adopted so as to unify potential similarity estimation and time similarity estimation. Specifically, as shown in fig. 3, search data (search graph) and template data (template graph) are respectively input to two branches of an optimized pulse neural network, and through forward reasoning calculation, the last layer of activation function of each branch accumulates membrane potential in a given time without triggering a peak value, two pulse feature graphs corresponding to the two data are obtained at the same time, and potential similarity estimation is calculated through correlation between the membrane potential corresponding to the two pulse feature graphs, and is expressed as:
wherein ,pulse characteristic diagram corresponding to template data representing time step t,/->Representing a pulse characteristic diagram corresponding to the time step t search data, M P (z, x) represents potential similarity estimation;
meanwhile, the time correlation between the two pulse characteristic diagrams is calculated, and the time similarity estimation is matched at each time step, and the time similarity estimation is expressed as follows by a formula:
where τ represents the response period;
in the conversion process from the SiamFC network to the impulse neural network, the error is inversely proportional to time, and the potential similarity estimation and the time similarity estimation are combined to obtain the comprehensive similarity estimation, which is expressed as follows by a formula:
wherein ,fspike (z, x) denotes the integrated similarity estimate, and T denotes the time period.
In an embodiment, in order to improve the calculation efficiency, the reasoning process of the impulse neural network is implemented by the FPGA in an accelerating manner, so that the impulse neural network needs to be quantized at a fixed point to be converted into data that can be input into the FPGA, and specifically includes:
by letting V (t) =v (t) ·2 β As neuron state voltage, W will be i =w i ·2 β As synaptic weight, v thr =V thr ·2 β As neuron threshold to perform floating point to vertex conversion, β represents a certificate of a scaling factor, different values may be taken to represent different compression quantization, e.g., β takes a value of 3, i.e., 8-bit quantization to reduce on-chip memory space, V (t) represents the neuron original voltage, w i Representing the original weights of neurons, the formula is converted into the following fixed-point form:
if v (t) is not less than v thr Issuing pulses and resetting v (t) =v (t) -v thr
If v (t) < v min Reset v (t) =0
wherein ,xi (t-1) shows the pulse input value of the i-th neuron of the previous layer at the time t-1, and the value range of i is the number of the neurons of the previous layer.
In the embodiment, when the reasoning process of the pulse neural network is realized by the FPGA in an acceleration way, the method comprises two parts of data transmission and network calculation, wherein the data transmission part is used for transmitting network transmission pulse data from the PC to the FPGA and transmitting a network transmission pulse result obtained by the FPGA through calculation back to the PC for display;
the network calculation part is realized on the FPGA, after the pulse data is input by the network, the pulse data is output through the calculation of the established layer number, the network calculation is concentrated in the pulse convolution layer, and the pulse convolution layer is divided into a pulse convolution overall control module, a shift module, a calculation unit control module, a pulse calculation unit array and a membrane potential calculation module;
the pulse convolution overall control module is used for controlling pulse convolution layer internal data calculation, the pulse calculation unit array is formed by the pulse calculation unit array, each pulse calculation unit is used for realizing the weight accumulation of a single convolution kernel based on pulse data, the calculation unit control module is used for controlling the weight accumulation of the single convolution kernel, the shift module is used for controlling the shift of the pulse data among the pulse calculation units in the pulse calculation unit array, and the membrane potential calculation module is used for carrying out membrane potential calculation based on the accumulated weight sum of all convolution kernels in the pulse convolution layer.
In an embodiment, 1 pulse calculation unit (SCU) is provided for each convolution kernel, and the pulse calculation unit selects whether to accumulate the input weights to the weight sums according to whether the pulse of the corresponding channel of the input register in the pulse convolution kernel is 1 or 0. Specifically, as shown in fig. 4, input pulse data is stored in a register inside the pulse calculation unit, each pulse data includes pulse information of neurons of the number of channels, when a control signal output by a control module of the calculation unit indicates to start calculation, all channels are traversed, whether to accumulate input weight values to the weight values and the weight values according to whether the pulse of the corresponding channel in the register is 1 or 0 is selected, the completion of traversing all channels is waited, the accumulated weight values and the pulse data in the register are output to the membrane potential calculation module, and the pulse data in the register are sent to other pulse calculation units.
In the embodiment, a pulse convolution parallel computing structure is also designed, and a pulse convolution kernel computing array and a shift register are adopted. The convolution kernel is a window of 5*5, 1 SCU is set at each position of the convolution kernel, and 25 SCUs are set in total. The SCUs of each row are sequentially connected from right to left, the rightmost SCU inputs pulses from the outside, and each time the SCU finishes calculating, the SCU outputs the weight sum of all output channels, the pulses latched in the internal register are sent to the neuron on the left, and the weights of all SCUs are taken out from the weight storage unit once and distributed to all SCUs. And the weights obtained by calculation of all SCUs are summarized and output to a membrane potential calculation module.
The foregoing detailed description of the preferred embodiments and advantages of the invention will be appreciated that the foregoing description is merely illustrative of the presently preferred embodiments of the invention, and that no changes, additions, substitutions and equivalents of those embodiments are intended to be included within the scope of the invention.

Claims (10)

1. The target tracking method based on the event data and the impulse neural network is characterized by comprising the following steps of:
acquiring event data, and performing time dimension compression processing of an event frame on the event data to obtain sample data;
the SiamFC network is subjected to structure adjustment to be converted into a pulse neural network, the SiamFC network is pre-trained by utilizing sample data, and the obtained weight and bias are migrated to the pulse neural network to optimize the pulse neural network;
and acquiring real-time event data, adjusting the time step to obtain search data, and utilizing an optimized pulse neural network to infer and synthesize similarity estimation on the search data and the template data so as to obtain a real-time target tracking result.
2. The method of claim 1, wherein the event data is asynchronously acquired by a dynamic vision sensor, and the event data e includes position information (x, y) of each pixel, an event polarity p, and a time t, expressed as e= [ x, y, t, p ]] T The event polarity p includes positive events and negative events;
the time dimension compression processing of the event frames is carried out on the event data, and the time dimension compression processing comprises the following steps:
event data within a time period is selected to form an event package, and a target pixel point (x k ,y k ) Number of positive events of (a)And negative event numberEvent data of different polarities are stored in different channels, and two-dimensional data containing the number of positive events and the number of negative events is obtained.
3. The method of claim 1, wherein the clipping the sialfc network to a pattern that conforms to a pulsed neural network comprises: setting the output of each convolution layer in the SiamFC network to be positive through an absolute value function;
setting the offset of convolution operation in each convolution layer in the SiamFC network to 0;
the max pooling operation in the sialmfc network is changed to evaluate the firing rate of neurons at each time step, i.e. the absolute firing rate accumulated over time is calculated, based on which the pulses of neurons with the maximum firing rate in one pooling window are passed.
4. The method of claim 3, wherein the absolute pulse delivery rate is calculated by:
f s (t)=f s (t-1)+x s (t)(T-t)
wherein ,fs (t) represents the firing rate of neurons s at time t, x s (t) indicates whether or not the neuron s has been pulsed at time t, x s (t) =1 indicates dispensing, x s (T) =0 indicates that no firing is performed, if a neuron fires a pulse, the firing rate of the neuron is updated, otherwise, no updating is performed, and T indicates the time window length required for one inference.
5. The method of claim 1, wherein the migration of the resulting weights and biases to the impulse neural network optimizes the impulse neural network. Comprising the following steps:
the weights and offsets are normalized by using the maximum activation value in the channel dimension of each layer, and the maximum activation value is expressed as:
where i and j represent dimension indices, l represents the number of layers,representing the weight of layer l in the i, j dimension by using the maximum activation value +.>Normalization is performed, and for non-first layers, the normalized activation value must be multiplied by +.>To restore the input to the value before normalization of the previous layer,then normalizing the layer to obtain normalized weight +.> Represents the bias of layer l in the j-th dimension, < >>Representing the normalized bias;
migration to the impulse neural network is achieved by normalizing the weights and biases to optimize the impulse neural network.
6. The method of claim 1, wherein adjusting the time step for the real-time event data comprises: and accumulating the real-time event data in each time step according to the preset time step length and the time step number N, if the event number in the preset N time steps is greater than 0, marking the pixel value as 1, otherwise, marking the pixel value as 0, and obtaining the search data.
7. The method of claim 1, wherein using the optimized impulse neural network to infer and synthesize similarity estimates for the search data and the template data comprises:
the search data and the template data are respectively input into two branch branches of the optimized pulse neural network, the membrane potential of the last layer of activation function of each branch is accumulated in a given time without triggering a peak value through forward reasoning calculation, two pulse feature graphs corresponding to the two data are obtained at the same time, potential similarity estimation is calculated through the correlation between the membrane potentials corresponding to the two pulse feature graphs, and the potential similarity estimation is expressed as follows by a formula:
wherein ,pulse characteristic diagram corresponding to template data representing time step t,/->Representing a pulse characteristic diagram corresponding to the time step t search data, M P (z, x) represents potential similarity estimation;
meanwhile, the time correlation between the two pulse characteristic diagrams is calculated, and the time similarity estimation is matched at each time step, and the time similarity estimation is expressed as follows by a formula:
where τ represents the response period;
in the conversion process from the SiamFC network to the impulse neural network, the error is inversely proportional to time, and the potential similarity estimation and the time similarity estimation are combined to obtain the comprehensive similarity estimation, which is expressed as follows by a formula:
wherein ,fspike (z, x) denotes the integrated similarity estimate, and T denotes the time period.
8. The method for tracking the target based on the event data and the impulse neural network according to claim 1, wherein the reasoning process of the impulse neural network is realized by acceleration of the FPGA, and the impulse neural network is subjected to fixed-point quantization to be converted into data which can be input into the FPGA, comprising:
by letting V (t) =v (t) ·2 β As neuron state voltage, W will be i =w i ·2 β As synaptic weight, v thr =V thr ·2 β Performing floating-point to vertex conversion as neuron threshold, β representing the certificate of scaling factor, V (t) representing the neuron raw voltage, w i Representing the original weights of neurons, the formula is converted into the following fixed-point form:
if v (t) is not less than v thr Issuing pulses and resetting v (t) =v (t) -v thr
If v (t)<v min Reset v (t) =0
wherein ,xi (t-1) shows the pulse input value of the i-th neuron of the previous layer at the time t-1, and the value range of i is the number of the neurons of the previous layer.
9. The target tracking method based on event data and a pulse neural network according to claim 1, wherein when the reasoning process of the pulse neural network is realized by the FPGA in an accelerating way, the method comprises two parts of data transmission and network calculation, wherein the data transmission part is used for transmitting the network transmission pulse data from the PC to the FPGA and transmitting the network transmission pulse result obtained by the calculation of the FPGA back to the PC for display;
the network calculation part is realized on the FPGA, after the pulse data is input by the network, the pulse data is output through the calculation of the established layer number, the network calculation is concentrated in the pulse convolution layer, and the pulse convolution layer is divided into a pulse convolution overall control module, a shift module, a calculation unit control module, a pulse calculation unit array and a membrane potential calculation module;
the pulse convolution overall control module is used for controlling pulse convolution layer internal data calculation, the pulse calculation unit array is formed by the pulse calculation unit array, each pulse calculation unit is used for realizing the weight accumulation of a single convolution kernel based on pulse data, the calculation unit control module is used for controlling the weight accumulation of the single convolution kernel, the shift module is used for controlling the shift of the pulse data among the pulse calculation units in the pulse calculation unit array, and the membrane potential calculation module is used for carrying out membrane potential calculation based on the accumulated weight sum of all convolution kernels in the pulse convolution layer.
10. The method according to claim 9, wherein 1 pulse calculation unit is provided for each convolution kernel, input pulse data is stored in a register inside the pulse calculation unit, each pulse data contains pulse information of neurons of the number of channels, when a control signal outputted from a control module of the calculation unit instructs to start calculation, all channels are traversed, whether to accumulate input weight values to the weight values and the sum according to the pulse of the corresponding channel in the register is selected, the completion of the traversal of all channels is waited, the accumulated weight values and the sum are outputted to a membrane potential calculation module, and the pulse data in the register is transmitted to other pulse calculation units.
CN202310725451.0A 2023-06-19 2023-06-19 Target tracking method based on event data and impulse neural network Pending CN116822592A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310725451.0A CN116822592A (en) 2023-06-19 2023-06-19 Target tracking method based on event data and impulse neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310725451.0A CN116822592A (en) 2023-06-19 2023-06-19 Target tracking method based on event data and impulse neural network

Publications (1)

Publication Number Publication Date
CN116822592A true CN116822592A (en) 2023-09-29

Family

ID=88113934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310725451.0A Pending CN116822592A (en) 2023-06-19 2023-06-19 Target tracking method based on event data and impulse neural network

Country Status (1)

Country Link
CN (1) CN116822592A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117944043A (en) * 2023-11-22 2024-04-30 广州深度医疗器械科技有限公司 Robot control method and robot thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117944043A (en) * 2023-11-22 2024-04-30 广州深度医疗器械科技有限公司 Robot control method and robot thereof

Similar Documents

Publication Publication Date Title
CN108133188B (en) Behavior identification method based on motion history image and convolutional neural network
CN113688723B (en) Infrared image pedestrian target detection method based on improved YOLOv5
CN112597883B (en) Human skeleton action recognition method based on generalized graph convolution and reinforcement learning
CN108805083B (en) Single-stage video behavior detection method
CN112052886A (en) Human body action attitude intelligent estimation method and device based on convolutional neural network
CN109829495B (en) Time sequence image prediction method based on LSTM and DCGAN
CN112818969B (en) Knowledge distillation-based face pose estimation method and system
CN111612136B (en) Neural morphology visual target classification method and system
CN112415521A (en) CGRU (China-swarm optimization and RU-based radar echo nowcasting) method with strong space-time characteristics
CN110728698A (en) Multi-target tracking model based on composite cyclic neural network system
CN112396001B (en) Rope skipping number statistical method based on human body posture estimation and TPA (tissue placement model) attention mechanism
CN113205048B (en) Gesture recognition method and system
CN113807318B (en) Action recognition method based on double-flow convolutional neural network and bidirectional GRU
CN112712170B (en) Neuromorphic visual target classification system based on input weighted impulse neural network
CN115601403A (en) Event camera optical flow estimation method and device based on self-attention mechanism
CN114186672A (en) Efficient high-precision training algorithm for impulse neural network
CN112766603A (en) Traffic flow prediction method, system, computer device and storage medium
CN111368770B (en) Gesture recognition method based on skeleton point detection and tracking
CN115051929B (en) Network fault prediction method and device based on self-supervision target perception neural network
CN114926737A (en) Low-power-consumption target detection method based on convolutional pulse neural network
CN117349622A (en) Wind power plant wind speed prediction method based on hybrid deep learning mechanism
CN116246338B (en) Behavior recognition method based on graph convolution and transducer composite neural network
CN116822592A (en) Target tracking method based on event data and impulse neural network
CN117197632A (en) Transformer-based electron microscope pollen image target detection method
Dinh et al. FBW-SNN: a fully binarized weights-spiking neural networks for edge-AI applications

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination