CN113568865B - Self-powered system-oriented integrated memory and calculation architecture and application method - Google Patents

Self-powered system-oriented integrated memory and calculation architecture and application method Download PDF

Info

Publication number
CN113568865B
CN113568865B CN202110542683.3A CN202110542683A CN113568865B CN 113568865 B CN113568865 B CN 113568865B CN 202110542683 A CN202110542683 A CN 202110542683A CN 113568865 B CN113568865 B CN 113568865B
Authority
CN
China
Prior art keywords
energy
module
calculation
self
offline
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110542683.3A
Other languages
Chinese (zh)
Other versions
CN113568865A (en
Inventor
邱柯妮
周坤雨
粟傈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chen Renzhao
Original Assignee
Chen Renzhao
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chen Renzhao filed Critical Chen Renzhao
Priority to CN202110542683.3A priority Critical patent/CN113568865B/en
Publication of CN113568865A publication Critical patent/CN113568865A/en
Application granted granted Critical
Publication of CN113568865B publication Critical patent/CN113568865B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7839Architectures of general purpose stored program computers comprising a single central processing unit with memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Supply And Distribution Of Alternating Current (AREA)
  • Power Sources (AREA)

Abstract

The invention discloses a self-powered system-oriented storage and calculation integrated architecture and a software optimization method, comprising the following steps: the system comprises an energy collection and management module, a CPU module and a calculation integrated module, wherein the output end of the energy collection and management module is electrically connected with the input ends of the CPU module and the calculation integrated module, and the CPU module is electrically connected with the calculation integrated module in a bidirectional manner and is arranged in the calculation integrated module. According to the method, the edge device is enabled to ensure the efficient operation of the binary neural network by means of the accelerator module based on the STT-MRAM array, so that the problem of high data transmission cost caused by the fact that the edge device wirelessly transmits a large amount of data to a server with higher performance for processing in the existing method is effectively avoided.

Description

Self-powered system-oriented integrated memory and calculation architecture and application method
Technical Field
The invention relates to the technical field of computer architectures and storage, in particular to a self-powered system-oriented storage and calculation integrated architecture and a software optimization method.
Background
Deep learning has shown excellent performance in various intelligent applications such as natural language processing, computer vision and speech recognition. The deep learning is applied to the edge embedded equipment, so that the equipment is more intelligent and the wider problem can be solved. The high demands on memory capacity and computing power of deep learning algorithms make complex neural network algorithms unsuitable for deployment to resource-limited devices. And devices based on conventional von neumann architecture, the "memory wall problem" and the "power wall problem" caused by the large difference between CPU processing speed and memory access speed and the leakage of large amounts of data migration and storage medium power consumption, further limit the possibilities of applying deep learning to edge devices.
In addition, with the development of energy collection technology, the equipment is supplied to the self-powered equipment by collecting energy (such as solar energy, wind energy, radio frequency energy, heat energy and kinetic energy of human body and the like) of the surrounding environment without a battery, and the self-powered equipment has the advantages of being green and economical, needing no replacement/maintenance of battery charging and the like. But the harvested environmental energy has unstable characteristics and there is also a great challenge in how to utilize the unstable energy.
When the neural network algorithm is deployed by the edge device, a large amount of data is transmitted to a computer with higher performance for processing in the past, but the data transmission requires more energy than storage or calculation, and a certain delay exists in the data transmission, so that the method is not suitable for devices with limited power, energy and bandwidth. The local intelligent processing of the edge terminal equipment in the self-powered application scene faces a great challenge.
Disclosure of Invention
It is an object of the present invention to solve at least the above problems and to provide at least the advantages to be described later.
The invention also aims to provide a memory-computing integrated processing architecture and a software optimization method, which realize that the edge equipment ensures the efficient operation of the binary neural network by depending on an accelerator module based on STT-MRAM, thereby effectively avoiding the problem of high data transmission cost caused by the fact that the edge equipment wirelessly transmits a large amount of data to a server with higher performance for processing in the prior art.
To achieve the above object and some other objects, the present invention adopts the following technical solutions:
A memory-accounting integrated processing architecture comprising:
The system comprises an energy collection and management module, a CPU module and a calculation integrated module, wherein the output end of the energy collection and management module is electrically connected with the input ends of the CPU module and the calculation integrated module, and the CPU module is in bidirectional electrical connection with the calculation integrated module.
The STT-MRAM is arranged in the storage and calculation integrated module, an energy collector and an energy management unit are arranged in the energy collection and management module, the energy management unit comprises an energy storage capacitor and a DC/DC converter, the output end of the energy storage capacitor is connected with the input end of the DC/DC converter, the output end of the energy collector is electrically connected with the input end of the energy management unit, the energy collector comprises a photovoltaic solar panel, a wind power generation module, a wireless radio frequency charging module, a kinetic energy generation module and a thermal energy generation module, the output ends of the photovoltaic solar panel, the wind power generation module, the wireless radio frequency charging module, the kinetic energy generation module and the thermal energy generation module are electrically connected with the input end of the energy management unit, the STT-MRAM arrays are distributed in the storage and calculation integrated module in a crossing mode, and 1T1MTJ units are arranged in each STT-MRAM;
the self-powered system-oriented integrated memory and calculation processing architecture comprises a software optimization module, wherein the software optimization module comprises the following two parts:
Offline modeling section: the power consumption and the delay of binary convolution calculation are completed by offline analysis of various logic combinations, a decision table is obtained according to analysis results and energy levels, the offline decision table is obtained, wherein the offline decision table comprises an optimal execution logic combination, power consumption and delay corresponding to each energy level, so that the execution decision is provided by fluctuation of energy of online simulation, and a neural network adopted by offline modeling is a two-layer convolution neural network: a LeNet network;
On-line simulation part: acquiring an offline decision table and an energy trace table as input of online simulation, wherein the energy trace table is used for simulating an unstable self-powered scene, the offline decision table is used as a basis for adapting to energy change, and the simulation is oriented to an execution process of a binary neural network in-memory processing architecture in a self-powered system;
in the offline modeling section, the establishment of the offline decision table includes the steps of:
step one: acquiring an energy trace, dividing energy levels according to the characteristics of the energy trace, and determining the energy level interval and the number of energy levels;
Step two: according to the obtained power consumption, delay and divided energy levels, obtaining logic combinations adapted to different energy levels, further obtaining an offline decision table, and establishing the offline decision table to provide input for online simulation;
in the online simulation part, the judging method of the logic combination comprises the following steps:
Step one: firstly, traversing an energy trace table, judging the energy level of the current energy, selecting a proper logic combination according to an offline decision table, and executing binary neural network convolution calculation so as to adapt to the energy fluctuation problem of a self-powered scene;
step two: the energy level is low, and the scheme with low power consumption of the logic combination is selected to be executed, so that the energy is not wasted when the energy is low; the energy level is higher, and a scheme with lower logic combination delay can be selected for execution; when the energy is higher, the parallel execution can be selected, so that the energy can be utilized as much as possible;
step three: and (3) traversing the trace table to obtain the energy efficiency and throughput rate of the adopted architecture.
The application method is based on the self-powered system oriented storage and calculation integrated processing architecture of claim 1, and comprises the following steps:
Step 1: based on STT-MRAM, a reconfigurable in-memory processing architecture oriented to self-power supply scene is designed to support the efficient operation of a binary neural network;
Step 2: the binary convolutional calculation can be realized by different logic combinations, and the binary neural network convolutional calculation is mapped to a hardware platform;
Step 3: the method is optimized by adaptive software to adapt to the fluctuation of energy so as to utilize the energy as much as possible.
Preferably, in the step 2, the mapping manner of the binary neural network and the hardware platform includes the following steps:
Step 2.1: according to the reconfigurability of the hardware architecture, the multiplication and addition operation of the binary neural network is completed by adopting XOR, XNOR and AND or NOT combination logic;
step 2.2: obtaining different mapping modes of binary neural network calculation mapping to a hardware architecture;
step 2.3: different system power consumption is corresponding to the adopted mapping mode, and a basis is provided for offline modeling of the self-adaptive software optimization method.
The invention at least comprises the following beneficial effects:
1. According to the invention, through arranging the energy collection and management module, the CPU module and the calculation integrated module, the inside of the calculation integrated module is distributed with the STT-MRAM through the array in a cross manner, each 1T1MTJ unit is provided with the 1T1MTJ unit in the array, each 1T1MTJ unit supports AND, OR, NOT and XOR logic, different logics can be realized by utilizing a plurality of 1T1MTJ units, the reconfigurability of the self-powered embedded cloud device can provide hardware support for applicable energy fluctuation, thereby assisting the CPU to process the data, the function of directly connecting with edge equipment locally, the required energy consumption is low, the provided in-memory processing architecture and the adaptive software optimization method can enable the binary neural network to efficiently operate in the architecture, the non-volatile STT-MRAM is adopted to ensure that the equipment is powered off, the data cannot be lost, the problem of energy fluctuation is fully utilized, the self-powered embedded cloud device can effectively complete intelligent reasoning programs locally at the edge end, the self-powered embedded cloud device is not relied on, the pressure of network transmission is reduced, the main edge device can transmit large amount of data to a certain computer, the data is more required to be transmitted wirelessly, and the problem of data is solved, and the problem of data transmission is solved, but more than the existing edge device has a certain wireless data transmission requirement is solved.
2. The energy collection and management module supplies energy to the storage and calculation integrated module and the CPU, the energy collection and management module comprises an energy collector and an energy management unit, wherein the collector comprises a photovoltaic solar panel, a wind power generation module, a wireless radio frequency charging module, a kinetic energy generation module and a thermal energy generation module, the wind energy, solar energy, radio frequency energy, kinetic energy, thermal energy and the like can be converted into electric energy, the electric energy is supplied to the storage and calculation integrated module and the CPU module for use, the self-energy supply effect is achieved, the energy is saved, the environment is protected, the complicated battery maintenance procedure is avoided, the whole system is widely applied, and the use requirements of various edge-end devices such as intelligent bracelets, wearable devices, wild animal detection and exploration tools can be met.
Drawings
FIG. 1 is a schematic diagram of a memory integrated processing architecture according to the present invention;
FIG. 2 is a process diagram of a binary neural network convolution calculation provided by the invention;
FIG. 3 is a schematic diagram of a storage of a 1T1MTJ cell in an STT-MRAM array provided by the invention;
FIG. 4 is a schematic diagram of an in-memory computing array for accelerating a binary neural network according to the present invention to implement logic operations;
FIG. 5 is a graph of the calculation mapping modes of three binary convolutional neural networks provided by the invention;
FIG. 6 is an exemplary diagram of an ambient energy sampling trace provided by the present invention;
FIG. 7 is a chart of an offline decision table of a first layer convolution operation provided by the present invention.
Detailed Description
The present invention is described in detail below with reference to the drawings so as to enable one of ordinary skill in the art to practice the same after having read the specification.
As shown in fig. 1-7, a memory integrated processing architecture, comprising: the system comprises an energy collection and management module 1, a CPU module 2 and an integrated storage and calculation module 3, wherein the output end of the energy collection and management module 1 is electrically connected with the input ends of the CPU module 2 and the integrated storage and calculation module 3, and the CPU module 2 is electrically connected with the integrated storage and calculation module 3 in a bidirectional way; the STT-MRAM array 12 is disposed inside the storage and calculation integrated module 3, the STT-MRAM arrays 12 are distributed inside the storage and calculation integrated module 3, each STT-MRAM array 12 is provided with a 1T1MTJ unit inside, the energy harvester 4 includes a photovoltaic solar panel 5, a wind power generation module 6, a wireless radio frequency charging module 7, a kinetic energy generation module 8 and a thermal energy generation module 9, and the output ends of the photovoltaic solar panel 5, the wind power generation module 6, the wireless radio frequency charging module 7, the kinetic energy generation module 8 and the thermal energy generation module 9 are electrically connected with the input ends of the energy management unit, the energy harvesting and management module 1 is internally provided with an energy harvester 4 and an energy management unit, the energy management unit includes an energy storage capacitor 10 and a DC/DC converter 11, and the output end of the energy storage capacitor 10 is connected with the input end of the DC/DC converter 11, the output end of the energy harvester 4 is electrically connected with the input end of the energy management unit, the self-powered integrated self-energy-supply-oriented process includes an optimizing software module, and the software module includes two software modules:
Offline modeling section: the power consumption and the delay of the binary convolution calculation are completed by offline analysis of various logic combinations, a decision table is obtained according to analysis results and energy levels, and the offline decision table is obtained, wherein the offline decision table comprises an optimal execution logic combination, power consumption and delay corresponding to each energy level, so that the fluctuation of the energy simulated on line can provide an execution decision;
On-line simulation part: obtaining an offline decision table and an energy trace table as input of online simulation, wherein the energy trace table is used for simulating an unstable self-powered scene, the offline decision table is used as a basis for adapting to energy change, the simulation is oriented to the execution process of a binary neural network in-memory processing architecture in a self-powered system,
In the offline modeling section, the establishment of the offline decision table includes the steps of:
step one: acquiring an energy trace, dividing energy levels according to the characteristics of the energy trace, and determining the energy level interval and the number of energy levels;
Step two: and obtaining logic combinations adapted to different energy levels according to the obtained power consumption, delay and divided energy levels, further obtaining an offline decision table, and establishing the offline decision table to provide input for online simulation.
In the scheme, the collector in the energy collection and management module is a photovoltaic solar panel, a wind power generation module, a wireless radio frequency charging module, a kinetic energy generation module and a thermal energy generation module, when the energy collection and management module is applied to edge equipment, environmental energy can be converted into electric energy and stored in an energy storage capacitor of the energy management unit, the electric energy is converted by a DC/DC converter and then is supplied to the CPU module and the storage integrated module, so that the effect of self energy supply is achieved, the self energy supply system is adopted to supply power, the energy collection and management system has the advantages of green and economical performance, no need of replacement and maintenance of battery charging, in the architecture, the CPU module is used as a main general control module, the storage integrated module is adopted in the aspect of storage, the energy collection and management module is a reconfigurable binary neural network accelerator module based on STT-MRAM and is in bidirectional electrical connection with the CPU module, the STT-MRAM is distributed in the accelerator module in an array cross way, each STT-MRAM is internally provided with 1T1MTJ units, the units of each 1T1MTJ of the array support AND, OR, NOT and XOR logic, different logics can be realized by utilizing a plurality of 1T1MTJ units, so that the reconfigurability of the STT-MRAM can provide hardware support for energy fluctuation, thereby achieving the function of replacing computer processing data, being directly connected with edge equipment in a local way, having low energy consumption and no delay condition of network transmission, the energy collector can transmit the acquired electric energy to the energy management unit, store the electric energy to the energy storage electric energy, supply the power requirements of the CPU module and the storage integrated module under the conversion function of the DC/DC converter, the self-powered system is adopted, the energy is saved, the environment is protected, the complicated battery maintenance procedure is avoided, the model of the buck-boost DC/DC converter is LTC3129, the set number of the device depends on how many environmental energy sources need to be controlled for conversion so as to ensure the stability of power supply, the device is provided with an accurate RUN pin threshold and a maximum power point control function, the device is used for providing voltage stabilizing communication function, the device can ensure that an energy collector absorbs maximum power, an energy collecting and managing module is environmental energy conversion equipment which can convert wind energy, solar energy, radio frequency energy, kinetic energy, heat energy and the like into electric energy and supply the electric energy to a calculation integrated module and a CPU module for use, the unit of each 1T1MTJ of the array supports AND logic, OR logic, NOT logic and XOR logic, different logics can be realized by utilizing a plurality of 1T1MTJ units, the reconfigurability of the device is suitable for providing hardware support for energy fluctuation, data processing is realized so as to replace a computer to be directly connected with edge equipment, and an offline modeling part firstly needs to acquire the power required by executing various logic operations of 1T1MTJ units in an STT-MRAM array and delay required by completing the logic operation, and the logic operation is required by the 1T1MTJ unit executing logic, AND logic, NOT not or NOT logic and XOR logic P and xor、Pand、Por P logic P and the STT logic P and xor、Pand、Por respectively; the delays of finishing exclusive-or logic, AND logic, OR logic and NOT logic are T xor、Tand、Tor and T not respectively, then, the adopted environment energy is selected and divided into energy levels, the adopted environment energy is a family WiFi signal as shown in figure 6, and an offline decision table is designed according to the power required by executing logic operation, the delay of finishing the logic and the energy level division of the adopted energy, and the power and delay required by various exclusive-or combination logics are analyzed offline; the on-line simulation part uses four energy sampling periods to illustrate the on-line simulation process, as shown in fig. 7, by the decision table generated by off-line modeling, and the four energy sampling powers are respectively: 50 μW, 820 μW, 360 μW and 550 μW, the neural network used for offline modeling is a two-layer convolutional neural network: the LeNet network is a two-layer network, the first layer convolution kernel is 6x5x5x1, the second layer convolution kernel is 16x5x5x6, after the LeNet network is binarized, 150 exclusive OR operations are needed to be executed for one time of calculation of the first layer, 2400 exclusive OR operations are needed to be executed for one time of calculation of the second layer, according to a sampling trace diagram of the environment energy of the home WiFi signal, which is known in FIG. 7, the acquired power range is 0-1000 mu W, the energy is divided into 4 energy levels, the energy level 1 is 0-200 mu W, the energy level 2 is 200-400 mu W, the energy level 3 is 400-600 mu W, the energy level 4 is more than 600 mu W, according to three logic mapping modes of the binary neural network and the hardware platform, the first logic is adopted to execute the first layer convolution required power to be 150P xor, the calculation delay is T xor, the second layer convolution required power is 2400P xor, and the calculation delay is T xor; the maximum power required by the second logic for executing the first layer convolution is 150P and, the completion calculation delay is T and, the power required by the second layer convolution is 2400P and, and the completion calculation delay is T and; the maximum power required by the third logic for executing the first layer convolution is 150P or, the completion calculation delay is T or, the power required by the second layer convolution is 2400P or, the completion calculation delay is T or, and the judging method of the logic combination in the online simulation part comprises the following steps:
Step one: firstly, traversing an energy trace table, judging the energy level of the current energy, selecting a proper logic combination according to an offline decision table, and executing binary neural network convolution calculation so as to adapt to the energy fluctuation problem of a self-powered scene;
step two: the energy level is low, and the scheme with low power consumption of the logic combination is selected to be executed, so that the energy is not wasted when the energy is low; the energy level is higher, and a scheme with lower logic combination delay can be selected for execution; when the energy is higher, the parallel execution can be selected, so that the energy can be utilized as much as possible;
step three: the trace table is traversed and completed, the energy efficiency and throughput rate of the adopted architecture are calculated, when a first energy sampling period is entered, the last convolution operation of which layer is completed is firstly obtained, then the sampling power is obtained to be 50 mu W, the energy level is judged to belong to the energy level 1, and then the energy level cannot be continuously executed according to a decision table, and backup data is carried out; when a second energy sampling period is entered, firstly, obtaining which layer of convolution operation is completed last time, judging that the first layer of convolution operation should be executed, then obtaining sampling power as 820 mu W, judging that the energy level belongs to an energy level 4, then according to a decision table of the first layer of convolution operation, continuously executing the energy level, selecting corresponding logic to execute the first layer of convolution operation of the LeNet network, when the time after the first layer of convolution operation is executed does not exceed the sampling period, executing the second layer of convolution operation according to the decision table of the second layer of convolution operation and the combination logic corresponding to the energy level, and repeating the process until the next sampling period is entered; when a third energy sampling period is entered, firstly, obtaining which layer of convolution operation is completed last time, judging that the second layer of convolution operation should be executed, then obtaining the sampling power to be 360 mu W, judging that the energy level belongs to the energy level 2, then selecting corresponding logic to execute the second layer of convolution operation of the LeNet network according to a decision table of the second layer of convolution operation, executing the first layer of convolution operation according to the decision table of the first layer of convolution operation and the combination logic corresponding to the energy level when the time after the second layer of convolution operation is executed does not exceed the sampling period, and repeating the process until the next sampling period is entered; when a fourth energy sampling period is entered, firstly, which layer of convolution operation is completed last time is obtained, it is judged that the first layer of convolution operation should be executed, then the sampling power is obtained to be 550 mu W, it is judged that the energy level belongs to the energy level 3, then according to a decision table of the first layer of convolution operation, the energy level can be continuously executed, the first layer of convolution operation of the LeNet network is selected to be executed by corresponding logic, when the time after the first layer of convolution operation is executed does not exceed the sampling period, the second layer of convolution operation is executed according to a decision table of the second layer of convolution operation and the combination logic corresponding to the energy level, and the process is repeated until the next sampling period is entered.
An application method is based on the self-powered system oriented memory-accounting integrated processing architecture of claim 1, the application method comprising the steps of:
Step 1: based on STT-MRAM, a reconfigurable in-memory processing architecture oriented to self-power supply scene is designed to support the efficient operation of a binary neural network;
Step 2: the binary convolutional calculation can be realized by different logic combinations, and the binary neural network convolutional calculation is mapped to a hardware platform;
Step 3: the method is optimized by adaptive software to adapt to the fluctuation of energy so as to utilize the energy as much as possible.
In the above scheme, firstly, a memory computing architecture is adopted to realize efficient binary neural network computing, and for an accelerator module, a spin transfer torque-magnetic random access memory STT-MRAM based memory processing platform is adopted, as shown in fig. 4, the platform realizes the principle of and, or, non-and exclusive-or logic, each 1T1MTJ cell can execute a single logic operation, wherein the and, or, non-and exclusive-or logic can realize conversion between different logics through a control signal C, the STT-MRAM array architecture of the hardware platform is formed by a plurality of 1T1MTJ cells, the array has reconfigurability, and the plurality of 1T1MTJ cells can be combined to realize more complex logic operation through configuring the array.
In a preferred embodiment, in the step 2, the mapping manner between the binary neural network and the hardware platform includes the following steps:
Step 2.1: according to the reconfigurability of the hardware architecture, the multiplication and addition operation of the binary neural network is completed by adopting XOR, XNOR and AND or NOT combination logic;
step 2.2: obtaining different mapping modes of binary neural network calculation mapping to a hardware architecture;
step 2.3: different system power consumption is corresponding to the adopted mapping mode, and a basis is provided for offline modeling of the self-adaptive software optimization method.
In the above scheme, since the convolution calculation of the binary neural network can be implemented by exclusive or, each 1T1MTJ cell of the accelerator module used supports exclusive or, and or and not logic, there are multiple mapping modes of the convolution calculation of the binary neural network, as shown in fig. 5, three mapping modes are implemented, the first mapping mode directly maps the calculation to the first column, and each cell of the column supports exclusive or logic; the second type is exclusive-or nor, and columns 2 to 6 are combinational logic for realizing exclusive-or logic; the third is to exclusive-or nor, and columns N-7 to N are combinational logic to implement exclusive-or logic.
Although embodiments of the present invention have been disclosed above, it is not limited to the details and embodiments shown, it is well suited to various fields of use, and further modifications may be readily apparent to those skilled in the art, without departing from the general concepts defined by the claims and the equivalents thereof, and therefore the invention is not limited to the specific details and illustrations shown and described herein.

Claims (3)

1. A self-powered system-oriented integrated computational and memory processing architecture, comprising:
The system comprises an energy collection and management module, a CPU module and an integrated calculation module, wherein the output end of the energy collection and management module is electrically connected with the input ends of the CPU module and the integrated calculation module, and the CPU module is in bidirectional electrical connection with the integrated calculation module; the STT-MRAM array is arranged in the storage and calculation integrated module, an energy collector and an energy management unit are arranged in the energy collection and management module, the energy management unit comprises an energy storage capacitor and a DC/DC converter, the output end of the energy storage capacitor is connected with the input end of the DC/DC converter, the output end of the energy collector is electrically connected with the input end of the energy management unit, the energy collector comprises a photovoltaic solar panel, a wind power generation module, a wireless radio frequency charging module, a kinetic energy generation module and a thermal energy generation module, the output ends of the photovoltaic solar panel, the wind power generation module, the wireless radio frequency charging module, the kinetic energy generation module and the thermal energy generation module are electrically connected with the input end of the energy management unit, the STT-MRAM array is distributed in the inside of the storage and calculation integrated module, and 1T1MTJ units are arranged in each STT-MRAM;
the self-powered system-oriented integrated memory and calculation processing architecture comprises a software optimization module, wherein the software optimization module comprises the following two parts:
Offline modeling section: the power consumption and the delay of binary convolution calculation are completed by offline analysis of various logic combinations, a decision table is obtained according to analysis results and energy levels, the offline decision table is obtained, wherein the offline decision table comprises an optimal execution logic combination, power consumption and delay corresponding to each energy level, so that the execution decision is provided by fluctuation of energy of online simulation, and a neural network adopted by offline modeling is a two-layer convolution neural network: a LeNet network;
On-line simulation part: acquiring an offline decision table and an energy trace table as input of online simulation, wherein the energy trace table is used for simulating an unstable self-powered scene, the offline decision table is used as a basis for adapting to energy change, and the simulation is oriented to an execution process of a binary neural network in-memory processing architecture in a self-powered system;
in the offline modeling section, the establishment of the offline decision table includes the steps of:
step one: acquiring an energy trace, dividing energy levels according to the characteristics of the energy trace, and determining the energy level interval and the number of energy levels;
Step two: according to the obtained power consumption, delay and divided energy levels, obtaining logic combinations adapted to different energy levels, further obtaining an offline decision table, and establishing the offline decision table to provide input for online simulation;
in the online simulation part, the judging method of the logic combination comprises the following steps:
Step one: firstly, traversing an energy trace table, judging the energy level of the current energy, selecting a proper logic combination according to an offline decision table, and executing binary neural network convolution calculation so as to adapt to the energy fluctuation problem of a self-powered scene;
step two: the energy level is low, and the scheme with low power consumption of the logic combination is selected to be executed, so that the energy is not wasted when the energy is low; the energy level is higher, and a scheme with lower logic combination delay can be selected for execution; when the energy is higher, the parallel execution can be selected, so that the energy can be utilized as much as possible;
step three: and (3) traversing the trace table to obtain the energy efficiency and throughput rate of the adopted architecture.
2. An application method of a self-powered system-oriented storage and calculation integrated processing architecture, which is characterized in that the application method is based on the self-powered system-oriented storage and calculation integrated processing architecture according to claim 1, and comprises the following steps:
Step1: the STT-MRAM-based integrated processing architecture is designed to support the efficient operation of the binary neural network;
Step 2: the binary convolutional calculation can be realized by different logic combinations, and the binary neural network convolutional calculation is mapped to a hardware platform;
Step 3: the method is optimized by adaptive software to adapt to the fluctuation of energy so as to utilize the energy as much as possible.
3. The method for applying the self-powered system-oriented memory-as-a-whole processing architecture according to claim 2, wherein in the step2, the mapping manner of the binary neural network and the hardware platform comprises the following steps:
Step 2.1: according to the reconfigurability of the hardware architecture, the multiplication and addition operation of the binary neural network is completed by adopting XOR, XNOR and AND or NOT combination logic;
step 2.2: obtaining different mapping modes of binary neural network calculation mapping to a hardware architecture; step 2.3: different system power consumption is corresponding to the adopted mapping mode, and a basis is provided for offline modeling of the self-adaptive software optimization method.
CN202110542683.3A 2021-05-18 2021-05-18 Self-powered system-oriented integrated memory and calculation architecture and application method Active CN113568865B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110542683.3A CN113568865B (en) 2021-05-18 2021-05-18 Self-powered system-oriented integrated memory and calculation architecture and application method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110542683.3A CN113568865B (en) 2021-05-18 2021-05-18 Self-powered system-oriented integrated memory and calculation architecture and application method

Publications (2)

Publication Number Publication Date
CN113568865A CN113568865A (en) 2021-10-29
CN113568865B true CN113568865B (en) 2024-05-14

Family

ID=78161566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110542683.3A Active CN113568865B (en) 2021-05-18 2021-05-18 Self-powered system-oriented integrated memory and calculation architecture and application method

Country Status (1)

Country Link
CN (1) CN113568865B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107667325A (en) * 2015-06-26 2018-02-06 英特尔公司 For the congenial power management for the available intermittent power for managing the data processing equipment with half nonvolatile memory or nonvolatile memory
CN108512894A (en) * 2018-02-05 2018-09-07 集能芯成科技(北京)有限公司 A kind of distributed load equalizing method and system towards self energizing Sensor Network
CN110998486A (en) * 2017-09-01 2020-04-10 高通股份有限公司 Ultra-low power neuron morphological artificial intelligence computing accelerator
CN111737053A (en) * 2020-06-22 2020-10-02 山东大学 Instruction analysis-based nonvolatile processor backup method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10591902B2 (en) * 2016-01-03 2020-03-17 Purdue Research Foundation Microcontroller energy management system
US20210004265A1 (en) * 2020-09-18 2021-01-07 Francesc Guim Bernat Elastic power scaling

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107667325A (en) * 2015-06-26 2018-02-06 英特尔公司 For the congenial power management for the available intermittent power for managing the data processing equipment with half nonvolatile memory or nonvolatile memory
CN110998486A (en) * 2017-09-01 2020-04-10 高通股份有限公司 Ultra-low power neuron morphological artificial intelligence computing accelerator
CN108512894A (en) * 2018-02-05 2018-09-07 集能芯成科技(北京)有限公司 A kind of distributed load equalizing method and system towards self energizing Sensor Network
CN111737053A (en) * 2020-06-22 2020-10-02 山东大学 Instruction analysis-based nonvolatile processor backup method and system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Keni Qiu等.ResiRCA: A Resilient Energy Harvesting ReRAM Crossbar-Based Accelerator for Intelligent Embedded Processors.《2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)》.2020,1-13. *
Mimi Xie等.A Novel STT-RAM-Based Hybrid Cache for Intermittently Powered Processors in IoT Devices.《IEEE Micro ( Volume: 39, Issue: 1, Jan.-Feb. 2019)》.2018,1-9. *
Yu Pan等.A Multilevel Cell STT-MRAM-Based Computing In-Memory Accelerator for Binary Convolutional Neural Network.《IEEE TRANSACTIONS ON MAGNETICS》.2018,第1-5页. *

Also Published As

Publication number Publication date
CN113568865A (en) 2021-10-29

Similar Documents

Publication Publication Date Title
Chen et al. An intelligent robust networking mechanism for the Internet of Things
US20200026992A1 (en) Hardware neural network conversion method, computing device, compiling method and neural network software and hardware collaboration system
CN108256636A (en) A kind of convolutional neural networks algorithm design implementation method based on Heterogeneous Computing
CN104992248A (en) Microgrid photovoltaic power station generating capacity combined forecasting method
Motamarri et al. Modified grey wolf optimization for global maximum power point tracking under partial shading conditions in photovoltaic system
Tu et al. A power efficient neural network implementation on heterogeneous FPGA and GPU devices
Hu et al. Short-term photovoltaic power prediction based on similar days and improved SOA-DBN model
CN116544983B (en) Wind-solar power generation energy storage system and optimal configuration method thereof
CN107453396A (en) A kind of Multiobjective Optimal Operation method that distributed photovoltaic power is contributed
CN115314343A (en) Source-load-storage resource aggregation control gateway device and load and output prediction method
CN111860773B (en) Processing apparatus and method for information processing
Alamaniotis Synergism of deep neural network and elm for smart very-short-term load forecasting
CN111831354A (en) Data precision configuration method, device, chip array, equipment and medium
US11909215B2 (en) Technologies for optimizing power grids through decentralized forecasting
CN112330021A (en) Network coordination control method of distributed optical storage system
Charoenchaiprakit et al. Optimal data transfer of SEH-WSN node via MDP based on duty cycle and battery energy
CN113568865B (en) Self-powered system-oriented integrated memory and calculation architecture and application method
Merino et al. Optimization of energy distribution in solar panel array configurations by graphs and Minkowski’s paths
Titri et al. Rapid prototyping of PVS into FPGA: From model based design to FPGA/ASICs implementation
Goh et al. Hardware implementation of an active learning self-organizing neural network to predict the power fluctuation events of a photovoltaic grid-tied system
KR102562761B1 (en) Method for intelligent day-ahead energy sharing scheduling of the P2P prosumer community in smart grid
CN107590976A (en) Wireless sensor terminal equipment for big data collection
CN111474978A (en) Photovoltaic MPPT control method for intelligently converting step length and system storage medium thereof
CN111525629A (en) Power supply capacity configuration method and device, computer equipment and storage medium
Chi et al. Hardware Architecture Design of the Deep-learning-based Machine Vision Chip

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240131

Address after: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant after: Shenzhen Wanzhida Technology Co.,Ltd.

Country or region after: China

Address before: 100089 No.105, Xisanhuan North Road, Haidian District, Beijing

Applicant before: Capital Normal University

Country or region before: China

TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240415

Address after: No. 88, Lishan, Daiyun Village, Chishui Town, Dehua County, Quanzhou City, Fujian Province, 362000

Applicant after: Chen Renzhao

Country or region after: China

Address before: 518000 1002, Building A, Zhiyun Industrial Park, No. 13, Huaxing Road, Henglang Community, Longhua District, Shenzhen, Guangdong Province

Applicant before: Shenzhen Wanzhida Technology Co.,Ltd.

Country or region before: China

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant