CN113554159A - Method and apparatus for implementing artificial neural networks in integrated circuits - Google Patents

Method and apparatus for implementing artificial neural networks in integrated circuits Download PDF

Info

Publication number
CN113554159A
CN113554159A CN202110435481.9A CN202110435481A CN113554159A CN 113554159 A CN113554159 A CN 113554159A CN 202110435481 A CN202110435481 A CN 202110435481A CN 113554159 A CN113554159 A CN 113554159A
Authority
CN
China
Prior art keywords
neural network
format
data
representation format
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110435481.9A
Other languages
Chinese (zh)
Inventor
L·福里奥特
P·德马雅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
STMicroelectronics Rousset SAS
Original Assignee
STMicroelectronics Rousset SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from FR2004070A external-priority patent/FR3109651B1/en
Application filed by STMicroelectronics Rousset SAS filed Critical STMicroelectronics Rousset SAS
Publication of CN113554159A publication Critical patent/CN113554159A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

Embodiments of the present disclosure relate to methods and apparatus for implementing artificial neural networks in integrated circuits. An embodiment method for implementing an artificial neural network in an integrated circuit includes: the method comprises obtaining an initial digital file representing a neural network configured according to at least one data representation format, then detecting at least one format of at least part of the data used to represent the neural network, then converting the detected at least one representation format into a predefined representation format, thereby obtaining a modified digital file representing the neural network, and then integrating the modified digital file into a memory of the integrated circuit.

Description

Method and apparatus for implementing artificial neural networks in integrated circuits
Cross Reference to Related Applications
The present application claims priority from french application No.2004070 filed on 23/4/2020, which is hereby incorporated by reference.
Technical Field
Embodiments and implementations relate to artificial neural network devices and methods, and more particularly to their implementation in integrated circuits.
Background
Artificial neural networks typically include a series of layers of neurons.
Each layer takes input data to which a weight is applied, and outputs output data after functional processing by neurons for activating the layer. These output data are transmitted to the next layer in the neural network.
The weights are data of the neurons, more particularly parameters of the neurons, which can be configured to obtain good output data.
The weights are adjusted during a typically supervised learning phase, in particular by performing a neural network with data that has been classified as input data from a reference database.
Neural networks can be quantized to speed up their execution and reduce memory requirements. In particular, the quantization of neural networks is to define the data representation format of the neural network, such as weights and inputs and outputs for each layer in the neural network.
In particular, the neural network is quantized according to the representation format of the integers. However, there are many possible representation formats for integers. In particular, integers may be represented according to signed or unsigned, symmetric or asymmetric representations. In addition, data from the same neural network may be represented in different integer representations.
Many industry participants are developing software infrastructures ("frameworks"), such as Tensorflow developed by Google or PyTorch, Inc
Figure BDA0003032807180000021
To develop a quantized neural network.
The choice of data representation format for the quantized neural network may vary according to the different roles in developing these software infrastructures.
The quantized neural network is trained and then integrated into an integrated circuit, such as a microcontroller.
In particular, integration software may be provided to integrate the quantized neural network into an integrated circuit. For example, the integration software stm32cube. AI developed by STMicroelectronics (semiconductor by jew) and its extension X-CUBE-AI are known.
Disclosure of Invention
The integration software may be configured to convert the quantized neural network into a neural network optimized for execution on a given integrated circuit.
However, in order to be able to handle quantized neural networks with different data representation formats, it is necessary that the integration software is compatible with all these different representation formats.
For compatibility, one solution is to program the integration software specifically for each representation format.
However, this solution has the disadvantage of increasing the cost of development, validation and technical support. In addition, this solution also has the disadvantage of increasing the size of the integrated software code.
It is therefore desirable to provide a method for implementing an artificial neural network in an integrated circuit that allows any type of representation format to be supported and that can be performed at low cost.
In addition, the integrated software is configured to be executed by the processor to perform the neural network, which is referred to as software execution, or at least partially by application specific electronic circuitry of the integrated circuit to accelerate its execution. The dedicated electronic circuit may be, for example, a logic circuit.
The processor and the dedicated electronic circuitry may have different constraints. In particular, constraints that may be optimal for the processor may not be optimal for the dedicated electronic circuitry, and vice versa.
Therefore, it is also desirable to provide an implementation that allows to improve, even optimize, the representation of the neural network according to its execution constraints.
According to an aspect, a method for implementing an artificial neural network in an integrated circuit is proposed, the method comprising obtaining an initial digital file representing a neural network configured according to at least one data representation format, then a) detecting at least one format for at least part of the data representing the neural network, then b) converting the detected at least one representation format into a predefined representation format, thereby obtaining a modified digital file representing the neural network, and then c) integrating the modified digital file into an integrated circuit memory.
The neural network may be by an end user, for example using a network such as Tensorflow
Figure BDA0003032807180000031
Or the software infrastructure of PyTorch to quantify and train neural networks.
The implementation method may be performed by integrated software.
The neural network can be optimized in particular by integration software before it is integrated into the integrated circuit.
The implementation allows any type of data representation format that supports neural networks and significantly reduces the cost for performing the implementation.
In particular, converting the detected data representation format of the neural network into a predefined data representation format allows limiting the number of representation formats supported by the integration software.
More particularly, the integration software may be programmed to support only predefined data representation formats, in particular for optimizing neural networks. The conversion allows the neural network to be adapted for use by integrated software.
The implementation method allows to simplify the programming of the integrated software and to reduce the memory size of the integrated software code.
Thus, the implementation allows the integrated software to support neural networks that are independently generated by any software infrastructure of quantitative parameters selected by the end user.
Neural networks typically include a series of layers of neurons. Each neuron layer receives input data and outputs output data. These output data are used as inputs to at least one subsequent layer in the neural network.
In an advantageous embodiment, a conversion of the representation format of at least part of the data is performed for at least one layer of the neural network.
Preferably, for each layer of the neural network, a conversion of a representation format of at least part of the data is performed.
Advantageously, the neural network comprises a series of layers and the data of the neural network comprises weights assigned to the layers and input data and output data that can be generated and used by the neural network layers.
In particular, when it is detected that the representation format allowing the detection of the weights is an unsigned format, the conversion may include a modification of the representation of the weights of the signed values, as well as a modification of the values of the data representing these weights.
Alternatively, when the detection allows detection that the weight representation format is a signed format, the conversion may comprise a modification of the weight representation into unsigned values, and a modification of the values of the data representing the weights.
Further, when the detection allows detection that the representation format of the input data and the output data of each layer is an unsigned format, the conversion may include a modification of the representation of the input data and the output data into a signed value.
Alternatively, when the detection allows detecting that the representation format of the input data and the output data of each layer is a signed representation format, the conversion may comprise a modification of the conversion of the representation of the input data and the output data into unsigned values.
In addition, the conversion includes the addition of a first conversion layer at the input of the neural network, the first conversion layer configured to modify the values of data that may be input to the neural network according to a predefined representation format, and the addition of a second layer for conversion at the output of the neural network, the second layer for conversion configured to modify the values of the output data of the last neural network layer according to the format of the output data used to represent the initial digital file.
Preferably, the predefined representation format is selected in accordance with the implementation material of the neural network. In particular, depending on whether the neural network is executed by a processor or at least partly by dedicated electronic circuitry, a predefined representation format is selected, thereby speeding up its execution.
In this way, constraints of the execution hardware may be taken into account to optimize execution of the neural network.
In particular, it is preferred that when a selection is made to execute the neural network by the processor, and when the neural network weights are represented according to an asymmetric representation format, the predefined representation formats of the weights are unsigned and asymmetric formats, and the predefined representation formats of the input data and the output data of each layer are unsigned and asymmetric formats.
Further, preferably, when a selection is made to execute the neural network by the processor, and when the neural network weights are represented according to a symmetric representation format, the predefined representation formats of the weights are signed and symmetric formats, and the predefined representation formats of the input data and the output data of each layer are unsigned and asymmetric formats. However, alternatively, the predefined representation format of the input data and the output data and the predefined representation format of the weights for each layer may be unsigned and asymmetric formats.
Furthermore, preferably, when a selection is made to implement the neural network using dedicated electronic circuitry, and when the neural network weights are represented according to a symmetric representation format, the predefined representation formats of the weights are signed and symmetric formats, and the predefined representation formats of the input data and the output data of each layer are signed and asymmetric formats, or if the dedicated electronic circuitry is configured to support unsigned arithmetic, the predefined representation formats of the input data and the output data of each layer are asymmetric and unsigned formats.
Further, preferably, when the selection is made to at least partially implement the neural network using dedicated electronic circuitry, and when the neural network weights are represented according to an asymmetric representation format, the predefined representation formats of the weights are signed and asymmetric, and the predefined representation formats of the input data and the output data of each layer are signed and asymmetric, or if the dedicated electronic circuitry is configured to support unsigned arithmetic, the predefined representation formats of the input data and the output data of each layer are asymmetric and unsigned formats.
According to another aspect, a computer program product is proposed, comprising instructions which, when the program is executed by a computer, cause the computer to carry out the steps a), b) and c) of the method as previously described.
According to another aspect, a computer-readable data medium is proposed, on which the above-mentioned computer program product is recorded.
According to another aspect, there is provided a computer-based tool, such as a computer, comprising: an input for receiving an initial digital file representing a neural network configured according to at least one data representation format; and a processing unit configured to perform: the method comprises the steps of detecting at least one format of at least part of data in the data representing the neural network, then carrying out conversion of the detected at least one representation format into a predefined representation format to obtain a modified digital file representing the neural network, and then carrying out integration of the modified digital file into the memory of the integrated circuit.
There is thus provided a computer-based tool comprising the above-mentioned data medium and a processing unit configured to execute the above-mentioned computer program product.
Drawings
Other advantages and features of the invention will appear after examining the detailed description of non-limiting implementations and embodiments and the accompanying drawings, in which:
FIG. 1 illustrates an embodiment implementing a method; and
FIG. 2 schematically illustrates an embodiment computer-based tool.
Detailed Description
FIG. 1 illustrates an implementation in accordance with an implementation of the present invention. The implementation method may be performed by integrated software.
The method first comprises an obtaining step 10 in which an initial digital file representing the neural network is obtained. The neural network is configured according to at least one data representation format.
In particular, neural networks generally include a series of layers of neurons. Each neuron layer receives input data to which weights are applied, and outputs output data.
The input data may be data received at an input of the neural network or may be output data from a previous layer.
The output data may be data output from the neural network or may be data generated by one layer of the neural network and input to a next layer of the neural network.
The weights are data of the neurons, more particularly parameters of the neurons, which can be configured to obtain good output data.
In particular, the neural network is user-defined, for example using a software program such as Tensorflow
Figure BDA0003032807180000061
Or a software infrastructure of PyTorch. In particular, the training allows defining weights.
The neural network then has at least one representation format, e.g., input data for each of its layers, output data for each of its layers, and weights for the neurons of each layer, selected by the user. In particular, the input data and output data and the weights of each layer are integers that can be represented according to a signed or unsigned, symmetric or asymmetric format.
The initial digital file contains one or more indications that allow identification of the presentation format(s). In particular, the indication may be represented in the initial digital file, for example in the form of a binary file.
The tflite may contain quantized segments of information, such as a scale s and a value zp representing a zero.
The initial digital file is provided to the integration software.
The integration software is programmed to optimize the neural network. In particular, the integration software allows, for example, optimizing the network topology, the execution order of the elements of the neural network, or optimizing the memory allocation, which may be performed during the execution of the neural network.
To simplify the programming of integrated software, the optimization of the neural network is programmed to operate with a limited number of data representation formats. These representation formats are predefined and are explained in detail below.
The integration software is programmed to be able to convert any type of data representation format to a predefined representation format before optimization in order to support any type of data representation format.
In this manner, the integrated software is configured to allow optimization of the neural network from a neural network that can be configured according to any type of data representation format.
The converting step is included in the method of implementation.
In particular, the converting step is adapted to modify the symmetric representation format into an asymmetric representation format. The converting step is also adapted to modify the signed representation format into the unsigned representation format and vice versa. The operation of this conversion step will be described in more detail below.
To better understand how the conversion works, it should be remembered that floating point integers quantized on n bits can be expressed in the following form:
[ mathematical formula 1]
r ═ sx (q-zp), where q and zp are integers on n bits, with the same signed or unsigned representation format, and s is a predefined floating point scale. The scale s and the value zp representing the zero point may be contained in the initial digital file.
This form is known to the person skilled in the art and is described, for example, in the specification for the quantified TensorFlow Lite. This specification is particularly visible on the following websites: https:// www.tensorflow.org/lite/performance/qualification _ spec.
In particular, the value zp is zero for data in a symmetric representation format.
Thus, a symmetric representation format may be considered an asymmetric representation format with zp equal to 0.
As indicated below, changing the representation format of the weights for each layer from the unsigned representation format to the unsigned representation format and vice versa may be obtained.
The weight of each layer in the unsigned representation format may be expressed in the form:
[ mathematical formula 2]
rw=sw×(qw-zpw) Wherein q iswAnd zpwIs in the interval [ 0; 2n-1]Unsigned data in (1).
The signed representation format of the weights can be obtained by applying the following equation:
[ mathematical formula 3]
rw=sw×(qw2-zpw2) Wherein q isw2=qw-2n-1And zpw2=zpw-2n-1Is in the interval [ -2 ]n-12n-1-1]Of (2) signed input data.
The unsigned representation format of the weights can be obtained by applying the following equation:
[ mathematical formula 4]
rw=sw×(qw-zpw) Wherein q isw=qw2+2n-1And zpw=zpw2+2n-1Wherein q isw2And zpw2Is in the interval [ -2 ]n-12n-1-1]Of (2) signed input data.
As indicated below, a change in the representation format of the input data and the output data of each layer from a signed representation format to an unsigned representation format and vice versa may be obtained.
The input data for each layer according to the signed representation format may be expressed in the following form:
[ math figure 5]
ri=si×(qi-zpi) Wherein q isiAnd zpiIs in the interval [ -2 ]n-1;2n-1-1]Wherein n is used to indicate the presence of the symbol in (1)The number of bits of the input data of the symbol.
The unsigned representation format of the input data can be obtained by applying the following equation:
[ mathematical formula 6]
ri=si×(qi2-zpi2) Wherein q isi2=qi+2n-1And zpi2=zpi+2n-1Is in the interval [ 0; 2n-1]Unsigned data in (1).
The signed representation format of the input data can also be obtained by applying the following equation:
[ math figure 7]
ri=si×(qi-zpi) Wherein q isi=qi2-2n-1And zpi=zpi2-2n-1Is in the interval [ 0; 2n-1]Unsigned input data in (1).
The output data of each layer according to the signed representation format may be expressed in the following form:
[ mathematical formula 8]
ro=so×(qo-zpo) Wherein q isoAnd zpoIs in the interval [ -2 ]n-1;2n-1-1]The signed output data of (1).
The output data for each layer is specifically calculated according to the following equation:
[ mathematical formula 9]
ro=ri×rw
Thus, an unsigned representation format of the output data for a neural network layer (e.g., convolutional or dense layer) can be obtained by applying the following equation:
[ mathematical formula 10]
ro=so×(qo2-zpo2) Wherein q iso2=qo+2n-1And zpo2=zpo+2n-1,qo2And zpo2Is in the interval [ 0; 2n-1]Signed data in。
The signed representation format of the output data can also be obtained by applying the following equation:
[ mathematical formula 11]
ro=so×(qo-zpo) Wherein q iso=qo2-2n-1And zpo=zpo2-2n-1,qo2And zpo2In the interval [ 0; 2n-1]Is unsigned.
When the representation format of the layer input data is converted from unsigned to signed, the data zp is converted into a data file0When converting into a signed output data q of the same layer0Will also be signed.
Thus, the representation format of the input data of the neural network is modified from unsigned to signed, and the data zp of each layer0Allows to directly obtain data q used as input data of the next layer0. Thus, it is not necessary to modify the value of the output data between two successive layers during the execution of the neural network.
In particular, the conversion of the representation formats of the input data and the output data may require the addition of a first conversion layer at the input of the neural network to convert the data at the network input in the desired representation format for the execution of the neural network, and may require the addition of a second layer at the output of the neural network to convert to the user desired representation format at the output of the neural network.
To accommodate the data representation format, the method comprises the steps of 11: an executive material is detected with which the neural network must execute if the executive material is indicated by the user.
In particular, the neural network may be executed by a processor, which is then referred to as software execution, or at least partly by dedicated electronic circuitry. The dedicated electronic circuitry is configured to perform defined functions to accelerate the execution of the neural network. The dedicated electronic circuit may be obtained, for example, according to the programming of the VHDL language.
The method comprises the following steps: at least one format of data representing a neural network is detected.
Preferably, each data representation format of the neural network is detected.
Specifically, a format for representing input data and output data of each layer and a weight representation format of neurons of each layer are detected.
The implementation method then allows to convert (if necessary) the representation format of the input data and output data of each layer detected and the representation format of the weights of the neurons of each layer detected.
The conversion may be performed in accordance with execution constraints of the neural network, in particular in accordance with whether the neural network is executed by a processor or by dedicated electronic circuitry.
Indeed, the processor and the dedicated electronic circuitry may have different constraints. For example, when a neural network is executed at least in part by dedicated electronic circuitry, it is not possible to change an asymmetric representation format to a symmetric representation format without modifying the number of bits representing the data.
Thus, for example, conversion from an asymmetric representation format of a weight to a symmetric representation format of the weight results in weights being represented on a higher number of bits to maintain accuracy, but this increases the execution time of the neural network, or results in a decrease in accuracy by maintaining a number of bits to maintain execution time.
Therefore, it is preferable that when the representation format of the data of the detected neural network is asymmetric, the asymmetric representation format is preferably maintained here.
Furthermore, if the neural network uses an activation function such as ReLU (from "Rectified Linear Units") or Sigmoid, an unsigned representation format is preferably used. Indeed, the signed representation format increases the execution time of these activation functions.
Thus, the method comprises step 13: the identification of the implementation material is verified. In this step, it is verified whether the user has indicated the executive material on which the neural network has to be executed.
In particular, the user may indicate whether the neural network should be executed by the processor or at least in part by dedicated electronic circuitry.
Also, the user may not instruct to execute the material.
If it is determined in step 13 that the user has indicated an execution material to be used, the method comprises a determination step 14, in which it is determined whether the neural network has to be executed by the processor or by a dedicated electronic circuit.
More particularly, if it is determined in step 14 that the neural network has to be executed by the processor, the method comprises a step 15 in which it is determined whether the weight representation format is asymmetric.
If the answer in step 15 is yes, the weight representation format is asymmetric, then the data representation format is converted according to conversion C1. The conversion C1 allows to obtain an unsigned and asymmetrical weight representation format and a representation format of the input data and output data of each unsigned and asymmetrical layer. For this purpose, equations [ equation 4] are applied to the weights if their original representation format is signed, and equations [ equation 6], [ equation 10] are applied to the input data and output data of each layer if their original representation format is signed.
If the answer in step 15 is no and the weight representation format is symmetric, the data representation format is converted according to conversion C2. The conversion C2 allows to obtain signed and symmetric weight representation formats, and predefined representation formats of input data and output data for each unsigned and asymmetric layer. For this purpose, equations [ equation 3] are applied to the weights if their original representation format is unsigned, and equations [ equation 6] and [ equation 10] are applied to the input data and output data of each layer if their original representation format is signed. If the answer in step 14 is no, the neural network must be executed using dedicated electronic circuitry, then the method includes step 16, in which it is determined whether the weight representation format is symmetric.
If the answer in step 16 is no and the weight representation format is symmetric, the data representation format is converted according to conversion C3. The conversion C3 allows to obtain signed and symmetric weight representation formats, and representation formats of input data and output data for each signed and asymmetric layer. For this purpose, if their original format is a signed representation format, the equations [ equation 3] are applied to the weights, and the equations [ equation 7] and [ equation 11] are applied to the input data and the output data of each layer.
The conversion C2 may also be performed instead of the conversion C3 if the dedicated electronic circuitry supports unsigned arithmetic.
If the answer in step 16 is no and the weight representation format is asymmetric, the data representation format is converted according to conversion C4. The conversion C4 allows for signed and asymmetrical weight representation formats, and representation formats of input data and output data for each signed and asymmetrical layer. For this purpose, if their original format is a signed representation format, the equations [ equation 3] are applied to the weights, and the equations [ equation 10] and [ equation 11] are applied to the input data and the output data of each layer.
The conversion C1 may also be performed instead of the conversion C4 if the dedicated electronic circuitry supports unsigned arithmetic.
If the answer in step 13 is no, it is determined that the user has not indicated to execute the material, the method comprises a step 17 of analyzing the neural network.
This analysis allows determining in step 18 whether the neural network should be performed in its entirety using dedicated electronic circuitry or in part by the processor and in part using dedicated electronic circuitry.
If the answer in step 18 is yes, the neural network must be fully executed using dedicated electronic circuitry, then the method includes step 19, in which it is determined whether the weight representation format is symmetric.
If the answer in step 19 is yes, the weight representation format is symmetric, the data representation format is converted according to the above conversion C3, or if the dedicated electronic circuitry supports unsigned arithmetic, the data representation format is converted according to the above conversion C2.
If the answer in step 19 is no and the weight representation format is asymmetric, the data representation format is converted according to the above-mentioned conversion C4, or if the dedicated electronic circuitry supports unsigned arithmetic, the data representation format is converted according to the above-mentioned conversion C1.
If the answer in step 18 is no, the neural network must be partially executed using dedicated electronic circuitry, then the method includes step 20, in which it is determined whether the weight representation format is symmetric.
If the answer in step 20 is yes, the weight representation format is symmetric, the data representation format is converted according to the above conversion C3, or if the dedicated electronic circuitry supports unsigned arithmetic, the data representation format is converted according to the above conversion C2.
If the answer in step 20 is no, the weight representation format is asymmetric, the data representation format is converted according to the above-mentioned conversion C4, or if the dedicated electronic circuitry supports unsigned arithmetic, the data representation format is converted according to the above-mentioned conversion C1.
These transformations allow to obtain a modified digital file representing the neural network.
The implementation method then comprises a step 21 for generating optimized code.
The implementation method finally comprises step 22: the optimized neural network is integrated into an integrated circuit.
The implementation allows any type of data representation format that supports neural networks and significantly reduces the cost of implementing the implementation.
In particular, converting the detected data representation format of the neural network into a predefined data representation format allows to limit the number of representation formats supported by the integration software.
More particularly, the integration software may be programmed to support only predefined data representation formats, in particular for optimizing neural networks. The conversion allows the neural network to be adapted for use by integrated software.
This implementation allows simplifying the programming of the integrated software and reducing the memory size of the integrated software code.
The implementation thus enables integrated software to support neural networks generated by any software infrastructure.
Fig. 2 shows a computer-based tool ORD comprising an input E for receiving an initial digital file and a processing unit UT programmed to execute the above-mentioned conversion method allowing the acquisition of a modified digital file and to integrate a neural network from this digital file modified in an integrated circuit memory, for example a microcontroller of the STM series 32 from STMicroelectronics, aimed at implementing a neural network.
The integrated circuit may be incorporated, for example, within a cellular mobile phone or tablet computer.
While the invention has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications and combinations of the illustrative embodiments, as well as other embodiments of the invention, will be apparent to persons skilled in the art upon reference to the description. It is therefore intended that the appended claims cover any such modifications or embodiments.

Claims (20)

1. A method for implementing an artificial neural network in an integrated circuit, the method comprising:
obtaining an initial digital file representing a neural network configured according to one or more data representation formats;
detecting at least one representation format of at least part of the data representing the neural network;
converting the at least one detected representation format into a predefined representation format to obtain a modified digital file representing the neural network; and
integrating the modified digital file into a memory of the integrated circuit.
2. The method of claim 1, wherein the conversion of the at least one representation format detected of at least part of the data is performed for at least one layer of the neural network.
3. The method of claim 1, wherein the conversion of the at least one representation format detected of at least part of the data is performed for each layer of the neural network.
4. The method of claim 1, wherein the converting comprises:
adding a first conversion layer at an input of the neural network, the first conversion layer configured to modify values of input data input to the neural network according to the predefined representation format; and
adding a second layer for conversion at an output of the neural network, the second layer configured to modify a value of output data of a last neural network layer according to a format of the output data for representing the initial digital file.
5. The method of claim 1, wherein the neural network comprises a series of neural network layers, and the data of the neural network comprises weights assigned to the neural network layers, and input data and output data generated and used by the neural network layers.
6. The method of claim 5, wherein the detecting comprises: detecting that a weight representation format of the weights is an unsigned format, and the converting includes: the weights represent modifications of format to signed values, and modifications of data values representing the weights.
7. The method of claim 5, wherein the detecting comprises: the weight representation format in which the weight is detected is a signed format, and the converting includes: the weights represent a modification of the format to unsigned values, and a modification of the data values representing the weights.
8. The method of claim 5, wherein the detecting comprises: detecting that the representation format of the input data and the output data of each layer is an unsigned format, and the converting comprises: a modification of the representation of the input data and the output data to signed values.
9. The method of claim 5, wherein the detecting comprises: detecting that the input data and the output data of each layer are represented in signed values, and the converting comprises: a modification of the representation of the input data and the output data to unsigned values.
10. The method of claim 5, further comprising:
selecting the neural network to be executed by a processor, the weights being represented according to an asymmetric representation format;
setting the predefined representation format of the weights to an unsigned and asymmetric format; and
setting the predefined representation format of the input data and the output data of each layer to the unsigned and asymmetric format.
11. The method of claim 5, further comprising:
selecting the neural network to be executed by a processor, the weights being represented according to a symmetric representation format;
setting the predefined representation format of the weights to a signed and symmetric format; and
setting the predefined representation format of the input data and the output data of each layer to an unsigned and asymmetric format.
12. The method of claim 5, further comprising:
selecting to use a dedicated electronic circuit to execute the neural network;
representing the weights according to a symmetric representation format;
setting the predefined representation format of the weights to a signed and symmetric format; and
setting the predefined representation format of the input data and the output data of each layer to an asymmetric and unsigned format in response to the dedicated electronic circuitry being configured to support unsigned arithmetic, or to a signed and asymmetric format in response to the dedicated electronic circuitry being configured to support signed arithmetic.
13. The method of claim 5, further comprising:
selecting to perform, at least in part, the neural network using dedicated electronic circuitry;
representing the weights according to an asymmetric representation format;
setting the predefined representation format of the weights to a signed and asymmetrical format; and
setting the predefined representation format of the input data and the output data of each layer to an asymmetric and unsigned format in response to the dedicated electronic circuitry being configured to support unsigned arithmetic, or to a signed and asymmetric format in response to the dedicated electronic circuitry being configured to support signed arithmetic.
14. A computer program product comprising instructions that, when executed by a computer, direct the computer to perform:
detecting at least one representation format of at least part of the data representing the neural network represented by the initial digital file; and
converting the detected at least one representation format to a predefined representation format to obtain a modified digital file representing the neural network.
15. The computer program product of claim 14, further directing the computer to integrate the modified digital file into a memory of an integrated circuit.
16. A computer-based tool, comprising:
an input configured to receive an initial digital file representing a neural network configured according to one or more data representation formats; and
a processing unit communicatively coupled to the input and configured to:
detecting at least one representation format of at least part of the data representing the neural network;
converting the at least one detected representation format into a predefined representation format to obtain a modified digital file representing the neural network; and
integrating the modified digital file into a memory of an integrated circuit.
17. The computer-based tool of claim 16, wherein the processing unit is configured to: for at least one layer of the neural network, a conversion of the at least one detected representation format of at least part of the data is performed.
18. The computer-based tool of claim 16, wherein the processing unit is configured to: for each layer of the neural network, a conversion of the at least one detected representation format of at least part of the data is performed.
19. The computer-based tool of claim 16, wherein the processing unit configured to convert comprises the processing unit configured to:
adding a first conversion layer at an input of the neural network, the first conversion layer configured to modify values of input data input to the neural network according to the predefined representation format; and
adding a second layer for conversion at an output of the neural network, the second layer configured to modify a value of output data of a last neural network layer according to a format of the output data for representing the initial digital file.
20. The computer-based tool of claim 16, wherein the neural network comprises a series of neural network layers, and the data of the neural network comprises weights assigned to the neural network layers, and input data and output data generated and used by the neural network layers.
CN202110435481.9A 2020-04-23 2021-04-22 Method and apparatus for implementing artificial neural networks in integrated circuits Pending CN113554159A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
FR2004070A FR3109651B1 (en) 2020-04-23 2020-04-23 METHOD FOR IMPLEMENTING AN ARTIFICIAL NEURON NETWORK IN AN INTEGRATED CIRCUIT
FR2004070 2020-04-23
US17/226,598 2021-04-09
US17/226,598 US20210334634A1 (en) 2020-04-23 2021-04-09 Method and apparatus for implementing an artificial neuron network in an integrated circuit

Publications (1)

Publication Number Publication Date
CN113554159A true CN113554159A (en) 2021-10-26

Family

ID=78101782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110435481.9A Pending CN113554159A (en) 2020-04-23 2021-04-22 Method and apparatus for implementing artificial neural networks in integrated circuits

Country Status (1)

Country Link
CN (1) CN113554159A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345939A (en) * 2017-01-25 2018-07-31 微软技术许可有限责任公司 Neural network based on fixed-point calculation
CN108985448A (en) * 2018-06-06 2018-12-11 北京大学 Neural Networks Representation standard card cage structure
CN109635916A (en) * 2017-09-20 2019-04-16 畅想科技有限公司 The hardware realization of deep neural network with variable output data format
US20190122100A1 (en) * 2017-10-19 2019-04-25 Samsung Electronics Co., Ltd. Method and apparatus with neural network parameter quantization
CN110059810A (en) * 2017-11-03 2019-07-26 畅想科技有限公司 The hard-wired every layer data format selection based on histogram of deep neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108345939A (en) * 2017-01-25 2018-07-31 微软技术许可有限责任公司 Neural network based on fixed-point calculation
CN109635916A (en) * 2017-09-20 2019-04-16 畅想科技有限公司 The hardware realization of deep neural network with variable output data format
US20190122100A1 (en) * 2017-10-19 2019-04-25 Samsung Electronics Co., Ltd. Method and apparatus with neural network parameter quantization
CN110059810A (en) * 2017-11-03 2019-07-26 畅想科技有限公司 The hard-wired every layer data format selection based on histogram of deep neural network
CN108985448A (en) * 2018-06-06 2018-12-11 北京大学 Neural Networks Representation standard card cage structure

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
STMICROELECTRONICS: "Getting Started with STM32Cube.AI", Retrieved from the Internet <URL:https://www.youtube.com/watch?v=grgNXdkmzzQ> *

Similar Documents

Publication Publication Date Title
US20210004663A1 (en) Neural network device and method of quantizing parameters of neural network
JP7329455B2 (en) Method and apparatus for neural network quantization
EP3380939B1 (en) Adaptive artificial neural network selection techniques
TWI787803B (en) Methods, systems, and computer storage media for implementing neural networks in fixed point arithmetic computing systems
US11604960B2 (en) Differential bit width neural architecture search
EP3924888A1 (en) Quantization-aware neural architecture search
KR20210064303A (en) Quantization of Trained Long Short Memory Neural Networks
CN111401550A (en) Neural network model quantification method and device and electronic equipment
CN111062475A (en) Method and device for quantifying parameters of a neural network
US11704543B2 (en) Neural network hardware acceleration with stochastic adaptive resource allocation
CN115658955B (en) Cross-media retrieval and model training method, device, equipment and menu retrieval system
CN112200296A (en) Network model quantification method and device, storage medium and electronic equipment
EP3518151A1 (en) Data processing method and data processing system
CN112084959A (en) Crowd image processing method and device
CN112016560A (en) Overlay text recognition method and device, electronic equipment and storage medium
CN111587441B (en) Generating output examples using regression neural networks conditioned on bit values
CN114995729A (en) Voice drawing method and device and computer equipment
CN112561050B (en) Neural network model training method and device
CN108847251A (en) A kind of voice De-weight method, device, server and storage medium
CN112446461A (en) Neural network model training method and device
CN113554159A (en) Method and apparatus for implementing artificial neural networks in integrated circuits
US20210334634A1 (en) Method and apparatus for implementing an artificial neuron network in an integrated circuit
CN113010687B (en) Exercise label prediction method and device, storage medium and computer equipment
KR102384588B1 (en) Method for operating a neural network and for producing weights for the neural network
CN115641879B (en) Music tag acquisition model training method, music tag acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination