CN114707650B - Simulation implementation method for improving simulation efficiency - Google Patents

Simulation implementation method for improving simulation efficiency Download PDF

Info

Publication number
CN114707650B
CN114707650B CN202210321357.4A CN202210321357A CN114707650B CN 114707650 B CN114707650 B CN 114707650B CN 202210321357 A CN202210321357 A CN 202210321357A CN 114707650 B CN114707650 B CN 114707650B
Authority
CN
China
Prior art keywords
neural network
point characteristic
simulation
file
folder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210321357.4A
Other languages
Chinese (zh)
Other versions
CN114707650A (en
Inventor
朱旭东
吴春选
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Xinmai Microelectronics Co ltd
Original Assignee
Zhejiang Xinmai Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Xinmai Microelectronics Co ltd filed Critical Zhejiang Xinmai Microelectronics Co ltd
Priority to CN202210321357.4A priority Critical patent/CN114707650B/en
Publication of CN114707650A publication Critical patent/CN114707650A/en
Application granted granted Critical
Publication of CN114707650B publication Critical patent/CN114707650B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • G06N3/065Analogue means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a simulation implementation method for improving simulation efficiency, which relates to the technical field of deep learning, wherein the simulation implementation method for improving the simulation efficiency comprises the following steps: the quantization set picture quantifies the neural network model through the neural network compiler to generate an executable file, the ten-thousand-person test set generates first input data, a first fixed point characteristic file and a floating point characteristic file through the neural network compiler, and if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to simulate the neural network model. The method has the advantages that batch simulation of a plurality of different types of neural network models is realized, correctness of transplanting to chips or FPGA is ensured, simulation is carried out layer by layer aiming at the different types of neural network models, more simulation verification points are covered, risk of chip streaming is prevented, and meanwhile, comprehensive accuracy verification is carried out on accuracy tables for counting the neural network models.

Description

Simulation implementation method for improving simulation efficiency
The application discloses a simulation realization method based on a neural network compiler, the neural network compiler and a computer readable storage medium, which are divisional applications of which the application date is 2021, 12 and 31, and the application number is 202111653883.2.
Technical Field
The application belongs to the technical field of deep learning, and particularly relates to a simulation implementation method for improving simulation efficiency.
Background
With the development of internet technology, the collected mass data provides enough scenes for deep learning training, the development of intelligent algorithms mainly comprising convolutional neural networks depends on the mass data, and the accuracy of the intelligent algorithms in the fields of image classification, object recognition and the like exceeds the recognition accuracy of human beings.
The neural network algorithm is required to land in the security field, and a trained algorithm model on the server is required to be analyzed into a computer language which can be identified by the embedded chip, so that the installation and monitoring of the security cameras are facilitated.
The convolutional neural network algorithm is realized on a CPU (Central Processing Unit, a central processing unit) or a GPU (Graphics Processing Unit, a graphic processor) and is transplanted to an FPGA (Programmable GATE ARRAY, a field Programmable gate array) or a chip to be realized so as to be convenient for portable carrying and installation, the calculation power realized in the CPU cannot meet the current requirements, the GPU realization mode cannot be applied to embedded equipment, and the python language or the C++ language is adopted to carry out the 32bit floating point forward realization process, so that the cost is reduced and the precision is not lost in order to reduce the area of the chip, the implementation is realized by quantifying to 8bit fixed points, and the FPGA or the chip is realized by adopting verilog (HDL is a hardware description language), so that the whole neural network model is required to be simulated and whether the precision meets the requirements or not is verified.
The prior technical proposal has the following defects: firstly, each middle layer of a neural network model can only be simulated, each layer of information can only be manually searched and configured into a file in the early stage, the precision test of a ten-thousand-person test set can not be performed, and the range distribution of different data sets is not simulated. Secondly, when other types of neural network models or test sets of different scenes are replaced, the operation correctness of the chip or the FPGA cannot be ensured, the chip flow cost is increased, the neural network is not quantized, and the floating point multiplier is adopted, so that the operation performance is reduced.
Disclosure of Invention
The application aims to provide a simulation implementation method for improving simulation efficiency, which aims to solve the technical problems that in the prior art, only each middle layer of a neural network model can be simulated and precision test of a ten-thousand-person test set cannot be performed.
In order to achieve the technical purpose, the application adopts the following technical scheme:
A simulation implementation method for improving simulation efficiency comprises the following steps:
A neural network compiler is constructed and used for receiving quantized set pictures, a plurality of different types of neural network models and a ten-thousand-person test set, and after the neural network compiler performs accuracy verification, the neural network models are simulated layer by layer;
the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler;
comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for counting the neural network model;
and if the statistical result of the precision table accords with a preset precision range, reading the executable file and the first input data to simulate the neural network model.
Preferably, the method further comprises the steps of:
Building an environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is successfully installed;
the building environment of the neural network compiler is set to be the same operating system as that of the simulation system.
Preferably, the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, which specifically includes the following steps:
preparing different types of neural network models and quantized set pictures in different scenes;
operating the neural network compiler, and quantizing the neural network model according to the quantized set picture to generate the executable file;
the executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of an intermediate layer, a layer identifier of an output layer, a quantized weight value, a quantized offset value, a layer operation name, layer parameter information, layer association information and layer memory information.
Preferably, the method further comprises the steps of:
Presetting the number of the neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models;
If the cycle times do not accord with the preset quantity of the neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate the executable file, and the ten-thousand-person test set generates the first input data, the first fixed-point characteristic file and the floating-point characteristic file through the neural network compiler;
And if the cycle times accord with the preset quantity of the neural network models, ending the flow.
Preferably, the ten thousand person test set generates first input data, a first fixed point characteristic file and a floating point characteristic file through the neural network compiler, and specifically comprises the following steps:
preparing different ten-thousand-person test sets according to different neural network models;
The ten-thousand-person test set generates first input data with the network resolution through a scaling function, and the ten-thousand-person test set is simulated to generate a first fixed-point characteristic file and a floating-point characteristic file.
Preferably, comparing the first fixed point feature file with the floating point feature file, outputting a precision table for counting the neural network model, and specifically comprising the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not in the preset variable, the accuracy requirement is not met;
And outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
Preferably, if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to simulate the neural network model, and the method specifically includes the following steps:
counting the precision table, wherein the counting result is required to accord with a preset precision range;
Reading the executable file, configuring hardware according to the executable file, reading the first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point characteristic file;
and comparing the first fixed point characteristic file with the second fixed point characteristic file, and if the first fixed point characteristic file and the second fixed point characteristic file are different, storing error data in the second fixed point characteristic file.
Preferably, the method further comprises the steps of:
establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing the executable files;
automatically generating a first sub-folder under a first folder, wherein the first sub-folder is used for storing the first fixed-point characteristic file;
and automatically generating an input data folder under a first folder, wherein the input data folder is used for storing the first input data.
Preferably, different types of neural network models and quantized set pictures are prepared, and the method specifically comprises the following steps of:
and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing the neural network models of different types, the quantized set pictures and the floating point characteristic files.
Preferably, different ten thousand person test sets are prepared according to different neural network models, and the method specifically comprises the following steps:
and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
A neural network compiler is applied to the simulation implementation method for improving the simulation efficiency, and comprises the following steps: the network analysis module, the network quantification module, the network merging module, the network storage module and the network forward execution module are sequentially connected;
The network analysis module is used for receiving the quantized set pictures, the multiple different types of neural network models and the ten-thousand-person test set, analyzing and reconstructing the structure of the neural network models layer by layer, and at least acquiring one of the input layer, the output layer and the layer operation name, the layer parameter information and the layer association information of the middle layer of the neural network model;
the network quantization module is used for generating an offset value, a conversion value and converting a floating point type weight value into a fixed point type weight value according to the reconstructed neural network model;
The network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model;
The network storage module is used for storing the data in the network analysis module, the network quantization module and the network merging module to generate an executable file;
The network forward execution module is used for generating the first input data, the first fixed point characteristic file and the floating point characteristic file through the network forward execution module by the ten-thousand-person test set, comparing the first fixed point characteristic file and the floating point characteristic file, and outputting an accuracy table for counting the neural network model.
A computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method described above.
The beneficial effects provided by the application are as follows:
1. The quantization set picture is quantized to different neural network models through the neural network compiler to generate different executable files, and if the statistical result of the precision table accords with a preset precision range, the executable files and the first input data are read to simulate the neural network models. The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulation is considered in the simulation of the neural network models, correctness of the transplanted neural network models to a chip or an FPGA is ensured, hardware is configured through executable files, simulation is conducted layer by layer aiming at the different types of the neural network models, more simulation verification points are covered, risk of chip streaming is prevented, cost is saved, simulation efficiency is improved, and meanwhile, comprehensive accuracy verification is conducted on an accuracy table for counting the neural network models.
2. The method comprises the steps of presetting the number of neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models. By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point characteristic file and the floating point characteristic file is saved, and the time consumption of quantification of the neural network models in the forward process is avoided. The generated different data are automatically stored under different folders through pre-stored paths, corresponding data are provided for realizing simulation of multiple types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a simulation implementation method for improving simulation efficiency in embodiment 1.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. The components of the embodiments of the present application generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the application, as presented in the figures, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Example 1:
As shown in fig. 1, the present embodiment includes a simulation implementation method for improving simulation efficiency, including the following steps:
And constructing a neural network compiler which is used for receiving the quantized set pictures, the neural network models of different types and the ten-thousand-person test set, and simulating the neural network models layer by layer after the neural network compiler performs accuracy verification.
The quantized set picture quantizes the neural network model through a neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler.
Comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for the statistical neural network model; if the statistical result of the precision table accords with the preset precision range, the executable file and the first input data are read to simulate the neural network model.
The simulation method has the advantages that batch simulation of a plurality of different types of neural network models is realized, various marginalized simulation is considered in the simulation of the neural network models, correctness of the transplanted neural network models to a chip or an FPGA is ensured, hardware is configured through executable files, simulation is conducted layer by layer aiming at the different types of the neural network models, more simulation verification points are covered, risk of chip streaming is prevented, cost is saved, simulation efficiency is improved, and meanwhile, comprehensive accuracy verification is conducted on an accuracy table for counting the neural network models.
The method also comprises the steps of: and setting up the environment of the neural network compiler, installing the neural network compiler, and testing whether the neural network compiler is successfully installed, wherein the setting up environment of the neural network compiler is set to be the same operating system as the simulation system. Specifically, the neural network compiler is packaged into whl format, whl format is a compressed file format, and the installation test under the operating system is convenient.
The quantization set picture quantizes the neural network model through a neural network compiler to generate an executable file, and specifically comprises the following steps: different types of neural network models and quantized set pictures in different scenes are prepared.
And operating a neural network compiler, and quantizing the neural network model according to the quantized set picture to generate an executable file. The executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of an intermediate layer, a layer identifier of an output layer, a quantized weight value, a quantized offset value, a layer operation name, layer parameter information, layer association information and layer memory information.
Specifically, the network analysis module of the neural network compiler analyzes and reconstructs the structure of the original neural network model layer by layer, generates offset values and conversion values according to the reconstructed neural network model, and converts floating point type weight values into fixed point type weight values. The network merging module and the network quantifying module operate simultaneously to merge pipeline operation instructions in a convolution layer, a pooling layer and an activation layer in the neural network model. And the network storage module generates executable files from the quantized data operated by the network analysis module, the network quantization module and the network merging module.
The formula for generating the offset value is as follows:
equation one: x's' m=(x′max-x′min)*2bw
Where x ' m denotes an offset value, x ' max denotes a maximum weight value of the floating point type, x ' min denotes a minimum weight value of the floating point type, bw denotes a converted bit width, and in this embodiment, a bit width of 12 bits is currently supported.
The formula for generating the conversion value is as follows:
formula II: f=max ((bw-ceil (log 2(x′m) +1)), bw)
Wherein, f represents a conversion value, max represents a maximum value of the built-in function belonging to the system library, bw represents a converted bit width, log2 represents the built-in function of the system library, x' m represents an offset value, and ceil represents an upward rounding of the built-in function belonging to the system library.
Converting a floating point type weight value into a fixed point type weight value, wherein the formula for converting floating point characteristic data into fixed point characteristic data is expressed as follows:
And (3) a formula III: x=round (X float*2f)+x′m
Wherein X represents fixed-point feature data, in this embodiment, may be a fixed-point weight value, X float represents floating-point feature data, in this embodiment, may be a floating-point weight value, round represents a rounded system library built-in function, f represents a conversion value, and X' m represents an offset value.
Specifically, the layer operation names include at least one of convolution, deconvolution, pooling, full join, culling, join, point addition, point multiplication, normalization, and activation layer operations. The layer parameter information includes at least one of a convolution kernel size, a convolution kernel span, a grouping, a padding value, whether to bring an active layer, a quantized weight value, and a quantized offset value. The layer association information includes at least one of an input layer operation name of the current layer, layer parameter information, an output layer operation name of the current layer, and layer parameter information. The intra-layer memory information includes at least one of a memory size of a current layer and whether to multiplex the memories of other layers.
Specifically, the neural network models of different types comprise a detection network, an identification network, a classification network and the like, and the number of quantized set pictures in different scenes at least comprises 50.
The method also comprises the steps of: presetting the number of neural network models, setting the initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models.
If the cycle times do not accord with the number of the preset neural network models, the quantized set pictures quantize the neural network models through a neural network compiler to generate executable files, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler.
If the cycle number accords with the number of the preset neural network models, ending the flow. And adding 1 to the circulation number every time an executable file is simulated.
By judging the number of the neural network models, the time for generating the executable file, the first input data, the first fixed point characteristic file and the floating point characteristic file is saved, and the time consumption of quantification of the neural network models in the forward process is avoided.
The ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through a neural network compiler, and specifically comprises the following steps:
according to different neural network models, different ten-thousand-person test sets are prepared, the ten-thousand-person test sets generate first input data with the network resolution through a scaling function, and simulation is carried out on the ten-thousand-person test sets to generate a first fixed-point feature file and a floating-point feature file.
Specifically, the ten thousand person test sets are picture sets, the number of the ten thousand person test sets is ten thousand pictures, and the ten thousand person test sets generate first input data, a first fixed point characteristic file and a floating point characteristic file through a network forward execution module.
The method also comprises the steps of: and establishing a first folder, automatically generating a first main folder under the first folder, wherein the first main folder is used for storing executable files.
And automatically generating a first sub-folder under the first folder, wherein the first sub-folder is used for storing the first fixed-point characteristic file. And automatically generating an input data folder under the first folder, wherein the input data folder is used for storing the first input data.
Preparing different types of neural network models and quantized set pictures, and specifically comprising the following steps of: and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing different types of neural network models, quantized set pictures and floating point characteristic files.
According to different neural network models, different ten thousand person test sets are prepared, and the method specifically comprises the following steps: and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
Specifically, under the current PATH, a first folder and a second folder are established, the file name of the first folder is defined as SPE_PATH1, the file name of the second folder is defined as SPE_PATH2, under the SPE_PATH2 file, a second main folder named by the name of the neural network is established to store the neural network model and quantized set pictures generated by the GPU, and a second auxiliary folder is established under the second main folder to store the ten-thousand-person test set.
And generating a second main folder named by the neural network name under the SPE_PATH1 file by the neural network compiler every time an executable file is generated, and storing the executable file generated by the neural network compiler.
Automatically generating an input data folder under the SPE_PATH1 file, defining a neural network name which is analyzed by a neural network compiler as resnet in the embodiment, defining the file name of the generated input data folder as SPE_PATH1/resnet/data_input, storing a ten-thousand-person test set, generating first input data with the network resolution through a scaling function, and adopting a hexadecimal format and arranging each data line for the convenience of simulation.
And automatically generating a first sub-folder under the SPE_PATH1 file, wherein the analyzed neural network name is resnet, the network layer name is conv1_1, and the layer serial number is 1, and the file name of the generated first sub-folder is defined as SPE_PATH1/resnet/conv1_1 and is used for storing a first fixed point characteristic file generated by an intermediate layer and an output layer when the ten-thousand-person test set is simulated so as to facilitate the simulation to check the correctness of data, and each data line is arranged in a hexadecimal format.
The generated different data are automatically stored under different folders through pre-stored paths, corresponding data are provided for realizing simulation of multiple types of neural network models, the simulation flow is simplified, and the simulation efficiency is accelerated.
The method also comprises the steps of: and presetting the number of executable files, and judging whether the number of the executable files under the first main folder exceeds the number of the preset executable files.
If the number of the executable files under the first main folder does not exceed the number of the preset executable files, the neural network compiler simulates the ten-thousand-person test set to generate a first fixed-point characteristic file.
If the number of the executable files under the first main folder exceeds the number of the preset executable files, ending the process of simulating the ten-thousand-person test set by the neural network compiler.
And determining whether the ten-thousand-person test set is simulated by judging the number of executable files under the first main folder, and ending the simulation flow if the simulation is finished, so that the simulation efficiency is improved.
Comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for the statistical neural network model, wherein the method specifically comprises the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data, and second floating point characteristic data is generated;
Comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not in the preset variable, the accuracy requirement is not met;
And outputting the similarity statistics of the first floating point characteristic data and the second floating point characteristic data in the form of a table.
Specifically, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data through a conversion formula, wherein the conversion formula is as follows:
Equation four: x'. float=(X-x′m)/2f
Wherein X 'float represents floating point feature data, which in this embodiment may be second floating point feature data, X represents fixed point feature data, which in this embodiment may be fixed point feature data in the first fixed point feature file, X' m represents an offset value, and f represents a conversion value.
Specifically, the similarity between the first floating point feature data and the second floating point feature data is compared, and the similarity distance formula is as follows:
Formula five:
Where n represents the total number of floating point feature data, x i represents the first floating point feature data, and y i represents the second floating point feature data, i.e., the value of x' float in equation four. θ represents the similarity of distances, and closer to 1 indicates higher accuracy.
In this embodiment, by testing a ten-thousand-person test set corresponding to a neural network model, setting a preset variable to be a similarity distance of 0.8, comparing the similarity of the first floating point feature data and the second floating point feature data, that is, counting the similarity of each picture in the ten-thousand-person test set, when the similarity distance is greater than or equal to 0.8, indicating that the precision requirement is met, counting the duty ratio of the count data of each neural network model in the ten-thousand-person test set, and outputting a precision table for counting the neural network model. The statistical result of the precision table can be intuitively seen, and whether the requirement of hardware design meets the precision requirement is checked.
If the statistical result of the precision table accords with the preset precision range, reading the executable file and the first input data to simulate the neural network model, wherein the method specifically comprises the following steps:
and counting the precision table, wherein the counting statistical result is required to accord with a preset precision range. And reading the executable file, configuring hardware according to the executable file, reading the first input data, and starting simulation of the neural network model according to the first input data to generate a second fixed-point characteristic file.
And comparing the first fixed point characteristic file with the second fixed point characteristic file, and if the first fixed point characteristic file and the second fixed point characteristic file are different, storing error data in the second fixed point characteristic file.
The simulation problem can be conveniently located through the error data in the second fixed-point characteristic file, the simulation efficiency can be improved, and the simulation coverage is wider.
Example 2:
the embodiment includes a neural network compiler, which is applied to the simulation implementation method for improving the simulation efficiency of embodiment 1, and includes: the system comprises a network analysis module, a network quantification module, a network merging module, a network storage module and a network forward execution module which are connected in sequence.
The network analysis module is used for receiving the quantized set pictures, the nerve network models of different types and the ten-thousand-person test set, analyzing and reconstructing the structure of the nerve network model layer by layer, and at least acquiring one of the input layer, the output layer and the layer operation name, the layer parameter information and the layer association information of the middle layer of the nerve network model.
Specifically, the network analysis module analyzes the structure of the original neural network model layer by layer, at least one of the input layer, the output layer, the layer operation name of the middle layer, the layer parameter information and the layer association information of the neural network model is obtained, the structure executed in the internal sequence is reconstructed after analysis, the data structure of the internal relevant network layer is redefined, the network layer comprises a convolution layer, a pooling layer and an activation layer, and the content such as the layer execution sequence, the layer operation type, the layer operation name, the layer parameter information and the layer association information is filled into the data structure of the internal relevant network layer.
And the network quantization module is used for generating an offset value, a conversion value and converting a floating point type weight value into a fixed point type weight value according to the reconstructed neural network model.
Specifically, floating point characteristic data of the storage address space is converted into a data format supported by hardware, and conversion values are calculated, so that the calculated amount of hardware and the number of multipliers are reduced.
And the network merging module is used for merging the running water operation instructions of the convolution layer, the pooling layer and the activation layer in the neural network model.
Specifically, according to the principle of reducing the bandwidth of an external memory, pipeline operation instructions in a convolution layer, a pooling layer and an activation layer are optimized, equivalent transformation optimization is performed on the convolution layer, the pooling layer and the activation layer, and internal data structures are optimized and combined again, so that the resource consumption is reduced, and the execution efficiency is improved. And the data interaction between the internal memory and the external memory is reduced, so that the bandwidth utilization rate is improved, and the layers in the same pipeline stage are combined, wherein the main combined layers are a convolution layer and a pooling layer.
And the network storage module is used for storing the data in the network analysis module, the network quantization module and the network merging module to generate an executable file.
The network forward execution module is used for generating first input data, a first fixed point characteristic file and a floating point characteristic file through the network forward execution module by the ten-thousand testing set, comparing the first fixed point characteristic file and the floating point characteristic file, and outputting an accuracy form for counting the neural network model.
Specifically, the standardization part is implemented by adopting an open-source deep learning architecture so as to ensure the correct comparison standard, and the simulation part keeps the forward logic of the network consistent with the logic of the hardware execution network so as to ensure the consistency of the simulation result of the data and the hardware.
For relevance, see the section of example 1.
Example 3:
a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method of embodiment 2.
It will be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that:
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the application. Thus, the appearances of the phrase "one embodiment" or "an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
In addition, the specific embodiments described in the present specification may differ in terms of parts, shapes of components, names, and the like. All equivalent or simple changes of the structure, characteristics and principle according to the inventive concept are included in the protection scope of the present application. Those skilled in the art may make various modifications or additions to the described embodiments or substitutions in a similar manner without departing from the scope of the application as defined in the accompanying claims.

Claims (9)

1. The simulation implementation method for improving the simulation efficiency is characterized by comprising the following steps of:
A neural network compiler is constructed and used for receiving quantized set pictures, a plurality of different types of neural network models and a ten-thousand-person test set, constructing the environment of the neural network compiler and installing the neural network compiler;
after the neural network compiler performs accuracy verification, simulating the neural network model layer by layer;
the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and the ten-thousand-person test set generates first input data, a first fixed-point characteristic file and a floating-point characteristic file through the neural network compiler;
comparing the first fixed point characteristic file with the floating point characteristic file, and outputting a precision table for counting the neural network model;
if the statistical result of the precision table accords with a preset precision range, reading the executable file and the first input data to simulate the neural network model;
The method comprises the steps of presetting the number of neural network models, setting initial circulation times to 0, and judging whether the circulation times accord with the preset number of the neural network models;
If the cycle times do not accord with the preset quantity of the neural network models, the quantization set picture quantizes the neural network models through the neural network compiler to generate the executable file, and the ten-thousand-person test set generates the first input data, the first fixed-point characteristic file and the floating-point characteristic file through the neural network compiler;
if the cycle times accord with the preset quantity of the neural network models, ending the flow;
The quantized set pictures are pictures collected under different types of neural network models and different scenes, and the ten-thousand-person test set is a picture set.
2. The simulation implementation method for improving simulation efficiency according to claim 1, further comprising the steps of:
testing whether the neural network compiler is successfully installed;
the building environment of the neural network compiler is set to be the same operating system as that of the simulation system.
3. The simulation implementation method for improving simulation efficiency according to claim 1, wherein the quantization set picture quantizes the neural network model through the neural network compiler to generate an executable file, and specifically comprises the following steps:
preparing different types of neural network models and quantized set pictures in different scenes;
operating the neural network compiler, and quantizing the neural network model according to the quantized set picture to generate the executable file;
the executable file comprises a neural network name identifier, a layer identifier of an input layer, a layer identifier of an intermediate layer, a layer identifier of an output layer, a quantized weight value, a quantized offset value, a layer operation name, layer parameter information, layer association information and layer memory information.
4. The simulation implementation method for improving simulation efficiency according to claim 1, wherein the ten thousand person test set generates first input data, a first fixed point feature file and a floating point feature file through the neural network compiler, specifically comprising the following steps:
preparing different ten-thousand-person test sets according to different neural network models;
The ten-thousand-person test set generates first input data with the network resolution through a scaling function, and the ten-thousand-person test set is simulated to generate a first fixed-point characteristic file and a floating-point characteristic file.
5. The simulation implementation method for improving simulation efficiency according to claim 1, wherein comparing the first fixed point profile and the floating point profile, outputting a precision table for counting the neural network model, comprises the following steps:
the floating point characteristic file comprises first floating point characteristic data, the fixed point characteristic data in the first fixed point characteristic file is converted into floating point characteristic data, and second floating point characteristic data is generated;
comparing the similarity of the first floating point characteristic data and the second floating point characteristic data, and if the similarity is within a preset variable, meeting the precision requirement; if the similarity is not in the preset variable, the accuracy requirement is not met;
And outputting the similarity statistical result of the first floating point characteristic data and the second floating point characteristic data in a form of a table.
6. The simulation implementation method for improving simulation efficiency according to claim 1, wherein if the statistical result of the precision table accords with a preset precision range, the executable file and the first input data are read to perform the simulation of the neural network model, and specifically comprising the following steps:
counting the precision table, wherein the counting result is required to accord with a preset precision range;
Reading the executable file, configuring hardware according to the executable file, reading the first input data, starting simulation of the neural network model according to the first input data, and generating a second fixed-point characteristic file;
and comparing the first fixed point characteristic file with the second fixed point characteristic file, and if the first fixed point characteristic file and the second fixed point characteristic file are different, storing error data in the second fixed point characteristic file.
7. The simulation implementation method for improving simulation efficiency according to claim 1, further comprising the steps of:
establishing a first folder, and automatically generating a first main folder under the first folder, wherein the first main folder is used for storing the executable files;
automatically generating a first sub-folder under a first folder, wherein the first sub-folder is used for storing the first fixed-point characteristic file;
and automatically generating an input data folder under a first folder, wherein the input data folder is used for storing the first input data.
8. A simulation implementation method for improving simulation efficiency according to claim 3, wherein different types of neural network models and quantized set pictures are prepared, specifically comprising the steps of:
and establishing a second folder, and generating a second main folder under the second folder, wherein the second main folder is used for storing the neural network models of different types, the quantized set pictures and the floating point characteristic files.
9. The simulation implementation method for improving simulation efficiency according to claim 4, wherein different ten thousand person test sets are prepared according to different neural network models, specifically comprising the following steps:
Establishing a second folder, generating a second main folder under the second folder, and establishing a second auxiliary folder under the second main folder, wherein the second auxiliary folder is used for storing the ten-thousand-person test set.
CN202210321357.4A 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency Active CN114707650B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210321357.4A CN114707650B (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210321357.4A CN114707650B (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency
CN202111653883.2A CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202111653883.2A Division CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN114707650A CN114707650A (en) 2022-07-05
CN114707650B true CN114707650B (en) 2024-06-14

Family

ID=79932421

Family Applications (3)

Application Number Title Priority Date Filing Date
CN202210321357.4A Active CN114707650B (en) 2021-12-31 2021-12-31 Simulation implementation method for improving simulation efficiency
CN202111653883.2A Active CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium
CN202210315323.4A Active CN114676830B (en) 2021-12-31 2021-12-31 Simulation implementation method based on neural network compiler

Family Applications After (2)

Application Number Title Priority Date Filing Date
CN202111653883.2A Active CN114004352B (en) 2021-12-31 2021-12-31 Simulation implementation method, neural network compiler and computer readable storage medium
CN202210315323.4A Active CN114676830B (en) 2021-12-31 2021-12-31 Simulation implementation method based on neural network compiler

Country Status (1)

Country Link
CN (3) CN114707650B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386588B (en) * 2022-03-23 2022-07-29 杭州雄迈集成电路技术股份有限公司 Neural network reasoning method and system
CN116796674B (en) * 2023-08-24 2023-11-24 上海合见工业软件集团有限公司 Heterogeneous hardware simulation method and system
CN117034822B (en) * 2023-10-10 2023-12-15 北京云枢创新软件技术有限公司 Verification method based on three-step simulation, electronic equipment and medium

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103929210B (en) * 2014-04-25 2017-01-11 重庆邮电大学 Hard decision decoding method based on genetic algorithm and neural network
US10643126B2 (en) * 2016-07-14 2020-05-05 Huawei Technologies Co., Ltd. Systems, methods and devices for data quantization
WO2018015963A1 (en) * 2016-07-21 2018-01-25 Ramot At Tel-Aviv University Ltd. Method and system for comparing sequences
CN108510067B (en) * 2018-04-11 2021-11-09 西安电子科技大学 Convolutional neural network quantification method based on engineering realization
US20210097391A1 (en) * 2018-04-17 2021-04-01 Shenzhen Corerain Technologies Co., Ltd. Network model compiler and related product
US11645493B2 (en) * 2018-05-04 2023-05-09 Microsoft Technology Licensing, Llc Flow for quantized neural networks
CN109102064B (en) * 2018-06-26 2020-11-13 杭州雄迈集成电路技术股份有限公司 High-precision neural network quantization compression method
US11586883B2 (en) * 2018-12-14 2023-02-21 Microsoft Technology Licensing, Llc Residual quantization for neural networks
CN109740302B (en) * 2019-04-02 2020-01-10 深兰人工智能芯片研究院(江苏)有限公司 Simulation method and device of neural network
CN113228056B (en) * 2019-10-12 2023-12-22 深圳鲲云信息科技有限公司 Runtime hardware simulation method, device, equipment and storage medium
CN113272813B (en) * 2019-10-12 2023-05-05 深圳鲲云信息科技有限公司 Custom data stream hardware simulation method, device, equipment and storage medium
CN110795165A (en) * 2019-10-12 2020-02-14 苏州浪潮智能科技有限公司 Neural network model data loading method and related device
CN110750945B (en) * 2019-12-25 2020-11-13 安徽寒武纪信息科技有限公司 Chip simulation method and device, simulation chip and related product
CN111178512B (en) * 2019-12-31 2023-04-18 中科南京人工智能创新研究院 Device operation neural network test method and device
CN113326930B (en) * 2020-02-29 2024-05-03 华为技术有限公司 Data processing method, neural network training method, related device and equipment
CN111401550A (en) * 2020-03-10 2020-07-10 北京迈格威科技有限公司 Neural network model quantification method and device and electronic equipment
CN111523526A (en) * 2020-07-02 2020-08-11 杭州雄迈集成电路技术股份有限公司 Target detection method, computer equipment and readable storage medium
CN112232497A (en) * 2020-10-12 2021-01-15 苏州浪潮智能科技有限公司 Method, system, device and medium for compiling AI chip
CN112446491B (en) * 2021-01-20 2024-03-15 上海齐感电子信息科技有限公司 Real-time automatic quantification method and real-time automatic quantification system for neural network model
CN113159276B (en) * 2021-03-09 2024-04-16 北京大学 Model optimization deployment method, system, equipment and storage medium
CN113435570B (en) * 2021-05-07 2024-05-31 西安电子科技大学 Programmable convolutional neural network processor, method, device, medium and terminal

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BP神经网络字符识别***Matlab建模及硬件实现;余菲;赵杰;王静霞;温国忠;宋荣;;深圳职业技术学院学报;20190520(03);全文 *

Also Published As

Publication number Publication date
CN114004352A (en) 2022-02-01
CN114004352B (en) 2022-04-26
CN114707650A (en) 2022-07-05
CN114676830A (en) 2022-06-28
CN114676830B (en) 2024-06-14

Similar Documents

Publication Publication Date Title
CN114707650B (en) Simulation implementation method for improving simulation efficiency
CN111709522B (en) Deep learning target detection system based on server-embedded cooperation
Capotondi et al. CMix-NN: Mixed low-precision CNN library for memory-constrained edge devices
CN110515739B (en) Deep learning neural network model load calculation method, device, equipment and medium
CN107480789B (en) Efficient conversion method and device of deep learning model
CN112101525A (en) Method, device and system for designing neural network through NAS
EP3846034B1 (en) Systems and methods for automated testing using artificial intelligence techniques
US20240161474A1 (en) Neural Network Inference Acceleration Method, Target Detection Method, Device, and Storage Medium
CN114429208A (en) Model compression method, device, equipment and medium based on residual structure pruning
CN116450486A (en) Modeling method, device, equipment and medium for nodes in multi-element heterogeneous computing system
CN115640851A (en) Neural network efficient reasoning method suitable for test instrument
Zhong et al. Reduced-order digital twin and latent data assimilation for global wildfire prediction
CN114970357A (en) Energy-saving effect evaluation method, system, device and storage medium
CN114492742A (en) Neural network structure searching method, model issuing method, electronic device, and storage medium
CN117196000A (en) Edge side model reasoning acceleration method for containerized deployment
US20220058530A1 (en) Method and device for optimizing deep learning model conversion, and storage medium
Anuradha et al. Efficient workload characterization technique for heterogeneous processors
CN117435870B (en) Load data real-time filling method, system, equipment and medium
CN116561696B (en) Multi-dimensional user adjustable load rapid aggregation method and system thereof
CN115146596B (en) Recall text generation method and device, electronic equipment and storage medium
CN115759197A (en) Neural network searching method and device and computer equipment
US20220414457A1 (en) Selective data structure encoding for deep neural network training
CN114691457A (en) Method, device, storage medium and electronic equipment for determining hardware performance
Westby FPGA Acceleration on Multilayer Perceptron (MLP) Neural Network for Handwritten Digit Recognition
CN117931211A (en) Model deployment method, device, apparatus, chip and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: 311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang Xinmai Microelectronics Co.,Ltd.

Address before: 311400 4th floor, building 9, Yinhu innovation center, No.9 Fuxian Road, Yinhu street, Fuyang District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou xiongmai integrated circuit technology Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant