WO2021068253A1 - 定制数据流硬件模拟仿真方法、装置、设备及存储介质 - Google Patents

定制数据流硬件模拟仿真方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2021068253A1
WO2021068253A1 PCT/CN2019/110858 CN2019110858W WO2021068253A1 WO 2021068253 A1 WO2021068253 A1 WO 2021068253A1 CN 2019110858 W CN2019110858 W CN 2019110858W WO 2021068253 A1 WO2021068253 A1 WO 2021068253A1
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
data
simulated
parameters
layer
Prior art date
Application number
PCT/CN2019/110858
Other languages
English (en)
French (fr)
Inventor
郭理源
黄炯凯
蔡权雄
牛昕宇
Original Assignee
深圳鲲云信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳鲲云信息科技有限公司 filed Critical 深圳鲲云信息科技有限公司
Priority to CN201980066982.5A priority Critical patent/CN113272813B/zh
Priority to PCT/CN2019/110858 priority patent/WO2021068253A1/zh
Publication of WO2021068253A1 publication Critical patent/WO2021068253A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the invention belongs to the field of artificial intelligence technology, and in particular relates to a custom data stream hardware simulation simulation method, device, equipment and storage medium.
  • simulation verification based on verification platform or software
  • formal verification a verification verification based on formal verification
  • software and hardware co-verification a verification verification based on software
  • simulation verification is the most extensive, which is also an indispensable part of integrated circuit design.
  • test cases check whether the RTL hardware design will produce a corresponding response under a specific stimulus.
  • the existing hardware system-level simulation method has the problem of low simulation speed, resulting in low efficiency in the development of customized data flow products.
  • the embodiment of the present invention provides a customized data stream hardware simulation simulation method, which aims to solve the problem of low simulation speed in the existing hardware system-level simulation method, which leads to low development efficiency of customized data stream products.
  • the embodiment of the present invention is implemented in this way, and provides a customized data stream hardware simulation simulation method, including the steps:
  • custom data stream hardware parameters including registration configuration parameters, neural network structure diagrams, and neural network parameters, the neural network structure diagrams including serial relationships between different neural network layers;
  • the analog neural network including the data flow relationship between different neural network layers, the data flow relationship being obtained according to the serial relationship;
  • the data to be simulated and the neural network parameters in the corresponding registered address are input to the simulated neural network for simulation calculation to obtain verification data, and the corresponding registered address is returned.
  • the registration configuration parameters include global flow configuration parameters and local flow configuration parameters
  • the neural network parameters include different neural network layer parameters
  • the corresponding registration address is configured in the C language environment according to the registration configuration parameters.
  • the specific steps of registering the data to be simulated, the neural network structure diagram, and the neural network parameters to the corresponding registration address include:
  • the step of simulating and constructing a corresponding simulated neural network according to the neural network structure diagram in the corresponding registered address specifically includes:
  • the corresponding simulated neural network is simulated and constructed.
  • the step of inputting the data to be simulated and the neural network parameters in the corresponding registered address into the simulated neural network for simulation calculation to obtain verification data and returning the corresponding registered address specifically includes:
  • the verification data of the simulated neural network is obtained, and the verification data of the simulated neural network is returned to the global flow registration address corresponding to the data to be simulated.
  • the method further includes:
  • the step of inputting the data to be simulated into the neural network specifically includes:
  • Input the 8bit unit length data to be simulated into the simulated neural network.
  • the parameters of each layer of the neural network in the corresponding local flow registration address and the corresponding data to be simulated are respectively read and calculated to obtain the layer verification data corresponding to each layer of the neural network, and the layer
  • the step of returning the layer verification data of the neural network layer to the corresponding local stream registration address also specifically includes:
  • the layer verification data corresponding to the previous neural network layer is quantified to obtain the layer verification data of 8bit unit length;
  • the present invention also provides a customized data stream hardware simulation simulation device, the device includes:
  • the acquisition module is used to acquire customized data stream hardware parameters and data to be simulated.
  • the customized data stream hardware parameters include registered configuration parameters, neural network structure diagrams, and neural network parameters.
  • the neural network structure diagrams include information between different neural network layers. Serial relationship
  • the configuration module is used to configure the corresponding register address in the C language environment according to the register configuration parameter, and register the to-be-simulated data, the neural network structure diagram and the neural network parameter to the corresponding register address;
  • the construction module is used to simulate and construct a corresponding simulated neural network according to the neural network structure diagram in the corresponding registered address.
  • the simulated neural network includes the data flow relationship between different neural network layers, and the data flow relationship is based on the string Line relationship
  • the calculation module is used to input the to-be-simulated data and neural network parameters in the corresponding registered address into the simulated neural network for simulation calculation, obtain verification data, and return the corresponding registered address.
  • the registered configuration parameters include global flow configuration parameters and local flow configuration parameters
  • the neural network parameters include different neural network layer parameters
  • the configuration module includes:
  • the first configuration unit is configured to respectively configure the data to be simulated and the global flow register address corresponding to the neural network structure
  • the second configuration unit is used to configure the local stream registration addresses corresponding to the different neural network layer parameters.
  • the present invention also provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements the customized data stream hardware according to any one of the embodiments of the present invention when the computer program is executed. Steps of simulation simulation method.
  • the present invention also provides a computer-readable storage medium having a computer program stored on the computer-readable storage medium, and when the computer program is executed by a processor, the customized data stream hardware according to any one of the embodiments of the present invention is realized Simulate the steps of the simulation method.
  • the beneficial effects achieved by the present invention because the present invention simulates the workflow of customized data flow hardware in the C language environment, the simulation of the hardware part in the system-level simulation verification is realized through the C language environment, and the software part and the hardware part can be in the same environment
  • the calculation of the data flow in the development process facilitates the coordination and verification of the software part and the hardware part in the development process, and improves the development efficiency.
  • FIG. 1 is a schematic flowchart of a hardware simulation method for customized data flow provided by an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of another customized data stream hardware simulation simulation method provided by an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart of another customized data stream hardware simulation method provided by an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a customized data stream hardware simulation simulation device provided by an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a specific flow of a configuration module 402 according to an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of a specific construction module 403 provided by an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a specific flow of a calculation module 404 according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of a customized data stream hardware simulation simulation device provided by an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of a specific flow of a structure calculation unit 4043 provided by an embodiment of the present invention.
  • Fig. 10 is a schematic structural diagram of an embodiment of a computer device according to an embodiment of the present invention.
  • the present invention simulates the workflow of customized data flow hardware in the C language environment, and realizes the simulation of the hardware part in the system-level simulation verification through the C language environment.
  • the software part and the hardware part can form the calculation of the data flow in the same environment.
  • FIG. 1 is an architecture diagram of a customized data flow hardware provided by an embodiment of the present invention.
  • the architecture 103 is connected with the off-chip memory module (DDR) 101 and the CPU 102 through interconnection.
  • the architecture 103 includes: A storage module 104, a global data flow network 105, and a data flow engine 106. While the first storage module 104 is connected to the off-chip storage module 101 through interconnection, it is also connected to the global data flow network 105 through interconnection.
  • the engine 106 is connected to the aforementioned global data flow network 105 through interconnection so that the aforementioned data flow engine 106 can be implemented in parallel or serial.
  • the above-mentioned data flow engine 106 may include: a calculation core (or called a calculation module), a second storage module 108, and a local data flow network 107.
  • the calculation core may include a core for calculation, such as a convolution core 109 and a pooling core. 110 and activation function core 111, etc. Of course, it can also include other calculation cores besides the example convolution core 109, pooling core 110, and activation function core 111, which are not limited here, and can also be included in the neural network.
  • the above-mentioned first storage module 104 and the above-mentioned second storage module 108 may be on-chip cache modules, or may be DDR or high-speed DDR memory modules.
  • the above-mentioned data stream engine 106 can be understood as a calculation engine that supports data stream processing, and can also be understood as a calculation engine dedicated to data stream processing.
  • the above-mentioned data flow architecture can be customized on the FPGA programmable gate array.
  • FIG. 2 it is a flowchart of an embodiment provided by a customized data stream hardware simulation simulation method according to the present application.
  • the above-mentioned customized data stream hardware simulation simulation method includes the steps:
  • the above-mentioned customized data stream hardware parameters include registration configuration parameters, neural network structure diagrams, and neural network parameters.
  • the above-mentioned register configuration parameters are used to open up a corresponding storage area in the memory of the C language environment to form a corresponding register address.
  • the above-mentioned neural network structure diagram can be a neural network structure diagram of recognition type, such as face recognition, vehicle recognition, etc., or a neural network structure diagram of detection type, such as object detection, vehicle detection, etc.
  • the above-mentioned neural network structure diagram can be understood as a neural network structure, and further, can be understood as a neural network structure used for various neural network models.
  • the above-mentioned neural network structure uses layers as the calculation unit, including but not limited to: convolutional layer, pooling layer, ReLU, fully connected layer and so on.
  • the neural network structure diagram includes the serial relationship between different neural network layers, such as the serial relationship between the convolutional layer, the bias layer, and the pooling layer.
  • the aforementioned neural network parameters refer to the parameters corresponding to each layer in the neural network structure, and may be weight parameters, bias parameters, and so on.
  • the above-mentioned various neural network models can be pre-trained corresponding neural network models. Since the neural network model is pre-trained, the attributes of the neural network parameters are also trained. Therefore, the neural network configured in the simulation software It can be used directly according to the configured neural network parameters, and there is no need to train the neural network. According to the pre-trained neural network model, the neural network structure diagram and parameters can be uniformly described.
  • the neural network structure diagram and neural network parameters described above can be acquired locally or on a cloud server.
  • the neural network structure diagram and neural network parameters described above can be stored locally and automatically when used.
  • the selection or the user selects, or uploads the neural network structure diagram and neural network parameters to the cloud server, and downloads the neural network structure diagram and neural network parameters in the cloud server through the network when in use.
  • S102 Configure a corresponding register address in the C language environment according to the register configuration parameter, and register the data to be simulated, the neural network structure diagram, and the neural network parameter to the corresponding register address.
  • the above-mentioned registered configuration parameter is the registered configuration parameter obtained in step S101, and the registered configuration parameter includes how many registered addresses are configured and the size of the corresponding registered addresses.
  • Each register address can store corresponding data, for example, configure a register address to store simulation data, configure a register address to store parameters of a neural network layer, etc., when needed, read the corresponding directly from the register address The data is fine.
  • the registered address is also used to store the corresponding calculation result.
  • the registered address corresponding to the convolutional neural network layer stores the corresponding weight parameter, and is also used to store the corresponding convolution result after the convolution calculation is completed.
  • the above-mentioned registered configuration parameters include global flow configuration parameters and local flow configuration parameters
  • the above-mentioned neural network parameters include parameters of different neural network layers.
  • the data to be simulated and the global flow register address corresponding to the neural network structure diagram are respectively configured.
  • each neural network layer needs to have different configuration parameters for calculation.
  • the convolutional layer needs weight parameters and input data for calculation
  • the bias layer needs bias parameters and input. Data is calculated.
  • different local flow registration addresses can be configured according to different neural network layer parameters, and when calculating different neural network layers, the parameters and input data corresponding to the registration addresses are read for calculation. For example, when performing convolution calculation, the weight parameter in the register address where the weight parameter of the corresponding convolution layer is located is read, and the calculation is performed with the input data.
  • the corresponding register address configured in the above C language environment can be opened up in the C language environment such as the hard disk of the computer as the register address, or opened up in the virtual memory space configured in the C language environment.
  • the storage space is used as the register address.
  • S103 Simulate and construct a corresponding simulated neural network according to the neural network structure diagram in the corresponding registered address.
  • the neural network structure diagram includes the serial relationship between different neural network layers. According to the serial relationship, the data flow relationship between the data flow in the different neural network layers is determined, so as to construct the corresponding analog neural network. It should be noted that the above The simulated neural network is a customized data flow simulation neural network.
  • the corresponding neural network structure diagram is read in the global flow register address where the neural network structure diagram is stored. According to the read neural network structure diagram, simulate and construct the corresponding simulated neural network. In this way, the neural network structure diagram in the global stream registration address can be reused.
  • S104 Input the to-be-simulated data and neural network parameters in the corresponding registered address into the simulated neural network for simulation calculation, obtain verification data, and return the corresponding registered address.
  • the corresponding data to be simulated is read from the registered address stored to be simulated, and after the calculation of the simulated neural network is completed, the verification data to be compared with the calculation result of the real hardware is obtained. After the calculation is completed, the obtained verification data is written back to the corresponding register address, and the upper layer software can read the verification data from the corresponding register address and provide it to the user.
  • the data to be simulated in the corresponding global stream register address is read.
  • the verification data of the simulated neural network is obtained, and the verification data of the simulated neural network is returned to the corresponding global flow registration address.
  • the customized data stream hardware parameters and the data to be simulated are acquired.
  • the customized data stream hardware parameters include registration configuration parameters, neural network structure diagrams, and neural network parameters.
  • the neural network structure diagrams include serial relationships between different neural network layers. ; Configure the corresponding register address in the C language environment according to the register configuration parameters, and register the data to be simulated, the neural network structure diagram and the neural network parameters to the corresponding register address; simulate and construct the correspondence according to the neural network structure diagram in the corresponding registered address
  • the simulated neural network includes the data flow relationship between different neural network layers.
  • the data flow relationship is obtained according to the serial relationship; the data to be simulated and the neural network parameters in the corresponding registered address are input into the simulated neural network for simulation calculation, Get the verification data and return the corresponding register address. Since the workflow of the customized data flow hardware is simulated in the C language environment, the hardware part of the system-level simulation verification is realized through the C language environment, the software part and the hardware part can form the data flow calculation in the same environment, which is convenient for development In the process, the software part and the hardware part are coordinated and verified to improve development efficiency.
  • FIG. 3 it is a flowchart of an embodiment provided for another customized data stream hardware simulation simulation method according to the present application.
  • the above-mentioned customized data stream hardware simulation simulation method includes the steps:
  • S202 Configure a corresponding register address in the C language environment according to the register configuration parameter, and register the data to be simulated, the neural network structure diagram, and the neural network parameter to the corresponding register address.
  • S203 Simulate and construct a corresponding simulated neural network according to the neural network structure diagram in the corresponding registered address.
  • the above quantization is quantization based on quantization information
  • the above quantization information is included in the above neural network structure diagram
  • the above quantization information includes information for quantizing data into an 8-bit unit length.
  • the data to be simulated can be quantized according to the above-mentioned quantization information, and the simulated data can be quantized into 8bit data.
  • the above-mentioned quantization can be done by a compiler.
  • r refers to the floating point value, which is the data input by the user
  • q refers to the quantized data
  • z is the offset value
  • s is the zoom
  • the values, s and z are generated by the compiler.
  • the input data to be simulated is quantized by the compiler to obtain the quantized simulation input data.
  • the obtained simulation input data and neural network parameters are the same data type.
  • the neural network parameters are different from the data to be simulated.
  • the neural network parameters are of the hardware data type, that is, the integer data type, and the data to be simulated is the floating point data type. After being quantized by the compiler, the data to be simulated becomes simulation input data, which is the same data type as the neural network parameters.
  • the data to be simulated is obtained, and according to the quantization information, the data to be simulated is converted into the data to be simulated with a unit length of 8 bits to obtain simulation input data with a unit length of 8 bits.
  • the neural network parameters are also 8-bit unit length data, and the neural network parameters include weight parameters and bias parameters.
  • S205 Input the data to be simulated with a unit length of 8 bits into the simulated neural network.
  • the obtained layer verification data is int32 data.
  • the int32 data will be quantified to obtain the layer verification data of 8bit unit length.
  • S207 Input the 8-bit unit length layer verification data corresponding to the previous neural network layer to the current layer neural network layer, and after the calculation is completed, return the corresponding layer verification data to the registration address corresponding to the current neural network layer.
  • the above-mentioned previous neural network layer is the previous neural network layer of the current neural network layer.
  • the simulation calculation is completed, and the verification data corresponding to the data to be simulated is obtained.
  • the verification data is used for comparison with real hardware calculation results.
  • the obtained verification data is written back to the corresponding registered address, and the upper layer software can read the verification data from the corresponding registered address and provide it to the user.
  • the customized data stream hardware parameters and the data to be simulated are acquired.
  • the customized data stream hardware parameters include registration configuration parameters, neural network structure diagrams, and neural network parameters.
  • the neural network structure diagrams include serial relationships between different neural network layers. ; Configure the corresponding register address in the C language environment according to the register configuration parameters, and register the data to be simulated, the neural network structure diagram and the neural network parameters to the corresponding register address; simulate and construct the correspondence according to the neural network structure diagram in the corresponding registered address
  • the simulated neural network includes the data flow relationship between different neural network layers.
  • the data flow relationship is obtained according to the serial relationship; the data to be simulated and the neural network parameters in the corresponding registered address are input into the simulated neural network for simulation calculation, Get the verification data and return the corresponding register address. Since the workflow of the customized data flow hardware is simulated in the C language environment, the hardware part of the system-level simulation verification is realized through the C language environment, the software part and the hardware part can form the data flow calculation in the same environment, which is convenient for development In the process, the software part and the hardware part are coordinated and verified to improve development efficiency.
  • the simulation calculation is closer to the result of the hardware calculation, and the data calculation amount of the hardware data type is less than the calculation amount of the floating point type.
  • Can also improve the calculation speed of neural network simulation. The entire calculation process is closer to the calculation mode of the hardware, reducing irrelevant content in floating-point calculations, and facilitating the hardware to be used for output verification.
  • the calculation mode and operation mode are consistent with the hardware, the final calculation result of the hardware can be directly simulated.
  • the computer program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. When executed, it may include the procedures of the above-mentioned method embodiments.
  • the aforementioned storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
  • this is a schematic structural diagram of a customized data stream hardware simulation simulation device provided by this embodiment, and the foregoing device 400 includes:
  • the acquiring module 401 is used to acquire customized data stream hardware parameters and data to be simulated.
  • the customized data stream hardware parameters include registered configuration parameters, a neural network structure diagram, and neural network parameters, and the neural network structure diagram includes different neural network layers.
  • the configuration module 402 is configured to configure a corresponding register address in the C language environment according to the registered configuration parameters, and register the data to be simulated, the neural network structure diagram, and the neural network parameters to the corresponding registered address;
  • the construction module 403 is configured to simulate and construct a corresponding simulated neural network according to the neural network structure diagram in the corresponding registered address.
  • the simulated neural network includes the data flow relationship between different neural network layers, and the data flow relationship is based on the Serial relationship is obtained;
  • the calculation module 404 is configured to input the data to be simulated and the neural network parameters in the corresponding registered address into the simulated neural network for simulation calculation, obtain verification data, and return the corresponding registered address.
  • the registered configuration parameters include global flow configuration parameters and local flow configuration parameters
  • the neural network parameters include different neural network layer parameters
  • the configuration module 402 includes:
  • the first configuration unit 4021 is configured to respectively configure the data to be simulated and the global flow register address corresponding to the neural network structure
  • the second configuration unit 4022 is configured to configure the local stream registration addresses corresponding to the different neural network layer parameters.
  • the construction module 403 includes:
  • the first reading unit 4031 is configured to read the neural network structure diagram in the corresponding global stream registration address
  • the construction unit 4032 is configured to simulate and construct a corresponding simulated neural network according to the neural network structure diagram.
  • the calculation module 404 includes:
  • the second reading unit 4041 is configured to read the data to be simulated in the corresponding global stream register address
  • the input unit 4042 inputs the data to be simulated into the simulated neural network
  • the calculation unit 4043 is configured to read the parameters of each layer of neural network in the corresponding local flow registration address and the corresponding data to be simulated for calculation, obtain the layer verification data corresponding to each layer of neural network, and compare the The layer verification data of the neural network layer is returned to the corresponding local stream registration address;
  • the returning unit 4044 is configured to obtain verification data of the simulated neural network after calculating the ratios of all neural network layers, and return the verification data of the simulated neural network to the corresponding global flow registration address.
  • the device further includes:
  • the quantization module 405 is configured to quantize the data to be simulated to obtain the data to be simulated with an 8-bit unit length
  • the calculation module 404 is further configured to input the data to be simulated with the length of the 8-bit unit into the simulated neural network.
  • the calculation unit 4043 further includes:
  • the reading subunit 40431 is configured to read the parameters of each neural network layer and the corresponding data to be simulated in the corresponding local flow registration address respectively for calculation;
  • the quantization subunit 40432 is used to quantify the layer verification data corresponding to the previous neural network layer after the layer verification data is calculated on the previous neural network layer to obtain the layer verification data with a unit length of 8 bits;
  • the calculation sub-unit 40433 is used to input the 8-bit unit length layer verification data corresponding to the previous neural network layer to the current layer neural network layer, and after the calculation is completed, return the obtained layer verification data to the current layer neural network layer The corresponding register address.
  • a customized data stream hardware simulation simulation device provided by an embodiment of the present application can implement various implementation manners in the method embodiments of FIGS. 2 to 3 and corresponding beneficial effects. To avoid repetition, details are not described herein again.
  • FIG. 10 is a block diagram of the basic structure of the computer device in this embodiment.
  • the computer device 10 includes a memory 1001, a processor 1002, and a network interface 1003 that communicate with each other through a system bus. It should be pointed out that the figure only shows the computer device 100 with the components 1001-1003, but it should be understood that it is not required to implement all the shown components, and more or fewer components may be implemented instead. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable GateArray, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
  • ASIC Application Specific Integrated Circuit
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable GateArray
  • DSP Digital Processor
  • the computer equipment can be computing equipment such as desktop computers, notebooks, palmtop computers, and cloud servers. Computer equipment can interact with customers through keyboard, mouse, remote control, touchpad or voice control equipment.
  • the memory 1001 includes at least one type of readable storage medium.
  • the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 1001 may be an internal storage unit of the computer device 10, such as a hard disk or a memory of the computer device 10.
  • the memory 1001 may also be an external storage device of the computer device 10, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 1001 may also include both an internal storage unit of the computer device 10 and an external storage device thereof.
  • the memory 1001 is generally used to store an operating system and various application software installed in the computer device 10, such as a program code for a customized data stream hardware simulation simulation method.
  • the memory 1001 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 1002 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 1002 is generally used to control the overall operation of the computer device 10.
  • the processor 1002 is used to run the program code stored in the memory 1001 or process data, for example, run the program code of a customized data stream hardware simulation simulation method.
  • the network interface 1003 may include a wireless network interface or a wired network interface, and the network interface 1003 is generally used to establish a communication connection between the computer device 10 and other electronic devices.
  • This application also provides another implementation manner, that is, a computer-readable storage medium storing a customized data stream hardware simulation simulation program, and the above-mentioned customized data stream hardware simulation simulation program can be at least One processor executes, so that at least one processor executes the steps of a customized data stream hardware simulation simulation method as described above.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute a customized data stream hardware simulation simulation method of each embodiment of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种人工智能领域的定制数据流硬件模拟仿真方法、装置、计算机设备及存储介质,其中,所述方法包括:获取定制数据流硬件参数以及待仿真数据(S101);根据寄存配置参数在C语言环境中配置对应的寄存地址,并将待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址(S102);根据对应寄存地址中的神经网络结构图模拟构建对应的模拟神经网络(S103);将对应寄存地址中的待仿真数据以及神经网络参数输入到模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址(S104)。由于在C语言环境中模拟定制数据流硬件的工作流,便于开发过程中软件部分与硬件部分的协调验证,提高开发效率。

Description

定制数据流硬件模拟仿真方法、装置、设备及存储介质 技术领域
本发明属于人工智能技术领域,尤其涉及一种定制数据流硬件模拟仿真方法、装置、设备及存储介质。
背景技术
在人工智能领域的集成电路设计中,验证所占据的时间周期甚至超过了设计,达到50%以上。对于涉及软硬件的协同工作的产品,验证变得尤为重要和复杂。定制数据流架构人工智能***的设计涉及软硬件的紧密合作,由于硬件和软件设计是分开的,如果完全等两边分开开发完再合并验证,将会使得整个开发流程冗长而繁杂,期间上层软件不知道如何控制硬件,下层硬件也无法得到准确的数据进行测试验证。
现阶段在工业界广泛应用的验证方法有:基于验证平台或软件的仿真验证,形式化验证,软硬件协同验证。其中以仿真验证最为广泛,也是集成电路设计中必不可少的一环。通过建立测试用例,检查RTL硬件设计在特定的激励下是否会产生相应的响应。
随着硬件设计的规模越来越大,构建***级仿真环境的开销越来越大。而且仿真的本质导致了硬件仿真的时间在硬件设计具有一定规模时,成为验证设计的瓶颈。硬件设计具有一定规模时,硬件仿真的时间极长。
完整的硬件***级的验证离不开软件的支持。在数据流人工智能加速芯片的工作流程中,硬件***需要提供神经网络每一层的输入数据,才能进行计算。而若这些数据不能及时而正确地产生并在***仿真时送给硬件,会给硬件的开发带来很大的麻烦,降低开发的效率。
因此,现有的硬件***级的仿真方法存在仿真速度低,导致定制数据流产品开发效率低的问题。
发明内容
本发明实施例提供一种定制数据流硬件模拟仿真方法,旨在解决现有的硬件***级的仿真方法存在仿真速度低,导致定制数据流产品开发效率低的问题。
本发明实施例是这样实现的,提供一种定制数据流硬件模拟仿真方法,包括步骤:
获取定制数据流硬件参数以及待仿真数据,所述定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,所述神经网络结构图包括不同神经网络层间的串行关系;
根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;
根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络,所述模拟神经网络包括不同神经网络层间的数据流关系,所述数据流关系根据所述串行关系得到;
将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
更进一步的,所述寄存配置参数包括全局流配置参数与局部流配置参数,所述神经网络参数包括不同神经网络层参数,所述根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址的具体步骤包括:
分别配置所述待仿真数据以及神经网络结构图对应的全局流寄存地址;
配置所述不同神经网络层参数对应的局部流寄存地址。
更进一步的,所述根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络的步骤具体包括:
读取对应的全局流寄存地址中所述神经网络结构图;
根据所述神经网络结构图,模拟构建对应的模拟神经网络。
更进一步的,所述将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址的步骤具体包括:
读取对应的全局流寄存地址中的所述待仿真数据;
将所述待仿真数据输入到所述模拟神经网络中;
分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将所述每层神经网络层的层验证数据返回到对应的局部流寄存地址;
在所有神经网络层计算完比后,得到模拟神经网络的验证数据,并将所述模拟神经网络的验证数据返回到与待仿真数据对应的全局流寄存地址。
更进一步的,在所述将所述待仿真数据输入到所述模拟神经网络中之前,所述方法还包括:
对所述待仿真数据进行量化,得到8bit单元长度的待仿真数据;
所述将所述待仿真数据输入到神经网络中的步骤具体包括:
将所述8bit单元长度的待仿真数据输入到所述模拟神经网络中。
更进一步的,所述分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将所述每层神经网络层的层验证数据返回到对应的局部流寄存地址的步骤具体还包括:
分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算;
在上一神经网络层计算得到层验证数据后,将上一神经网络层对应的层验证数据进行量化,得到8bit单元长度的层验证数据;
将所述上一神经网络层对应的8bit单元长度的层验证数据输入到当前层神 经网络层,计算完成后,将得到的层验证数据返回所述当前层神经网络层对应的寄存地址。
本发明还提供一种定制数据流硬件模拟仿真装置,所述装置包括:
获取模块,用于获取定制数据流硬件参数以及待仿真数据,所述定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,所述神经网络结构图包括不同神经网络层间的串行关系;
配置模块,用于根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;
构建模块,用于根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络,所述模拟神经网络包括不同神经网络层间的数据流关系,所述数据流关系根据所述串行关系得到;
计算模块,用于将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
更进一步的,所述寄存配置参数包括全局流配置参数与局部流配置参数,所述神经网络参数包括不同神经网络层参数,所述配置模块包括:
第一配置单元,用于分别配置所述待仿真数据以及神经网络结构对应的全局流寄存地址;
第二配置单元,用于配置所述不同神经网络层参数对应的局部流寄存地址。
本发明还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现本发明实施例中任一项所述的定制数据流硬件模拟仿真仿真方法的步骤。
本发明还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现本发明实施例中任一项所述的定制数据流硬件模拟仿真方法的步骤。
本发明实现的有益效果:本发明由于在C语言环境中模拟定制数据流硬件 的工作流,将***级的仿真验证中硬件部分仿真通过C语言环境进行实现,软件部分与硬件部分可以在同一环境中形成数据流的计算,便于开发过程中软件部分与硬件部分的协调验证,提高开发效率。
附图说明
图1是本发明实施例提供的一种定制数据流硬件模拟仿真方法的流程示意图;
图2是本发明实施例提供的另一种定制数据流硬件模拟仿真方法的流程示意图;
图3是本发明实施例提供的另一种定制数据流硬件模拟仿真方法的流程示意图;
图4是本发明实施例提供的一种定制数据流硬件模拟仿真装置的结构示意图;
图5是本发明实施例提供的一种配置模块402的具体流程示意图;
图6是本发明实施例提供的一种构建模块403的具体流程示意图;
图7是本发明实施例提供的一种计算模块404的具体流程示意图;
图8是本发明实施例提供的定制数据流硬件模拟仿真装置的结构示意图;
图9是本发明实施例提供的一种构计算单元4043的具体流程示意图;
图10是本发明实施例的计算机设备的一个实施例的结构示意图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
现有的硬件***级的仿真方法存在仿真速度低,导致定制数据流产品开发效率低的问题。本发明由于在C语言环境中模拟定制数据流硬件的工作流,将 ***级的仿真验证中硬件部分仿真通过C语言环境进行实现,软件部分与硬件部分可以在同一环境中形成数据流的计算,便于开发过程中软件部分与硬件部分的协调,提高开发效率。
如图1所示,图1为本发明实施例提供的一种定制数据流硬件的架构图,架构103与片外存储模块(DDR)101以及处CPU102通过互连进行连接,架构103包括:第一存储模块104、全局数据流网络105以及数据流引擎106,上述第一存储模块104通过互连连接上述片外存储模块101的同时,还通过互连连接上述全局数据流网络105,上述数据流引擎106通过互连连接上述全局数据流网络105以使上述数据流引擎106可以实现并行或串行。上述的数据流引擎106可以包括:计算核(或称为计算模块)、第二存储模块108以及局部数据流网络107,计算核可以包括用于计算的内核,比如卷积核109、池化核110以及激活函数核111等,当然,还可以包括除示例卷积核109、池化核110以及激活函数核111外的其他计算核,在此并不做限定,也可以包括在神经网络中所有用于计算的内核。上述的第一存储模块104与上述的第二存储模块108可以是片上缓存模块,也可以是DDR或高速DDR存储模块等。上述的数据流引擎106可以理解为支持数据流处理的计算引擎,也可以理解为专用于数据流处理的计算引擎。
上述的数据流架构可以是在FPGA可编程门阵列上进行定制。
如图2所示,为根据本申请的一种定制数据流硬件模拟仿真方法所提供的一个实施例的流程图。上述的定制数据流硬件模拟仿真方法,包括步骤:
S101,获取定制数据流硬件参数以及待仿真数据。
其中,上述的定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数。
上述的寄存配置参数用于在C语言环境的存储器中开辟对应的存储区,以行成对应的寄存地址。
上述的神经网络结构图可以是识别类的神经网络结构图,比如人脸识别,车辆识别等,也可以是检测类的神经网络结构图,比如物体检测,车辆检测等。
上述的神经网络结构图可以理解为神经网络结构,进一步的,可以理解为用于各类神经网络模型的神经网络结构。上述的神经网络结构是以层为计算单 元的,包含且不限于:卷积层、池化层、ReLU、全连接层等。
神经网络结构图包括不同神经网络层间的串行关系,比如卷积层、偏置层、池化层等神经网络层之间的串行关系。
上述的神经网络参数是指的神经网络结构中的每一层对应的参数,可以是权重参数、偏置参数等。上述的各类神经网络模型可以是预先训练好的对应神经网络模型,由于神经网络模型是预先训练好的,其神经网络参数的属性也是训练好的,因此,在仿真软件中配置好的神经网络可以根据配置的神经网络参数直接使用,不需要再对神经网络进行训练,根据该预先训练好的神经网络模型,可以通过神经网络结构图以及参数进行统一描述。
上述获取神经网络结构图以及神经网络参数可以是在本地进行获取,也可以是云服务器上进行获取,比如:上述的神经网络结构图以及神经网络参数可以成套的存储在本地,在使用时自动进行选择或者用户进行选择,或者是将神经网络结构图以及神经网络参数上传到云服务器中,在使用时通过网络将云服务器中的神经网络结构图以及神经网络参数下载下来。
S102,根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址。
其中,上述的寄存配置参数为步骤S101中获取到的寄存配置参数,该寄存配置参数包括配置多少个寄存地址、以及对应寄存地址的大小。每个寄存地址可以存储对应的数据,比如,配置一个寄存地址用于存储仿真数据,配置一个寄存地址用于存储一个神经网络层的参数等,在需要时,从寄存地址中直接读取对应的数据即可。
另外,寄存地址也用于存储对应的计算结果,比如,卷积神经网络层对应的寄存地址存储有对应的权重参数之外,还用于在完成卷积计算后,存储对应的卷积结果。
在一个实施例中,上述寄存配置参数包括全局流配置参数与局部流配置参数,上述神经网络参数包括不同神经网络层参数。
分别配置所述待仿真数据以及神经网络结构图对应的全局流寄存地址。
由于待仿真数据以及神经网络结构图为全局数据,可以分别为待仿真数据以及神经网络结构图配置不同的全局流寄存地址。
配置所述不同神经网络层参数对应的局部流寄存地址。
由于一个神经网络中存在多个神经网络层,每个神经网络层都需要有不同的配置参数进行计算实现,比如卷积层需要权重参数与输入数据进行计算,偏置层需要偏置参数与输入数据进行计算。
因此,可以根据不同神经网络层参数配置不同的局部流寄存地址,在计算不同神经网络层时,读取对应寄存地址的参数与输入数据进行计算。比如,在进行卷积计算时,读取对应卷积层权重参数所在的寄存地址中的权重参数,与输入数据进行计算。
上述的C语言环境中配置对应的寄存地址,可以是在C语言环境中比如电脑的硬盘中开辟对应的存储空间作为寄存地址,也可以是在C语言环境中配置好的虚拟存储空间中开辟对应的存储空间作为寄存地址。
S103,根据对应寄存地址中的神经网络结构图模拟构建对应的模拟神经网络。
神经网络结构图包括不同神经网络层间的串行关系,根据该串行关系,确定数据流在不同神经网络层间的数据流关系,从而构建得到对应的模拟神经网络,需要说明的是,上述的模拟神经网络为定制数据流模拟神经网络。
上述的不同神经网络层间的数据流关系用于描述数据的流向。
在一个实施例中,在存储有神经网络结构图的全局流寄存地址中读取对应的神经网络结构图。根据读取到的神经网络结构图,模拟构建对应的模拟神经网络。这样,在全局流寄存地址中的神经网络结构图可以实现复用。
S104,将对应寄存地址中的待仿真数据以及神经网络参数输入到模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
在存储有待仿真的寄存地址中读取出对应的待仿真数据,在经过模拟神经 网络的计算完成后,得到用以与真实硬件计算结果进行对比的验证数据。在计算完成后,将得到的验证数据写回对应的寄存地址,上层的软件可以从对应的寄存地址中读取该验证数据,提供给用户。
在一个实施例中,读取对应的全局流寄存地址中的所述待仿真数据。
将待仿真数据输入到上述的模拟神经网络中。
分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将每层神经网络层的层验证数据返回到对应的局部流寄存地址。这样,可以复用每层神经网络层对应的层验证数据。
在所有神经网络层计算完比后,得到模拟神经网络的验证数据,并将模拟神经网络的验证数据返回到对应的全局流寄存地址。
本发明实施例中,获取定制数据流硬件参数以及待仿真数据,定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,神经网络结构图包括不同神经网络层间的串行关系;根据寄存配置参数在C语言环境中配置对应的寄存地址,并将待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;根据对应寄存地址中的神经网络结构图模拟构建对应的模拟神经网络,模拟神经网络包括不同神经网络层间的数据流关系,数据流关系根据串行关系得到;将对应寄存地址中的待仿真数据以及神经网络参数输入到模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。由于在C语言环境中模拟定制数据流硬件的工作流,将***级的仿真验证中硬件部分仿真通过C语言环境进行实现,软件部分与硬件部分可以在同一环境中形成数据流的计算,便于开发过程中软件部分与硬件部分的协调验证,提高开发效率。
如图3所示,为根据本申请的另一种定制数据流硬件模拟仿真方法所提供的一个实施例的流程图。上述的定制数据流硬件模拟仿真方法,包括步骤:
S201,获取定制数据流硬件参数以及待仿真数据。
S202,根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将 待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址。
S203,根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络。
S204,对待仿真数据进行量化,得到8bit单元长度的待仿真数据。
其中,上述的量化为根据量化信息进行量化,上述的量化信息包括在上述的神经网络结构图中,上述的量化信息包括将数据量化为8bit单元长度的信息。
可以根据上述的量化信息将待仿真数据进行量化,将仿真数据量化为8bit数据等。
上述的量化可以是通过编译器进行完成。
具体的,可以通过公式r=s×(q-z)进行计算,其中,r指的是浮点数值,就是用户输入的数据,q指的是量化后的数据,z是偏移值,s是缩放值,s和z是编译器产生的。
根据公式r=s×(q-z)可得,量化后的数据为q=r/s+z。
由于s和z是编译器产生的,r是用户输入的待仿真数据,所以通过编译器对输入的待仿真数据进行量化,得到量化后的仿真输入数据。
其中,经过量化后,得到的仿真输入数据与神经网络参数为同一数据类型。
需要说明的是,神经网络参数与待仿真数据不同,神经网络参数为硬件数据类型,即为整型数据类型,而待仿真数据为浮点数据类型。在经过编译器的量化后,待仿真数据变成仿真输入数据,与神经网络参数为同一数据类型。
进一步的,获取待仿真数据,并根据量化信息,将待仿真数据转换为8bit单元长度的待仿真数据,得到8bit单元长度的仿真输入数据。
其中,神经网络参数也是8bit单元长度的数据,神经网络参数包括权重参数及偏置参数。
S205,将8bit单元长度的待仿真数据输入到模拟神经网络中。
S206,在每层神经网络层计算得到层验证数据后,将上一神经网络层对应的层验证数据进行量化,得到8bit单元长度的层验证数据。
在上一神经网络层计算后,得到的层验证数据为int32的数据,在需要输入到当前神经网络层计算时,会将该int32的数据进行量化,得到8bit单元长度的层验证数据。
S207,将上一神经网络层对应的8bit单元长度的层验证数据输入到当前层神经网络层,计算完成后,将对应的层验证数据返回当前神经网络层对应的寄存地址。
上述的上一神经网络层为当前神经网络层的上一神经网络层。将每层神经网络层对应的层验证数据写回对应的神经网络层对应的寄存地址,比如,将卷积层计算得到的卷积结果写加存储有卷积参数的局部流寄存地址,这样,可以实现卷积参数的快速复用。
S208,在所有神经网络层计算完比后,得到模拟神经网络的验证数据,并将模拟神经网络的验证数据返回到与待仿真数据对应的全局流寄存地址。
当最后的神经网络层计算完成,说明仿真计算完成,得到对应待仿真数据的验证数据。验证数据用以与真实硬件计算结果进行对比。在计算完成后,将得到的验证数据写回对应的寄存地址,上层的软件可以从对应的寄存地址中读取该验证数据,提供给用户。
本发明实施例中,获取定制数据流硬件参数以及待仿真数据,定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,神经网络结构图包括不同神经网络层间的串行关系;根据寄存配置参数在C语言环境中配置对应的寄存地址,并将待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;根据对应寄存地址中的神经网络结构图模拟构建对应的模拟神经网络,模拟神经网络包括不同神经网络层间的数据流关系,数据流关系根据串行关系得到;将对应寄存地址中的待仿真数据以及神经网络参数输入到模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。由于在C语言环境中模拟定制数据流硬件的工作流,将***级的仿真验证中硬件部分仿真通过C语言环境进行实现,软件部分与硬件部分可以在同一环境中形成数据流 的计算,便于开发过程中软件部分与硬件部分的协调验证,提高开发效率。另外,由于将待仿真数据量化为与神经网络参数相同的硬件数据类型,在使用软件仿真时,使得仿真计算更贴近硬件计算的结果,且硬件数据类型的数据计算量小于浮点类型的计算量,还可以提高神经网络仿真的计算速度。整个计算流程更贴近硬件的计算模式,减少浮点计算中的不相关内容,便于硬件用作输出校验。同时,由于计算模式和操作模式和硬件一致,所以可以直接模拟硬件的最终计算结果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
如图4所示,为本实施例所提供的一种定制数据流硬件模拟仿真装置的结构示意图,上述装置400包括:
获取模块401,用于获取定制数据流硬件参数以及待仿真数据,所述定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,所述神经网络结构图包括不同神经网络层间的串行关系;
配置模块402,用于根据所述寄存配置参数在C语言环境中配置对应的寄 存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;
构建模块403,用于根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络,所述模拟神经网络包括不同神经网络层间的数据流关系,所述数据流关系根据所述串行关系得到;
计算模块404,用于将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
进一步地,如图5所示,所述寄存配置参数包括全局流配置参数与局部流配置参数,所述神经网络参数包括不同神经网络层参数,所述配置模块402包括:
第一配置单元4021,用于分别配置所述待仿真数据以及神经网络结构对应的全局流寄存地址;
第二配置单元4022,用于配置所述不同神经网络层参数对应的局部流寄存地址。
进一步地,如图6所示,所述构建模块403包括:
第一读取单元4031,用于读取对应的全局流寄存地址中所述神经网络结构图;
构建单元4032,用于根据所述神经网络结构图,模拟构建对应的模拟神经网络。
进一步的,如图7所示,所述计算模块404包括:
第二读取单元4041,用于读取对应的全局流寄存地址中的所述待仿真数据;
输入单元4042,将所述待仿真数据输入到所述模拟神经网络中;
计算单元4043,用于分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将所述每层神经网络层的层验证数据返回到对应的局部流寄存地址;
返回单元4044,用于在所有神经网络层计算完比后,得到模拟神经网络的验证数据,并将所述模拟神经网络的验证数据返回到对应的全局流寄存地址。
进一步的,如图8所示,所述装置还包括:
量化模块405,用于对所述待仿真数据进行量化,得到8bit单元长度的待仿真数据;
所述计算模块404还用于将所述8bit单元长度的待仿真数据输入到所述模拟神经网络中。
如图9所示,所述计算单元4043还包括:
读取子单元40431,用于分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算;
量化子单元40432,用于在上一神经网络层计算得到层验证数据后,将上一神经网络层对应的层验证数据进行量化,得到8bit单元长度的层验证数据;
计算子单元40433,用于将所述上一神经网络层对应的8bit单元长度的层验证数据输入到当前层神经网络层,计算完成后,将得到的层验证数据返回所述当前层神经网络层对应的寄存地址。
本申请实施例提供的一种定制数据流硬件模拟仿真装置能够实现图2至图3的方法实施例中的各个实施方式,以及相应有益效果,为避免重复,这里不再赘述。
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图10,图10为本实施例计算机设备基本结构框图。
计算机设备10包括通过***总线相互通信连接存储器1001、处理器1002、网络接口1003。需要指出的是,图中仅示出了具有组件1001-1003的计算机设备100,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated  Circuit,ASIC)、可编程门阵列(Field-Programmable GateArray,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。
计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。计算机设备可以与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。
存储器1001至少包括一种类型的可读存储介质,可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器1001可以是计算机设备10的内部存储单元,例如该计算机设备10的硬盘或内存。在另一些实施例中,存储器1001也可以是计算机设备10的外部存储设备,例如该计算机设备10上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器1001还可以既包括计算机设备10的内部存储单元也包括其外部存储设备。本实施例中,存储器1001通常用于存储安装于计算机设备10的操作***和各类应用软件,例如一种定制数据流硬件模拟仿真方法的程序代码等。此外,存储器1001还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器1002在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器1002通常用于控制计算机设备10的总体操作。本实施例中,处理器1002用于运行存储器1001中存储的程序代码或者处理数据,例如运行一种定制数据流硬件模拟仿真方法的程序代码。
网络接口1003可包括无线网络接口或有线网络接口,该网络接口1003通常用于在计算机设备10与其他电子设备之间建立通信连接。
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,计算 机可读存储介质存储有一种定制数据流硬件模拟仿真程序,上述一种定制数据流硬件模拟仿真程序可被至少一个处理器执行,以使至少一个处理器执行如上述的一种定制数据流硬件模拟仿真方法的步骤。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例的一种定制数据流硬件模拟仿真方法。
本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。
以上仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 一种定制数据流硬件模拟仿真方法,其特征在于,包括步骤:
    获取定制数据流硬件参数以及待仿真数据,所述定制数据流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,所述神经网络结构图包括不同神经网络层间的串行关系;
    根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;
    根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络,所述模拟神经网络包括不同神经网络层间的数据流关系,所述数据流关系根据所述串行关系得到;
    将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
  2. 根据权利要求1所述的定制数据流硬件模拟仿真方法,其特征在于,所述寄存配置参数包括全局流配置参数与局部流配置参数,所述神经网络参数包括不同神经网络层参数,所述根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址的具体步骤包括:
    分别配置所述待仿真数据以及神经网络结构图对应的全局流寄存地址;
    配置所述不同神经网络层参数对应的局部流寄存地址。
  3. 根据权利要求2所述的定制数据流硬件模拟仿真方法,其特征在于,所述根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络的步骤具体包括:
    读取对应的全局流寄存地址中所述神经网络结构图;
    根据所述神经网络结构图,模拟构建对应的模拟神经网络。
  4. 根据权利要求2所述的定制数据流硬件模拟仿真方法,其特征在于,所述将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经 网络进行仿真计算,得到验证数据,并返回对应的寄存地址的步骤具体包括:
    读取对应的全局流寄存地址中的所述待仿真数据;
    将所述待仿真数据输入到所述模拟神经网络中;
    分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将所述每层神经网络层的层验证数据返回到对应的局部流寄存地址;
    在所有神经网络层计算完比后,得到模拟神经网络的验证数据,并将所述模拟神经网络的验证数据返回到与待仿真数据对应的全局流寄存地址。
  5. 根据权利要求4所述的定制数据流硬件模拟仿真方法,其特征在于,在所述将所述待仿真数据输入到所述模拟神经网络中之前,所述方法还包括:
    对所述待仿真数据进行量化,得到8bit单元长度的待仿真数据;
    所述将所述待仿真数据输入到神经网络中的步骤具体包括:
    将所述8bit单元长度的待仿真数据输入到所述模拟神经网络中。
  6. 根据权利要求5所述的定制数据流硬件模拟仿真方法,其特征在于,所述分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算,得到每层神经网络层对应的层验证数据,并将所述每层神经网络层的层验证数据返回到对应的局部流寄存地址的步骤具体还包括:
    分别读取对应的局部流寄存地址中的每层神经网络层参数与对应的待仿真数据进行计算;
    在上一神经网络层计算得到层验证数据后,将上一神经网络层对应的层验证数据进行量化,得到8bit单元长度的层验证数据;
    将所述上一神经网络层对应的8bit单元长度的层验证数据输入到当前层神经网络层,计算完成后,将得到的层验证数据返回所述当前层神经网络层对应的寄存地址。
  7. 一种定制数据流硬件模拟仿真装置,其特征在于,所述装置包括:
    获取模块,用于获取定制数据流硬件参数以及待仿真数据,所述定制数据 流硬件参数包括寄存配置参数、神经网络结构图与神经网络参数,所述神经网络结构图包括不同神经网络层间的串行关系;
    配置模块,用于根据所述寄存配置参数在C语言环境中配置对应的寄存地址,并将所述待仿真数据、神经网络结构图与神经网络参数寄存到对应的寄存地址;
    构建模块,用于根据对应寄存地址中的所述神经网络结构图模拟构建对应的模拟神经网络,所述模拟神经网络包括不同神经网络层间的数据流关系,所述数据流关系根据所述串行关系得到;
    计算模块,用于将对应寄存地址中的所述待仿真数据以及神经网络参数输入到所述模拟神经网络进行仿真计算,得到验证数据,并返回对应的寄存地址。
  8. 根据权利要求7所述的定制数据流硬件模拟仿真装置,其特征在于,所述寄存配置参数包括全局流配置参数与局部流配置参数,所述神经网络参数包括不同神经网络层参数,所述配置模块包括:
    第一配置单元,用于分别配置所述待仿真数据以及神经网络结构对应的全局流寄存地址;
    第二配置单元,用于配置所述不同神经网络层参数对应的局部流寄存地址。
  9. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的定制数据流硬件模拟仿真方法的步骤。
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的定制数据流硬件模拟仿真方法的步骤。
PCT/CN2019/110858 2019-10-12 2019-10-12 定制数据流硬件模拟仿真方法、装置、设备及存储介质 WO2021068253A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201980066982.5A CN113272813B (zh) 2019-10-12 2019-10-12 定制数据流硬件模拟仿真方法、装置、设备及存储介质
PCT/CN2019/110858 WO2021068253A1 (zh) 2019-10-12 2019-10-12 定制数据流硬件模拟仿真方法、装置、设备及存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/110858 WO2021068253A1 (zh) 2019-10-12 2019-10-12 定制数据流硬件模拟仿真方法、装置、设备及存储介质

Publications (1)

Publication Number Publication Date
WO2021068253A1 true WO2021068253A1 (zh) 2021-04-15

Family

ID=75437657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/110858 WO2021068253A1 (zh) 2019-10-12 2019-10-12 定制数据流硬件模拟仿真方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN113272813B (zh)
WO (1) WO2021068253A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707650B (zh) * 2021-12-31 2024-06-14 浙江芯劢微电子股份有限公司 一种提高仿真效率的仿真实现方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170074932A1 (en) * 2014-05-29 2017-03-16 Universiteit Gent Integrated circuit verification using parameterized configuration
CN107016175A (zh) * 2017-03-23 2017-08-04 中国科学院计算技术研究所 适用神经网络处理器的自动化设计方法、装置及优化方法
CN109496319A (zh) * 2018-01-15 2019-03-19 深圳鲲云信息科技有限公司 人工智能处理装置硬件优化方法、***、存储介质、终端
CN110245750A (zh) * 2019-06-14 2019-09-17 西南科技大学 一种基于fpga的神经网络数值模拟方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170074932A1 (en) * 2014-05-29 2017-03-16 Universiteit Gent Integrated circuit verification using parameterized configuration
CN107016175A (zh) * 2017-03-23 2017-08-04 中国科学院计算技术研究所 适用神经网络处理器的自动化设计方法、装置及优化方法
CN109496319A (zh) * 2018-01-15 2019-03-19 深圳鲲云信息科技有限公司 人工智能处理装置硬件优化方法、***、存储介质、终端
CN110245750A (zh) * 2019-06-14 2019-09-17 西南科技大学 一种基于fpga的神经网络数值模拟方法

Also Published As

Publication number Publication date
CN113272813A (zh) 2021-08-17
CN113272813B (zh) 2023-05-05

Similar Documents

Publication Publication Date Title
TWI529552B (zh) 用於實施具有電感知之電子電路設計的約束驗證之方法、系統及製造物
US7765500B2 (en) Automated generation of theoretical performance analysis based upon workload and design configuration
US10007492B2 (en) System and method for automatically generating device drivers for run time environments
US8838430B1 (en) Detection of memory access violation in simulations
CN107436762A (zh) 一种寄存器代码文件生成方法、装置和电子设备
CN104598659B (zh) 对数字电路进行仿真的方法和设备
US8271252B2 (en) Automatic verification of device models
WO2022126902A1 (zh) 模型压缩方法、装置、电子设备及介质
CN106803799A (zh) 一种性能测试方法和装置
CN114462338A (zh) 一种集成电路的验证方法、装置、计算机设备及存储介质
CN109840878B (zh) 一种基于SystemC的面向GPU参数管理方法
US20210350230A1 (en) Data dividing method and processor for convolution operation
US20190065962A1 (en) Systems And Methods For Determining Circuit-Level Effects On Classifier Accuracy
CN109582906A (zh) 数据可靠度的确定方法、装置、设备和存储介质
CN114707650A (zh) 一种提高仿真效率的仿真实现方法
US20120166168A1 (en) Methods and systems for fault-tolerant power analysis
WO2021031137A1 (zh) 人工智能应用开发***、计算机设备及存储介质
WO2021068253A1 (zh) 定制数据流硬件模拟仿真方法、装置、设备及存储介质
WO2021068249A1 (zh) 运行时硬件模拟仿真方法、装置、设备及存储介质
CN116168403A (zh) 医疗数据分类模型训练方法、分类方法、装置及相关介质
US12001771B2 (en) Variant model-based compilation for analog simulation
CN112242959B (zh) 微服务限流控制方法、装置、设备及计算机存储介质
CN109426503A (zh) 提供仿真激励的方法及装置
US8527923B1 (en) System, method, and computer program product for hierarchical formal hardware verification of floating-point division and/or square root algorithmic designs using automatic sequential equivalence checking
CN111859985A (zh) Ai客服模型测试方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19948811

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 16/09/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19948811

Country of ref document: EP

Kind code of ref document: A1