WO2023070301A1 - Method, apparatus and device for logic simulation - Google Patents

Method, apparatus and device for logic simulation Download PDF

Info

Publication number
WO2023070301A1
WO2023070301A1 PCT/CN2021/126318 CN2021126318W WO2023070301A1 WO 2023070301 A1 WO2023070301 A1 WO 2023070301A1 CN 2021126318 W CN2021126318 W CN 2021126318W WO 2023070301 A1 WO2023070301 A1 WO 2023070301A1
Authority
WO
WIPO (PCT)
Prior art keywords
logic
time frame
output
circuit
level
Prior art date
Application number
PCT/CN2021/126318
Other languages
French (fr)
Chinese (zh)
Inventor
王柳峥
黄宇
张炜铭
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180100883.1A priority Critical patent/CN117751295A/en
Priority to PCT/CN2021/126318 priority patent/WO2023070301A1/en
Publication of WO2023070301A1 publication Critical patent/WO2023070301A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01RMEASURING ELECTRIC VARIABLES; MEASURING MAGNETIC VARIABLES
    • G01R31/00Arrangements for testing electric properties; Arrangements for locating electric faults; Arrangements for electrical testing characterised by what is being tested not provided for elsewhere
    • G01R31/28Testing of electronic circuits, e.g. by signal tracer
    • G01R31/317Testing of digital circuits
    • G01R31/3181Functional testing
    • G01R31/3185Reconfiguring for testing, e.g. LSSD, partitioning

Definitions

  • the calculation time of these logic gates can be omitted when no calculation is required, thereby further reducing the total time of logic simulation and also reducing computing resources Consumption of limited computing resources to process other logic gates that require computation. This further reduces the overall time for logic simulation.
  • the generation unit is further configured to: sequentially calculate the output of each level circuit frame by time frame; and based on the last time frame The output of each level circuit in the logic simulation output set is generated.
  • FIG. 2 shows a schematic block diagram of a simulation process of a logic circuit according to some embodiments of the present disclosure.
  • FIG. 8 shows a schematic diagram of a simulation process executed by a graphics processor according to some embodiments of the present disclosure.
  • the ATPG device 20 is configured to generate ATPG data for logic simulation and transmit the ATPG data to the electronic device 10 .
  • the ATPG device 20 may be integrated with the electronic device 10 , which is not limited by the present disclosure.
  • the electronic device 10 may include input devices, communication devices, displays, audio devices and other components not shown here.
  • the electronic device 10 may include, for example, devices with computing functions such as desktop computers, notebooks, workstations, and servers.
  • the netlist file used to describe the logic circuit can be transmitted to the electronic device 10 in various wired or wireless ways. Alternatively, the electronic device 10 may also use a storage medium storing the netlist file to read the netlist file.
  • the ATPG device 20 can generate different ATPG data for different logic circuits.
  • FIG. 3 shows an example circuit diagram of an illustrative logic circuit 30 according to some embodiments of the present disclosure.
  • the logic circuit 30 is only used to illustrate the principle of the present disclosure, but not to limit the scope of the present disclosure. It is understood that other configurations of logic circuits are also possible.
  • the logic circuit 30 may include, for example, a first original data input PI1, an AND gate 31, a first flip-flop U1, a second original data input PI2, an inverter 32, a second flip-flop U2, a first buffer 33, a second buffer device 34 and the original output PO.
  • the input of the AND gate 31 is coupled to the first raw data input PI1 and the output of the first flip-flop U1.
  • primitive inputs, primitive outputs and sequential logic gates may be divided into different levels.
  • the original output can be concentrated in the last stage of the combinational logic gate, or it can be distributed and arranged in the next stage of the respective driving sources closely following the respective driving sources.
  • the CPU judges whether the current time frame is the last time frame? If the current time frame is the last time frame, proceed to 620 to end the logic simulation. If not the last time frame, proceed to 618. At 618, the CPU sets the next time frame as the current time frame, and loops through 608-616 until the last time frame is reached and the logic simulation ends.
  • the original input, the original output and the sequential logic gate may be divided into circuits of different levels based on their respective connection relationships.
  • the second-level circuits including combinational logic gates may also be divided into different sub-level circuits. This disclosure is not limited in this regard.
  • FIG. 13 shows a schematic block diagram of an electronic device 1300 according to some embodiments of the present disclosure.
  • the electronic device 1300 may include multiple modules for performing corresponding steps in the methods discussed in FIGS. 6-12 .
  • an electronic device 1300 includes a receiving unit 1302 and a generating unit 1304 .
  • the receiving unit 1302 is used for receiving hierarchical data of the logic circuit.
  • the hierarchical data represents multiple hierarchical circuits of the logical circuit, and the multiple hierarchical circuits are divided based on the connection relationship of multiple logic gates in the logical circuit.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Tests Of Electronic Circuits (AREA)

Abstract

A method and device for simulating a logic circuit. The method comprises: grading a logic circuit, and performing parallel calculation on a plurality of logic gates in the graded stages by using a test vector by a GPU in the same time frame so as to finally obtain a logic simulation output set. Compared to conventional serial simulation executed by a CPU such as hard coding, the GPU performs parallel simulation on a plurality of logic gates in the graded circuit, so that the logic simulation time may be greatly shortened.

Description

用于逻辑仿真的方法、装置及设备Method, device and equipment for logic simulation 技术领域technical field
本公开涉及电子领域,更具体而言涉及用于集成电路的仿真的方法、装置和设备。The present disclosure relates to the field of electronics, and more particularly to methods, apparatus and apparatus for simulation of integrated circuits.
背景技术Background technique
已经开发出多种电子设计自动化(electronic design automation,EDA)工具来完成超大规模集成电路(very large scale integration,VLSI)芯片的功能设计、综合、验证、物理设计(包括布局、布线、版图、设计规则检查等)等设计流程。在诸如数字集成电路之类的集成电路的设计过程中,一个重要的阶段是逻辑电路的仿真,其可以在流片之前验证电路设计的正确性。逻辑电路的仿真通常包括逻辑仿真和故障仿真两个阶段。逻辑仿真是一种根据电路结构和给定的输入激励,来推导出逻辑电路输出的技术。它是数字电路设计中的常用技术,它通过计算机来对电路的行为进行模拟,来检验电路的正确性。它也可用来检查测试向量(包含输入激励和预期输出)的正确性。逻辑仿真可以在不同抽象层次的电路进行,例如行为级(behavior level)、寄存器传输级(register transfer level,RTL)、门级(gate level)、晶体管级(transistor level)等。A variety of electronic design automation (EDA) tools have been developed to complete the functional design, synthesis, verification, and physical design (including layout, wiring, layout, and design) of very large scale integration (VLSI) chips. rule checking, etc.) and other design processes. In the design process of integrated circuits such as digital integrated circuits, an important stage is the simulation of logic circuits, which can verify the correctness of circuit design before tape-out. The simulation of logic circuit usually includes two stages of logic simulation and fault simulation. Logic simulation is a technique for deriving the output of a logic circuit based on the circuit structure and given input stimuli. It is a common technique in digital circuit design, which uses a computer to simulate the behavior of the circuit to verify the correctness of the circuit. It can also be used to check the correctness of test vectors (including input stimuli and expected outputs). Logic simulation can be performed on circuits at different levels of abstraction, such as behavior level, register transfer level (RTL), gate level, transistor level, etc.
常规逻辑仿真例如包括基于硬编码的逻辑仿真和基于事件驱动技术的逻辑仿真,其通常在中央处理器(central processing unit,CPU)上串行执行。集成电路通常包括上千万的逻辑门,常规的逻辑仿真消耗相当长的仿真时间。Conventional logic simulations include, for example, hard-coded logic simulations and event-driven technology-based logic simulations, which are usually serially executed on a central processing unit (CPU). Integrated circuits usually include tens of millions of logic gates, and conventional logic simulation consumes a considerable amount of simulation time.
发明内容Contents of the invention
鉴于上述问题,本公开的实施例旨在提供一种用于逻辑仿真的方法、存储介质、程序产品和电子设备。In view of the above problems, embodiments of the present disclosure aim to provide a method, storage medium, program product, and electronic device for logic simulation.
根据本公开的第一方面,提供一种用于逻辑仿真的方法。该方法包括接收逻辑电路的分级数据。分级数据表示逻辑电路的多个层级电路,多个层级电路是基于逻辑电路中的多个逻辑门的连接关系而被划分的。该方法还包括基于分级数据和针对逻辑电路的测试向量集,生成逻辑仿真输出集。生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,多个逻辑门的逻辑输出值与逻辑仿真输出集相关联。通过使用诸如图形处理器(graphics processing unit,GPU)之类的加速器并行计算在同一时间帧中的多个逻辑门的输出,相比于CPU的常规串行计算,可以显著降低处理的时间。此外,通过将逻辑仿真电路分级,可以确保逻辑仿真的正确性,这是因为并行处理的多个逻辑门位于同一层级电路中并且在逻辑仿真结果方面彼此之间并不具有因果关联性。According to a first aspect of the present disclosure, a method for logic simulation is provided. The method includes receiving hierarchical data for logic circuits. The hierarchical data represents a plurality of hierarchical circuits of the logic circuit, and the plurality of hierarchical circuits are divided based on the connection relationship of the plurality of logic gates in the logic circuit. The method also includes generating a set of logic simulation outputs based on the hierarchical data and the set of test vectors for the logic circuit. Generating a logic simulation output set includes parallel calculation of logic output values of a plurality of logic gates located in the same hierarchical circuit in the same time frame, and the logic output values of the plurality of logic gates are associated with the logic simulation output set. By using an accelerator such as a graphics processing unit (GPU) to parallelize the output of multiple logic gates in the same time frame, the processing time can be significantly reduced compared to the conventional serial calculation of the CPU. In addition, by grading the logic simulation circuit, the correctness of the logic simulation can be ensured because multiple logic gates processed in parallel are located in the same hierarchical circuit and do not have a causal relationship with each other in terms of logic simulation results.
在第一方面的一种可能实现方式中,测试向量集包括第一测试向量和第二测试向量,并且逻辑仿真输出集包括第一输出子集和第二输出子集。生成逻辑仿真输出集还包括基于分级数据和第一测试向量生成第一输出子集;基于分级数据和第二测试向量生成第二输出子集。第一输出子集的生成和第二输出子集的生成是并行执行的。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In a possible implementation manner of the first aspect, the test vector set includes a first test vector and a second test vector, and the logic simulation output set includes a first output subset and a second output subset. Generating the set of logic simulation outputs further includes generating a first subset of outputs based on the hierarchical data and the first test vectors; and generating a second subset of outputs based on the hierarchical data and the second test vectors. The generation of the first output subset and the generation of the second output subset are performed in parallel. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在第一方面的一种可能实现方式中,生成第一输出子集包括基于分级数据和第一测试向 量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,第一多个输出与第一输出子集相关联。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In a possible implementation manner of the first aspect, generating the first output subset includes, based on the hierarchical data and the first test vector, computing in parallel a first multiple outputs, the first plurality of outputs is associated with the first subset of outputs. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在第一方面的一种可能实现方式中,生成逻辑仿真输出集还包括确定逻辑电路中的第一层级电路在第一时间帧中是否将被计算;如果确定第一层级电路将被计算,则计算第一层级电路的第一输出集,逻辑仿真输出集与第一输出集相关联;以及如果确定第一层级电路将不被计算,则使用第一层级电路在第一时间帧之前一个时间帧中的输出集作为第一层级电路在第一时间帧中的第一输出集。通过确定逻辑电路中的层级电路在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略整个层级电路的逻辑门的计算时间,从而进一步降低逻辑仿真的总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the first aspect, generating the logic simulation output set further includes determining whether the first-level circuit in the logic circuit will be calculated in the first time frame; if it is determined that the first-level circuit will be calculated, then calculating a first set of outputs for the first level circuit, the logic simulation output set being associated with the first output set; and if it is determined that the first level circuit will not be calculated, using the first level circuit one time frame before the first time frame The output set in is used as the first output set of the first-level circuit in the first time frame. By determining whether a hierarchical circuit in a logic circuit needs to be calculated in some time frames, the calculation time of the logic gates of the entire hierarchical circuit can be omitted without calculation, thereby further reducing the total time of logic simulation and can also reduce Consumption of computing resources to use limited computing resources for processing other logic gates that require computation. This further reduces the overall time for logic simulation.
在第一方面的一种可能实现方式中,确定逻辑电路中的第一层级电路在第一时间帧中是否将被计算包括以下至少一项:确定与第一层级电路对应的第一分级标识位在第一时间帧中的值是否为第一值,第一分级标识位的第一值指示第一层级电路在第一时间帧中的输入集与第一层级电路在第一时间帧之前一个时间帧中的输入集至少部分地不同;以及确定第一层级电路的在第一时间帧中的全部输入或因素组合相比于第一层级电路的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the first aspect, determining whether the first-level circuit in the logic circuit will be calculated in the first time frame includes at least one of the following: determining the first hierarchical identification bit corresponding to the first-level circuit Whether the value in the first time frame is the first value, the first value of the first level flag indicates that the input set of the first level circuit in the first time frame is the same as the first level circuit in the first time frame one time before The sets of inputs in the frames are at least partially different; and determining all inputs or combinations of factors to the first-level circuit in the first time frame compared to all inputs to the first-level circuit in a time frame preceding the first time frame Or whether the combination of factors changes. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of the identification bit determination method, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在第一方面的一种可能实现方式中,计算第一层级电路的第一输出集包括:确定第一层级电路中的第一逻辑门在第一时间帧中是否将被计算;如果确定第一逻辑门将被计算,则计算第一逻辑门的第一输出;以及如果确定第一逻辑门将不被计算,则使用第一逻辑门在第一时间帧之前一个时间帧中的输出作为第一逻辑门在第一时间帧中的第一输出,逻辑仿真输出集与第一输出相关联。通过确定逻辑电路中的一些逻辑门在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略这些逻辑门的计算时间,从而进一步降低逻辑仿真的总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the first aspect, calculating the first output set of the first level circuit includes: determining whether the first logic gate in the first level circuit will be calculated in the first time frame; if it is determined that the first the logic gate is to be calculated, then calculating a first output of the first logic gate; and if it is determined that the first logic gate will not be calculated, using the output of the first logic gate in one time frame before the first time frame as the first logic gate A first output in a first time frame, a logic simulation output set is associated with the first output. By determining whether some logic gates in a logic circuit need to be calculated in some time frames, the calculation time of these logic gates can be omitted when no calculation is required, thereby further reducing the total time of logic simulation and also reducing computing resources Consumption of limited computing resources to process other logic gates that require computation. This further reduces the overall time for logic simulation.
在第一方面的一种可能实现方式中,确定第一层级电路中的第一逻辑门在第一时间帧中是否将被计算包括以下至少一项确定第一逻辑门的在第一时间帧中的全部输入相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入是否改变包括:确定与第一逻辑门对应的第一逻辑门标识位在第一时间帧中的值是否为第一值,第一逻辑门标识位的第一值指示第一逻辑门在第一时间帧中的输入与第一逻辑门在第一时间帧之前一个时间帧中的输入至少部分地不同;以及确定第一逻辑门的在第一时间帧中的全部输入或因素组合相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the first aspect, determining whether the first logic gate in the first-level circuit will be calculated in the first time frame includes at least one of the following determining whether the first logic gate in the first time frame Whether all inputs of the first logic gate are changed compared to all inputs of the first time frame in the first time frame includes: determining the value of the first logic gate identification bit corresponding to the first logic gate in the first time frame Whether it is the first value, the first value of the first logic gate identification bit indicates that the input of the first logic gate in the first time frame is at least partially different from the input of the first logic gate in a time frame before the first time frame ; and determining whether all inputs or factor combinations of the first logic gate in a first time frame have changed compared to all inputs or factor combinations of the first logic gate in a time frame preceding the first time frame. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of the identification bit determination method, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在第一方面的一种可能实现方式中,基于分级数据和针对逻辑电路的测试向量集,生成逻辑仿真输出集还包括:按时间帧逐帧地依次计算各个层级电路的输出;以及基于最后一个时间帧中的各个层级电路的输出,生成逻辑仿真输出集。通过按时间帧逐帧计算各个层级的 逻辑输出,可以确保逻辑仿真中逻辑运算的因果性的准确,从而提高逻辑仿真的准确率。In a possible implementation manner of the first aspect, based on the hierarchical data and the test vector set for the logic circuit, generating the logic simulation output set further includes: sequentially calculating the output of each level circuit frame by time frame; and based on the last The output of each level circuit in the time frame, generating a logic simulation output set. By calculating the logic output of each level frame by time frame, the accuracy of the causality of logic operations in logic simulation can be ensured, thereby improving the accuracy of logic simulation.
在第一方面的一种可能实现方式中,按时间帧逐帧地依次计算各个层级电路的输出包括:确定包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数是否超过阈值次数;如果包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数超过阈值次数,则生成故障指示,故障指示表示逻辑仿真出现故障。通过设置循环阈值次数,可以避免逻辑仿真陷入针对时序逻辑门的错误循环或无限循环之中,并且能及时报告逻辑仿真的错误,从而减少逻辑仿真的时间。In a possible implementation manner of the first aspect, sequentially calculating the output of each hierarchical circuit frame by time frame includes: determining whether the number of calculations of the output of the hierarchical circuit including sequential logic gates in a time frame exceeds a threshold number of times; If the number of calculations of the output of the hierarchical circuit comprising sequential logic gates exceeds a threshold number of times in a time frame, a fault indication is generated, indicating that the logic simulation has failed. By setting the threshold number of cycles, logic simulation can be prevented from falling into an error loop or an infinite loop for sequential logic gates, and errors in logic simulation can be reported in time, thereby reducing the time of logic simulation.
根据本公开的第二方面,提供一种计算机可读存储介质。计算机可读存储介质存储多个程序,多个程序被配置为一个或多个处理器执行,多个程序包括用于执行第一方面的方法的指令。According to a second aspect of the present disclosure, a computer readable storage medium is provided. The computer-readable storage medium stores multiple programs configured to be executed by one or more processors, and the multiple programs include instructions for executing the method of the first aspect.
根据本公开的第三方面,提供一种计算机程序产品。计算机程序产品包括多个程序,多个程序被配置为一个或多个处理器执行,多个程序包括用于执行第一方面的方法的指令。According to a third aspect of the present disclosure, a computer program product is provided. The computer program product comprises a plurality of programs configured to be executed by one or more processors, the plurality of programs comprising instructions for performing the method of the first aspect.
根据本公开的第四方面,提供一种电子设备。电子设备包括:一个或多个处理器;包括计算机指令的存储器。计算机指令在由电子设备的一个或多个处理器执行时使得电子设备执行第一方面的方法。According to a fourth aspect of the present disclosure, an electronic device is provided. An electronic device includes: one or more processors; memory including computer instructions. The computer instructions, when executed by one or more processors of the electronic device, cause the electronic device to perform the method of the first aspect.
根据本公开的第五方面,提供一种电子设备。电子设备包括接收单元和生成单元。接收单元用于接收逻辑电路的分级数据,分级数据表示逻辑电路的多个层级电路,多个层级电路是基于逻辑电路中的多个逻辑门的连接关系而被划分的。生成单元用于基于分级数据和针对逻辑电路的测试向量集生成逻辑仿真输出集,其中生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,多个逻辑门的逻辑输出值与逻辑仿真输出集相关联。通用使用诸如GPU之类的加速器并行计算在同一时间帧中的多个逻辑门的输出,相比于CPU的常规串行计算,可以显著降低处理的时间。此外,通过将逻辑仿真电路分级,可以确保逻辑仿真的正确性,这是因为并行处理的多个逻辑门位于同一层级电路中并且在逻辑仿真结果方面彼此之间并不具有因果关联性。According to a fifth aspect of the present disclosure, an electronic device is provided. An electronic device includes a receiving unit and a generating unit. The receiving unit is used for receiving hierarchical data of the logic circuit. The hierarchical data represents multiple hierarchical circuits of the logical circuit, and the multiple hierarchical circuits are divided based on the connection relationship of multiple logic gates in the logical circuit. The generation unit is used to generate a logic simulation output set based on the hierarchical data and the test vector set for the logic circuit, wherein generating the logic simulation output set includes parallel calculation of logic output values of multiple logic gates located in the same hierarchical circuit in the same time frame, Logic output values of the plurality of logic gates are associated with a set of logic simulation outputs. General use of accelerators such as GPUs to parallelize the output of multiple logic gates in the same time frame can significantly reduce processing time compared to conventional serial calculations by CPUs. In addition, by grading the logic simulation circuit, the correctness of the logic simulation can be ensured because multiple logic gates processed in parallel are located in the same hierarchical circuit and do not have a causal relationship with each other in terms of logic simulation results.
在第五方面的一种可能实现方式中,测试向量集包括第一测试向量和第二测试向量,并且逻辑仿真输出集包括第一输出子集和第二输出子集。生成单元还用于基于分级数据和第一测试向量生成第一输出子集;基于分级数据和第二测试向量生成第二输出子集,第一输出子集的生成和第二输出子集的生成是并行执行的。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In a possible implementation manner of the fifth aspect, the test vector set includes a first test vector and a second test vector, and the logic simulation output set includes a first output subset and a second output subset. The generation unit is also used to generate a first output subset based on the classification data and the first test vector; generate a second output subset based on the classification data and the second test vector, the generation of the first output subset and the generation of the second output subset are executed in parallel. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在第五方面的一种可能实现方式中,生成单元还用于:基于分级数据和第一测试向量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,第一多个输出与第一输出子集相关联。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In a possible implementation manner of the fifth aspect, the generation unit is further configured to: based on the hierarchical data and the first test vector, calculate in parallel the first multiple outputs of the multiple logic gates located in the same level circuit in the same time frame , the first plurality of outputs is associated with the first subset of outputs. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在第五方面的一种可能实现方式中,生成单元还用于:确定逻辑电路中的第一层级电路在第一时间帧中是否将被计算;如果确定第一层级电路将被计算,则计算第一层级电路的第一输出集,逻辑仿真输出集与第一输出集相关联;以及如果确定第一层级电路将不被计算,则使用第一层级电路在第一时间帧之前一个时间帧中的输出集作为第一层级电路在第一时间帧中的第一输出集。通过确定逻辑电路中的层级电路在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略整个层级电路的逻辑门的计算时间,从而进一步降低逻辑仿真的 总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the fifth aspect, the generating unit is further configured to: determine whether the first-level circuit in the logic circuit will be calculated in the first time frame; if it is determined that the first-level circuit will be calculated, calculate A first set of outputs for a first-level circuit, the logic simulation output set being associated with the first output set; and if it is determined that the first-level circuit will not be calculated, using the first-level circuit in one time frame prior to the first time frame The output set of is used as the first output set of the first-level circuit in the first time frame. By determining whether a hierarchical circuit in a logic circuit needs to be calculated in some time frames, the calculation time of the logic gates of the entire hierarchical circuit can be omitted without calculation, thereby further reducing the total time of logic simulation and can also reduce Consumption of computing resources to use limited computing resources for processing other logic gates that require computation. This further reduces the overall time for logic simulation.
在第五方面的一种可能实现方式中,生成单元还用于确定与第一层级电路对应的第一分级标识位在第一时间帧中的值是否为第一值,第一分级标识位的第一值指示第一层级电路在第一时间帧中的输入集与第一层级电路在第一时间帧之前一个时间帧中的输入集至少部分地不同;以及确定第一层级电路的在第一时间帧中的全部输入或因素组合相比于第一层级电路的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the fifth aspect, the generating unit is further configured to determine whether the value of the first hierarchical identification bit corresponding to the first-level circuit in the first time frame is the first value, and the value of the first hierarchical identification bit The first value indicates that the set of inputs to the first-level circuit in the first time frame is at least partially different from the set of inputs to the first-level circuit in one time frame before the first time frame; and determining the set of inputs to the first-level circuit in the first time frame; Whether all inputs or factor combinations in a time frame have changed compared to all inputs or factor combinations in a time frame preceding the first time frame for a first-level circuit. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of the identification bit determination method, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在第五方面的一种可能实现方式中,生成单元还用于确定第一层级电路中的第一逻辑门在第一时间帧中是否将被计算;如果确定第一逻辑门将被计算,则计算第一逻辑门的第一输出;以及如果确定第一逻辑门将不被计算,则使用第一逻辑门在第一时间帧之前一个时间帧中的输出作为第一逻辑门在第一时间帧中的第一输出,逻辑仿真输出集与第一输出相关联。通过确定逻辑电路中的一些逻辑门在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略这些逻辑门的计算时间,从而进一步降低逻辑仿真的总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the fifth aspect, the generation unit is further configured to determine whether the first logic gate in the first-level circuit will be calculated in the first time frame; if it is determined that the first logic gate will be calculated, then calculate the first output of the first logic gate; and if it is determined that the first logic gate will not be calculated, using the output of the first logic gate in one time frame before the first time frame as the output of the first logic gate in the first time frame A first output, logic simulation output set is associated with the first output. By determining whether some logic gates in a logic circuit need to be calculated in some time frames, the calculation time of these logic gates can be omitted when no calculation is required, thereby further reducing the total time of logic simulation and also reducing computing resources Consumption of limited computing resources to process other logic gates that require computation. This further reduces the overall time for logic simulation.
在第五方面的一种可能实现方式中,生成单元还用于确定第一逻辑门的在第一时间帧中的全部输入相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入是否改变包括:确定与第一逻辑门对应的第一逻辑门标识位在第一时间帧中的值是否为第一值,第一逻辑门标识位的第一值指示第一逻辑门在第一时间帧中的输入与第一逻辑门在第一时间帧之前一个时间帧中的输入至少部分地不同;以及确定第一逻辑门的在第一时间帧中的全部输入或因素组合相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In a possible implementation manner of the fifth aspect, the generating unit is further configured to determine that all inputs of the first logic gate in the first time frame are compared with those of the first logic gate in a time frame preceding the first time frame Whether all the inputs of the change include: determining whether the value of the first logic gate identification bit corresponding to the first logic gate in the first time frame is the first value, and the first value of the first logic gate identification bit indicates that the first logic gate The input in the first time frame is at least partially different from the input of the first logic gate in one time frame preceding the first time frame; and determining that all inputs or factor combinations of the first logic gate in the first time frame are identical Compared to whether all inputs or combinations of factors in the first time frame before the first time frame changed for the first logic gate. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of the identification bit determination method, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在第五方面的一种可能实现方式中,基于分级数据和针对逻辑电路的测试向量集,生成单元还用于:按时间帧逐帧地依次计算各个层级电路的输出;以及基于最后一个时间帧中的各个层级电路的输出,生成逻辑仿真输出集。通过按时间帧逐帧计算各个层级的逻辑输出,可以确保逻辑仿真中逻辑运算的因果性的准确,从而提高逻辑仿真的准确率。In a possible implementation manner of the fifth aspect, based on the hierarchical data and the test vector set for the logic circuit, the generation unit is further configured to: sequentially calculate the output of each level circuit frame by time frame; and based on the last time frame The output of each level circuit in the logic simulation output set is generated. By calculating the logic output of each level frame by time frame, the accuracy of the causality of the logic operation in the logic simulation can be ensured, thereby improving the accuracy of the logic simulation.
在第五方面的一种可能实现方式中,生成单元还用于确定包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数是否超过阈值次数;如果包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数超过阈值次数,则生成故障指示,故障指示表示逻辑仿真出现故障。通过设置循环阈值次数,可以避免逻辑仿真陷入针对时序逻辑门的错误循环或无限循环之中,并且能及时报告逻辑仿真的错误,从而减少逻辑仿真的时间。In a possible implementation manner of the fifth aspect, the generating unit is further used to determine whether the number of calculations of the output of the hierarchical circuit including the sequential logic gate exceeds the threshold number of times in a time frame; if the output of the hierarchical circuit including the sequential logic gate When the number of calculations in a time frame exceeds a threshold number of times, a fault indication is generated, which indicates that the logic simulation fails. By setting the threshold number of cycles, logic simulation can be prevented from falling into an error loop or an infinite loop for sequential logic gates, and errors in logic simulation can be reported in time, thereby reducing the time of logic simulation.
应当理解,发明内容部分中所描述的内容并非旨在限定本公开的实施例的关键或重要特征,亦非用于限制本公开的范围。本公开的其它特征将通过以下的描述变得容易理解。It should be understood that what is described in the Summary of the Invention is not intended to limit the key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.
附图说明Description of drawings
结合附图并参考以下详细说明,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。在附图中,相同或相似的附图标记表示相同或相似的元素,其中:The above and other features, advantages and aspects of the various embodiments of the present disclosure will become more apparent with reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, identical or similar reference numerals denote identical or similar elements, wherein:
图1示出了根据本公开的一些实施例的逻辑电路的仿真***100的示意图。FIG. 1 shows a schematic diagram of a logic circuit simulation system 100 according to some embodiments of the present disclosure.
图2示出了根据本公开的一些实施例的逻辑电路的仿真流程的示意框图。FIG. 2 shows a schematic block diagram of a simulation process of a logic circuit according to some embodiments of the present disclosure.
图3示出了根据本公开的一些实施例的示意性逻辑电路的示例电路图。FIG. 3 shows an example circuit diagram of an illustrative logic circuit according to some embodiments of the present disclosure.
图4示出了图3中的逻辑电路的分级示意图。FIG. 4 shows a hierarchical schematic diagram of the logic circuit in FIG. 3 .
图5示出了图3中的逻辑电路按时间帧展开的示意图。FIG. 5 shows a schematic diagram of the logic circuit in FIG. 3 expanded in time frames.
图6示出了根据本公开的一些实施例的由通用处理器执行的仿真过程的示意图。FIG. 6 shows a schematic diagram of a simulation process performed by a general-purpose processor according to some embodiments of the present disclosure.
图7示出了根据本公开的一些实施例的逻辑仿真方法的示意流程图。Fig. 7 shows a schematic flowchart of a logic simulation method according to some embodiments of the present disclosure.
图8示出了根据本公开的一些实施例的由图形处理器执行的的仿真过程的示意图。FIG. 8 shows a schematic diagram of a simulation process executed by a graphics processor according to some embodiments of the present disclosure.
图9示出了根据本公开的一些实施例的图8中的过程中的组合逻辑计算过程的示意图。FIG. 9 shows a schematic diagram of a combinatorial logic calculation process in the process in FIG. 8 according to some embodiments of the present disclosure.
图10示出了根据本公开的一些实施例的图8中的过程中的实现逻辑计算过程的示意图。FIG. 10 shows a schematic diagram of implementing a logic calculation process in the process in FIG. 8 according to some embodiments of the present disclosure.
图11示出了根据本公开的一些实施例的逻辑仿真的一些示例的示意框图。FIG. 11 shows a schematic block diagram of some examples of logic simulations according to some embodiments of the present disclosure.
图12示出了根据本公开的另一些实施例的逻辑仿真的一些示例的示意框图。Fig. 12 shows a schematic block diagram of some examples of logic simulation according to other embodiments of the present disclosure.
图13示出了根据本公开的一些实施例的电子设备的示意框图。Fig. 13 shows a schematic block diagram of an electronic device according to some embodiments of the present disclosure.
图14示出了可以用来实施本公开的一些实施例的示例设备的框图。Figure 14 shows a block diagram of an example device that may be used to implement some embodiments of the present disclosure.
具体实施方式Detailed ways
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present disclosure are shown in the drawings, it should be understood that the disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein; A more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for exemplary purposes only, and are not intended to limit the protection scope of the present disclosure.
在本公开的实施例的描述中,术语“包括”及其类似用语应当理解为开放性包含,即“包括但不限于”。术语“基于”应当理解为“至少部分地基于”。术语“一个实施例”或“该实施例”应当理解为“至少一个实施例”。术语“第一”、“第二”等等可以指代不同的或相同的对象。术语“和/或”表示由其关联的两项的至少一项。例如“A和/或B”表示A、B、或者A和B。下文还可能包括其他明确的和隐含的定义。In the description of the embodiments of the present disclosure, the term "comprising" and its similar expressions should be interpreted as an open inclusion, that is, "including but not limited to". The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be read as "at least one embodiment". The terms "first", "second", etc. may refer to different or the same object. The term "and/or" means at least one of the two items associated with it. For example "A and/or B" means A, B, or A and B. Other definitions, both express and implied, may also be included below.
应理解,本申请实施例提供的技术方案,在以下具体实施例的介绍中,某些重复之处可能不再赘述,但应视为这些具体实施例之间已有相互引用,可以相互结合。It should be understood that for the technical solutions provided by the embodiments of the present application, in the introduction of the following specific embodiments, some repetitions may not be repeated, but it should be considered that these specific embodiments have been referred to each other and can be combined with each other.
如上所述,常规的逻辑仿真消耗相当长的仿真时间。例如在常规的硬编码逻辑仿真中,硬编码从原始输入开始依次遍历计算逻辑电路中的各个逻辑门的逻辑值。这通常在CPU上串行执行,并且消耗相当长的仿真时间,因为CPU主要依次逐个计算每个逻辑门的逻辑仿真值,直至得到最终的逻辑仿真输出集。As mentioned above, conventional logic simulation consumes a considerable amount of simulation time. For example, in a conventional hard-coded logic simulation, the hard-coded logical values of each logic gate in the computational logic circuit are sequentially traversed from the original input. This is usually performed serially on the CPU and consumes a considerable amount of simulation time, as the CPU essentially computes the logic simulation values for each logic gate one by one until the final set of logic simulation outputs is obtained.
在本公开的一些实施例中,处理器在读取网表文件之后,将网表文件所描述的逻辑电路中的逻辑门按照逻辑门的类型分级,以将时序逻辑门、原始输入端口和原始输出端口放入第一层级电路,并且将组合逻辑门放入第二层级电路。通过使用诸如GPU之类的加速器并行计算在同一时间帧中的多个逻辑门的输出,相比于CPU的常规串行计算,可以显著降低处理的时间。此外,通过将逻辑仿真电路分级,可以确保逻辑仿真的正确性,这是因为并行处理的多个逻辑门位于同一层级电路中并且在逻辑仿真结果方面彼此之间并不具有因果关联性。In some embodiments of the present disclosure, after reading the netlist file, the processor classifies the logic gates in the logic circuit described in the netlist file according to the types of logic gates, so as to classify the sequential logic gates, original input ports and original The output ports are placed into the first level of circuitry, and the combinational logic gates are placed into the second level of circuitry. By parallelizing the output of multiple logic gates in the same time frame using an accelerator such as a GPU, the processing time can be significantly reduced compared to conventional serial calculations by a CPU. In addition, by grading the logic simulation circuit, the correctness of the logic simulation can be ensured because multiple logic gates processed in parallel are located in the same hierarchical circuit and do not have a causal relationship with each other in terms of logic simulation results.
图1示出了根据本公开的一些实施例的逻辑电路的仿真***100的示意图。在一个实施 例中,仿真***100例如包括电子设备10和ATPG设备20。在一个实施例中,电子设备10例如是计算机。电子设备10包括诸如CPU之类的通用处理器14以及诸如GPU之类的加速器12,其中通用处理器14和加速器12可以分别包括高速缓存。备选地,在一些实施例中,高速缓存也可以独立于通用处理器14和加速器12,本公开的范围对此不进行限制。诸如GPU之类的加速器通常包括非常多的计算单元,并且对于特定类型的数据处理,加速器具有相比于诸如CPU之类的通用处理器的显著优势。例如,GPU可以具有成百上千个处理引擎或线程,其适于数据格式相同或相似的数据的并行处理,例如图像像素的并行处理。在本公开的一些实施例中,GPU被用于并行计算同一层级电路中的多个逻辑门。例如,如果一个层级电路中包括100个逻辑门,通过使用GPU进行并行计算,相比于串行计算,理论上处理时间可以缩减至串行计算时间的约1/100。虽然在本公开中使用GPU对并行计算进行描述,但是可以理解这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以使用现场可编程门阵列(field programmable gate array,FPGA)或专用集成电路等其它加速器来实现逻辑门的并行计算。FIG. 1 shows a schematic diagram of a logic circuit simulation system 100 according to some embodiments of the present disclosure. In one embodiment, the simulation system 100 includes an electronic device 10 and an ATPG device 20, for example. In one embodiment, the electronic device 10 is, for example, a computer. The electronic device 10 includes a general purpose processor 14 such as a CPU and an accelerator 12 such as a GPU, where the general purpose processor 14 and the accelerator 12 may each include a cache memory. Alternatively, in some embodiments, the cache may also be independent from the general-purpose processor 14 and the accelerator 12, and the scope of the present disclosure is not limited thereto. Accelerators such as GPUs typically include a very large number of computing units, and for certain types of data processing, accelerators have significant advantages over general-purpose processors such as CPUs. For example, a GPU may have hundreds or thousands of processing engines or threads adapted for parallel processing of data of the same or similar data format, such as parallel processing of image pixels. In some embodiments of the present disclosure, a GPU is used to compute in parallel multiple logic gates in the same level of circuitry. For example, if a hierarchical circuit includes 100 logic gates, by using GPU for parallel computing, compared with serial computing, the processing time can theoretically be reduced to about 1/100 of the serial computing time. Although parallel computing is described in this disclosure using a GPU, it is understood that this is illustrative only and not limiting of the scope of the disclosure. In some embodiments, other accelerators such as a field programmable gate array (field programmable gate array, FPGA) or an application specific integrated circuit may be used to implement parallel calculation of logic gates.
ATPG设备20被配置为生成针对逻辑仿真的ATPG数据,并且将ATPG数据传输至电子设备10。虽然在图1中将电子设备10和ATPG设备20独立地限制,但是在一些实施例中,ATPG设备20可以与电子设备10集成在一起,本公开对此不进行限制。电子设备10可以包括输入装置、通信装置、显示器、音频装置等在此未被示出的其它部件。电子设备10例如可以包括台式计算机、笔记本、工作站、服务器等具有计算功能的设备。用于描述逻辑电路的网表文件可以通过各种有线或无线的方式传递至电子设备10。备选地,电子设备10还可以使用存储有网表文件的存储介质来读取该网表文件。ATPG设备20可以针对不同的逻辑电路生成不同的ATPG数据。The ATPG device 20 is configured to generate ATPG data for logic simulation and transmit the ATPG data to the electronic device 10 . Although the electronic device 10 and the ATPG device 20 are limited independently in FIG. 1 , in some embodiments, the ATPG device 20 may be integrated with the electronic device 10 , which is not limited by the present disclosure. The electronic device 10 may include input devices, communication devices, displays, audio devices and other components not shown here. The electronic device 10 may include, for example, devices with computing functions such as desktop computers, notebooks, workstations, and servers. The netlist file used to describe the logic circuit can be transmitted to the electronic device 10 in various wired or wireless ways. Alternatively, the electronic device 10 may also use a storage medium storing the netlist file to read the netlist file. The ATPG device 20 can generate different ATPG data for different logic circuits.
图2示出了根据本公开的一些实施例的仿真流程200的示意图。在一个实施例中,仿真流程200由图1的电子设备10执行,因此针对电子设备10的描述内容可以适用于仿真流程200。仿真流程例如可以包括逻辑仿真210以及未示出的故障仿真。逻辑仿真210使用来自ATPG设备20的ATPG数据202和通过无线、有线或读取存储介质方式获得的网表文件204。网表文件204包括用于描述逻辑电路中的各个逻辑门(包括时序逻辑门和组合逻辑门)、原始输入、原始输出以及各个部件之间的耦合关系等数据。电子设备10可以执行逻辑仿真210以生成逻辑仿真输出集。逻辑仿真210的具体过程可以参见下文。FIG. 2 shows a schematic diagram of a simulation process 200 according to some embodiments of the present disclosure. In one embodiment, the simulation process 200 is executed by the electronic device 10 in FIG. 1 , so the content described for the electronic device 10 can be applied to the simulation process 200 . The simulation process may include, for example, a logic simulation 210 and a fault simulation not shown. The logic simulation 210 uses the ATPG data 202 from the ATPG device 20 and the netlist file 204 obtained through wireless, wired or reading storage media. The netlist file 204 includes data for describing various logic gates (including sequential logic gates and combinational logic gates), original inputs, original outputs, and coupling relationships among various components in the logic circuit. Electronic device 10 may perform logic simulation 210 to generate a set of logic simulation outputs. The specific process of the logic simulation 210 can be referred to below.
图3示出了根据本公开的一些实施例的示意性逻辑电路30的示例电路图。逻辑电路30仅是用于说明本公开的原理,而非对本公开的范围进行限制。可以理解,还可以有其它配置的逻辑电路。逻辑电路30例如可以包括第一原始数据输入PI1、与门31、第一触发器U1、第二原始数据输入PI2、反相器32、第二触发器U2、第一缓冲器33、第二缓冲器34和原始输出PO。与门31的输入耦合至第一原始数据输入PI1和第一触发器U1的输出。第一触发器U1的时钟端口C1被配置为接收第一时钟信号,第一触发器U1的复位端耦合至第二缓冲器34的输出,并且第一触发器U1的输出耦合至原始输出PO。反相器32的输入耦合至第二原始数据输入PI2,并且反相器32的输出耦合至第二触发器U2的输入端。第二触发器U2的时钟端口C2被配置为接收第二时钟信号,并且第二触发器U2的输出耦合至第一缓冲器33的输入,并且第一缓冲器33的输出耦合至第二缓冲器34的输入。逻辑电路30包括第一类时序逻辑门和第二类组合逻辑门。第一类时序逻辑门包括第一缓冲器U1和第二缓冲器U2,而第二类组合逻辑门包括与门31、反相器32、第一缓冲器33和第二缓冲器34。FIG. 3 shows an example circuit diagram of an illustrative logic circuit 30 according to some embodiments of the present disclosure. The logic circuit 30 is only used to illustrate the principle of the present disclosure, but not to limit the scope of the present disclosure. It is understood that other configurations of logic circuits are also possible. The logic circuit 30 may include, for example, a first original data input PI1, an AND gate 31, a first flip-flop U1, a second original data input PI2, an inverter 32, a second flip-flop U2, a first buffer 33, a second buffer device 34 and the original output PO. The input of the AND gate 31 is coupled to the first raw data input PI1 and the output of the first flip-flop U1. The clock port C1 of the first flip-flop U1 is configured to receive the first clock signal, the reset terminal of the first flip-flop U1 is coupled to the output of the second buffer 34 , and the output of the first flip-flop U1 is coupled to the original output PO. The input of the inverter 32 is coupled to the second raw data input PI2, and the output of the inverter 32 is coupled to the input of the second flip-flop U2. The clock port C2 of the second flip-flop U2 is configured to receive the second clock signal, and the output of the second flip-flop U2 is coupled to the input of the first buffer 33, and the output of the first buffer 33 is coupled to the second buffer 34 inputs. The logic circuit 30 includes a first type of sequential logic gate and a second type of combinational logic gate. The first type of sequential logic gates includes a first buffer U1 and a second buffer U2 , and the second type of combinational logic gates includes an AND gate 31 , an inverter 32 , a first buffer 33 and a second buffer 34 .
图4示出了图3中的逻辑电路的分级示意图。在本公开的一些实施例中,处理器在逻辑仿真过程中,根据网表文件所描述的逻辑电路中各个逻辑门的耦合关系,将逻辑电路拆分为两个层级的电路,其中第一层级电路包括时序逻辑门、原始输入和原始输出,并且第二层级电路包括组合逻辑门。在另一些实施例中,还可以针对组合逻辑门与第一层级电路的关系以及组合逻辑门彼此之间的关系,进一步分级第二层级电路。例如,第二层级电路包括第一子层级电路、第二子层级电路……第N子层级电路,其中N表示大于1的整数,并且N的具体数值取决于待仿真的逻辑电路。在一个实施例中,第一子层级电路包括与第一层级电路直接耦合的组合逻辑门,第二子层级电路包括与第一子层级电路直接耦合的组合逻辑门,以此类推。FIG. 4 shows a hierarchical schematic diagram of the logic circuit in FIG. 3 . In some embodiments of the present disclosure, during the logic simulation process, the processor splits the logic circuit into two levels of circuits according to the coupling relationship of each logic gate in the logic circuit described in the netlist file, wherein the first level The circuits include sequential logic gates, primitive inputs and primitive outputs, and the second level circuits include combinational logic gates. In some other embodiments, the second-level circuits can be further classified with respect to the relationship between the combinational logic gates and the first-level circuits and the relationship between the combinational logic gates. For example, the second-level circuit includes a first sub-level circuit, a second sub-level circuit...Nth sub-level circuit, where N represents an integer greater than 1, and the specific value of N depends on the logic circuit to be simulated. In one embodiment, the first sub-level circuit includes combinational logic gates directly coupled to the first sub-level circuit, the second sub-level circuit includes combinational logic gates directly coupled to the first sub-level circuit, and so on.
在图4所示的实施例中,逻辑电路30可以被分为3个层级,其中图4中所示的0级对应于第一层级电路,1级对应于第二层级电路的第一子层级电路,并且2级对应于第二层级电路的第二子层级电路。第一层级电路包括原始输出PO、第一原始数据输入PI1、第二原始数据输入PI2、第一触发器U1和第二触发器U2。第一子层级电路包括与门31、第一缓冲器33和反相器32。第二子层级电路包括第二缓冲器34。在一些情形下,时钟信号并未被直接施加至时序逻辑门的时钟端口,而是经由一个或多个组合逻辑门施加至第一层级电路的时序逻辑门的时钟端口。在此情形下,第二层级电路可以包括在原始时钟输入端口到时序逻辑门的时钟端口之间的组合逻辑门。In the embodiment shown in FIG. 4, the logic circuit 30 can be divided into three levels, wherein level 0 shown in FIG. 4 corresponds to the first level circuit, and level 1 corresponds to the first sublevel of the second level circuit circuit, and level 2 corresponds to the second sub-level circuit of the second level circuit. The first level circuit includes an original output PO, a first original data input PI1, a second original data input PI2, a first flip-flop U1 and a second flip-flop U2. The first sub-level circuit includes an AND gate 31 , a first buffer 33 and an inverter 32 . The second sub-level circuit includes a second buffer 34 . In some cases, the clock signal is not directly applied to the clock port of the sequential logic gate, but is applied to the clock port of the sequential logic gate of the first level circuit via one or more combinational logic gates. In this case, the second level circuit may include combinational logic gates between the original clock input port to the clock port of the sequential logic gate.
虽然在图4中示出了一种分级的具体实现方式,但这并未对本公开的范围进行限制。也可以使用其它分级的方式。在一个实施例中,可以将原始输入、原始输出和时序逻辑门划分至不同的层级。例如,原始输出即可集中在组合逻辑门的最后一级,也可以紧跟在各自的驱动源以分散设置在各自驱动源的下一级。Although a specific implementation of the hierarchy is shown in FIG. 4, this does not limit the scope of the present disclosure. Other grading schemes may also be used. In one embodiment, primitive inputs, primitive outputs and sequential logic gates may be divided into different levels. For example, the original output can be concentrated in the last stage of the combinational logic gate, or it can be distributed and arranged in the next stage of the respective driving sources closely following the respective driving sources.
图5示出了图3中的逻辑电路按时间帧展开的示意图。图4示出了逻辑电路在一个时间帧中的分级示意,但是逻辑仿真通常并不针对单个时间帧,而是针对多个时间帧以仿真在不同输入下的逻辑输出。此外,对于时序逻辑门而言,通常时钟端口上的单个逻辑电平并不能反映其是否被触发,而是需要相继的多个逻辑电平来确定。例如,寄存器需要时钟端口上的从低电平(逻辑“0”)到高电平(逻辑“1”)的跳变来触发。因此,针对具有时序逻辑门的逻辑电路,需要多个时间帧来确定是否存在触发。FIG. 5 shows a schematic diagram of the logic circuit in FIG. 3 expanded in time frames. Fig. 4 shows a hierarchical diagram of a logic circuit in a time frame, but logic simulation usually does not target a single time frame, but multiple time frames to simulate logic outputs under different inputs. In addition, for sequential logic gates, usually a single logic level on the clock port does not reflect whether it is triggered, but requires multiple consecutive logic levels to determine. For example, a register requires a low (logic "0") to high (logic "1") transition on the clock port to trigger. Therefore, for logic circuits with sequential logic gates, multiple time frames are required to determine whether a trigger is present.
时钟信号通常以脉冲形式提供,并且包括例如“...10101010…”之类的一系列高低脉冲,如图5中上方所示在一个实施例中,可以采用“010”的时钟信号的片段来确定是否存在触发。例如,可以选择时钟信号为“0”(低电平)的后一半时段作为上述“010”片段的第一个“0”,使用时钟信号的相继的完整的“1”(高电平)作为上述“010”片段的“1”,并且使用时钟信号的相继的“0”的前一半时段作为上述“010”片段的第二个“0”。这样,一个时钟周期对应于用于确定触发是否存在的一个周期。该时钟周期包括三个逻辑值,因此对应于3个时间帧。备选地,也可以使用“101”的时钟信号的片段来确定是否存在触发。The clock signal is usually provided in the form of pulses and includes a series of high and low pulses such as "...10101010...", as shown in the upper part of Figure 5. In one embodiment, a segment of the clock signal of "010" can be used to Determine if a trigger exists. For example, you can select the second half of the period when the clock signal is "0" (low level) as the first "0" of the above "010" segment, and use the successive complete "1" (high level) of the clock signal as The "1" of the above "010" segment, and use the first half period of successive "0"s of the clock signal as the second "0" of the above "010" segment. Thus, one clock cycle corresponds to one cycle for determining the presence or absence of a trigger. This clock cycle consists of three logic values and thus corresponds to 3 time frames. Alternatively, a segment of the clock signal of "101" may also be used to determine whether a trigger exists.
图5中示出了逻辑电路30的对应于一个周期的3个时间帧的时间帧展开示意图。帧0对应于上述“010”片段的第一个“0”,帧1对应于上述“010”片段的“1”,并且帧2对应于上述“010”片段的第二个“0”。由于逻辑仿真通常包括多个原始输入集以确定逻辑电路在不同输入下的仿真结果,例如针对第一原始数据输入PI1的逻辑输入值例如为第一逻辑输入集,其例如可以包括64位比特值。因此,可以需要M个周期来进行仿真,其中M表示大于1的整数,例如64。对于图5的帧展开而言,需要将逻辑电路30展开为3M个时间帧。逻辑电路 30在各个时间帧中的展开具有基本上相同的分级形式,因此在此不再重复描述。FIG. 5 shows a time frame expansion schematic diagram of three time frames corresponding to one cycle of the logic circuit 30 . Frame 0 corresponds to the first "0" of the aforementioned "010" segment, frame 1 corresponds to the "1" of the aforementioned "010" segment, and frame 2 corresponds to the second "0" of the aforementioned "010" segment. Since the logic simulation usually includes a plurality of original input sets to determine the simulation results of the logic circuit under different inputs, for example, the logic input value for the first original data input PI1 is, for example, the first logic input set, which may include, for example, a 64-bit bit value . Therefore, M cycles may be required for simulation, where M represents an integer greater than 1, such as 64. For the frame expansion of FIG. 5 , it is necessary to expand the logic circuit 30 into 3M time frames. The expansion of the logic circuit 30 in each time frame has basically the same hierarchical form, so the description will not be repeated here.
图6示出了根据本公开的一些实施例的由通用处理器执行的用于仿真的方法600的示意图。在一个实施例中,仿真过程600可以由诸如CPU之类的通用处理器执行以用于逻辑仿真,例如可以是图2中的逻辑仿真210的一种实现方式的至少一部分,因此上面针对图1-图5所述的各个方面可以适用于方法600,在此不再赘述。通用处理器14例如可以通过有线、无线或读取存储介质的方式接收描述逻辑电路的网表文件。网表文件包括了用于描述各个逻辑门、原始输入和原始输出、各个部件之间的连接关系等各种数据。虽然在此以通用处理器来实施方法600,但是这仅是示意而非对本公开的限制。在一些实施例中,方法600的至少一部分或其全部可以由加速器执行。FIG. 6 shows a schematic diagram of a method 600 for simulation performed by a general-purpose processor according to some embodiments of the present disclosure. In one embodiment, the simulation process 600 can be executed by a general-purpose processor such as a CPU for logic simulation, for example, it can be at least a part of an implementation of the logic simulation 210 in FIG. - Various aspects described in FIG. 5 may be applicable to the method 600, and details are not repeated here. The general-purpose processor 14 can receive the netlist file describing the logic circuit, for example, through wires, wirelessly, or by reading a storage medium. The netlist file includes various data used to describe each logic gate, original input and original output, and connection relationship between various components. Although method 600 is implemented here with a general-purpose processor, this is for illustration only and not limitation of the present disclosure. In some embodiments, at least a portion or all of method 600 may be performed by an accelerator.
在602,CPU基于网表数据构造电路互联矩阵并对电路分级。在一个实施例中,网表文件例如可以是门级电路的网表文件。具体而言,CPU可以将逻辑电路按照图4所示的方式基于网表数据构造电路互连矩阵并且对逻辑电路进行分级。At 602, the CPU constructs a circuit interconnection matrix and ranks the circuits based on the netlist data. In one embodiment, the netlist file may be, for example, a netlist file of a gate-level circuit. Specifically, the CPU can construct a circuit interconnection matrix based on the netlist data for the logic circuit in the manner shown in FIG. 4 and classify the logic circuit.
在604,CPU将分级数据传输至GPU。例如,CPU可以将电路互联矩阵和分级数据传到GPU中的内存。虽然在此将电路互连矩阵和分级数据分开示出,但这仅是示意而非对本公开的范围进行限制。在另一些实施例中,电路互连矩阵可以被合并至分级数据中。At 604, the CPU transmits the staging data to the GPU. For example, a CPU can pass circuit interconnection matrices and hierarchical data to memory in a GPU. Although the circuit interconnection matrix and hierarchical data are shown separately here, this is for illustration only and does not limit the scope of the present disclosure. In other embodiments, a matrix of circuit interconnections may be incorporated into the hierarchical data.
在606,CPU设置时间帧0为当前帧。由于逻辑仿真中各个逻辑门彼此之间有相互依赖关系,因此为了正确实现逻辑仿真,可以使用时间帧来逐帧地计算各个逻辑门的输出。在仿真开始处,CPU可以将时间帧0设置为当前时间帧。At 606, the CPU sets time frame 0 as the current frame. Since logic gates in logic simulation are interdependent with each other, in order to implement logic simulation correctly, time frames can be used to calculate the output of each logic gate frame by frame. At the start of the simulation, the CPU can set timeframe 0 as the current timeframe.
在608,CPU基于测试向量获得当前时间帧的初值并且将其传输至GPU。在一个实施例中,测试向量可以包括向逻辑电路的各个原始输入提供的测试激励值。可以理解,在不同的时间帧中,相同原始输入所接收的测试激励值可以相同或不同。测试向量因此包括针对不同时间帧中的原始输入的初值。在一些实施例中,由于测试向量可以具有较高的数据量,因此可以仅将当前时间帧的测试向量的原始输入的初值提供至GPU。在另一些实施例中,如果GPU的内存足够大,也可以将所有时间帧或一部分时间帧的原始输入的初值提供至GPU。本公开对此不进行限制。At 608, the CPU obtains an initial value for the current time frame based on the test vector and transmits it to the GPU. In one embodiment, the test vectors may include test stimulus values provided to respective raw inputs of the logic circuit. It can be understood that in different time frames, the test stimulus values received by the same original input may be the same or different. The test vectors therefore include initial values for the original inputs in different time frames. In some embodiments, since the test vector may have a relatively high amount of data, only the initial value of the original input of the test vector of the current time frame may be provided to the GPU. In some other embodiments, if the memory of the GPU is large enough, the initial value of the original input of all time frames or a part of the time frames may also be provided to the GPU. This disclosure is not limited in this regard.
在610,CPU将标识位数组拷贝到GPU。在一些实施例中,标识位数组可以包括分级标识位数组和逻辑门标识位数组,其中分级标识位数组用于表示与其对应的各个层级电路在一个时间帧中是否需要被计算,而逻辑门标识位数组用于表示与其对应的各个逻辑门在一个时间帧中是否需要被计算。在不需要计算的情形下,可以使用当前时间帧之前一个时间帧中的输出作为当前时间帧的输出。这样,可以减少GPU的计算时间和功耗。可以理解,在一些实施例中,标识位数组并非必须,GPU可以针对每个时间帧中的每个层级电路和逻辑门都进行计算,本公开对此不进行限制。At 610, the CPU copies the flag bit array to the GPU. In some embodiments, the identification bit array may include a hierarchical identification bit array and a logic gate identification bit array, wherein the hierarchical identification bit array is used to indicate whether each level circuit corresponding to it needs to be calculated in a time frame, and the logic gate identification The bit array is used to indicate whether each logic gate corresponding to it needs to be calculated in a time frame. In cases where calculations are not required, the output in the previous time frame of the current time frame can be used as the output of the current time frame. In this way, calculation time and power consumption of the GPU can be reduced. It can be understood that, in some embodiments, the identification bit array is not necessary, and the GPU can perform calculations for each level circuit and logic gate in each time frame, which is not limited in the present disclosure.
在612,CPU调用GPU对当前时间帧进行仿真。GPU的逻辑仿真将在下文具体描述。在614,CPU从GPU接收当前时间帧中的逻辑门的终值。在接收所有时间帧的逻辑门的终值之后,CPU可以获得逻辑仿真输出集。逻辑仿真输出集可以包括所有时间帧中的逻辑门的终值的至少一部分或全部。在一些实施例中,CPU可以输出逻辑仿真输出集至外部设备,例如显示器或存储器。At 612, the CPU invokes the GPU to simulate the current time frame. The logic simulation of the GPU will be described in detail below. At 614, the CPU receives the final value of the logic gate in the current time frame from the GPU. After receiving the final values of the logic gates for all time frames, the CPU can obtain a logic simulation output set. The set of logic simulation outputs may include at least some or all of the final values of the logic gates in all time frames. In some embodiments, the CPU may output a logic simulation output set to an external device, such as a display or memory.
在616,CPU判断当前时间帧是否为最后一个时间帧?如果当前时间帧为最后一个时间帧,则行进至620以结束逻辑仿真。如果不是最后一个时间帧,则行进至618。在618,CPU将下一个时间帧设置为当前时间帧,并且循环执行608-616,直至到达最后一个时间帧并且结 束逻辑仿真。At 616, the CPU judges whether the current time frame is the last time frame? If the current time frame is the last time frame, proceed to 620 to end the logic simulation. If not the last time frame, proceed to 618. At 618, the CPU sets the next time frame as the current time frame, and loops through 608-616 until the last time frame is reached and the logic simulation ends.
虽然在此示出了由CPU执行的示例性过程,但是这仅是示意,而非对本公开的范围进行限制。在一些实施例中,方法600还可以包括其它步骤或是不具有图6所示的步骤的一部分,例如610。Although an exemplary process performed by a CPU is shown here, this is only illustrative and does not limit the scope of the present disclosure. In some embodiments, the method 600 may also include other steps or not have a part of the steps shown in FIG. 6 , such as 610 .
图7示出了根据本公开的一些实施例的逻辑仿真方法700的示意流程图。在一个实施例中,方法700可以由诸如GPU之类的加速器执行以用于逻辑仿真,例如可以是图2中的逻辑仿真210的一种实现方式的至少一部分,因此上面针对图1-图5所述的各个方面可以适用于方法700,在此不再赘述。在下文中,虽然使用GPU作为执行主体对方法700进行描述,但是这仅是示意而非对本公开的范围进行限制。在另一些实施例中可以使用诸如FPGA之类的加速器执行方法700。FIG. 7 shows a schematic flowchart of a logic simulation method 700 according to some embodiments of the present disclosure. In one embodiment, the method 700 can be executed by an accelerator such as a GPU for logic simulation, for example, it can be at least a part of an implementation of the logic simulation 210 in FIG. The various aspects described above may be applicable to the method 700, which will not be repeated here. In the following, although the method 700 is described using a GPU as an execution subject, this is only for illustration and not to limit the scope of the present disclosure. In other embodiments, an accelerator such as an FPGA may be used to perform the method 700 .
在702,GPU接收逻辑电路的分级数据。分级数据表示逻辑电路的多个层级电路。多个层级电路是基于逻辑电路中的多个逻辑门的连接关系而被划分的,例如如图4所示。在一个实施例中,原始输入、原始输出和时序逻辑门被划分在第一层级电路,即0级电路,而组合逻辑门被划分在第二层级电路。进一步地,组合逻辑门可以基于连接关系被进一步划分为第一子层级电路、第二子层级电路,等等。虽然在此示出了一种分级方式,但这仅是示意而非对本公开的范围进行限制。在另一些实施例中,原始输入、原始输出和时序逻辑门可以基于各自的连接关系而被划分至不同的层级电路。此外,包括组合逻辑门的第二层级电路也可以被划分不同的子层级电路。本公开对此不进行限制。At 702, the GPU receives hierarchical data for logic circuits. Hierarchical data represents multiple levels of logic circuitry. Multiple hierarchical circuits are divided based on the connection relationship of multiple logic gates in the logic circuit, as shown in FIG. 4 , for example. In one embodiment, the original input, original output and sequential logic gates are divided into the first level circuit, ie level 0 circuit, while the combinational logic gate is divided into the second level circuit. Further, the combinational logic gates can be further divided into a first sub-level circuit, a second sub-level circuit, and so on based on the connection relationship. Although a hierarchy is shown here, this is for illustration only and does not limit the scope of the present disclosure. In some other embodiments, the original input, the original output and the sequential logic gate may be divided into circuits of different levels based on their respective connection relationships. Furthermore, the second-level circuits including combinational logic gates may also be divided into different sub-level circuits. This disclosure is not limited in this regard.
在704,GPU基于分级数据和针对逻辑电路的测试向量集,生成逻辑仿真输出集,如下文具体描述。生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,多个逻辑门的逻辑输出值与所述逻辑仿真输出集相关联。例如,GPU可以使用其包括的三个处理引擎或线程并行地分别计算图4中的1级电路中的与门31、第一缓冲器33和反相器32在同一个时间帧中的逻辑输出。该时间帧中的三个逻辑门的输出可能会进一步影响后续层级电路,例如第二缓冲器34的输出,并且会影响后续时间帧的逻辑输出。因此,这三个逻辑门的逻辑输出值实际上会影响最终的逻辑仿真输出集。换言之,逻辑仿真输出集至少部分地取决于这三个逻辑门在该时间帧中的逻辑输出。At 704, the GPU generates a set of logic simulation outputs based on the hierarchical data and the set of test vectors for the logic circuit, as described in detail below. Generating a logic simulation output set includes parallel computing logic output values of a plurality of logic gates located in the same hierarchical circuit in the same time frame, the logic output values of the plurality of logic gates being associated with the logic simulation output set. For example, the GPU can use its three processing engines or threads to calculate the logic outputs of the AND gate 31, the first buffer 33, and the inverter 32 in the same time frame in the first-stage circuit in FIG. 4 in parallel. . The outputs of the three logic gates in this time frame may further affect subsequent level circuits, such as the output of the second buffer 34 , and may affect the logic output of the subsequent time frame. Therefore, the logic output values of these three logic gates actually affect the final set of logic simulation outputs. In other words, the set of logic simulation outputs depends at least in part on the logic outputs of the three logic gates in the time frame.
图8示出了根据本公开的一些实施例的由图形处理器执行的仿真过程800的示意图。在一个实施例中,过程800例如可以是704的至少一部分。因此过程800可以由诸如GPU之类的加速器执行以用于逻辑仿真,例如可以是图2中的逻辑仿真210的一种实现方式的至少一部分,因此上面针对图1-图5和图7所述的各个方面可以适用于过程800,在此不再赘述。FIG. 8 shows a schematic diagram of a simulation process 800 performed by a graphics processor according to some embodiments of the present disclosure. In one embodiment, process 800 may be at least a portion of 704, for example. Therefore, the process 800 may be executed by an accelerator such as a GPU for logic simulation, for example may be at least a part of an implementation of the logic simulation 210 in FIG. Various aspects of can be applied to the process 800, and will not be repeated here.
在802,GPU调用过程900以计算电路中组合逻辑门的值。例如参见图4,在一个时间帧中,在获得0级的原始输入之后,可以由GPU计算0级之后的1级和2级的组合逻辑门的输出逻辑值。具体计算流程在下面参见图9描述。如果逻辑电路中不存在组合逻辑门,则可以省略该步骤。At 802, the GPU invokes process 900 to compute values of combinatorial logic gates in the circuit. For example, referring to FIG. 4 , in a time frame, after the original input of level 0 is obtained, the output logic values of combinational logic gates of level 1 and level 2 after level 0 may be calculated by the GPU. The specific calculation process is described below with reference to FIG. 9 . This step can be omitted if there are no combinational logic gates in the logic circuit.
在804,GPU确定下一级是否需要计算?例如参见图5,在计算完帧0的第二层级中的第二缓冲器34的逻辑输出之后,GPU判断下一级是否要计算。例如,GPU判断后面帧1中0级电路中的时序逻辑门需要被计算,则行进至806。如果无需计算,例如通过标识位数组判断或是达到时间帧的最后一级,则将该组合逻辑的输出返回至CPU。At 804, the GPU determines if the next stage needs to be computed? For example, referring to FIG. 5 , after calculating the logic output of the second buffer 34 in the second level of frame 0, the GPU determines whether the next level needs to be calculated. For example, the GPU judges that the sequential logic gates in the 0-level circuit in the following frame 1 need to be calculated, and then proceeds to 806 . If there is no need to calculate, for example, it is judged by the identification bit array or the last level of the time frame is reached, then the output of the combinatorial logic is returned to the CPU.
在806,GPU调用过程1000以计算电路中时序逻辑门的值。如上所述,由于时序逻辑门和组合逻辑门被分级在不同层级之后,因此在此需要针对时序逻辑门的逻辑输出进行计算。 在一个实施例中,可以使用图10所示的过程1000来计算时序逻辑门的逻辑输出。如果逻辑电路中不存在时序逻辑门,则可以省略该步骤。At 806, the GPU invokes process 1000 to calculate values of sequential logic gates in the circuit. As mentioned above, since the sequential logic gates and the combinational logic gates are classified at different levels, calculations need to be performed on the logic outputs of the sequential logic gates. In one embodiment, the logic output of a sequential logic gate may be calculated using the process 1000 shown in FIG. 10 . This step can be omitted if there are no sequential logic gates in the logic circuit.
在808,GPU确定是否循环次数超限?在一些情形下,时序逻辑的计算可能存在循环次数超出限制,从而无法获得正确或确定的逻辑输出。因此,在一些实施例中,可以针对逻辑仿真中时序逻辑门的计算次数进行约束。当超出阈值次数的时候,这表明时序逻辑门的计算可能存在错误。在此情形下,可以行进至810,由GPU对这种情形进行报错。例如,GPU可以将报错信息发送至CPU,并且由CPU在屏幕上显示报错信息或是记录至仿真日志。如果未超出次数阈值,则GPU返回至802以计算下一级组合逻辑门的逻辑输出值。在812,GPU将计算得到的逻辑输出返回至CPU。在一些实施例中,过程800可以没有808和810,例如在逻辑电路不具有时序逻辑门的情形下或其它无需报错的情形下。At 808, the GPU determines if the number of loops exceeded? In some cases, the calculation of sequential logic may have an excessive number of cycles, so that a correct or deterministic logic output cannot be obtained. Therefore, in some embodiments, the calculation times of sequential logic gates in logic simulation can be constrained. When the threshold number of times is exceeded, it indicates that there may be an error in the calculation of the sequential logic gate. In this case, it may proceed to 810, and the GPU reports an error for this situation. For example, the GPU can send the error message to the CPU, and the CPU can display the error message on the screen or record it in the simulation log. If the times threshold is not exceeded, the GPU returns to 802 to calculate the logic output value of the combinational logic gate of the next stage. At 812, the GPU returns the computed logic output to the CPU. In some embodiments, process 800 may be without 808 and 810 , such as where the logic circuit does not have sequential logic gates or otherwise does not require error reporting.
图9示出了根据本公开的一些实施例的图8中的过程的组合逻辑计算过程900的示意图。在一个实施例中,过程900例如可以是802的至少一部分。因此过程900可以由诸如GPU之类的加速器执行以用于逻辑仿真,例如可以是图2中的逻辑仿真210的一种实现方式的至少一部分,因此上面针对图1-图5和图7-图8所述的各个方面可以适用于过程900,在此不再赘述。FIG. 9 shows a schematic diagram of a combinatorial logic calculation process 900 of the process in FIG. 8 according to some embodiments of the present disclosure. In one embodiment, process 900 may be at least a part of 802, for example. Therefore, the process 900 can be executed by an accelerator such as a GPU for logic simulation, for example, it can be at least a part of an implementation of the logic simulation 210 in FIG. The various aspects described in 8 may be applicable to the process 900, which will not be repeated here.
在902,GPU将组合逻辑首级设为当前级。例如,GPU将图4中的1级设置为当前级。在904,GPU确定当前级是否需要计算?在一个实施例中,GPU可以通过确定与当前级对应的分级标识位来确定当前级是否需要计算。如果分级标识位为第一值,则表示该层级电路在时间帧中的输入集与该层级电路在该时间帧之前一个时间帧中的输入集至少部分地不同。由于输入不同,因此该层级电路的输出集也可能与前一帧的输出集至少部分地不同。在此情形下,GPU可以确定当前级需要被计算。如果分级标识位为与第一值不同的第二值,则GPU可以确定当前级无需被计算。在一个实施例中,GPU可以使用当前级在上一帧中的输出集作为当前帧的输出集并且行进至908,从而节省计算资源的消耗和计算时间。虽然在此使用分级标识位来确定是否需要被计算,这仅是示意而非对本公开的范围进行限制。备选地,可以根据当前级所接收的在当前帧中的各个逻辑输入相比于上一帧中的各个逻辑输入是否改变来确定是否需要重新计算。当全部输入中的至少一部分改变时,则可以确定当前级需要被重新计算。At 902, the GPU sets the combinatorial logic head as the current level. For example, the GPU sets level 1 in Figure 4 as the current level. At 904, the GPU determines whether the current stage requires computation? In an embodiment, the GPU can determine whether the current level needs to be calculated by determining the level identification bit corresponding to the current level. If the hierarchical identification bit is the first value, it indicates that the input set of the hierarchical circuit in the time frame is at least partially different from the input set of the hierarchical circuit in a time frame before the time frame. Due to the different inputs, the output set of the hierarchical circuit may also be at least partially different from the output set of the previous frame. In this case, the GPU may determine that the current level needs to be computed. If the level identification bit is a second value different from the first value, the GPU may determine that the current level does not need to be calculated. In one embodiment, the GPU may use the output set of the current stage in the previous frame as the output set of the current frame and proceed to 908, thereby saving consumption of computing resources and computing time. Although a hierarchical flag is used here to determine whether it needs to be calculated, this is for illustration only and not to limit the scope of the present disclosure. Alternatively, it may be determined whether recalculation is required according to whether each logic input received by the current stage in the current frame has changed compared with each logic input in the previous frame. When at least some of the total inputs are changed, then it may be determined that the current level needs to be recalculated.
进一步地,在一些实施例中,即使在GPU确定出当前级需要被计算的情形下,也可以针对当前级中的各个逻辑门进行进一步的计算判断。与分级电路类似地,GPU中可以存储有与各个级中的各个逻辑门对应的逻辑门标识位数组。通过查阅逻辑门标识位数组中的相关值,可以确定各个逻辑门在该时间帧中是否需要被计算。在一个实施例中,如果一个逻辑门的对应的逻辑门标识位为第一值,则表示该逻辑门在时间帧中的输入集与该逻辑门在该时间帧之前一个时间帧中的输入集至少部分地不同。由于输入不同,因此该逻辑门的输出集也可能与前一帧的输出集至少部分地不同。在此情形下,GPU可以确定该逻辑门需要被计算。如果逻辑门标识位为与第一值不同的第二值,则GPU可以确定该逻辑门无需被计算。在一个实施例中,GPU可以使用逻辑门在上一帧中的输出集作为逻辑门在当前帧的输出集并且行进至908,从而节省计算资源的消耗和计算时间。虽然在此使用逻辑门标识位来确定是否需要被计算,这仅是示意而非对本公开的范围进行限制。备选地,可以根据当前级所接收的在当前帧中的各个逻辑输入相比于上一帧中的各个逻辑输入是否改变来确定是否需要重新计算。当全部输入中的至少一部分改变时,则可以确定逻辑门需要被重新计算。Further, in some embodiments, even when the GPU determines that the current level needs to be calculated, further calculation judgments can be made for each logic gate in the current level. Similar to the hierarchical circuit, a logic gate identification bit array corresponding to each logic gate in each level may be stored in the GPU. Whether each logic gate needs to be calculated in the time frame can be determined by referring to the relevant value in the logic gate identification bit array. In one embodiment, if the corresponding logic gate identification bit of a logic gate is the first value, it means that the input set of the logic gate in the time frame is different from the input set of the logic gate in a time frame before the time frame different at least in part. Because of the different inputs, the set of outputs of the logic gate may also be at least partially different from the set of outputs of the previous frame. In this case, the GPU may determine that the logic gate needs to be computed. If the logic gate identification bit is a second value different from the first value, the GPU may determine that the logic gate does not need to be calculated. In one embodiment, the GPU may use the output set of the logic gate in the previous frame as the output set of the logic gate in the current frame and proceed to 908 , thereby saving consumption of computing resources and computing time. Although a logic gate flag is used here to determine whether it needs to be calculated, this is only for illustration and not to limit the scope of the present disclosure. Alternatively, it may be determined whether recalculation is required according to whether each logic input received by the current stage in the current frame has changed compared with each logic input in the previous frame. When at least some of the total inputs are changed, then it can be determined that the logic gate needs to be recalculated.
可以理解,虽然上面通过使用分级标识位和逻辑门标识位来示出,但是这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以无需判断是否需要计算,而由GPU直接进行计算,例如,当查找标识位的时间多于逻辑门计算本身所需的时间时。It can be understood that although the above is illustrated by using hierarchical flags and logic gate flags, this is only for illustration and not to limit the scope of the present disclosure. In some embodiments, the calculation may be performed directly by the GPU without judging whether the calculation is required, for example, when the time for searching the identification bit is longer than the time required for the logic gate calculation itself.
在906,在确定需要计算逻辑门的逻辑输出时,GPU计算当前线程所负责的逻辑门的值,并更新标识位数组。在一个实施例中,例如在计算完成一个逻辑门的逻辑输出时,如果该逻辑门的逻辑输出与前一时间帧中的逻辑输出不同,则这表明该逻辑门的值发生了改变,此时需要更新标识位数组。首先,将该逻辑门对应的标识位设置为第二值,表明该门已经计算完成。然后,将该逻辑门的直接扇出(即直接被该逻辑门驱动的)区域内的所有逻辑门对应的标识位置为第一值,表示它们需要被计算。此外,GPU还将直接扇出区域内的逻辑门所在的分级标识位置为第一值,表明这些层级电路需要在下一轮被计算。此处对标识位的操作不需要原子操作。此外,在一些实施例中,例如在不具有标识位的情形下,可以无需更新标识位数组。At 906, when it is determined that the logic output of the logic gate needs to be calculated, the GPU calculates the value of the logic gate that the current thread is responsible for, and updates the identification bit array. In one embodiment, for example, when the logic output of a logic gate is calculated, if the logic output of the logic gate is different from the logic output in the previous time frame, this indicates that the value of the logic gate has changed, and at this time The flag array needs to be updated. First, the identification bit corresponding to the logic gate is set to a second value, indicating that the calculation of the gate has been completed. Then, the identification positions corresponding to all the logic gates in the direct fan-out area of the logic gate (that is, directly driven by the logic gate) are set as the first value, indicating that they need to be calculated. In addition, the GPU sets the hierarchical identification position of the logic gate in the direct fan-out area as the first value, indicating that these hierarchical circuits need to be calculated in the next round. The operation on the identification bit here does not require an atomic operation. In addition, in some embodiments, for example, if there is no identification bit, there may be no need to update the identification bit array.
在908,GPU等待所有GPU线程执行至此。GPU中的不同线程可以并行执行同一时间帧中的同一分级中的不同组合逻辑门的仿真计算。为了避免逻辑错误,GPU可以在908等待同一分级中的所有组合逻辑门的逻辑计算完成之后再行进至910。在910,GPU确定当前级是否是最后一级组合逻辑门?如果不是最后一级组合逻辑门,例如,在图4中的1级之后还有2级,则表明还存在未被计算的其它分级的组合逻辑门。因此,在912处,GPU将下一级设置为当前级并且重复904-910。如果是最后一级逻辑门,则在914GPU返回至图8中的804。虽然在此示出了组合逻辑门的一种计算过程,但是这仅是示意而非对本公开进行限制。在另一些实施例中,过程900中的一些可以没有,例如可以没有更新标识位数组。此外,过程900还可以具有图中未示出的其它步骤,本公开对此不进行限制。At 908, the GPU waits for all GPU threads to execute so far. Different threads in the GPU can perform simulation calculations for different combinatorial logic gates in the same stage in the same time frame in parallel. In order to avoid logic errors, the GPU may proceed to 910 after waiting in 908 for the completion of logic calculations of all combinatorial logic gates in the same hierarchy. At 910, the GPU determines whether the current stage is the last stage of combinational logic gates? If it is not the last level of combinational logic gates, for example, there are 2 levels after level 1 in FIG. 4 , it indicates that there are other levels of combinational logic gates that have not been calculated. Therefore, at 912, the GPU sets the next level as the current level and repeats 904-910. If it is the last level of logic gates, return to 804 in FIG. 8 at 914GPU. Although a calculation process of combinational logic gates is shown here, this is only for illustration and not to limit the present disclosure. In other embodiments, some of the process 900 may not, for example, may not update the identification bit array. In addition, the process 900 may also have other steps not shown in the figure, which is not limited in the present disclosure.
图10示出了根据本公开的一些实施例的图8中的过程的时序逻辑计算过程1000的示意图。在一个实施例中,过程1000例如可以是806的至少一部分。因此过程1000可以由诸如GPU之类的加速器执行以用于逻辑仿真,例如可以是图2中的逻辑仿真210的一种实现方式的至少一部分,因此上面针对图1-图5和图7-图9所述的各个方面可以适用于过程1000,在此不再赘述。FIG. 10 shows a schematic diagram of a sequential logic calculation process 1000 of the process in FIG. 8 according to some embodiments of the present disclosure. In one embodiment, process 1000 may be at least a part of 806, for example. Therefore, the process 1000 can be executed by an accelerator such as a GPU for logic simulation, for example, it can be at least a part of an implementation of the logic simulation 210 in FIG. The various aspects described in 9 may be applicable to the process 1000, and will not be repeated here.
在1002,GPU将时序逻辑门所在级设为当前级。例如,GPU将图5中的帧1的0级设置为当前级。在1004,GPU确定当前级是否需要计算?在一个实施例中,GPU可以通过确定与当前级对应的分级标识位来确定当前级是否需要计算。如果分级标识位为第一值,则表示该层级电路在该时间帧中的因素组合与该层级电路在该时间帧之前一个时间帧中的相同组合至少部分地不同。由于时序逻辑门的输出不仅取决于输入,还取决于该时序逻辑门在上一时间帧中的输出、时钟信号和控制端信号等,因此在此使用因素组合来包括来自原始输入或上一级驱动的输入集、该时序逻辑门在上一帧的输出集、时钟信号和控制信号等。At 1002, the GPU sets the level where the sequential logic gate is located as the current level. For example, the GPU sets level 0 of frame 1 in Figure 5 as the current level. At 1004, the GPU determines whether the current stage requires computation? In an embodiment, the GPU can determine whether the current level needs to be calculated by determining the level identification bit corresponding to the current level. If the hierarchical identification bit is the first value, it indicates that the factor combination of the hierarchical circuit in the time frame is at least partially different from the same combination of the hierarchical circuit in a time frame before the time frame. Since the output of a sequential logic gate not only depends on the input, but also depends on the output of the sequential logic gate in the previous time frame, the clock signal and the control terminal signal, etc., the combination of factors is used here to include the output from the original input or the previous stage. The input set of the driver, the output set of the sequential logic gate in the previous frame, the clock signal and the control signal, etc.
由于因素组合不同,因此该层级电路的输出集也可能与前一帧的输出集至少部分地不同。在此情形下,GPU可以确定当前级需要被计算。如果分级标识位为与第一值不同的第二值,则GPU可以确定当前级无需被计算。在一个实施例中,GPU可以使用当前级在上一帧中的输出集作为当前帧的输出集并且行进至1008,从而节省计算资源的消耗和计算时间。虽然在此使用分级标识位来确定是否需要被计算,这仅是示意而非对本公开的范围进行限制。备选地,可以根据当前级在当前帧中的因素组合相比于上一帧中的因素组合是否改变来确定是否需要重新计算。当全部输入中的至少一部分改变时,则可以确定当前级需要被重新计算。Due to the different combination of factors, the output set of the hierarchical circuit may also be at least partially different from the output set of the previous frame. In this case, the GPU may determine that the current level needs to be computed. If the level identification bit is a second value different from the first value, the GPU may determine that the current level does not need to be calculated. In one embodiment, the GPU may use the output set of the current stage in the previous frame as the output set of the current frame and proceed to 1008 , thereby saving consumption of computing resources and computing time. Although a hierarchical flag is used here to determine whether it needs to be calculated, this is for illustration only and not to limit the scope of the present disclosure. Alternatively, it may be determined whether recalculation is required according to whether the factor combination of the current level in the current frame is changed compared with the factor combination in the previous frame. When at least some of the total inputs are changed, then it may be determined that the current level needs to be recalculated.
进一步地,在一些实施例中,即使在GPU确定出当前级需要被计算的情形下,也可以针对当前级中的各个逻辑门进行进一步的计算判断。与分级电路类似地,GPU中可以存储有与各个级中的各个逻辑门对应的逻辑门标识位数组。通过查阅逻辑门标识位数组中的相关值,可以确定各个逻辑门在该时间帧中是否需要被计算。在一个实施例中,如果一个逻辑门的对应的逻辑门标识位为第一值,则表示该逻辑门在时间帧中的输入集与该逻辑门在该时间帧之前一个时间帧中的输入集至少部分地不同。由于输入不同,因此该逻辑门的输出集也可能与前一帧的输出集至少部分地不同。在此情形下,GPU可以确定该逻辑门需要被计算。如果逻辑门标识位为与第一值不同的第二值,则GPU可以确定该逻辑门无需被计算。在一个实施例中,GPU可以使用逻辑门在上一帧中的输出集作为逻辑门在当前帧的输出集并且行进至1008,从而节省计算资源的消耗和计算时间。虽然在此使用逻辑门标识位来确定是否需要被计算,这仅是示意而非对本公开的范围进行限制。备选地,可以根据当前级所接收的在当前帧中的各个逻辑输入相比于上一帧中的各个逻辑输入是否改变来确定是否需要重新计算。当全部输入中的至少一部分改变时,则可以确定逻辑门需要被重新计算。Further, in some embodiments, even when the GPU determines that the current level needs to be calculated, further calculation judgments can be made for each logic gate in the current level. Similar to the hierarchical circuit, a logic gate identification bit array corresponding to each logic gate in each level may be stored in the GPU. Whether each logic gate needs to be calculated in the time frame can be determined by referring to the relevant value in the logic gate identification bit array. In one embodiment, if the corresponding logic gate identification bit of a logic gate is the first value, it means that the input set of the logic gate in the time frame is different from the input set of the logic gate in a time frame before the time frame different at least in part. Because of the different inputs, the set of outputs of the logic gate may also be at least partially different from the set of outputs of the previous frame. In this case, the GPU may determine that the logic gate needs to be computed. If the logic gate identification bit is a second value different from the first value, the GPU may determine that the logic gate does not need to be calculated. In one embodiment, the GPU may use the output set of the logic gate in the previous frame as the output set of the logic gate in the current frame and proceed to 1008 , thereby saving consumption of computing resources and computing time. Although a logic gate flag is used here to determine whether it needs to be calculated, this is only for illustration and not to limit the scope of the present disclosure. Alternatively, it may be determined whether recalculation is required according to whether each logic input received by the current stage in the current frame has changed compared with each logic input in the previous frame. When at least some of the total inputs are changed, then it can be determined that the logic gate needs to be recalculated.
可以理解,虽然上面通过使用分级标识位和逻辑门标识位来示出,但是这仅是示意而非对本公开的范围进行限制。在一些实施例中,可以无需判断是否需要计算,而由GPU直接进行计算,例如,当查找标识位的时间多于逻辑门计算本身所需的时间时。It can be understood that although the above is illustrated by using hierarchical flags and logic gate flags, this is only for illustration and not to limit the scope of the present disclosure. In some embodiments, the calculation may be performed directly by the GPU without judging whether the calculation is required, for example, when the time for searching the identification bit is longer than the time required for the logic gate calculation itself.
在1006,GPU计算当前线程所负责的逻辑门的值,并更新标识位数组。在确定需要计算逻辑门的逻辑输出时,GPU计算当前线程所负责的逻辑门的值,并更新标识位数组。在一个实施例中,例如在计算完成一个逻辑门的逻辑输出时,如果该逻辑门的逻辑输出与前一时间帧中的逻辑输出不同,则这表明该逻辑门的值发生了改变,此时需要更新标识位数组。首先,将该逻辑门对应的标识位设置为第二值,表明该门已经计算完成。然后,将该逻辑门的直接扇出(即直接被该逻辑门驱动的)区域内的所有逻辑门对应的标识位置为第一值,表示它们需要被计算。此外,GPU还将直接扇出区域内的逻辑门所在的分级标识位置为第一值,表明这些层级电路需要在下一轮被计算。此处对标识位的操作不需要原子操作。此外,在一些实施例中,例如在不具有标识位的情形下,可以无需更新标识位数组。At 1006, the GPU calculates the value of the logic gate that the current thread is responsible for, and updates the identification bit array. When it is determined that the logic output of the logic gate needs to be calculated, the GPU calculates the value of the logic gate that the current thread is responsible for, and updates the identification bit array. In one embodiment, for example, when the logic output of a logic gate is calculated, if the logic output of the logic gate is different from the logic output in the previous time frame, this indicates that the value of the logic gate has changed, and at this time The flag array needs to be updated. First, the identification bit corresponding to the logic gate is set to a second value, indicating that the calculation of the gate has been completed. Then, the identification positions corresponding to all the logic gates in the direct fan-out area of the logic gate (that is, directly driven by the logic gate) are set as the first value, indicating that they need to be calculated. In addition, the GPU sets the hierarchical identification position of the logic gate in the direct fan-out area as the first value, indicating that these hierarchical circuits need to be calculated in the next round. The operation on the identification bit here does not require an atomic operation. In addition, in some embodiments, for example, if there is no identification bit, there may be no need to update the identification bit array.
在1008,GPU等待所有GPU线程执行至此。GPU中的不同线程可以并行执行同一时间帧中的同一分级中的不同时序逻辑门的仿真计算。为了避免逻辑错误,GPU可以在1008等待同一分级中的所有时序逻辑门的逻辑计算完成之后再行进至1010。在1010,GPU返回至808或802。在时序逻辑门分为多级的情形下,GPU还可以确定当前级是否是最后一级时序逻辑门?如果不是最后一级时序逻辑门,则表明还存在未被计算的其它分级的时序逻辑门。因此,GPU将下一级设置为当前级并且重复1002-1008。虽然在此示出了时序逻辑门的一种计算过程,但是这仅是示意而非对本公开进行限制。在另一些实施例中,过程1000中的一些可以没有,例如可以没有更新标识位数组。此外,过程1000还可以具有图中未示出的其它步骤,本公开对此不进行限制。At 1008, the GPU waits for all GPU threads to execute so far. Different threads in the GPU can perform simulation calculations of different sequential logic gates in the same stage in the same time frame in parallel. In order to avoid logic errors, the GPU may proceed to 1010 after waiting for logic calculations of all sequential logic gates in the same stage to be completed in 1008 . At 1010, the GPU returns to 808 or 802. In the case that the sequential logic gate is divided into multiple levels, the GPU can also determine whether the current level is the last level of sequential logic gate? If it is not the last level of sequential logic gates, it indicates that there are other hierarchical sequential logic gates that have not been calculated. Therefore, the GPU sets the next level as the current level and repeats 1002-1008. Although a calculation process of sequential logic gates is shown here, this is only for illustration and not to limit the present disclosure. In other embodiments, some of the process 1000 may not, for example, may not update the identification bit array. In addition, the process 1000 may also have other steps not shown in the figure, which is not limited in the present disclosure.
图11示出了根据本公开的一些实施例的逻辑仿真的一些示例的示意框图。如图11所示,测试向量集可以包括p个测试向量,其中p为大于1的整数。测试向量1具有r个时间帧,测试向量2具有s个时间帧、测试向量p具有t个时间帧,其中r、s、t均为大于0的整数,并且彼此可以相同或不同。在一个实施例中,当逻辑电路中包含的逻辑门的数目较多(例如逻辑门的数目大于阈值数目)时,可以使用“逻辑门并行”的方式以将不同的逻辑门分配在GPU的不同线程上运算,但是在此情形下对于测试向量而言是串行的。即,针对一个测试向 量完成多个逻辑门的并行逻辑仿真之后,再针对下一个测试向量进行多个逻辑门的并行逻辑仿真,直至针对测试向量集中的所有测试向量都完成逻辑仿真。FIG. 11 shows a schematic block diagram of some examples of logic simulations according to some embodiments of the present disclosure. As shown in FIG. 11 , the test vector set may include p test vectors, where p is an integer greater than 1. Test vector 1 has r time frames, test vector 2 has s time frames, and test vector p has t time frames, where r, s, and t are all integers greater than 0, and they can be the same or different from each other. In one embodiment, when the number of logic gates contained in the logic circuit is large (for example, the number of logic gates is greater than the threshold number), the "parallel logic gates" method can be used to allocate different logic gates to different GPUs. Operates on threads, but in this case serially for test vectors. That is, after the parallel logic simulation of multiple logic gates is completed for one test vector, the parallel logic simulation of multiple logic gates is performed for the next test vector until the logic simulation is completed for all the test vectors in the test vector set.
虽然在此以测试向量串行的方式示出了并行逻辑仿真,但是这仅是示意而非对本公开的范围进行限制。可以针对测试向量也采用并行的方式进行并行逻辑仿真。例如在一些情形下,针对一个逻辑电路的逻辑仿真,测试向量集可以具有多个测试向量,其中每个测试向量又可以具有多个时间帧,并且测试向量集中的测试向量的时间帧数目可以相同或不同。例如,测试向量集可以包括第一测试向量和第二测试向量,并且逻辑仿真输出集包括第一输出子集和第二输出子集。生成逻辑仿真输出集还包括基于分级数据和第一测试向量生成第一输出子集;基于分级数据和第二测试向量生成第二输出子集。第一输出子集的生成和第二输出子集的生成是并行执行的。Although a parallel logic simulation is shown here in a serial manner of test vectors, this is for illustration only and does not limit the scope of the present disclosure. Parallel logic simulation can also be performed for test vectors in a parallel manner. For example, in some cases, for the logic simulation of a logic circuit, the test vector set can have multiple test vectors, wherein each test vector can have multiple time frames, and the number of time frames of the test vectors in the test vector set can be the same or different. For example, the set of test vectors may include a first test vector and a second test vector, and the set of logic simulation outputs includes a first subset of outputs and a second subset of outputs. Generating the set of logic simulation outputs further includes generating a first subset of outputs based on the hierarchical data and the first test vectors; and generating a second subset of outputs based on the hierarchical data and the second test vectors. The generation of the first output subset and the generation of the second output subset are performed in parallel.
例如,图12示出了根据本公开的另一些实施例的逻辑仿真的一些示例的示意框图。如果逻辑电路中逻辑门的数目较大(例如高于第一阈值数目),而且测试向量集中包括的测试向量的数目也较大(例如高于第二阈值数目)时,GPU中的线程可以被分为多个均匀的线程块,每个线程块包含多个线程,每个线程块负责计算一条测试向量。针对不同测试向量的线程块是并行运行的,例如每个线程块并行处理针对相应测试向量的逻辑仿真。针对同一测试向量,线程块内包含的多个线程对多个逻辑门的逻辑仿真也是并行执行的。例如基于分级数据和第一测试向量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,第一多个输出与第一输出子集相关联。换言之,包括第一输出子集的逻辑仿真输出集可以取决于第一多个输出。这样,可以进一步减少逻辑仿真的时间。虽然在此以测试向量并行的方式示出了并行逻辑仿真,但是这仅是示意而非对本公开的范围进行限制。备选地,在逻辑电路所包括的逻辑门的数目低于第一阈值数目,但测试向量集所包括的测试向量的数目高于第二阈值数目时,可以使用“测试向量并行”方式进行逻辑仿真。即,GPU使用多个GPU线程来并行仿真各条测试向量,每个线程负责一个测试向量的仿真。For example, FIG. 12 shows a schematic block diagram of some examples of logic simulation according to other embodiments of the present disclosure. If the number of logic gates in the logic circuit is large (for example, higher than the first threshold number), and the number of test vectors included in the test vector set is also large (for example, higher than the second threshold number), the threads in the GPU can be Divided into multiple uniform thread blocks, each thread block contains multiple threads, and each thread block is responsible for calculating a test vector. Thread blocks for different test vectors are run in parallel, for example, each thread block processes logic simulation for a corresponding test vector in parallel. For the same test vector, the logic simulation of multiple logic gates by multiple threads included in the thread block is also executed in parallel. A first plurality of outputs for a plurality of logic gates located in the same hierarchical circuit in the same time frame are computed in parallel, eg based on the hierarchical data and the first test vector, the first plurality of outputs being associated with the first subset of outputs. In other words, the set of logic simulation outputs comprising the first subset of outputs may depend on the first plurality of outputs. In this way, the time for logic simulation can be further reduced. Although parallel logic simulation is shown here in parallel with test vectors, this is for illustration only and not limiting of the scope of the present disclosure. Alternatively, when the number of logic gates included in the logic circuit is lower than the first threshold number, but the number of test vectors included in the test vector set is higher than the second threshold number, the logic may be performed in a "parallel test vector" mode. simulation. That is, the GPU uses multiple GPU threads to simulate various test vectors in parallel, and each thread is responsible for the simulation of one test vector.
虽然在上面示出了使用多个测试向量分别进行串行和并行仿真的示例,但这仅是示例而非对本公开的范围进行限制。在一些实施例中,可以使用多个测试向量进行混合仿真。即,对于一部分的测试向量,可以使用上面图12所示的并行逻辑仿真。但是在一些GPU线程执行完并行仿真之后,可以使用剩余的测试向量继续执行逻辑仿真。备选地,也可以使用一些测试向量先进行串行逻辑仿真,并且随后进行并行逻辑仿真。在又一些实施例中,也可以通过先串行后并行再串行的方式使用测试向量来逻辑仿真。本公开对于具体的混合方式不进行限制。可以基于测试向量的数目、GPU中线程的数目和处理能力等因素动态调整,以尽可能地减少逻辑仿真的时间。Although an example of performing serial and parallel simulations separately using multiple test vectors is shown above, this is only an example and does not limit the scope of the present disclosure. In some embodiments, multiple test vectors can be used for hybrid simulation. That is, for a part of the test vectors, the parallel logic simulation shown in FIG. 12 above can be used. But after some GPU threads have executed the parallel simulation, the remaining test vectors can be used to continue the logic simulation. Alternatively, some test vectors can also be used to perform serial logic simulation first, and then perform parallel logic simulation. In some other embodiments, the test vectors may also be used for logic simulation in a manner of serial first, then parallel and then serial. The present disclosure is not limited to specific mixing methods. It can be dynamically adjusted based on factors such as the number of test vectors, the number of threads in the GPU, and processing power, so as to reduce the time of logic simulation as much as possible.
图13示出了根据本公开的一些实施例的电子设备1300的示意框图。电子设备1300可以包括多个模块,以用于执行如图6-图12中所讨论的方法中的对应步骤。如图13所示,在一个实施例中,电子设备1300包括接收单元1302和生成单元1304。接收单元1302用于接收逻辑电路的分级数据,分级数据表示逻辑电路的多个层级电路,多个层级电路是基于逻辑电路中的多个逻辑门的连接关系而被划分的。生成单元1304用于基于分级数据和针对逻辑电路的测试向量集生成逻辑仿真输出集,其中生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,多个逻辑门的逻辑输出值与逻辑仿真输出集相关联。通过使用诸如GPU之类的加速器并行计算在同一时间帧中的多个逻辑门的输出,相比于CPU的常规串行计算,可以显著降低处理的时间。此外,通过将逻辑仿真电路分级, 可以确保逻辑仿真的正确性,这是因为并行处理的多个逻辑门位于同一层级电路中并且在逻辑仿真结果方面彼此之间并不具有因果关联性。FIG. 13 shows a schematic block diagram of an electronic device 1300 according to some embodiments of the present disclosure. The electronic device 1300 may include multiple modules for performing corresponding steps in the methods discussed in FIGS. 6-12 . As shown in FIG. 13 , in an embodiment, an electronic device 1300 includes a receiving unit 1302 and a generating unit 1304 . The receiving unit 1302 is used for receiving hierarchical data of the logic circuit. The hierarchical data represents multiple hierarchical circuits of the logical circuit, and the multiple hierarchical circuits are divided based on the connection relationship of multiple logic gates in the logical circuit. The generating unit 1304 is used to generate a logic simulation output set based on the hierarchical data and the test vector set for the logic circuit, wherein generating the logic simulation output set includes parallel calculation of the logic output values of multiple logic gates located in the same hierarchical circuit in the same time frame , the logic output values of the plurality of logic gates are associated with the logic simulation output set. By parallelizing the output of multiple logic gates in the same time frame using an accelerator such as a GPU, the processing time can be significantly reduced compared to conventional serial calculations by a CPU. In addition, by grading the logic simulation circuit, the correctness of the logic simulation can be ensured because multiple logic gates processed in parallel are located in the same hierarchical circuit and do not have a causal relationship with each other in terms of logic simulation results.
在一个实施例中,测试向量集包括第一测试向量和第二测试向量,并且逻辑仿真输出集包括第一输出子集和第二输出子集。生成单元1304还用于基于分级数据和第一测试向量生成第一输出子集;基于分级数据和第二测试向量生成第二输出子集,第一输出子集的生成和第二输出子集的生成是并行执行的。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In one embodiment, the set of test vectors includes a first test vector and a second test vector, and the set of logic simulation outputs includes a first subset of outputs and a second subset of outputs. The generation unit 1304 is also used to generate a first output subset based on the classification data and the first test vector; generate a second output subset based on the classification data and the second test vector, the generation of the first output subset and the generation of the second output subset Builds are performed in parallel. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在一个实施例中,生成单元1304还用于:基于分级数据和第一测试向量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,第一多个输出与第一输出子集相关联。除了在同一时间帧中针对同一层级内的多个逻辑门进行并行计算之外,通过还针对多个测试向量进行并行逻辑仿真,可以进一步缩减逻辑仿真的时间。In one embodiment, the generation unit 1304 is further configured to: based on the hierarchical data and the first test vector, calculate in parallel the first multiple outputs of multiple logic gates located in the same level circuit in the same time frame, the first multiple The outputs are associated with the first subset of outputs. In addition to performing parallel calculations for multiple logic gates within the same hierarchy in the same time frame, the logic simulation time can be further reduced by also performing parallel logic simulation for multiple test vectors.
在一个实施例中,生成单元1304还用于:确定逻辑电路中的第一层级电路在第一时间帧中是否将被计算;如果确定第一层级电路将被计算,则计算第一层级电路的第一输出集,逻辑仿真输出集与第一输出集相关联;以及如果确定第一层级电路将不被计算,则使用第一层级电路在第一时间帧之前一个时间帧中的输出集作为第一层级电路在第一时间帧中的第一输出集。通过确定逻辑电路中的层级电路在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略整个层级电路的逻辑门的计算时间,从而进一步降低逻辑仿真的总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In one embodiment, the generation unit 1304 is further configured to: determine whether the first-level circuit in the logic circuit will be calculated in the first time frame; if it is determined that the first-level circuit will be calculated, calculate the first-level circuit A first output set, a logic simulation output set is associated with the first output set; and if it is determined that the first level circuit will not be calculated, using the output set of the first level circuit in one time frame before the first time frame as the second output set The first output set of a hierarchical circuit in the first time frame. By determining whether a hierarchical circuit in a logic circuit needs to be calculated in some time frames, the calculation time of the logic gates of the entire hierarchical circuit can be omitted without calculation, thereby further reducing the total time of logic simulation and can also reduce Consumption of computing resources to use limited computing resources for processing other logic gates that require computation. This further reduces the overall time for logic simulation.
在一个实施例中,生成单元1304还用于确定与第一层级电路对应的第一分级标识位在第一时间帧中的值是否为第一值,第一分级标识位的第一值指示第一层级电路在第一时间帧中的输入集与第一层级电路在第一时间帧之前一个时间帧中的输入集至少部分地不同;以及确定第一层级电路的在第一时间帧中的全部输入或因素组合相比于第一层级电路的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In one embodiment, the generation unit 1304 is further configured to determine whether the value of the first level identification bit corresponding to the first level circuit in the first time frame is the first value, and the first value of the first level identification bit indicates that the first level a set of inputs to a first-level circuit in a first time frame differs at least in part from a set of inputs to a first-level circuit in a time frame preceding the first time frame; and determining all of the first-level circuits' inputs in the first time frame Whether an input or factor combination has changed compared to all inputs or factor combinations in the first time frame preceding the first time frame for the first level circuit. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of the identification bit determination method, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在一个实施例中,生成单元1304还用于确定第一层级电路中的第一逻辑门在第一时间帧中是否将被计算;如果确定第一逻辑门将被计算,则计算第一逻辑门的第一输出;以及如果确定第一逻辑门将不被计算,则使用第一逻辑门在第一时间帧之前一个时间帧中的输出作为第一逻辑门在第一时间帧中的第一输出,逻辑仿真输出集与第一输出相关联。通过确定逻辑电路中的一些逻辑门在一些时间帧中是否需要被计算,可以在不需计算的情形下,省略这些逻辑门的计算时间,从而进一步降低逻辑仿真的总时间并且还可以减少计算资源的消耗以将有限的计算资源用于处理其它需要计算的逻辑门。这又进一步地减少逻辑仿真的总时间。In one embodiment, the generation unit 1304 is also used to determine whether the first logic gate in the first level circuit will be calculated in the first time frame; if it is determined that the first logic gate will be calculated, then calculate the first logic gate first output; and if it is determined that the first logic gate will not be calculated, using the output of the first logic gate in one time frame before the first time frame as the first output of the first logic gate in the first time frame, logic A set of simulation outputs is associated with the first output. By determining whether some logic gates in a logic circuit need to be calculated in some time frames, the calculation time of these logic gates can be omitted when no calculation is required, thereby further reducing the total time of logic simulation and also reducing computing resources Consumption of limited computing resources to process other logic gates that require computation. This further reduces the overall time for logic simulation.
在一个实施例中,生成单元1304还用于确定第一逻辑门的在第一时间帧中的全部输入相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入是否改变包括:确定与第一逻辑门对应的第一逻辑门标识位在第一时间帧中的值是否为第一值,第一逻辑门标识位的第一值指示第一逻辑门在第一时间帧中的输入与第一逻辑门在第一时间帧之前一个时间帧中的输入至少部分地不同;以及确定第一逻辑门的在第一时间帧中的全部输入或因素组合相比于第一逻辑门的在第一时间帧之前一个时间帧中的全部输入或因素组合是否改变。通过使用标识位,可以简单有效地确定层级电路中的逻辑门是否需要被计算。由于使用标识位确定方式的 简单有效,因此也可以减少用于确定逻辑电路是否需要计算所需的时间。这又进一步地减少逻辑仿真的总时间。In one embodiment, the generating unit 1304 is further configured to determine whether all inputs of the first logic gate in the first time frame are changed compared to all inputs of the first logic gate in one time frame before the first time frame Including: determining whether the value of the first logic gate identification bit corresponding to the first logic gate in the first time frame is the first value, and the first value of the first logic gate identification bit indicates that the first logic gate is in the first time frame The input in is at least partially different from the input of the first logic gate in one time frame before the first time frame; and determining that all inputs or factor combinations of the first logic gate in the first time frame are compared to the Whether all inputs or combinations of factors for the gate changed one timeframe before the first timeframe. By using flags, it is possible to simply and efficiently determine whether a logic gate in a hierarchical circuit needs to be computed. Due to the simplicity and effectiveness of using the identification bit to determine, the time required for determining whether the logic circuit needs calculation can also be reduced. This further reduces the overall time for logic simulation.
在一个实施例中,基于分级数据和针对逻辑电路的测试向量集,生成单元1304还用于:按时间帧逐帧地依次计算各个层级电路的输出;以及基于最后一个时间帧中的各个层级电路的输出,生成逻辑仿真输出集。通过按时间帧逐帧计算各个层级的逻辑输出,可以确保逻辑仿真中逻辑运算的因果性的准确,从而提高逻辑仿真的准确率。In one embodiment, based on the hierarchical data and the test vector set for the logic circuit, the generating unit 1304 is further configured to: sequentially calculate the output of each hierarchical circuit frame by time frame; and based on the output of each hierarchical circuit in the last time frame , generating a logic simulation output set. By calculating the logic output of each level frame by time frame, the accuracy of the causality of the logic operation in the logic simulation can be ensured, thereby improving the accuracy of the logic simulation.
在一个实施例中,生成单元1304还用于确定包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数是否超过阈值次数;如果包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数超过阈值次数,则生成故障指示,故障指示表示逻辑仿真出现故障。通过设置循环阈值次数,可以避免逻辑仿真陷入针对时序逻辑门的错误循环或无限循环之中,并且能及时报告逻辑仿真的错误,从而减少逻辑仿真的时间。In one embodiment, the generating unit 1304 is further configured to determine whether the number of calculations of the output of the hierarchical circuit including the sequential logic gate in a time frame exceeds a threshold number of times; if the output of the hierarchical circuit including the sequential logic gate is in a time frame If the number of calculations exceeds the threshold number of times, a fault indication is generated, and the fault indication indicates that a fault occurs in the logic simulation. By setting the threshold number of cycles, logic simulation can be prevented from falling into an error loop or an infinite loop for sequential logic gates, and errors in logic simulation can be reported in time, thereby reducing the time of logic simulation.
图14示出了可以用来实施本公开的实施例的示例设备1400的示意性框图。设备1400可以用于实现电子设备10或1300。如图所示,设备1400包括计算单元1401,其可以根据存储在随机存取存储器(RAM)1403和/或只读存储器(ROM)1402的计算机程序指令或者从存储单元1408加载到RAM 1403和/或ROM 1402中的计算机程序指令,来执行各种适当的动作和处理。在RAM 1403和/或ROM 1402中,还可存储设备1400操作所需的各种程序和数据。计算单元1401和RAM 1403和/或ROM 1402通过总线1404彼此相连。输入/输出(I/O)接口1405也连接至总线1404。FIG. 14 shows a schematic block diagram of an example device 1400 that may be used to implement embodiments of the present disclosure. The device 1400 may be used to implement the electronic device 10 or 1300 . As shown, device 1400 includes computing unit 1401, which may be loaded into RAM 1403 and/or from storage unit 1408 according to computer program instructions stored in random access memory (RAM) 1403 and/or read only memory (ROM) 1402 or computer program instructions in ROM 1402 to perform various appropriate actions and processes. In the RAM 1403 and/or the ROM 1402, various programs and data necessary for the operation of the device 1400 can also be stored. The computing unit 1401 and the RAM 1403 and/or ROM 1402 are connected to each other via a bus 1404. An input/output (I/O) interface 1405 is also connected to the bus 1404 .
设备1400中的多个部件连接至I/O接口1405,包括:输入单元1406,例如键盘、鼠标等;输出单元1407,例如各种类型的显示器、扬声器等;存储单元1408,例如磁盘、光盘等;以及通信单元1409,例如网卡、调制解调器、无线通信收发机等。通信单元1409允许设备1400通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in the device 1400 are connected to the I/O interface 1405, including: an input unit 1406, such as a keyboard, a mouse, etc.; an output unit 1407, such as various types of displays, speakers, etc.; a storage unit 1408, such as a magnetic disk, an optical disk, etc. ; and a communication unit 1409, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 1409 allows the device 1400 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
计算单元1401可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元1401的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元1401执行上文所描述的各个方法和处理,例如方法600、700、800、900和/或1000。例如,在一些实施例中,方法600、700、800、900和/或1000可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元1408。在一些实施例中,计算机程序的部分或者全部可以经由RAM和/或ROM和/或通信单元1409而被载入和/或安装到设备1400上。当计算机程序加载到RAM和/或ROM并由计算单元1401执行时,可以执行上文描述的方法600、700、800、900和/或1000的一个或多个步骤。备选地,在其他实施例中,计算单元1401可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行方法600、700、800、900和/或1000。The computing unit 1401 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 1401 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1401 executes various methods and processes described above, such as methods 600 , 700 , 800 , 900 and/or 1000 . For example, in some embodiments, methods 600 , 700 , 800 , 900 and/or 1000 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1408 . In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 1400 via RAM and/or ROM and/or communication unit 1409 . When a computer program is loaded into RAM and/or ROM and executed by computing unit 1401, one or more steps of methods 600, 700, 800, 900 and/or 1000 described above may be performed. Alternatively, in other embodiments, the computing unit 1401 may be configured to execute the methods 600 , 700 , 800 , 900 and/or 1000 in any other suitable manner (for example, by means of firmware).
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams Action is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行***、装置或设备使用或与指令执行***、装置或设备结合地使用的程序。机器可读介质 可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体***、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. The machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
此外,虽然采用特定次序描绘了各操作,但是这应当理解为要求这样操作以所示出的特定次序或以顺序次序执行,或者要求所有图示的操作应被执行以取得期望的结果。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实现中。相反地,在单个实现的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实现中。In addition, while operations are depicted in a particular order, this should be understood to require that such operations be performed in the particular order shown, or in sequential order, or that all illustrated operations should be performed to achieve desirable results. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (19)

  1. 一种用于逻辑仿真的方法,其特征在于,所述方法包括:A method for logic simulation, characterized in that the method comprises:
    接收逻辑电路的分级数据,所述分级数据表示所述逻辑电路的多个层级电路,所述多个层级电路是基于所述逻辑电路中的多个逻辑门的连接关系而被划分的;以及receiving hierarchical data of a logic circuit, the hierarchical data representing a plurality of hierarchical circuits of the logical circuit, the plurality of hierarchical circuits being divided based on a connection relationship of a plurality of logic gates in the logical circuit; and
    基于所述分级数据和针对所述逻辑电路的测试向量集,生成逻辑仿真输出集,其中生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,所述多个逻辑门的逻辑输出值与所述逻辑仿真输出集相关联。generating a logic simulation output set based on the hierarchical data and the test vector set for the logic circuit, wherein generating the logic simulation output set includes parallel calculation of logic output values of a plurality of logic gates located in the same hierarchical circuit in the same time frame , the logic output values of the plurality of logic gates are associated with the logic simulation output set.
  2. 根据权利要求1所述的方法,其特征在于,所述测试向量集包括第一测试向量和第二测试向量,并且所述逻辑仿真输出集包括第一输出子集和第二输出子集,生成所述逻辑仿真输出集还包括:The method according to claim 1, wherein the test vector set includes a first test vector and a second test vector, and the logic simulation output set includes a first output subset and a second output subset, generating The logic simulation output set also includes:
    基于所述分级数据和所述第一测试向量生成第一输出子集;generating a first output subset based on the ranking data and the first test vector;
    基于所述分级数据和所述第二测试向量生成第二输出子集,所述第一输出子集的生成和所述第二输出子集的生成是并行执行的。A second output subset is generated based on the classification data and the second test vector, the generation of the first output subset and the generation of the second output subset being performed in parallel.
  3. 根据权利要求2所述的方法,其特征在于,生成第一输出子集包括:The method according to claim 2, wherein generating the first output subset comprises:
    基于所述分级数据和所述第一测试向量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,所述第一多个输出与所述第一输出子集相关联。Based on the hierarchical data and the first test vector, calculating in parallel a first plurality of outputs of a plurality of logic gates located in the same hierarchical circuit in the same time frame, the first plurality of outputs and the first output Subsets are associated.
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,生成逻辑仿真输出集还包括:The method according to any one of claims 1-3, wherein generating a logic simulation output set further comprises:
    确定所述逻辑电路中的第一层级电路在第一时间帧中是否将被计算;determining whether a first level of circuitry in the logic circuitry is to be computed in a first time frame;
    如果确定所述第一层级电路将被计算,则计算所述第一层级电路的第一输出集,所述逻辑仿真输出集与所述第一输出集相关联;以及If it is determined that the first level of circuitry is to be computed, computing a first set of outputs for the first level of circuitry, the set of logic simulation outputs being associated with the first set of outputs; and
    如果确定所述第一层级电路将不被计算,则使用所述第一层级电路在所述第一时间帧之前一个时间帧中的输出集作为所述第一层级电路在所述第一时间帧中的所述第一输出集。If it is determined that the first-level circuit will not be computed, then using the output set of the first-level circuit in one time frame prior to the first time frame as the output set of the first-level circuit in the first time frame The first output set in .
  5. 根据权利要求4所述的方法,其特征在于,确定所述逻辑电路中的第一层级电路在第一时间帧中是否将被计算包括以下至少一项:The method of claim 4, wherein determining whether a first-level circuit in the logic circuit is to be calculated in the first time frame comprises at least one of the following:
    确定与所述第一层级电路对应的第一分级标识位在所述第一时间帧中的值是否为第一值,所述第一分级标识位的第一值指示所述第一层级电路在所述第一时间帧中的输入集与所述第一层级电路在所述第一时间帧之前一个时间帧中的输入集至少部分地不同;以及determining whether the value of the first hierarchical identification bit corresponding to the first-level circuit in the first time frame is a first value, and the first value of the first hierarchical identification bit indicates that the first-level circuit is in the set of inputs in the first time frame differs at least in part from the set of inputs of the first hierarchy circuit in a time frame preceding the first time frame; and
    确定所述第一层级电路的在所述第一时间帧中的全部输入或因素组合相比于所述第一层级电路的在所述第一时间帧之前一个时间帧中的全部输入或因素组合是否改变,所述第一层级电路的因素组合包括所述第一层级电路中的时序逻辑门的在所述第一时间帧中的输入、时钟信号、控制信号和在所述前一个时间帧中的输出。determining all combinations of inputs or factors for said first level circuit in said first time frame compared to all combinations of inputs or factors for said first level circuit in a time frame preceding said first time frame Whether it is changed, the factor combination of the first-level circuit includes the input of the sequential logic gate in the first-level circuit in the first time frame, the clock signal, the control signal and the input in the previous time frame Output.
  6. 根据权利要求4或5所述的方法,其特征在于,计算所述第一层级电路的第一输出集包括:The method according to claim 4 or 5, wherein calculating the first output set of the first level circuit comprises:
    确定所述第一层级电路中的第一逻辑门在所述第一时间帧中是否将被计算;determining whether a first logic gate in the first level of circuitry is to be evaluated in the first time frame;
    如果确定所述第一逻辑门将被计算,则计算所述第一逻辑门的第一输出;以及computing a first output of the first logic gate if it is determined that the first logic gate is to be computed; and
    如果确定所述第一逻辑门将不被计算,则使用所述第一逻辑门在所述第一时间帧之前一个时间帧中的输出作为所述第一逻辑门在所述第一时间帧中的第一输出,所述逻辑仿真输出集与所述第一输出相关联。If it is determined that the first logic gate will not be calculated, then use the output of the first logic gate in one time frame before the first time frame as the output of the first logic gate in the first time frame A first output, the set of logic simulation outputs is associated with the first output.
  7. 根据权利要求6所述的方法,其特征在于,确定所述第一层级电路中的第一逻辑门在 所述第一时间帧中是否将被计算包括以下至少一项:The method of claim 6, wherein determining whether a first logic gate in the first level of circuitry is to be computed in the first time frame comprises at least one of the following:
    确定所述第一逻辑门的在所述第一时间帧中的全部输入相比于所述第一逻辑门的在所述第一时间帧之前一个时间帧中的全部输入是否改变包括:确定与所述第一逻辑门对应的第一逻辑门标识位在所述第一时间帧中的值是否为第一值,所述第一逻辑门标识位的第一值指示所述第一逻辑门在所述第一时间帧中的输入与所述第一逻辑门在所述第一时间帧之前一个时间帧中的输入至少部分地不同;以及Determining whether all inputs to the first logic gate in the first time frame have changed compared to all inputs to the first logic gate in a time frame preceding the first time frame includes determining with Whether the value of the first logic gate identification bit corresponding to the first logic gate in the first time frame is the first value, and the first value of the first logic gate identification bit indicates that the first logic gate is in an input in the first time frame differs at least in part from an input to the first logic gate in a time frame preceding the first time frame; and
    确定所述第一逻辑门的在所述第一时间帧中的全部输入或因素组合相比于所述第一逻辑门的在所述第一时间帧之前一个时间帧中的全部输入或因素组合是否改变,所述第一逻辑门的因素组合包括时序逻辑门的在所述第一时间帧中的输入、时钟信号、控制信号和在所述前一个时间帧中的输出。determining all combinations of inputs or factors of the first logic gate in the first time frame compared to all combinations of inputs or factors of the first logic gate in a time frame preceding the first time frame Whether changed, the factor combination of the first logic gate includes the input of the sequential logic gate in the first time frame, the clock signal, the control signal and the output in the previous time frame.
  8. 根据权利要求1-7中任一项所述的方法,其特征在于,基于所述分级数据和针对所述逻辑电路的测试向量集,生成逻辑仿真输出集还包括:The method according to any one of claims 1-7, wherein, based on the hierarchical data and the test vector set for the logic circuit, generating a logic simulation output set further comprises:
    按时间帧逐帧地依次计算各个层级电路的输出;以及sequentially calculate the output of each level circuit on a time frame basis; and
    基于最后一个时间帧中的各个层级电路的输出,生成所述逻辑仿真输出集。The logic simulation output set is generated based on the output of each hierarchical circuit in the last time frame.
  9. 根据权利要求8所述的方法,其特征在于,按时间帧逐帧地依次计算各个层级电路的输出包括:The method according to claim 8, wherein sequentially calculating the output of each level circuit frame by frame by time frame comprises:
    确定包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数是否超过阈值次数;determining whether the number of calculations of the output of the hierarchical circuit comprising sequential logic gates in a time frame exceeds a threshold number of times;
    如果所述包括时序逻辑门的层级电路的输出在所述一个时间帧中的计算次数超过所述阈值次数,则生成故障指示,所述故障指示表示所述逻辑仿真出现故障。If the number of calculations of the output of the hierarchical circuit comprising sequential logic gates in the one time frame exceeds the threshold number of times, a fault indication is generated indicating that the logic simulation has failed.
  10. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储多个程序,所述多个程序被配置为一个或多个处理器执行,所述多个程序包括用于执行权利要求1-9中任一项所述的方法的指令。A computer-readable storage medium, wherein the computer-readable storage medium stores a plurality of programs configured to be executed by one or more processors, and the plurality of programs include Instructions for the method described in any one of 1-9 are required.
  11. 一种计算机程序产品,其特征在于,所述计算机程序产品包括多个程序,所述多个程序被配置为一个或多个处理器执行,所述多个程序包括用于执行权利要求1-9中任一项所述的方法的指令。A computer program product, characterized in that the computer program product includes a plurality of programs configured to be executed by one or more processors, and the plurality of programs include a program for performing claims 1-9 Instructions for any of the methods described herein.
  12. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, characterized in that the electronic device comprises:
    一个或多个处理器;one or more processors;
    包括计算机指令的存储器,所述计算机指令在由所述电子设备的所述一个或多个处理器执行时使得所述电子设备执行权利要求1-9中任一项所述的方法。A memory comprising computer instructions which, when executed by the one or more processors of the electronic device, cause the electronic device to perform the method of any one of claims 1-9.
  13. 一种电子设备,其特征在于,所述电子设备包括:An electronic device, characterized in that the electronic device comprises:
    接收单元,用于接收逻辑电路的分级数据,所述分级数据表示所述逻辑电路的多个层级电路,所述多个层级电路是基于所述逻辑电路中的多个逻辑门的连接关系而被划分的;以及The receiving unit is configured to receive hierarchical data of the logic circuit, the hierarchical data represents multiple hierarchical circuits of the logical circuit, and the multiple hierarchical circuits are organized based on the connection relationship of multiple logic gates in the logical circuit divided; and
    生成单元,用于基于所述分级数据和针对所述逻辑电路的测试向量集生成逻辑仿真输出集,其中生成逻辑仿真输出集包括并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的逻辑输出值,所述多个逻辑门的逻辑输出值与所述逻辑仿真输出集相关联。A generation unit, configured to generate a logic simulation output set based on the hierarchical data and the test vector set for the logic circuit, wherein generating the logic simulation output set includes parallel computing a plurality of logic gates located in the same hierarchical circuit in the same time frame The logic output values of the plurality of logic gates are associated with the logic simulation output set.
  14. 根据权利要求13所述的电子设备,其特征在于,所述测试向量集包括第一测试向量和第二测试向量,并且所述逻辑仿真输出集包括第一输出子集和第二输出子集,所述生成单元还用于:The electronic device according to claim 13, wherein the test vector set includes a first test vector and a second test vector, and the logic simulation output set includes a first output subset and a second output subset, The generation unit is also used for:
    基于所述分级数据和所述第一测试向量生成第一输出子集;generating a first output subset based on the ranking data and the first test vector;
    基于所述分级数据和所述第二测试向量生成第二输出子集,所述第一输出子集的生成和 所述第二输出子集的生成是并行执行的。A second output subset is generated based on the classification data and the second test vector, the generation of the first output subset and the generation of the second output subset being performed in parallel.
  15. 根据权利要求14所述的电子设备,其特征在于,所述生成单元还用于:The electronic device according to claim 14, wherein the generating unit is further used for:
    基于所述分级数据和所述第一测试向量,并行计算同一时间帧中的位于同一层级电路中的多个逻辑门的第一多个输出,所述第一多个输出与所述第一输出子集相关联。Based on the hierarchical data and the first test vector, calculating in parallel a first plurality of outputs of a plurality of logic gates located in the same hierarchical circuit in the same time frame, the first plurality of outputs and the first output Subsets are associated.
  16. 根据权利要求13-15中任一项所述的电子设备,其特征在于,所述生成单元还用于:The electronic device according to any one of claims 13-15, wherein the generating unit is further configured to:
    确定所述逻辑电路中的第一层级电路在第一时间帧中是否将被计算;determining whether a first level of circuitry in the logic circuitry is to be computed in a first time frame;
    如果确定所述第一层级电路将被计算,则计算所述第一层级电路的第一输出集,所述逻辑仿真输出集与所述第一输出集相关联;以及If it is determined that the first level of circuitry is to be computed, computing a first set of outputs for the first level of circuitry, the set of logic simulation outputs being associated with the first set of outputs; and
    如果确定所述第一层级电路将不被计算,则使用所述第一层级电路在所述第一时间帧之前一个时间帧中的输出集作为所述第一层级电路在所述第一时间帧中的所述第一输出集。If it is determined that the first-level circuit will not be computed, then using the output set of the first-level circuit in one time frame prior to the first time frame as the output set of the first-level circuit in the first time frame The first output set in .
  17. 根据权利要求16所述的电子设备,其特征在于,所述生成单元还用于:The electronic device according to claim 16, wherein the generating unit is further used for:
    确定所述第一层级电路中的第一逻辑门在所述第一时间帧中是否将被计算;determining whether a first logic gate in the first level of circuitry is to be evaluated in the first time frame;
    如果确定所述第一逻辑门将被计算,则计算所述第一逻辑门的第一输出;以及computing a first output of the first logic gate if it is determined that the first logic gate is to be computed; and
    如果确定所述第一逻辑门将不被计算,则使用所述第一逻辑门在所述第一时间帧之前一个时间帧中的输出作为所述第一逻辑门在所述第一时间帧中的第一输出,所述逻辑仿真输出集与所述第一输出相关联。If it is determined that the first logic gate will not be calculated, then use the output of the first logic gate in one time frame before the first time frame as the output of the first logic gate in the first time frame A first output, the set of logic simulation outputs is associated with the first output.
  18. 根据权利要求13-17中任一项所述的电子设备,其特征在于,基于所述分级数据和针对所述逻辑电路的测试向量集,所述生成单元还用于:The electronic device according to any one of claims 13-17, wherein based on the classification data and the test vector set for the logic circuit, the generation unit is further configured to:
    按时间帧逐帧地依次计算各个层级电路的输出;以及sequentially calculate the output of each level circuit on a time frame basis; and
    基于最后一个时间帧中的各个层级电路的输出,生成所述逻辑仿真输出集。The logic simulation output set is generated based on the output of each hierarchical circuit in the last time frame.
  19. 根据权利要求18所述的电子设备,其特征在于,所述生成单元还用于:The electronic device according to claim 18, wherein the generating unit is further used for:
    确定包括时序逻辑门的层级电路的输出在一个时间帧中的计算次数是否超过阈值次数;determining whether the number of calculations of the output of the hierarchical circuit comprising sequential logic gates in a time frame exceeds a threshold number of times;
    如果所述包括时序逻辑门的层级电路的输出在所述一个时间帧中的计算次数超过所述阈值次数,则生成故障指示,所述故障指示表示所述逻辑仿真出现故障。If the number of calculations of the output of the hierarchical circuit comprising sequential logic gates in the one time frame exceeds the threshold number of times, a fault indication is generated indicating that the logic simulation has failed.
PCT/CN2021/126318 2021-10-26 2021-10-26 Method, apparatus and device for logic simulation WO2023070301A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180100883.1A CN117751295A (en) 2021-10-26 2021-10-26 Method, device and equipment for logic simulation
PCT/CN2021/126318 WO2023070301A1 (en) 2021-10-26 2021-10-26 Method, apparatus and device for logic simulation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/126318 WO2023070301A1 (en) 2021-10-26 2021-10-26 Method, apparatus and device for logic simulation

Publications (1)

Publication Number Publication Date
WO2023070301A1 true WO2023070301A1 (en) 2023-05-04

Family

ID=86158957

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126318 WO2023070301A1 (en) 2021-10-26 2021-10-26 Method, apparatus and device for logic simulation

Country Status (2)

Country Link
CN (1) CN117751295A (en)
WO (1) WO2023070301A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07294604A (en) * 1994-04-28 1995-11-10 Nec Corp Testing circuit for lsi
US20030110457A1 (en) * 2001-10-30 2003-06-12 Benoit Nadeau-Dostie Method and program product for designing hierarchical circuit for quiescent current testing and circuit produced thereby
US7181705B2 (en) * 2000-01-18 2007-02-20 Cadence Design Systems, Inc. Hierarchical test circuit structure for chips with multiple circuit blocks
CN110007200A (en) * 2019-01-11 2019-07-12 华为技术有限公司 A kind of test circuit, equipment and system
CN112394281A (en) * 2021-01-20 2021-02-23 北京燧原智能科技有限公司 Test signal parallel loading conversion circuit and system-on-chip

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07294604A (en) * 1994-04-28 1995-11-10 Nec Corp Testing circuit for lsi
US7181705B2 (en) * 2000-01-18 2007-02-20 Cadence Design Systems, Inc. Hierarchical test circuit structure for chips with multiple circuit blocks
US20030110457A1 (en) * 2001-10-30 2003-06-12 Benoit Nadeau-Dostie Method and program product for designing hierarchical circuit for quiescent current testing and circuit produced thereby
CN110007200A (en) * 2019-01-11 2019-07-12 华为技术有限公司 A kind of test circuit, equipment and system
CN112394281A (en) * 2021-01-20 2021-02-23 北京燧原智能科技有限公司 Test signal parallel loading conversion circuit and system-on-chip

Also Published As

Publication number Publication date
CN117751295A (en) 2024-03-22

Similar Documents

Publication Publication Date Title
US8214195B2 (en) Testing in a hardware emulation environment
JPS5995657A (en) Logic simulator
US20050223344A1 (en) Power-consumption calculation method and apparatus
WO2022134581A1 (en) Test case sorting method and related device
Davis et al. A practical reconfigurable hardware accelerator for Boolean satisfiability solvers
US20230097314A1 (en) Verification of hardware design for component that evaluates an algebraic expression using decomposition and recombination
US7266795B2 (en) System and method for engine-controlled case splitting within multiple-engine based verification framework
Alqudah et al. Parallel implementation of genetic algorithm on FPGA using Vivado high level synthesis
US20220121593A1 (en) Systems And Methods For Processor Circuits
WO2023070301A1 (en) Method, apparatus and device for logic simulation
US20190050514A1 (en) Fault injection using hybrid simulation model
US20230297747A1 (en) Verification of hardware design for data transformation component
US7672827B1 (en) Method and system for simulation of analog/digital interfaces with analog tri-state ioputs
CN117377961A (en) Method, device and equipment for simulation
US11709984B2 (en) Automatic sequential retry on compilation failure
KR102325612B1 (en) Method, apparatus, device, and medium for implementing simulator
Wang et al. Improving the efficiency of functional verification based on test prioritization
US20210247997A1 (en) Method for data center storage evaluation framework simulation
Zhu et al. Parallel logic simulation of million-gate VLSI circuits
Omland et al. API-Based Hardware Fault Simulation for DNN Accelerators
US6829572B2 (en) Method and system for efficiently overriding array net values in a logic simulator machine
Gharibi et al. In-Memory Fault as Address Simulation
US7447621B1 (en) PLI-less co-simulation of ISS-based verification systems in hardware simulators
Oyeniran et al. Mixed-level identification of fault redundancy in microprocessors
US20210303336A1 (en) Advanced Register Merging

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21961683

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180100883.1

Country of ref document: CN