WO2017185388A1 - 一种用于生成服从一定分布的随机向量的装置和方法 - Google Patents

一种用于生成服从一定分布的随机向量的装置和方法 Download PDF

Info

Publication number
WO2017185388A1
WO2017185388A1 PCT/CN2016/080970 CN2016080970W WO2017185388A1 WO 2017185388 A1 WO2017185388 A1 WO 2017185388A1 CN 2016080970 W CN2016080970 W CN 2016080970W WO 2017185388 A1 WO2017185388 A1 WO 2017185388A1
Authority
WO
WIPO (PCT)
Prior art keywords
random vector
instruction
vector generation
generation instruction
random
Prior art date
Application number
PCT/CN2016/080970
Other languages
English (en)
French (fr)
Inventor
刘道福
张潇
刘少礼
陈天石
陈云霁
Original Assignee
北京中科寒武纪科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京中科寒武纪科技有限公司 filed Critical 北京中科寒武纪科技有限公司
Priority to EP16899899.5A priority Critical patent/EP3451158B1/en
Publication of WO2017185388A1 publication Critical patent/WO2017185388A1/zh
Priority to US16/171,284 priority patent/US11501158B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present invention relates to the field of computer technology, and in particular, to an apparatus and method for generating a random vector obeying a certain distribution, and can generate a random vector of a certain length according to an instruction, and there are various ways of randomly distributing, including but not Limited to uniform distribution and Gaussian distribution.
  • a random vector that is, each value in a vector is a result of obeying a random distribution.
  • a restricted Boltzmann machine of an artificial neural network there is a step of sampling a vector of a set of neurons, that is, comparing each neuron in the vector with a random number, the value of the neuron If it is greater than the vector, it takes 1 and vice versa. This requires generating a set of random vectors consisting of random numbers obeying a certain distribution, which are the same size as the neuron vector.
  • the random carry method For example, if a 32-bit single-precision floating-point number is converted into a 16-bit semi-precision floating-point number, if the random carry method is selected, the truncated part needs to be compared with a random number satisfying a certain distribution, and the random number is selected to be larger than 1 This also requires a set of random numbers that satisfy a certain distribution, that is, random vectors.
  • one of the most commonly used methods for generating random vectors is to generate random numbers that satisfy a certain distribution one by one on a general purpose processor.
  • this method can only generate one random number at a time, and is less efficient when the required number is large.
  • multiple instructions are needed to generate the random number to complete the process.
  • the present invention provides a random vector apparatus and method for generating a distribution that is operative to generate a random vector of any length that satisfies a certain distribution.
  • a plurality of distributions and arbitrary lengths can be selected.
  • a device for generating a random vector comprising:
  • a storage unit configured to store vector data related to the random vector generation instruction
  • a register unit for storing scalar data related to the random vector generation instruction
  • control unit configured to decode the random vector generation instruction, and control an execution process of the random vector generation instruction
  • a random vector generating unit configured to generate a random vector obeying a specified distribution according to the decoded random vector generating instruction
  • the random vector generating unit is a customized hardware circuit.
  • the scalar data stored by the register unit includes a random vector storage address associated with a random vector generation instruction, a random vector length, and a distribution parameter; wherein the random vector storage address is an address in the storage unit.
  • control unit comprises:
  • the instruction queue module is configured to sequentially store the decoded random vector generation instructions, and acquire scalar data related to the random vector generation instruction.
  • control unit comprises:
  • the dependency processing unit is configured to determine whether the current random vector generation instruction has a dependency relationship with the previously unexecuted operation instruction before the random vector generation unit acquires the current random vector generation instruction.
  • control unit comprises:
  • the storage queue module is configured to temporarily store the current random vector generation instruction when the current random vector generation instruction has a dependency relationship with the previously unexecuted operation instruction, and send the temporary random vector generation instruction when the dependency relationship is eliminated. Generate units to random vectors.
  • the device further comprises:
  • An instruction cache unit configured to store a random vector generation instruction to be executed
  • the input/output unit is configured to store the vector data related to the random vector generation instruction in the storage unit, or obtain the vector data related to the random vector generation instruction from the storage unit.
  • the random vector generation instruction includes an operation code and an operation domain
  • the operation code is used to indicate a random vector generation operation of performing a specified distribution
  • the operational field includes an immediate value and/or a register number indicating that the random vector generates associated scalar data, wherein the register number points to the register unit address.
  • the storage unit is a scratch pad memory.
  • an apparatus for generating a random vector comprising:
  • the fetch module is configured to take out a random vector generation instruction to be executed from the instruction sequence, and transmit the random vector generation instruction to the decoding module;
  • a decoding module configured to decode the random vector generation instruction, and transmit the decoded random vector generation instruction to the instruction queue module;
  • An instruction queue module configured to temporarily store the decoded random vector generation instruction, and obtain scalar data related to the random vector generation instruction from the random vector generation instruction or the scalar register; after obtaining the scalar data, generating the random vector generation instruction Sent to the dependency processing unit;
  • a scalar register file including a plurality of scalar registers for storing scalar data associated with random vector generation instructions
  • a dependency processing unit configured to determine whether there is a dependency relationship between the random vector generation instruction and the previously unexecuted operation instruction; if there is a dependency relationship, send the random vector generation instruction To the storage queue module, if there is no dependency, sending the random vector generation instruction to the random vector generation unit;
  • a storage queue module configured to store a random vector generation instruction that has a dependency relationship with the previous operation instruction, and send the random vector generation instruction to the random vector generation unit after the dependency relationship is released;
  • a random vector generating unit configured to generate a random vector obeying a specified distribution according to the received random vector generating instruction
  • a scratchpad memory for storing generated random vectors
  • An input/output access module for directly accessing the scratchpad memory, responsible for writing the generated random vector into the scratchpad memory.
  • the random vector generation unit is a customized hardware circuit.
  • a method for generating a random vector comprising:
  • the fetch module extracts the next random vector generation instruction to be executed from the instruction sequence, and transmits the random vector generation instruction to the decoding module;
  • the decoding module decodes the random vector generation instruction, and transmits the decoded random vector generation instruction to the instruction queue module;
  • the instruction queue module temporarily stores the decoded random vector generation instruction, and obtains scalar data related to the random vector generation instruction operation from the random vector generation instruction or the scalar register; after obtaining the scalar data, sending the random vector generation instruction to Dependency processing unit;
  • the dependency processing unit determines whether there is a dependency between the random vector generation instruction and the previously unexecuted operation instruction; if there is a dependency, the random vector generation instruction is sent to the storage queue module if there is no dependency relationship Transmitting the random vector generation instruction to the random vector generation unit;
  • the storage queue module stores a random vector generation instruction having a dependency relationship with the previous operation instruction, and after the dependency relationship is released, sends the random vector generation instruction to the random vector generation unit;
  • the random vector generating unit generates a random vector obeying the specified distribution according to the received random vector generating instruction, and writes the generated random vector into the scratch pad memory through the input/output access module.
  • the random vector generating apparatus and method provided by the invention realizes the complete process of the reduced random vector generating instruction through the customized hardware circuit, that is, the random vector generating operation can be realized by a simplified random vector generating instruction.
  • the random vector data By temporarily storing the vector data participating in the calculation on the scratch pad memory (Scratchpad Memory), the vector data of different widths can be more flexibly and effectively supported, and the customized random number generating unit can more efficiently generate and obey various distributions.
  • the random data improves the performance of an algorithm that requires a large number of random vectors.
  • the instructions used in the present invention are more streamlined, and a set of random vectors can be generated by one instruction.
  • the present invention can be applied to the following scenarios (including but not limited to): data processing, robots, computers, printers, scanners, telephones, tablets, smart terminals, mobile phones, driving recorders, navigators, transmissions Sensors, cameras, cloud servers, cameras, camcorders, projectors, watches, headsets, mobile storage, wearable devices and other electronic products; aircraft, ships, vehicles and other vehicles; TV, air conditioning, microwave ovens, refrigerators, Rice cookers, humidifiers, washing machines, electric lights, gas stoves, range hoods and other household appliances; and various types of medical equipment including nuclear magnetic resonance instruments, B-ultrasounds, electrocardiographs.
  • FIG. 1 is a schematic structural diagram of a random vector generating apparatus provided by the present invention.
  • FIG. 2 is a schematic diagram of a format of a random vector generation instruction provided by the present invention.
  • FIG. 3 is a schematic structural diagram of a random vector generating apparatus according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a method for generating a random vector according to an embodiment of the present invention.
  • the present invention provides an apparatus for generating a random vector, comprising a storage unit, a register unit, a control unit and a random vector generation unit, the storage unit is for storing a vector, the register unit is for storing a vector storage address and other scalar parameters, and the control unit For performing a decoding operation, and controlling each module according to the instruction, the random vector generating unit acquires a vector storage address, a distribution parameter, a length, and other parameters in the instruction or the register unit according to the random vector generation operation instruction, and then generates a distribution that satisfies the specified instruction and A random vector of the specified length.
  • the storage unit adopts a high-speed temporary storage memory, and the present invention temporarily stores the generated vector data on the high-speed temporary storage memory, so that the vector data of different widths can be more flexibly and effectively supported during the operation, and a large number of random vectors are required for improvement. Algorithm execution performance of the data.
  • FIG. 1 is a schematic structural diagram of a device for generating a random vector provided by the present invention. As shown in FIG. 1, the device includes:
  • the storage unit may be a scratch pad memory (Scratchpad Memory) capable of supporting vector data of different sizes; the present invention will perform necessary calculations.
  • the data is temporarily stored in the scratchpad memory, so that the device can support data of different widths more flexibly and efficiently.
  • the random vector generation instruction related vector data includes the generated random vector.
  • the scratchpad memory can be implemented by a variety of different memory devices such as SRAM, DRAM, eDRAM, memristor, 3D-DRAM, and nonvolatile memory.
  • a register unit for storing scalar data related to random vector generation, such as a generated random vector storage address, and for storing scalar data used in other operations, such as a distribution parameter specified by a random vector generation instruction, such as a uniformly distributed Upper and lower bounds, the mean and variance of the Gaussian distribution.
  • the generated random vector storage address is an address stored in the storage unit of the vector;
  • control unit configured to decode the random vector generation instruction, and control an execution process of the random vector generation instruction; the implementation of the random vector generation instruction is mainly implemented by the behavior of each module in the control device Control of the line process; in one embodiment, the control unit reads the prepared command, decodes and generates a control signal, and sends it to other modules in the device, and the other modules perform corresponding operations according to the obtained control signal.
  • a random vector generation unit that generates a random vector of a specified length obeying a specified distribution according to an instruction.
  • This unit is a vector operation unit that generates each element in the random vector.
  • the random vector generation unit is a customized hardware circuit, including but not limited to an FPGA, a CGRA, an application specific integrated circuit ASIC, an analog circuit, a memristor, etc.; the random vector generation unit cooperates with other modules in the device, Ability to generate random vectors of any length that follow a specified distribution.
  • a plurality of parallel random number generating modules are actually included in the random vector generating unit, and each module can generate a random number in each execution. Therefore, when generating a random vector, actually, a plurality of parallel random number generating modules continuously generate a plurality of random vector segments, and finally obtain a random vector of a required length. For each random number generation module, there are two main parts to satisfy the requirement of generating an arbitrary distribution random number.
  • the random vector generation unit comprises two modules:
  • LFSR module which is used to generate random numbers that are uniformly distributed, and can also generate true random numbers by detecting resistance thermal noise;
  • the apparatus further includes: an instruction cache unit, configured to store an operation instruction to be executed.
  • the instruction is also cached in the instruction cache unit during execution. When an instruction is executed, if the instruction is also the oldest instruction in the instruction cache unit that is not committed, the instruction will be submitted.
  • control unit in the apparatus further includes: an instruction queue module, configured to sequentially store the decoded random vector generation instructions, which acquires random by using an operation domain in the random vector generation instruction
  • the vector generates instruction-related scalar data, such as specified distribution parameters, random vector lengths, and random vector storage addresses, which are padded to a random vector generation instruction and sent to the dependency processing unit.
  • the control unit of the apparatus further includes: a dependency processing unit, configured to determine whether the random vector generation instruction and the previously unexecuted instruction are dependent before the random vector generation unit acquires the instruction Relationship, such as whether to access the same vector storage address. If there is a dependency, the random vector generation instruction is stored in the storage queue module, and after the execution of the operation instruction with the dependency relationship is completed, the storage queue module provides the random vector generation instruction to the random vector generation unit; otherwise Directly providing the random vector generation instruction to the random vector generation unit. Specifically, when the random vector generation instruction accesses the scratchpad memory, the front and rear instructions may access the same block. In order to ensure the correctness of the execution result of the instruction, if the current instruction is detected to have a dependency on the data of the previous instruction, the instruction must wait in the storage queue until the dependency is eliminated.
  • a dependency processing unit configured to determine whether the random vector generation instruction and the previously unexecuted instruction are dependent before the random vector generation unit acquires the instruction Relationship, such as whether to access the same vector storage address. If
  • control unit of the apparatus further includes: a storage queue module, the module includes an ordered queue, and an instruction having a dependency on the data in the previous instruction is stored in the ordered queue until The dependency is eliminated, and after the dependency is eliminated, it provides the operation instruction to the random vector generation unit.
  • the apparatus further includes: an input and output unit configured to store the generated random vector in the storage unit. At the same time, it is responsible for reading vector data from memory or writing vector data.
  • the instruction design of the device is in a simplified manner, and an instruction can generate a random vector of arbitrary length.
  • the device fetches the instruction for decoding, and then sends it to the instruction queue for storing, and according to the decoding result, acquires each parameter in the instruction, and the parameter may be directly written in the operation domain of the instruction. It can also be read from the specified register according to the register number in the instruction operation field.
  • the dependency processing unit determines whether the data actually needed by the instruction has a dependency relationship with the previous instruction, which determines whether the instruction can be immediately sent to the execution unit. Once a dependency is found between the previous data and the previous data, the instruction must wait until the instruction it depends on has been executed before it can be sent to the arithmetic unit for execution. In a custom arithmetic unit, the instruction will be executed quickly, and the result, that is, the generated random vector, is written back to the address provided by the instruction, and the instruction is executed.
  • the random vector generation instruction includes an operation code and at least one operation domain, wherein the operation code is used to indicate which distribution is to be generated.
  • a random vector such as a Gaussian distribution or a uniform distribution
  • the operation field is used to indicate data information of the operation instruction, wherein the data information may be an immediate number or a register number, for example, when a vector is to be generated, according to the register number, the corresponding
  • the output vector in the register stores the start address and vector length, as well as the parameters of the distribution, and then stores the random vector generated from the distribution to the specified address.
  • random vector generation instructions can be implemented:
  • Uniformly distributed instruction (UNIF), according to which the device reads uniformly distributed upper and lower bound parameters from the instruction or from the register file, and the size and storage address of the random vector to be generated, and then in the random vector generation unit A random vector that is subject to the uniform distribution is generated and the generated random vector result is written back to the storage address of the specified scratch pad memory.
  • UPF Uniformly distributed instruction
  • GUS Gaussian Distribution Instruction
  • FIG. 3 is a schematic structural diagram of a random vector generating apparatus according to an embodiment of the present invention.
  • the apparatus includes an instruction module, a decoding module, an instruction queue, a scalar register file, a dependency processing unit, and a storage queue. , random vector generation unit, high speed register, IO memory access module;
  • the fetch module which is responsible for fetching the next instruction to be executed from the instruction sequence and passing the instruction to the decoding module;
  • the module is responsible for decoding the instruction, and transmitting the decoded instruction to the instruction queue;
  • the instruction queue module is configured to temporarily store the instruction obtained from the decoding module, and obtain the corresponding data of the instruction operation from the instruction or the scalar register, including the generated random vector length, the storage address, and the distribution parameter. After obtaining the scalar data, the instruction is sent to the dependency processing unit;
  • the scalar register file provides the scalar registers required by the device during the operation.
  • the various scalar parameters can be directly in the operation field of the instruction or read from the scalar register file.
  • a dependency processing unit that handles storage dependencies that may exist between random vector generation instructions and previously unexecuted instructions.
  • Random vector generation instructions may access the scratchpad memory, such as storing the generated random vector to the scratchpad memory, etc., and the front and back instructions may access the same block of memory.
  • the instruction is sent to the storage queue module until the dependency is eliminated. That is, whether the storage section of the input data for detecting the instruction of this instruction overlaps with the storage section of the output data of the instruction that has not been executed before, and the storage section is determined by the start address and the data length. If there is overlap, it means that this instruction actually needs the execution result of the previous instruction as input, so it must wait until the instruction is executed before the instruction can start execution. In this process, the instructions are actually temporarily stored in the storage queue module.
  • a storage queue module the module is an ordered queue, and instructions related to previous instructions on the data are stored in the queue until the storage relationship is eliminated; the random vector generation instruction after the dependency is eliminated is sent to the random vector Generating unit
  • the random vector generation unit is a customized hardware circuit implementation, including but not limited to an FPGA, a CGRA, an application specific integrated circuit ASIC, an analog circuit, and a memristor;
  • the module is a vector data dedicated temporary storage device capable of supporting vector data of different sizes; the cache can be used to store generated random vectors;
  • IO memory access module which is used to directly access the scratchpad memory and is responsible for reading data or writing data from the scratchpad memory.
  • FIG. 4 is a flowchart of performing a uniform distribution instruction to generate a random vector satisfying a uniform distribution according to an embodiment of the present invention. As shown in FIG. 4, the process of executing a uniform distributed instruction includes:
  • the fetch module extracts the random number generation instruction, and sends the instruction to the decoding module.
  • the decoding module decodes the instruction and sends the instruction to the instruction queue.
  • the random number generation instruction acquires scalar data corresponding to four operation domains in the instruction from the instruction itself or from the scalar register file, including the generated random vector storage address, the generated random vector length, and the uniform distribution. The upper and lower bounds.
  • the dependency processing unit analyzes whether the instruction has a dependency on the data with the previous instruction that has not been executed. If there is a dependency, the instruction is sent to the store queue module until it no longer has a dependency on the data with the previous unexecuted instruction. If there is no dependency, the instruction is sent directly to the random vector generation unit.
  • the random vector generating unit generates a random vector of a certain length that satisfies the uniform distribution distribution by using a hardware circuit according to the upper and lower bound parameters.
  • the random vector generating unit continues to generate a random vector of a certain length until the generation of the random vector of the specified length is completed.
  • the present invention provides a random vector generation device, which can solve the computational task of generating more and more random vectors satisfying a certain distribution in the current computer field with the corresponding instructions.
  • the present invention can have the advantages of simple instruction, convenient use, flexible vector length support, and sufficient on-chip buffering.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Complex Calculations (AREA)
  • Executing Machine-Instructions (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种用于生成随机向量的装置和方法。该装置包括:存储单元,用于存储随机向量生成指令相关的向量数据;寄存器单元,用于存储随机向量生成指令相关的标量数据;控制单元,用于对随机向量生成指令进行译码,并控制随机向量生成指令的执行过程;随机向量生成单元,用于根据译码后的随机向量生成指令,生成服从指定分布的随机向量;其中,所述随机向量生成单元为定制的硬件电路。本发明提供的随机向量生成装置及方法,通过定制的硬件电路实现了精简随机向量生成指令的完整过程,即通过一条精简的随机向量生成指令即可实现随机向量生成运算。

Description

一种用于生成服从一定分布的随机向量的装置和方法 技术领域
本发明涉及计算机技术领域,尤其涉及一种用于生成服从一定分布的随机向量的装置和方法,可以根据指令生成任意长度的服从一定分布的随机向量,随机分布的方式有多种,包括但不限于均匀分布和高斯分布。
背景技术
随机向量,即向量中的每一个数值都是服从某一随机分布生成的结果。在人工神经网络的受限玻尔兹曼机中,即存在这样的步骤,要求对一组神经元组成的向量进行采样,即将向量中每一个神经元与一个随机数进行比较,神经元的值大于该向量就取1反之取0,这要求生成一组与神经元向量同等大小的由服从某一分布的随机数组成的随机向量。又比如将一组32位单精度浮点数转换值16位的半精度浮点数,如果选择随机进位的方法,则需要将截断部分与满足某分布的随机数进行比较,大于该随机数选择进1,这同样要求有一组满足某一分布的随机数,即随机向量。
在现有技术中,一种最常用的实现生成随机向量的方法是在通用处理器上逐个生成满足某一分布的随机数。但是,这种方法每次只能生成一个随机数,在要求的数量较大时效率较低。同时在生成随机数时需要多条指令配合才能够完成该过程。
发明内容
有鉴于此,本发明提供了一种用于生成服从一定分布的随机向量装置和方法,用于能够生成满足某一分布的任意长度的随机向量,根据指令,可以选择多种分布和任意长度。
根据本发明第一方面,提供了一种用于生成随机向量装置,该装置包括:
存储单元,用于存储随机向量生成指令相关的向量数据;
寄存器单元,用于存储随机向量生成指令相关的标量数据;
控制单元,用于对随机向量生成指令进行译码,并控制随机向量生成指令的执行过程;
随机向量生成单元,用于根据译码后的随机向量生成指令,生成服从指定分布的随机向量;
其中,所述随机向量生成单元为定制的硬件电路。
优选地,所述寄存器单元所存储的标量数据包括随机向量生成指令相关的随机向量存储地址、随机向量长度以及分布参数;其中,所述随机向量存储地址为所述存储单元中的地址。
优选地,所述控制单元包括:
指令队列模块,用于对译码后的随机向量生成指令进行顺序存储,并获取随机向量生成指令相关的标量数据。
优选地,所述控制单元包括:
依赖关系处理单元,用于在随机向量生成单元获取当前随机向量生成指令前,判断当前随机向量生成指令与之前未执行完的运算指令是否存在依赖关系。
优选地,所述控制单元包括:
存储队列模块,用于在当前随机向量生成指令与之前未执行完的运算指令存在依赖关系时,暂时存储当前随机向量生成指令,并且在该依赖关系消除时,将暂存的随机向量生成指令送往随机向量生成单元。
优选地,所述装置还包括:
指令缓存单元,用于存储待执行的随机向量生成指令;
输入输出单元,用于将随机向量生成指令相关的向量数据存储于存储单元,或者,从存储单元中获取随机向量生成指令相关的向量数据。
优选地,所述随机向量生成指令包括操作码和操作域;
所述操作码用于指示执行指定分布的随机向量生成操作;
所述操作域包括立即数和/或寄存器号,指示随机向量生成相关的标量数据,其中寄存器号指向所述寄存器单元地址。
优选地,所述存储单元为高速暂存存储器。
根据本发明第二方面,提供了一种用于生成随机向量的装置,其该装置包括:
取指模块,用于从指令序列中取出下一条要执行的随机向量生成指令,并将该随机向量生成指令传给译码模块;
译码模块,用于对该随机向量生成指令进行译码,并将译码后的随机向量生成指令传送给指令队列模块;
指令队列模块,用于暂存译码后的随机向量生成指令,并从随机向量生成指令或标量寄存器获得随机向量生成指令相关的标量数据;获得所述标量数据后,将所述随机向量生成指令送至依赖关系处理单元;
标量寄存器堆,包括多个标量寄存器,用于存储随机向量生成指令相关的标量数据;
依赖关系处理单元,用于判断所述随机向量生成指令与之前未执行完的运算指令之间是否存在依赖关系;如果存在依赖关系,则将所述随机向量生成指令送 至存储队列模块,如果不存在依赖关系,则将所述随机向量生成指令送至随机向量生成单元;
存储队列模块,用于存储与之前运算指令存在依赖关系的随机向量生成指令,并且在所述依赖关系解除后,将所述随机向量生成指令送至随机向量生成单元;
随机向量生成单元,用于根据接收到随机向量生成指令生成服从指定分布的随机向量;
高速暂存存储器,用于存储生成的随机向量;
输入输出存取模块,用于直接访问所述高速暂存存储器,负责向所述高速暂存存储器中写入生成的随机向量。
优选地,所述随机向量生成单元为定制的硬件电路。
根据本发明第三方面,提供了一种用于生成随机向量的方法,该方法包括:
取指模块从指令序列中取出下一条要执行的随机向量生成指令,并将该随机向量生成指令传给译码模块;
译码模块对该随机向量生成指令进行译码,并将译码后的随机向量生成指令传送给指令队列模块;
指令队列模块暂存译码后的随机向量生成指令,并从随机向量生成指令或标量寄存器获得随机向量生成指令运算相关的标量数据;获得所述标量数据后,将所述随机向量生成指令送至依赖关系处理单元;
依赖关系处理单元判断所述随机向量生成指令与之前未执行完的运算指令之间是否存在依赖关系;如果存在依赖关系,则将所述随机向量生成指令送至存储队列模块,如果不存在依赖关系,则将所述随机向量生成指令送至随机向量生成单元;
存储队列模块存储与之前运算指令存在依赖关系的随机向量生成指令,并且在所述依赖关系解除后,将所述随机向量生成指令送至随机向量生成单元;
随机向量生成单元根据接收到的随机向量生成指令,生成服从指定分布的随机向量,并通过输入输出存取模块将生成的随机向量写入高速暂存存储器。
本发明提供的随机向量生成装置及方法,通过定制的硬件电路实现了精简随机向量生成指令的完整过程,即通过一条精简的随机向量生成指令即可实现随机向量生成运算。本发明通过将参与计算的向量数据暂存在高速暂存存储器上(Scratchpad Memory),使得可以更加灵活有效地支持不同宽度的向量数据,同时定制的随机数生成单元能够更加高效地生成服从各种分布的随机数据,提升需要大量随机向量的算法的执行性能,本发明采用的指令更加精简,一条指令即可实现生成一组随机向量。
本发明可以应用于以下场景中(包括但不限于):数据处理、机器人、电脑、打印机、扫描仪、电话、平板电脑、智能终端、手机、行车记录仪、导航仪、传 感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备等各类电子产品;飞机、轮船、车辆等各类交通工具;电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机等各类家用电器;以及包括核磁共振仪、B超、心电图仪等各类医疗设备。
附图说明
图1是本发明提供的随机向量生成装置的结构示意图。
图2是本发明提供的随机向量生成指令的格式示意图。
图3是本发明实施例提供的随机向量生成装置的结构示意图。
图4是本发明实施例提供的随机向量生成方法的流程图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明作进一步的详细说明。
本发明提供一种用于生成随机向量的装置,包括存储单元、寄存器单元、控制单元和随机向量生成单元,存储单元用于存储向量,寄存器单元用于存储向量存储地址和其他标量参数,控制单元用于执行译码操作,根据指令控制各个模块,随机向量生成单元根据随机向量生成运算指令在指令中或寄存器单元中获取向量存储地址、分布参数、长度和其他参数,然后生成满足指令指定分布和指定长度的随机向量。本发明中,所述存储单元采用高速暂存存储器,本发明将生成的向量数据暂存在高速暂存存储器上,使得运算过程中可以更加灵活有效地支持不同宽度的向量数据,提升需要大量随机向量数据的算法执行性能。
图1是本发明提供的用于生成随机向量装置的结构示意图,如图1所示,装置包括:
存储单元,用于存储随机向量生成指令相关的向量数据,在一种实施方式中,该存储单元可以是高速暂存存储器(Scratchpad Memory),能够支持不同大小的向量数据;本发明将必要的计算数据暂存在高速暂存存储器上,使得本装置可以更加灵活有效地支持不同宽度的数据。所述随机向量生成指令相关的向量数据包括生成的随机向量。所述高速暂存存储器可以通过各种不同存储器件如SRAM、DRAM、eDRAM、忆阻器、3D-DRAM和非易失存储等实现。
寄存器单元,用于存储随机向量生成相关的标量数据,如生成的随机向量存储地址,也可用于存储其他运算过程中用到的标量数据,例如随机向量生成指令指定的分布参数,如均匀分布的上下界,高斯分布的均值和方差。其中,生成的随机向量存储地址为向量在存储单元中存储的地址;
控制单元,用于对随机向量生成指令进行译码,并控制随机向量生成指令的执行过程;其主要通过控制装置中各个模块的行为实现对随机向量生成指令的执 行过程的控制;在一种实施方式中,控制单元读取准备好的指令,进行译码生成控制信号,发送给装置中的其他模块,其他模块根据得到的控制信号执行相应的操作。
随机向量生成单元,该单元根据指令实现生成服从指定分布的指定长度的随机向量。该单元是向量运算单元,同时生成随机向量中的每一个元素。所述随机向量生成单元为定制的硬件电路,包括但不限于FPGA、CGRA、专用集成电路ASIC、模拟电路和忆阻器等;所述随机向量生成单元通过与该装置中的其它模块相互协作,能够生成任意长度服从指定分布的随机向量。
需要注意的是,针对随机向量生成的不同要求,在随机向量生成单元中实际上包含了多个并行的随机数生成模块,每个模块在每次执行过程中可以生成一个随机数。因此当生成随机向量时,实际上是多个并行的随机数生成模块不断生成若干个随机向量段,最终得到要求长度的随机向量。而对于每一个随机数生成模块,其中包含两个主要部分来满足生成任意分布随机数的要求。
在一实施例中,所述随机向量生成单元包括两个模块:
其一是LFSR模块,用于生成服从均匀分布的随机数,此外还可以通过检测电阻热噪声的方式生成真随机数;
其二是Ziggurat算法模块,用于生成服从任意分布(如高斯分布的)的随机数,在执行时需要调用LFSR模块。生成均匀分布随机数的模块在初始时需要配置随机种子,不同模块可以配置不同的随机种子。根据本发明的一种实施方式,所述装置还包括:指令缓存单元,用于存储待执行的运算指令。指令在执行过程中,同时也被缓存在指令缓存单元中,当一条指令执行完之后,如果该指令同时也是指令缓存单元中未被提交指令中最早的一条指令,该指令将被提交。
根据本发明的一种实施方式,所述装置中的控制单元还包括:指令队列模块,用于对译码后的随机向量生成指令进行顺序存储,其通过随机向量生成指令中的操作域获取随机向量生成指令相关的标量数据,如指定的分布参数、随机向量长度和随机向量存储地址等,将其填充至随机向量生成指令后将其送往依赖关系处理单元。
根据本发明的一种实施方式,所述装置的控制单元还包括:依赖关系处理单元,用于在随机向量生成单元获取指令前,判断该随机向量生成指令与之前未执行完的指令是否存在依赖关系,如是否访问相同的向量存储地址。如果存在依赖关系,则将该随机向量生成指令存储在存储队列模块中,待与其存在依赖关系的运算指令执行完毕后,存储队列模块将该随机向量生成指令提供给所述随机向量生成单元;否则,直接将该随机向量生成指令提供给所述随机向量生成单元。具体地,随机向量生成指令访问高速暂存存储器时,前后指令可能会访问同一块存 储空间,为了保证指令执行结果的正确性,当前指令如果被检测到与之前的指令的数据存在依赖关系,该指令必须在存储队列内等待至依赖关系被消除。
根据本发明的一种实施方式,所述装置的控制单元还包括:存储队列模块,该模块包括一个有序队列,与之前指令在数据上有依赖关系的指令被存储在该有序队列内直至依赖关系被消除,在依赖关系消除后,其将运算指令提供给随机向量生成单元。
根据本发明的一种实施方式,所述装置还包括:输入输出单元,用于将生成的随机向量存储于存储单元。同时,负责从内存中读取向量数据或写入向量数据。
根据本发明的一种实施方式,本装置的指令设计采用精简化的方式,一条指令可以生成一条任意长度的随机向量。
在本装置生成随机向量的过程中,所述装置取出指令进行译码,然后送至指令队列存储,根据译码结果,获取指令中的各个参数,这些参数可以是直接写在指令的操作域中,也可以是根据指令操作域中的寄存器号从指定的寄存器中读取。这种使用寄存器存储参数的好处是无需改变指令本身,只要用指令改变寄存器中的值,就可以实现大部分的循环,因此大大节省了在解决某些实际问题时所需要的指令条数。在获取全部操作数之后,依赖关系处理单元会判断指令实际需要使用的数据与之前指令中是否存在依赖关系,这决定了这条指令是否可以被立即发送至运算单元中执行。一旦发现与之前的数据之间存在依赖关系,则该条指令必须等到它依赖的指令执行完毕之后才可以送至运算单元执行。在定制的运算单元中,该条指令将快速执行完毕,并将结果,即生成的随机向量写回至指令提供的地址,该条指令执行完毕。
图2是本发明提供的随机向量生成指令的格式示意图,如图2所示,所述随机向量生成指令包括一操作码和至少一操作域,其中,操作码用于指示生成服从何种分布的随机向量,如高斯分布或均匀分布等;操作域用于指示该运算指令的数据信息,其中,数据信息可以是立即数或寄存器号,例如,要生成一个向量时,根据寄存器号可以在相应的寄存器中获取输出向量存储起始地址和向量长度,以及分布的参数,然后将根据该分布生成的随机向量存至指定的地址。
本发明一实施例中可以实现下列几种随机向量生成指令:
均匀分布指令(UNIF),根据该指令,装置从指令或从寄存器堆中读取均匀分布的上界参数和下界参数,以及要生成的随机向量的大小和存储地址,然后在随机向量生成单元中生成服从该均匀分布的随机向量,并将生成的随机向量结果写回至指定的高速暂存存储器的存储地址。
高斯分布指令(GAUS),根据该指令,装置从指令或从寄存器堆中读取高斯分布的均值参数和方差参数,以及要生成的随机向量的大小和存储地址,然后在 随机向量生成单元中生成服从该高斯分布的随机向量,并将生成的随机向量结果写回至指定的高速暂存存储器的存储地址。
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明进一步详细说明。
图3是本发明一实施例提供的随机向量生成装置的结构示意图,如图3所示,所述装置包括取指模块、译码模块、指令队列、标量寄存器堆、依赖关系处理单元、存储队列、随机向量生成单元、高速暂存器、IO内存存取模块;
取指模块,该模块负责从指令序列中取出下一条将要执行的指令,并将该指令传给译码模块;
译码模块,该模块负责对指令进行译码,并将译码后指令传给指令队列;
指令队列模块,该模块用于暂存从译码模块获得的指令,并从指令或标量寄存器获得指令运算相应的数据,包括生成的随机向量长度、存储地址和分布参数等。获得标量数据后,指令被送至依赖关系处理单元;
标量寄存器堆,提供装置在运算过程中所需的标量寄存器,各种标量参数可以直接在指令的操作域中给出,也可以从标量寄存器堆中读取;
依赖关系处理单元,该单元用于处理随机向量生成指令与之前未执行完的指令可能存在的存储依赖关系。随机向量生成指令可能会访问高速暂存存储器如将生成的随机向量存储至高速暂存存储器等,前后指令可能会访问同一块存储空间。为了保证指令执行结果的正确性,当前指令如果被检测到与之前的指令的数据存在依赖关系,该指令被送至存储队列模块内等待至依赖关系被消除。即检测本条指令的输入数据的存储区间与之前没有执行完毕的指令的输出数据的存储区间是否有重叠,存储区间是由起始地址和数据长度决定的。如果有重叠,则说明本条指令实际上是需要之前指令的执行结果作为输入的,因此必须等到那条指令执行完毕后,这条指令才能开始执行。在这个过程中,指令实际被暂存在存储队列模块中。
存储队列模块,该模块是一个有序队列,与之前指令在数据上有依赖关系的指令被存储在该队列内直至存储关系被消除;依赖关系被消除后的随机向量生成指令被送往随机向量生成单元;
随机向量生成单元,该单元根据指令生成服从指定分布的随机向量;该随机向量生成单元为定制的硬件电路实现,包括但不限于FPGA、CGRA、专用集成电路ASIC、模拟电路和忆阻器等;
高速暂存存储器,该模块是向量数据专用的暂存存储装置,能够支持不同大小的向量数据;所述高速暂存器可用于存储生成的随机向量;
IO内存存取模块,该模块用于直接访问高速暂存存储器,负责从高速暂存存储器中读取数据或写入数据。
图4是本发明实施例提供的运算装置执行均匀分布指令生成满足均匀分布的随机向量的流程图,如图4所示,执行均匀分布指令的过程包括:
S1,取指模块取出该条随机数生成指令,并将该指令送往译码模块。
S2,译码模块对指令译码,并将指令送往指令队列。
S3,在指令队列中,该随机数生成指令从指令本身或从标量寄存器堆中获取指令中四个操作域所对应的标量数据,包括生成的随机向量存储地址、生成的随机向量长度、均匀分布的上界和下界。
S4,在取得需要的标量数据后,该指令被送往依赖关系处理单元。
S5,依赖关系处理单元分析该指令与前面的尚未执行结束的指令在数据上是否存在依赖关系。若存在依赖关系,则该条指令被送往存储队列模块中等待至其与前面的未执行结束的指令在数据上不再存在依赖关系为止。若不存在依赖关系,则该条指令直接被送往随机向量生成单元。
S6,随机向量生成单元根据上下界参数通过硬件电路生成满足该均匀分布分布的一定长度的随机向量。
S7,随机向量生成单元继续生成一定长度的随机向量,直至完成指定长度的随机向量的生成。
S8,运算完成后,将结果向量写回至高速暂存存储器的指定地址。
综上所述,本发明提供随机向量生成装置,配合相应的指令,能够很好地解决当前计算机领域越来越多的生成满足一定分布的随机向量的计算任务。相比于已有的传统解决方案,本发明可以具有指令精简、使用方便、支持的向量长度灵活、片上缓存充足等优点。
以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (11)

  1. 一种用于生成随机向量的装置,其特征在于,该装置包括:
    存储单元,用于存储随机向量生成指令相关的向量数据;
    寄存器单元,用于存储随机向量生成指令相关的标量数据;
    控制单元,用于对随机向量生成指令进行译码,并控制随机向量生成指令的执行过程;
    随机向量生成单元,用于根据译码后的随机向量生成指令,生成服从指定分布的随机向量;
    其中,所述随机向量生成单元为定制的硬件电路。
  2. 如权利要求1所述的装置,其特征在于,所述寄存器单元所存储的标量数据包括随机向量生成指令相关的随机向量存储地址、随机向量长度以及分布参数;其中,所述随机向量存储地址为所述存储单元中的地址。
  3. 如权利要求1所述的装置,其特征在于,所述控制单元包括:
    指令队列模块,用于对译码后的随机向量生成指令进行顺序存储,并获取随机向量生成指令相关的标量数据。
  4. 如权利要求1所述的装置,其特征在于,所述控制单元包括:
    依赖关系处理单元,用于在随机向量生成单元获取当前随机向量生成指令前,判断当前随机向量生成指令与之前未执行完的运算指令是否存在依赖关系。
  5. 如权利要求1所述的装置,其特征在于,所述控制单元包括:
    存储队列模块,用于在当前随机向量生成指令与之前未执行完的运算指令存在依赖关系时,暂时存储当前随机向量生成指令,并且在该依赖关系消除时,将暂存的随机向量生成指令送往随机向量生成单元。
  6. 如权利要求1-5任一项所述的装置,其特征在于,所述装置还包括:
    指令缓存单元,用于存储待执行的随机向量生成指令;
    输入输出单元,用于将随机向量生成指令相关的向量数据存储于存储单元,或者,从存储单元中获取随机向量生成指令相关的向量数据。
  7. 如权利要求1所述的装置,其特征在于,所述随机向量生成指令包括操作码和操作域;
    所述操作码用于指示执行指定分布的随机向量生成操作;
    所述操作域包括立即数和/或寄存器号,指示随机向量生成相关的标量数据,其中寄存器号指向所述寄存器单元地址。
  8. 如权利要求1-5、7任一项所述的装置,其特征在于,所述存储单元为高速暂存存储器。
  9. 一种用于生成随机向量的装置,其特征在于,包括:
    取指模块,用于从指令序列中取出下一条要执行的随机向量生成指令,并将该随机向量生成指令传给译码模块;
    译码模块,用于对该随机向量生成指令进行译码,并将译码后的随机向量生成指令传送给指令队列模块;
    指令队列模块,用于暂存译码后的随机向量生成指令,并从随机向量生成指令或标量寄存器获得随机向量生成指令相关的标量数据;获得所述标量数据后,将所述随机向量生成指令送至依赖关系处理单元;
    标量寄存器堆,包括多个标量寄存器,用于存储随机向量生成指令相关的标量数据;
    依赖关系处理单元,用于判断所述随机向量生成指令与之前未执行完的运算指令之间是否存在依赖关系;如果存在依赖关系,则将所述随机向量生成指令送至存储队列模块,如果不存在依赖关系,则将所述随机向量生成指令送至随机向量生成单元;
    存储队列模块,用于存储与之前运算指令存在依赖关系的随机向量生成指令,并且在所述依赖关系解除后,将所述随机向量生成指令送至随机向量生成单元;
    随机向量生成单元,用于根据接收到随机向量生成指令生成服从指定分布的随机向量;
    高速暂存存储器,用于存储生成的随机向量;
    输入输出存取模块,用于直接访问所述高速暂存存储器,负责向所述高速暂存存储器中写入生成的随机向量。
  10. 如权利要求9所述的装置,其特征在于,所述随机向量生成单元为定制的硬件电路。
  11. 一种用于生成随机向量的方法,其特征在于,该方法包括:
    取指模块从指令序列中取出下一条要执行的随机向量生成指令,并将该随机向量生成指令传给译码模块;
    译码模块对该随机向量生成指令进行译码,并将译码后的随机向量生成指令传送给指令队列模块;
    指令队列模块暂存译码后的随机向量生成指令,并从随机向量生成指令或标量寄存器获得随机向量生成指令运算相关的标量数据;获得所述标量数据后,将所述随机向量生成指令送至依赖关系处理单元;
    依赖关系处理单元判断所述随机向量生成指令与之前未执行完的运算指令之间是否存在依赖关系;如果存在依赖关系,则将所述随机向量生成指令送至存储队列模块,如果不存在依赖关系,则将所述随机向量生成指令送至随机向量生成单元;
    存储队列模块存储与之前运算指令存在依赖关系的随机向量生成指令,并且在所述依赖关系解除后,将所述随机向量生成指令送至随机向量生成单元;
    随机向量生成单元根据接收到的随机向量生成指令,生成服从指定分布的随机向量,并通过输入输出存取模块将生成的随机向量写入高速暂存存储器。
PCT/CN2016/080970 2016-04-26 2016-05-04 一种用于生成服从一定分布的随机向量的装置和方法 WO2017185388A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16899899.5A EP3451158B1 (en) 2016-04-26 2016-05-04 Device and method for generating random vectors conforming to certain distribution
US16/171,284 US11501158B2 (en) 2016-04-26 2018-10-25 Apparatus and methods for generating random vectors

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610266608.8A CN107315565B (zh) 2016-04-26 2016-04-26 一种用于生成服从一定分布的随机向量装置和方法
CN201610266608.8 2016-04-26

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/171,284 Continuation-In-Part US11501158B2 (en) 2016-04-26 2018-10-25 Apparatus and methods for generating random vectors

Publications (1)

Publication Number Publication Date
WO2017185388A1 true WO2017185388A1 (zh) 2017-11-02

Family

ID=60161755

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/080970 WO2017185388A1 (zh) 2016-04-26 2016-05-04 一种用于生成服从一定分布的随机向量的装置和方法

Country Status (4)

Country Link
US (1) US11501158B2 (zh)
EP (1) EP3451158B1 (zh)
CN (2) CN111857821A (zh)
WO (1) WO2017185388A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11501158B2 (en) 2016-04-26 2022-11-15 Cambricon (Xi'an) Semiconductor Co., Ltd. Apparatus and methods for generating random vectors
CN115437603A (zh) * 2021-06-04 2022-12-06 中科寒武纪科技股份有限公司 用于生成随机数的方法及其相关产品

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110764733B (zh) * 2019-10-15 2023-06-30 天津津航计算技术研究所 一种基于fpga的多种分布随机数生成装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609715A (zh) * 2009-05-11 2009-12-23 中国人民解放军国防科学技术大学 行列访问端口分离的矩阵寄存器文件
CN101776988A (zh) * 2010-02-01 2010-07-14 中国人民解放军国防科学技术大学 一种块大小可变的可重构矩阵寄存器文件

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2553540B1 (fr) * 1983-10-13 1986-01-03 Centre Nat Rech Scient Dispositif de test aleatoire pour circuits logiques, notamment microprocesseurs
JP3182177B2 (ja) * 1991-09-12 2001-07-03 株式会社日立製作所 ベクトル演算処理機能を有する中央数値処理装置及びベクトル演算処理方法
US6931400B1 (en) * 2001-08-21 2005-08-16 At&T Corp. Method and system for identifying representative trends using sketches
US7822797B2 (en) * 2002-07-29 2010-10-26 Broadcom Corporation System and method for generating initial vectors
CN100545804C (zh) * 2003-08-18 2009-09-30 上海海尔集成电路有限公司 一种基于cisc结构的微控制器及其指令集的实现方法
CN100428665C (zh) * 2003-09-10 2008-10-22 联想(北京)有限公司 一种数据安全传输的方法
CN101178644B (zh) * 2006-11-10 2012-01-25 上海海尔集成电路有限公司 一种基于复杂指令集计算机结构的微处理器架构
CN101515301B (zh) * 2008-02-23 2011-05-04 炬力集成电路设计有限公司 一种片上***芯片验证的方法和装置
CN102156637A (zh) * 2011-05-04 2011-08-17 中国人民解放军国防科学技术大学 向量交叉多线程处理方法及向量交叉多线程微处理器
US9165328B2 (en) * 2012-08-17 2015-10-20 International Business Machines Corporation System, method and computer program product for classification of social streams
US9268563B2 (en) * 2012-11-12 2016-02-23 International Business Machines Corporation Verification of a vector execution unit design
CN111857821A (zh) 2016-04-26 2020-10-30 中科寒武纪科技股份有限公司 一种用于生成服从一定分布的随机向量装置和方法
US11062215B2 (en) * 2017-03-17 2021-07-13 Microsoft Technology Licensing, Llc Using different data sources for a predictive model
US10860796B2 (en) * 2017-05-16 2020-12-08 Gluru Limited Method and system for vector representation of linearly progressing entities
GB201710877D0 (en) * 2017-07-06 2017-08-23 Nokia Technologies Oy A method and an apparatus for evaluating generative machine learning model

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101609715A (zh) * 2009-05-11 2009-12-23 中国人民解放军国防科学技术大学 行列访问端口分离的矩阵寄存器文件
CN101776988A (zh) * 2010-02-01 2010-07-14 中国人民解放军国防科学技术大学 一种块大小可变的可重构矩阵寄存器文件

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3451158A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11501158B2 (en) 2016-04-26 2022-11-15 Cambricon (Xi'an) Semiconductor Co., Ltd. Apparatus and methods for generating random vectors
CN115437603A (zh) * 2021-06-04 2022-12-06 中科寒武纪科技股份有限公司 用于生成随机数的方法及其相关产品
CN115437603B (zh) * 2021-06-04 2023-12-19 中科寒武纪科技股份有限公司 用于生成随机数的方法及其相关产品

Also Published As

Publication number Publication date
US11501158B2 (en) 2022-11-15
EP3451158B1 (en) 2021-10-06
CN107315565B (zh) 2020-08-07
CN111857821A (zh) 2020-10-30
EP3451158A1 (en) 2019-03-06
CN107315565A (zh) 2017-11-03
US20190065952A1 (en) 2019-02-28
EP3451158A4 (en) 2020-04-29

Similar Documents

Publication Publication Date Title
CN107315715B (zh) 一种用于执行矩阵加/减运算的装置和方法
CN109240746B (zh) 一种用于执行矩阵乘运算的装置和方法
WO2017185395A1 (zh) 一种用于执行向量比较运算的装置和方法
WO2017185385A1 (zh) 一种用于执行向量合并运算的装置和方法
WO2017185384A1 (zh) 一种用于执行向量循环移位运算的装置和方法
CN107315718B (zh) 一种用于执行向量内积运算的装置和方法
CN107315717B (zh) 一种用于执行向量四则运算的装置和方法
WO2017185390A1 (zh) 一种用于执行向量超越函数运算的装置和方法
CN107315568B (zh) 一种用于执行向量逻辑运算的装置
WO2017185388A1 (zh) 一种用于生成服从一定分布的随机向量的装置和方法
EP3451161B1 (en) Apparatus and method for executing operations of maximum value and minimum value of vectors
KR102467544B1 (ko) 연산 장치 및 그 조작 방법

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2016899899

Country of ref document: EP

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16899899

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2016899899

Country of ref document: EP

Effective date: 20181126