WO2019223383A1 - 直接内存存取方法、装置、专用计算芯片及异构计算*** - Google Patents
直接内存存取方法、装置、专用计算芯片及异构计算*** Download PDFInfo
- Publication number
- WO2019223383A1 WO2019223383A1 PCT/CN2019/076252 CN2019076252W WO2019223383A1 WO 2019223383 A1 WO2019223383 A1 WO 2019223383A1 CN 2019076252 W CN2019076252 W CN 2019076252W WO 2019223383 A1 WO2019223383 A1 WO 2019223383A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- dma control
- length
- control block
- output data
- input data
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/20—Handling requests for interconnection or transfer for access to input/output bus
- G06F13/28—Handling requests for interconnection or transfer for access to input/output bus using burst mode transfer, e.g. direct memory access DMA, cycle steal
Definitions
- One or more embodiments of the present specification relate to the field of computer technology, and in particular, to a direct memory access method, device, dedicated computing chip, and heterogeneous computing system.
- Heterogeneous computing refers to the control of the overall process of data processing by a general-purpose central processing unit (CPU).
- CPU central processing unit
- the general-purpose CPU calls a special-purpose computing chip for calculation.
- a general-purpose CPU needs to call a Direct Memory Access (DMA) method (a method of transferring memory data through a dedicated hardware module) to transfer input data of a dedicated calculation from a system memory to a device memory. After the special calculation chip completes the calculation, the output data is transmitted back to the system memory.
- DMA Direct Memory Access
- the transmission process of the input data may be: 1) accessing a pointer of the queue where the DMA descriptor is located to read the DMA descriptor of the input data (used to describe the address and length of the input data). 2) Access the DMA descriptor of the input data to read the address and length of the input data. 3) Read the input data according to the address and length of the input data.
- the transmission process of the output data can be: 1) access the pointer of the queue where the DMA descriptor is located to read the DMA descriptor of the output data (used to describe the address and length of the output data). 2) Access the DMA descriptor of the output data to read the address and length of the output data; 3) Write the output data according to the address and length of the output data.
- a process of heterogeneous computing needs to perform six access operations.
- One or more embodiments of the present specification describe a direct memory access method, a device, a dedicated computing chip, and a heterogeneous computing system, which can reduce the number of data accesses in a DMA transfer, thereby improving the performance of heterogeneous computing.
- a direct memory access method including:
- a corresponding DMA control block is determined in system memory, and the content of the DMA control block includes DMA control information and input data;
- the system memory is used to store the data used by the general-purpose central processing unit CPU. Data storage space;
- the device memory refers to a storage space for storing data of a dedicated computing chip
- a dedicated computing chip including: a direct memory access DMA length register, a DMA control block pointer queue, a DMA data transmission module, and a dedicated computing module;
- the DMA length register is used to store the length of the input data and the length of the output data
- the DMA control block pointer queue is used to store multiple DMA control block pointers; the DMA control block pointer points to a DMA control block in system memory; the content of the DMA control block includes DMA control information and input data;
- a DMA data transmission module configured to move the DMA control information and the input data from system memory to device memory according to the length of the input data, the length of the DMA control information, and the DMA control block pointer; Configured to move the output data from the device memory to the system memory according to the DMA control information and the length of the output data;
- the dedicated calculation module is configured to calculate the input data and obtain the output data.
- a heterogeneous computing system including: a general-purpose central processing unit CPU, system memory, a dedicated computing chip and device memory as provided in the second aspect above;
- the general-purpose CPU is configured to call the dedicated computing chip for heterogeneous computing
- the system memory is used to store data used by the general-purpose CPU
- the device memory is configured to store data used by the dedicated computing chip.
- a direct memory access device including:
- a determining unit configured to determine a corresponding DMA control block in system memory according to the DMA control block pointer read by the reading unit, and the content of the DMA control block includes DMA control information and input data;
- the system Memory refers to the storage space used to store data used by the general purpose central processing unit CPU;
- the determining unit is further configured to determine a total length of the DMA control information and the input data
- a moving unit configured to move the DMA control information and the input data to a device memory according to the DMA control block pointer read by the reading unit and the total length determined by the determining unit;
- Device memory refers to the storage space used to store data for dedicated computing chips;
- a calculation unit configured to perform corresponding calculation on the input data to obtain output data
- a writing unit configured to write the output data calculated by the calculation unit into the device memory
- An obtaining unit configured to obtain a length of the output data
- the moving unit is further configured to move the output data from the device memory to the DMA control block according to the DMA control information and a length of the output data obtained by the obtaining unit.
- the direct memory access method, device, dedicated computing chip, and heterogeneous computing system read a DMA control block pointer from a DMA control block pointer queue.
- the corresponding DMA control block is determined in the system memory. Determines the total length of DMA control information and input data in the DMA control block.
- the DMA control information and input data are moved to the device memory. Perform corresponding calculations on the input data to obtain the output data. Write the output data to the device memory and get the length of the output data.
- the output data is moved from the device memory to the DMA control block.
- the scheme provided in this specification performs the following two access operations during the input data transmission process: for the first time, the DMA control block pointer queue is accessed to read the DMA control block pointer. The second time, the DMA control block pointer is accessed to read the DMA control information and input data.
- the access to the DMA descriptor is reduced.
- the output data can be moved to the DMA control block directly according to the DMA control information. That is to say, the output data transmission process only needs to perform access to the output data once, and it is no longer necessary to execute the pointer operation of the queue where the DMA descriptor is located and the access operation of the output data descriptor.
- two DMA transfers can reduce three access operations. This can greatly improve the DMA transmission efficiency of the data, which in turn can improve the performance of heterogeneous computing.
- FIG. 1 is a schematic structural diagram of a heterogeneous computing system provided in this specification
- FIG. 2 is a flowchart of a direct memory access method according to an embodiment of the present specification
- FIG. 3 is a schematic diagram of a direct memory access device according to an embodiment of the present specification.
- the direct memory access method provided by an embodiment of this specification can be applied to a heterogeneous computing system as shown in FIG. 1.
- the heterogeneous computing system may include a general-purpose CPU 10, a system memory 20, a dedicated computing chip 30, and a device memory 40 Among them, the general-purpose CPU 10 and the dedicated computing chip 30 may also be referred to as two computing units in a heterogeneous computing system.
- the general-purpose CPU 10 is used to control the main data processing flow of heterogeneous computing. Specifically includes: a. Preprocessing and preparation of heterogeneous computing input data. b. Calling a special computing chip for heterogeneous computing. c. Query heterogeneous calculation results (also called output data) and return. d. Perform post-processing and output of heterogeneous calculation output data.
- the system memory 20 is used to store data used by the general-purpose CPU 10.
- it can store the used data in the form of a DMA control block (a data structure), which occupies a physically continuous address space in the system memory 20.
- the content of the DMA control block may include DMA control information, input data, and output data.
- the space occupied by the input data may also be referred to as an input data block.
- the space occupied by output data can also be referred to as an output data block.
- the general-purpose CPU 10 may determine the length of the input data and the length of the output data according to the current heterogeneous calculation method.
- the corresponding input data, the length of the input data, and the length of the output data are usually determined.
- the DMA control information can be constructed later, and the specific construction process will be described later.
- part of the content of the DMA control block (DMA control information and input data) is obtained.
- the general-purpose CPU 10 can write the part of the content into a physically continuous address space of the system memory 20. It can be understood that since the length of the output data is also determined, after the above part of the content, an address space of the above length is usually reserved continuously for writing output data.
- the above-mentioned DMA control block is constituted by an address space in which a part of content is written in the system memory 20 and a reserved address space.
- a DMA control block can be formed in the system memory 20.
- the above DMA control information may include: an offset address of input data, an offset address of output data, and a calculation completion flag.
- the offset address of the input data can occupy 32 bits (that is, 4 bytes), which can refer to the offset of the space occupied by the input data (or the input data block) relative to the start address of the DMA control block.
- the actual address of the input data in the system memory 20 may be determined according to the DMA control block start address and the offset address.
- the definition of the offset address of the output data is the same as the definition of the offset address of the input data.
- the calculation completion flag can occupy 1 bit (expanded to 4 bytes), which can be cleared by the general-purpose CPU 10 before heterogeneous calculation. After the heterogeneous calculation is completed, the dedicated calculation chip 30 rewrites the flag bit to 1.
- the CPU 10 polls the calculation completion flag to confirm whether the heterogeneous calculation is completed.
- the special-purpose calculation chip 30 is configured to cooperate with a general-purpose CPU to perform special-purpose calculation (for example, matrix multiplication and large number multiplication) functions.
- the dedicated computing chip 30 may be, for example, a Field Programmable Gate Array (FPGA) chip, an Application Specific Integrated Circuit (ASIC) chip, a Graphics Processing Unit (GPU) chip, or the like.
- the general-purpose CPU 10 has lower calculation efficiency, and uses a special-purpose calculation chip 30 for calculation, and has higher cost performance.
- the device memory 40 is used to store data of the dedicated computing chip 30. Specifically, when the heterogeneous calculation is started, the dedicated computing chip 30 may read the input data from the device memory 40. When the heterogeneous calculation is completed, the output data may be written into the device memory 40.
- the dedicated computing chip 30 may specifically include: a DMA length register 31, a DMA control block pointer queue 32, a DMA data transmission module 33, and a dedicated computing module 34.
- the DMA length register 31 is used to store the length of the input data and the length of the output data. Usually for a specific heterogeneous calculation, the length of the input data and the length of the output data are fixed. That is, the general-purpose CPU 10 may determine the foregoing length according to the heterogeneous calculation method currently performed.
- the DMA control block pointer queue 32 is used to store a plurality of DMA control block pointers.
- the DMA control block pointers point to the DMA control blocks in the system memory 20, which can occupy 32 bits. Specifically, each time a DMA control block is formed in the system memory 20, the general-purpose CPU 10 can write a DMA control block pointer corresponding to the DMA control block to the DMA control block pointer queue 32. Since one DMA control block can be formed for one heterogeneous calculation, the DMA control block pointer also corresponds to one heterogeneous calculation.
- multiple DMA control block pointers in the DMA control block pointer queue 32 can be read at the same time, so that multiple heterogeneous calculations can be performed in parallel. Greatly improve the efficiency of heterogeneous computing. It should be noted that the multiple heterogeneous calculations here belong to the same type, for example, they are all encrypted calculations.
- the DMA data transmission module 33 is used to move the DMA control information and input data from the system memory 20 to the device memory 40 according to the length of the input data, the length of the DMA control information, and the DMA control block pointer; it is also used to output the data according to the DMA control information and output. The length of the data, the output data is moved from the device memory 40 to the system memory 20.
- the special-purpose calculation module 34 is used to implement a special-purpose calculation function. Specifically, it is used to calculate input data and obtain output data.
- the general-purpose CPU 10 in FIG. 1 can call a special-purpose computing chip 30 to perform heterogeneous calculations.
- this specification improves the direct memory access method. Before executing the direct memory access method for heterogeneous computing provided in this manual, the following steps can be performed first:
- the general-purpose CPU 10 determines the length of the input data and the length of the output data according to the current heterogeneous calculation method, and writes the length of the input data and the length of the output data into the DMA length register 31.
- the general-purpose CPU 10 prepares input data for heterogeneous calculation and constructs DMA control information.
- the DMA control information may include: an offset address of input data, an offset address of output data, and a calculation completion flag.
- the offset address of the input data can be determined according to the length of the DMA control information.
- an address space of the above length can be continuously reserved for writing the output data.
- the address space in which data is written in the system memory 20 and the reserved address space constitute a DMA control block.
- the general-purpose CPU 10 writes a DMA control block pointer into the DMA control block pointer queue 32, and the DMA control block pointer points to the start address of the formed DMA control block.
- step 1) can be performed only once, while steps 2) and 3) are It can be executed multiple times based on the number of heterogeneous calculations.
- FIG. 2 is a flowchart of a direct memory access method according to an embodiment of the present specification.
- the execution subject of the method may be a dedicated computing chip 30 in FIG. 1.
- the method may specifically include:
- Step 210 Read the DMA control block pointer from the DMA control block pointer queue.
- the DMA control block pointer here points to the start address of the DMA control block, so that the content of the DMA control block can be directly accessed according to the pointer, thereby reducing the number of accesses to the system memory 20, and thus reducing the DMA transfer delay.
- the dedicated computing chip 30 may poll to check whether the DMA control block pointer queue 32 is empty. If it is not empty, the DMA control block pointer can be read from the head of the queue. Since the DMA control block pointer queue 32 provided in this specification can store multiple DMA control block pointers at the same time, it can more conveniently support DMA asynchronous operations, more conveniently support multiple processes to perform DMA operations independently, and improve DMA transmission efficiency.
- Step 220 Determine the corresponding DMA control block in the system memory according to the DMA control block pointer.
- the DMA control block A when the DMA control block pointer A is read, the DMA control block A can be determined; when the DMA control block pointer B is read, the DMA control block B can be determined. It can be understood that the content of the DMA control block read in this step only includes DMA control information and input data.
- Step 230 Determine the total length of the DMA control information and the input data.
- the DMA control information in this specification may include: an offset address of input data, an offset address of output data, and a calculation completion flag, and it has a fixed length.
- the fixed length is the sum of the lengths of the three.
- the above-mentioned determination process of the total length may be: reading the length of the input data from the DMA length register 31. Determine the total length based on the fixed length and the length of the input data.
- Step 240 Move the DMA control information and input data to the device memory according to the DMA control block pointer and the total length.
- the DMA data transmission module 33 may move the DMA control information and input data to the device memory 40 according to the DMA control block pointer and the total length.
- a physically continuous address space may be divided in the device memory 40 first. After that, the DMA control information and input data can be read from the corresponding DMA control block according to the DMA control block pointer and the total length.
- the above-mentioned read operation can also be understood as a continuous read read operation.
- the DMA control information and the input data are written into a preliminarily divided physically continuous address space in the device memory 40. It can be understood that, after the foregoing write operation is performed, the start addresses of the DMA control information and the input data in the device memory 40 are determined.
- Step 250 Perform corresponding calculations on the input data to obtain output data.
- the dedicated calculation module 34 may be called to perform corresponding calculations on the input data.
- the actual address of the input data in the device memory 40 may be determined according to the start address determined in step 240 and the offset address of the input data in the DMA control information. After that, the data input can be read from the device memory 40 according to the actual address, and a dedicated calculation module 34 is called to perform corresponding calculations on the input data.
- Step 260 Write the output data to the device memory.
- the actual address of the output data in the device memory 40 may be determined according to the start address determined in step 240 and the offset address of the output data in the DMA control information. After that, the output data can be written into the storage space corresponding to the actual address in the device memory 40.
- Step 270 Obtain the length of the output data.
- the length of the output data may be read from the DMA length register 31.
- Step 280 Move the output data from the device memory to the DMA control block according to the DMA control information and the length of the output data.
- the output data may be moved from the device memory 40 to the DMA control block according to the offset address of the output data and the length of the output data in the DMA control information.
- the moving process may specifically include: obtaining a DMA control information and a start address of the input data in the device memory 40.
- the first actual address of the output data in the device memory 40 is determined according to the offset address and the start address.
- the second actual address of the output data in the DMA control block is determined.
- the dedicated computing chip 30 may rewrite the calculation completion flag in the DMA control information, for example, the calculation completion flag may be rewritten to 1.
- the general-purpose CPU 10 may poll the calculation completion flag. When the calculation completion flag is 1, it indicates that the heterogeneous calculation is completed, and the output data in the system memory 20 may be used.
- the direct memory access method provided in the embodiment of the present specification can avoid access to the system memory by DMA transmission of separate output data, and obtains the offset address of the output data while obtaining the input data from the system memory. Therefore, after the heterogeneous calculation is completed, the output data is directly moved according to the offset address. It avoids the operation of the general-purpose CPU and reduces the delay of the entire heterogeneous calculation.
- the DMA block pointer queue provided in this specification only needs to write a 32-bit DMA block pointer at a time, and the amount of data is very small, which directly corresponds to an atomic write operation of a general-purpose CPU, which improves the efficiency of concurrent operations of multiple processes.
- an embodiment of the present specification further provides a direct memory access device.
- the device may include:
- the reading unit 301 is configured to read a DMA control block pointer from a direct memory access DMA control block pointer queue.
- the determining unit 302 is configured to determine a corresponding DMA control block in the system memory according to the DMA control block pointer read by the reading unit 301, and the content of the DMA control block includes DMA control information and input data.
- the above system memory refers to a storage space for storing data used by a general-purpose central processing unit CPU.
- the determining unit 302 is further configured to determine a total length of the DMA control information and the input data.
- the DMA control information may have a fixed length.
- the determining unit 302 may be specifically configured to:
- the length of the input data is determined by the general-purpose CPU according to the heterogeneous calculation method currently performed.
- the moving unit 303 is configured to move the DMA control information and the input data to the device memory according to the DMA control block pointer read by the reading unit 301 and the total length determined by the determining unit 32.
- the device memory refers to the storage space used to store data for a dedicated computing chip.
- the moving unit 303 here may be implemented by the DMA data transmission module 33 in FIG. 1.
- the calculation unit 304 is configured to perform corresponding calculation on the input data to obtain output data.
- the calculation unit 304 here may be implemented by a dedicated calculation module 34 in FIG. 1.
- the writing unit 305 is configured to write output data calculated by the calculation unit 304 into a device memory.
- the obtaining unit 306 is configured to obtain a length of the output data.
- the moving unit 303 is further configured to move the output data from the device memory to the DMA control block according to the DMA control information and the length of the output data obtained by the obtaining unit 306.
- the above DMA control information may include an offset address of output data.
- the moving unit 303 can be specifically used for:
- the output data is moved from the device memory to the DMA control block.
- the reading unit 301 reads a DMA control block pointer from a direct memory access DMA control block pointer queue.
- the determining unit 302 determines a corresponding DMA control block in the system memory according to the DMA control block pointer.
- the determining unit 302 is further configured to determine a total length of the DMA control information and the input data.
- the moving unit 303 moves the DMA control information and the input data to the device memory according to the DMA control block pointer and the total length.
- the calculation unit 304 performs corresponding calculations on the input data to obtain output data.
- the writing unit 305 writes the output data into the device memory.
- the obtaining unit 306 obtains the length of the output data.
- the moving unit 303 moves the output data from the device memory to the DMA control block according to the DMA control information and the length of the output data. This improves the performance of heterogeneous computing.
- the direct memory access device provided in the embodiment of the present specification may be a module or a unit in the dedicated computing chip 30 in FIG. 1.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Bus Control (AREA)
Abstract
Description
Claims (8)
- 一种直接内存存取方法,其特征在于,包括:从直接内存存取DMA控制块指针队列中读取DMA控制块指针;根据所述DMA控制块指针,在***内存中确定对应的DMA控制块,所述DMA控制块的内容包括DMA控制信息和输入数据;所述***内存是指用于存储通用中央处理器CPU使用的数据的存储空间;确定所述DMA控制信息和所述输入数据的总长度;根据所述DMA控制块指针以及所述总长度,将所述DMA控制信息和所述输入数据搬移至设备内存;所述设备内存是指用于存储专用计算芯片的数据的存储空间;对所述输入数据进行相应的计算,得到输出数据;将所述输出数据写入所述设备内存;获取所述输出数据的长度;根据所述DMA控制信息以及所述输出数据的长度,将所述输出数据从所述设备内存搬移至所述DMA控制块。
- 根据权利要求1所述的方法,其特征在于,所述DMA控制信息包括所述输出数据的偏移地址;所述根据所述DMA控制信息以及所述输出数据的长度,将所述设备内存的所述输出数据搬移到所述DMA控制块,包括:根据所述输出数据的偏移地址以及所述输出数据的长度,将所述输出数据从所述设备内存搬移至所述DMA控制块。
- 根据权利要求1所述的方法,其特征在于,所述DMA控制信息具有固定长度;所述确定所述DMA控制信息和所述输入数据的总长度,包括:从DMA长度寄存器中读取所述输入数据的长度;所述输入数据的长度是由所述通用CPU根据当前所执行的异构计算方法确定的;根据所述固定长度以及所述输入数据的长度,确定所述总长度。
- 一种专用计算芯片,其特征在于,包括:直接内存存取DMA长度寄存器、DMA控制块指针队列、DMA数据传输模块以及专用计算模块;所述DMA长度寄存器,用于存储输入数据的长度以及输出数据的长度;所述DMA控制块指针队列,用于存储多个DMA控制块指针;所述DMA控制块指针指向***内存中的DMA控制块;所述DMA控制块的内容包括DMA控制信息和输入数据;DMA数据传输模块,用于根据所述输入数据的长度、所述DMA控制信息的长度以及所述DMA控制块指针,将所述DMA控制信息以及所述输入数据从***内存搬移至设备内存;还用于根据所述DMA控制信息以及所述输出数据的长度,将所述输出数据从所述设备内存搬移至所述***内存;所述专用计算模块,用于对所述输入数据进行计算,并得到所述输出数据。
- 一种异构计算***,其特征在于,包括:通用中央处理器CPU、***内存、如权利要求4所述的专用计算芯片和设备内存;所述通用CPU,用于调用所述专用计算芯片进行异构计算;所述***内存,用于存储所述通用CPU使用的数据;所述设备内存,用于存储所述专用计算芯片所使用的数据。
- 一种直接内存存取装置,其特征在于,包括:读取单元,用于从直接内存存取DMA控制块指针队列中读取DMA控制块指针;确定单元,用于根据所述读取单元读取的所述DMA控制块指针,在***内存中确定对应的DMA控制块,所述DMA控制块的内容包括DMA控制信息和输入数据;所述***内存是指用于存储通用中央处理器CPU使用的数据的存储空间;所述确定单元,还用于确定所述DMA控制信息和所述输入数据的总长度;搬移单元,用于根据所述读取单元读取的所述DMA控制块指针以及所述确定单元确定的所述总长度,将所述DMA控制信息和所述输入数据搬移至设备内存;所述设备内存是指用于存储专用计算芯片的数据的存储空间;计算单元,用于对所述输入数据进行相应的计算,得到输出数据;写入单元,用于将所述计算单元计算的所述输出数据写入所述设备内存;获取单元,用于获取所述输出数据的长度;所述搬移单元,还用于根据所述DMA控制信息以及所述获取单元获取的所述输出数据的长度,将所述输出数据从所述设备内存搬移至所述DMA控制块。
- 根据权利要求6所述的装置,其特征在于,所述DMA控制信息包括所述输出数据的偏移地址;所述搬移单元具体用于:根据所述输出数据的偏移地址以及所述输出数据的长度,将所述输出数据从所述设备内存搬移至所述DMA控制块。
- 根据权利要求6所述的装置,其特征在于,所述DMA控制信息具有固定长度;所述确定单元具体用于:从DMA长度寄存器中读取所述输入数据的长度;所述输入数据的长度是由所述通用CPU根据当前所执行的异构计算方法确定的;根据所述固定长度以及所述输入数据的长度,确定所述总长度。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810488487.0A CN110515872B (zh) | 2018-05-21 | 2018-05-21 | 直接内存存取方法、装置、专用计算芯片及异构计算*** |
CN201810488487.0 | 2018-05-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019223383A1 true WO2019223383A1 (zh) | 2019-11-28 |
Family
ID=68616539
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/076252 WO2019223383A1 (zh) | 2018-05-21 | 2019-02-27 | 直接内存存取方法、装置、专用计算芯片及异构计算*** |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN110515872B (zh) |
TW (1) | TWI696949B (zh) |
WO (1) | WO2019223383A1 (zh) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP4024202A4 (en) | 2019-09-18 | 2022-10-26 | Huawei Technologies Co., Ltd. | METHOD FOR CONSTRUCTING AN INTERMEDIATE REPRESENTATION, COMPILER AND SERVER |
CN111190842B (zh) * | 2019-12-30 | 2021-07-20 | Oppo广东移动通信有限公司 | 直接存储器访问、处理器、电子设备和数据搬移方法 |
CN113342721B (zh) * | 2021-07-06 | 2022-09-23 | 无锡众星微***技术有限公司 | 存储控制器dma设计方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1474568A (zh) * | 2002-08-06 | 2004-02-11 | 华为技术有限公司 | 多通道数据直接内存访问***和方法 |
CN1641613A (zh) * | 2003-12-05 | 2005-07-20 | 联发科技股份有限公司 | 虚拟先进先出直接存储器存取装置 |
CN105512005A (zh) * | 2015-12-12 | 2016-04-20 | 中国航空工业集团公司西安航空计算技术研究所 | 控制/远程节点与总线监控节点同步工作的电路及方法 |
CN106339338A (zh) * | 2016-08-31 | 2017-01-18 | 天津国芯科技有限公司 | 一种可提高***性能的数据传输方法及装置 |
CN106569736A (zh) * | 2015-10-10 | 2017-04-19 | 北京忆芯科技有限公司 | NVMe协议处理器及其处理方法 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5953538A (en) * | 1996-11-12 | 1999-09-14 | Digital Equipment Corporation | Method and apparatus providing DMA transfers between devices coupled to different host bus bridges |
GB2359906B (en) * | 2000-02-29 | 2004-10-20 | Virata Ltd | Method and apparatus for DMA data transfer |
US6904473B1 (en) * | 2002-05-24 | 2005-06-07 | Xyratex Technology Limited | Direct memory access controller and method of filtering data during data transfer from a source memory to a destination memory |
US7533198B2 (en) * | 2005-10-07 | 2009-05-12 | International Business Machines Corporation | Memory controller and method for handling DMA operations during a page copy |
CN100395737C (zh) * | 2006-06-08 | 2008-06-18 | 杭州华三通信技术有限公司 | 一种在内存和数字信号处理器之间传送数据的方法 |
US8250252B1 (en) * | 2010-06-29 | 2012-08-21 | Qlogic, Corporation | System and methods for using a DMA module for a plurality of virtual machines |
CN102467473B (zh) * | 2010-11-03 | 2015-02-11 | Tcl集团股份有限公司 | 一种在用户空间和内核之间传输数据的方法和装置 |
US9239796B2 (en) * | 2011-05-24 | 2016-01-19 | Ixia | Methods, systems, and computer readable media for caching and using scatter list metadata to control direct memory access (DMA) receiving of network protocol data |
CN103377170B (zh) * | 2012-04-26 | 2015-12-02 | 上海宝信软件股份有限公司 | 异构处理器间spi高速双向对等数据通信*** |
CN103500149A (zh) * | 2013-09-29 | 2014-01-08 | 华为技术有限公司 | 直接内存访问控制器和直接内存访问控制方法 |
CN104317754B (zh) * | 2014-10-15 | 2017-03-15 | 中国人民解放军国防科学技术大学 | 面向异构计算***的跨步数据传输优化方法 |
CN105656805B (zh) * | 2016-01-20 | 2018-09-25 | 中国人民解放军国防科学技术大学 | 一种基于控制块预分配的分组接收方法和装置 |
-
2018
- 2018-05-21 CN CN201810488487.0A patent/CN110515872B/zh active Active
-
2019
- 2019-02-21 TW TW108105818A patent/TWI696949B/zh active
- 2019-02-27 WO PCT/CN2019/076252 patent/WO2019223383A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1474568A (zh) * | 2002-08-06 | 2004-02-11 | 华为技术有限公司 | 多通道数据直接内存访问***和方法 |
CN1641613A (zh) * | 2003-12-05 | 2005-07-20 | 联发科技股份有限公司 | 虚拟先进先出直接存储器存取装置 |
CN106569736A (zh) * | 2015-10-10 | 2017-04-19 | 北京忆芯科技有限公司 | NVMe协议处理器及其处理方法 |
CN105512005A (zh) * | 2015-12-12 | 2016-04-20 | 中国航空工业集团公司西安航空计算技术研究所 | 控制/远程节点与总线监控节点同步工作的电路及方法 |
CN106339338A (zh) * | 2016-08-31 | 2017-01-18 | 天津国芯科技有限公司 | 一种可提高***性能的数据传输方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN110515872A (zh) | 2019-11-29 |
TW202004494A (zh) | 2020-01-16 |
TWI696949B (zh) | 2020-06-21 |
CN110515872B (zh) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200159681A1 (en) | Information processor with tightly coupled smart memory unit | |
WO2019223383A1 (zh) | 直接内存存取方法、装置、专用计算芯片及异构计算*** | |
CN110647480A (zh) | 数据处理方法、远程直接访存网卡和设备 | |
US9710191B1 (en) | Rapid memory buffer write storage system and method | |
KR20120123127A (ko) | 이종 플랫폼에서 포인터를 공유시키는 방법 및 장치 | |
US11308171B2 (en) | Apparatus and method for searching linked lists | |
CN112214240B (zh) | 主机输出输入命令的执行装置及方法及计算机可读取存储介质 | |
KR102287677B1 (ko) | 데이터 액세스 방법, 장치, 기기 및 저장 매체 | |
US10049035B1 (en) | Stream memory management unit (SMMU) | |
US20210295607A1 (en) | Data reading/writing method and system in 3d image processing, storage medium and terminal | |
JP2021515318A (ja) | NVMeベースのデータ読み取り方法、装置及びシステム | |
WO2015176664A1 (zh) | 一种数据操作的方法、设备和*** | |
CN112506823A (zh) | 一种fpga数据读写方法、装置、设备及可读存储介质 | |
JP6679570B2 (ja) | データ処理装置 | |
WO2022068328A1 (zh) | 数据迁移的方法、装置、处理器和计算设备 | |
US8200900B2 (en) | Method and apparatus for controlling cache memory | |
CN116627867B (zh) | 数据交互***、方法、大规模运算处理方法、设备及介质 | |
CN112035056B (zh) | 一种基于多计算单元的并行ram访问设备及访问方法 | |
CN107807888B (zh) | 一种用于soc架构的数据预取***及其方法 | |
CN113742115A (zh) | 用于通过处理器处理页面错误的方法 | |
TWI786476B (zh) | 處理暨儲存電路 | |
CN117312182B (zh) | 基于便签式存储的向量数据分散方法、装置及计算机设备 | |
TWI799317B (zh) | 快閃記憶體控制器及使用於快閃記憶體控制器的方法 | |
US20230350797A1 (en) | Flash-based storage device and copy-back operation method thereof | |
CN110245096B (zh) | 一种实现处理器直接连接扩展计算模块的方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19806451 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19806451 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21/05/2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19806451 Country of ref document: EP Kind code of ref document: A1 |