WO2019000716A1 - 一种计算控制方法、网卡及电子设备 - Google Patents

一种计算控制方法、网卡及电子设备 Download PDF

Info

Publication number
WO2019000716A1
WO2019000716A1 PCT/CN2017/106871 CN2017106871W WO2019000716A1 WO 2019000716 A1 WO2019000716 A1 WO 2019000716A1 CN 2017106871 W CN2017106871 W CN 2017106871W WO 2019000716 A1 WO2019000716 A1 WO 2019000716A1
Authority
WO
WIPO (PCT)
Prior art keywords
data block
cpu
network card
storage
calculation result
Prior art date
Application number
PCT/CN2017/106871
Other languages
English (en)
French (fr)
Inventor
李波
Original Assignee
联想(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 联想(北京)有限公司 filed Critical 联想(北京)有限公司
Publication of WO2019000716A1 publication Critical patent/WO2019000716A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1012Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using codes or arrangements adapted for a specific type of error
    • G06F11/1032Simple parity
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a computing control method, a network card, and an electronic device.
  • Erasure Code algorithm Redundant Arrays of Independent Disks, RAID
  • the central processing unit performs parity calculation, which makes the CPU occupy a higher amount, thereby increasing the delay of data input and output.
  • the purpose of the present application is to provide a computing control method, a network card, and an electronic device, which are used to solve the problem that the CPU performs parity calculation in the prior art, so that the CPU usage is high, thereby increasing the data input and output delay. technical problem.
  • the application provides a computing control method, which is applied to a network card, and the method includes:
  • the target data block is calculated to obtain a calculation result data block.
  • the above method preferably, further includes:
  • the storage space is preset, and the storage space has an address mapping with the memory space of the CPU.
  • the storage space comprises a data storage structure of a circular queue.
  • the storage space and the memory space of the CPU have an address mapping including:
  • MMIO Memory Mapping Input/Output
  • the above method preferably, further includes:
  • the calculation result data block is stored in the storage space.
  • the above method preferably, further includes:
  • the address information corresponding to the storage of the calculation result data block is transmitted to the CPU.
  • the corresponding address information includes:
  • the above method preferably, further includes:
  • the target data block is calculated to obtain a calculation result data block, including:
  • Parity calculation is performed on the target data block to obtain a parity data block.
  • the application also provides a network card, including:
  • One or more calculators are One or more calculators
  • a memory for storing one or more applications and data generated by the one or more applications running
  • the application also provides an electronic device, including:
  • a calculation control method, a network card, and an electronic device provided by the present application obtain a calculation result data block by transmitting a target data block that needs to perform data calculation to a device that is different from the CPU, such as a network card.
  • the data calculation task is handed over to the device different from the CPU, so that the computing resources of the CPU are no longer occupied, thereby reducing the data input and output delay, and achieving the purpose of the present application.
  • FIG. 1 is a flowchart of an implementation of a calculation control method according to an embodiment of the present application
  • FIG. 2 is another flowchart of a calculation control method according to an embodiment of the present application.
  • FIG. 3 is a partial flowchart of a calculation control method according to an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of an electronic device network card according to an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • FIG. 6 and FIG. 7 are respectively an application example diagram of an embodiment of the present application.
  • the techniques of this disclosure may be implemented in the form of hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer program product on a computer readable medium storing instructions for use by or in connection with an instruction execution system.
  • a computer readable medium can be any medium that can contain, store, communicate, propagate or transport the instructions.
  • a computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium.
  • the computer readable medium include: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disk (CD-ROM); a memory such as a random access memory (RAM) or a flash memory; and/or a wired /Wireless communication link.
  • a magnetic storage device such as a magnetic tape or a hard disk (HDD)
  • an optical storage device such as a compact disk (CD-ROM)
  • a memory such as a random access memory (RAM) or a flash memory
  • RAM random access memory
  • FIG. 1 a flow chart of implementing a computing control method according to an embodiment of the present application is applied to a network card (Network). Interface Card, NIC) and other devices capable of data calculation.
  • Network Network
  • Interface Card NIC
  • other devices capable of data calculation.
  • the calculation control method may include the following steps:
  • Step 101 Receive a target data block sent by the CPU.
  • the target data block can be obtained by the CPU when data distributed storage is required. After that, the CPU sends the target data block to the device capable of data calculation in the embodiment. For example, the CPU can send the target data to the network card.
  • the target data block received by the network card may include a plurality of data blocks that have been divided into good blocks, for example, the target data block includes D1, D2, D3, D4, and D5; or the target data block includes undivided data. Block data.
  • Step 102 Perform calculation on the target data block to obtain a calculation result data block.
  • the network card after receiving the target data block, the network card directly performs data calculation on the blocked data block, such as parity calculation, A block of calculation result data is obtained, for example, a check data block.
  • the network card after receiving the target data block, the network card first blocks the target data block, for example, divided into 5, D1, D2, D3, D4, and D5. Data blocks, after which data calculations are performed on the data blocks, such as parity calculations, to obtain calculation result data blocks, for example, check data blocks.
  • the network card when calculating the target data block, may perform parity calculation on the target data block, thereby obtaining a calculation result data block including the check data block.
  • the implementation solution in this embodiment can be implemented by a network card device connected to the CPU.
  • the CPU When the CPU needs to perform target data block storage, the CPU usually performs parity calculation on the target data block, and the calculated school is calculated.
  • the data is sent to the storage location, such as a cloud disk or a server, together with the target data block.
  • the parity calculation of the target data block is performed by the network card transmitting the data, and the parity calculation is not performed on the CPU. It does not need to occupy CPU resources such as CPU memory in a large amount or for a long time, thereby increasing the input and output rate of the CPU.
  • a calculation control method provided by the embodiment of the present application obtains a calculation result data block by transmitting a target data block that needs to perform data calculation to a device that is different from the CPU, such as a network card.
  • the data calculation task is handed over to the device different from the CPU, so that the computing resources of the CPU are no longer occupied, thereby reducing the data input and output delay, and achieving the purpose of the embodiment.
  • the device for performing data calculation in the network card or the like in this embodiment may preset a storage space, where the storage space is set on a device different from the CPU, for example, the storage space is different from the CPU memory.
  • the storage space of the space for example, may be the storage space on the network card, or the storage space on other devices that the network card can access.
  • the storage space and the memory space of the CPU have an address mapping.
  • the storage space can work in a queue mode.
  • the storage space can be a data queue of a circular queue.
  • the storage structure the storage space can be stored according to the storage rules of the circular queue first-in first-out when storing data.
  • the address mapping between the preset storage space on the network card and the memory space of the CPU can be established by MMIO.
  • the following steps may also be included:
  • Step 103 Store the calculation result data block into the storage space.
  • the storage space may be the foregoing preset storage space. After the network card stores the calculation result data block in the storage space, the calculation result data block may not be sent to the CPU for storage, thereby eliminating the need for occupying the memory space of the CPU.
  • the network card can obtain the address information of the calculation result data block in the memory space according to the address mapping between the storage space and the memory space of the CPU, for example, the first address of the calculation result data stored in the storage space.
  • the network card can obtain the address information in the CPU memory space corresponding to the first address based on the address mapping, and after step 103, the network card can further have the following steps:
  • Step 104 Send address information corresponding to the calculation result data block to the CPU.
  • the corresponding address information includes address information in the CPU memory space corresponding to the storage address of the calculation result data block stored in the storage space obtained based on the address mapping.
  • the network card side obtains address information in the CPU memory space corresponding to the storage address in the storage space based on the address mapping, and then sends the corresponding address information to the CPU.
  • the address information stored in the storage space by the calculation result data block may also be sent to the CPU, and then the CPU obtains the CPU memory space corresponding to the storage address in the storage space based on the address mapping by the CPU. Address information in .
  • the network card can asynchronously notify the CPU of the address information corresponding to the calculation result data block by the driver.
  • the CPU can determine the storage address of the target data block and the calculation result data block based on the address information, and further transmit the storage address.
  • the storage address determined by the CPU is different from the address stored in the memory space by the target data block and the address stored in the storage space by the calculation result data block, where the target data block determined by the CPU and the storage address of the calculation result data block are stored. It refers to the target address that the target data block and the calculation result data block need or will store.
  • the method in this embodiment may further include the following steps, as shown in FIG. 3:
  • Step 301 Receive a storage address of a target data block and a calculation result data block sent by the CPU, where the storage address includes a target data block and a target address stored by the calculation result data block.
  • the CPU When the CPU sends the storage address to the device such as the network card in this embodiment, the CPU can start the interrupt process and then send the storage address through the interrupt process.
  • Step 302 Send the target data block and the calculation result data block based on the storage address.
  • the device such as the network card in this embodiment may send the target data block transmitted from the CPU and the calculation result data block in the storage space toward the location where X is located, and then the target data block and the calculation result.
  • the data block can be stored on the storage address X.
  • the network card does not need to send the calculation result data block to the CPU for data transmission, thereby reducing the data input and output pressure of the CPU and reducing the delay of the CPU data input and output.
  • FIG. 4 is a schematic structural diagram of a network card according to an embodiment of the present disclosure.
  • the network card may be a device with data calculation and storage, and the network card is connected to the CPU.
  • the network card may include a memory 401 and a calculator 402.
  • the network card can perform the method described above with reference to FIGS. 1 to 3.
  • the calculator 402 can include, for example, a general purpose microprocessor, an instruction set processor, and/or a related chipset and/or a special purpose microprocessor (eg, an application specific integrated circuit (ASIC)), and the like.
  • the calculator 402 can also include an onboard memory for caching purposes.
  • the calculator 402 may be a single processing unit or a plurality of processing units for performing different actions of the method flow according to the embodiments of the present disclosure described with reference to FIGS. 1 to 3.
  • the memory 401 is used for one or more storage applications and data generated by one or more applications.
  • it can be any medium that can contain, store, transfer, propagate, or transport the instructions.
  • a readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium.
  • Specific examples of the readable storage medium include: a magnetic storage device such as a magnetic tape or a hard disk (HDD); an optical storage device such as a compact disk (CD-ROM); a memory such as a random access memory (RAM) or a flash memory; and/or a wired /Wireless communication link.
  • Memory 401 can include a computer program that can include code/computer executable instructions that, when executed by calculator 402, cause calculator 402 to perform, for example, the method flow described above in connection with Figures 1-3 and any variations thereof.
  • the network card in this embodiment has a device with a data storage function, such as a memory 401.
  • the storage space can be preset in the memory 401, and the preset storage space can be used to store the calculation result data block.
  • the preset storage space in the memory 401 may be a data storage structure of the circular queue, and has a first-in-first-out storage feature for buffering the calculation result data block.
  • the calculator 402 in the network card can be implemented by a Field-Programmable Gate Array (FPGA) computing unit, and performs data calculation on the target data block, such as performing parity calculation, and obtaining calculation results such as a check data block. data block.
  • FPGA Field-Programmable Gate Array
  • the calculation result data block is sent to the CPU in the memory space of the CPU, and the CPU determines the storage address of the target data block and the calculation result data block.
  • the network card can send the target data block transmitted from the CPU and the calculation result data block in the storage space to the location where the storage address sent by the CPU is located, thereby storing the target data block and the calculation result data block. To the deposit Store address.
  • the data calculation and the storage of the result data block are performed by using the network card, and the computing resource and the memory resource of the CPU are not required to be occupied, thereby reducing the delay of the CPU data input and output, and implementing the embodiment. purpose.
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
  • the electronic device may include a CPU 501 and a network card 502.
  • the CPU 501 can be used to obtain a target data block and send the target data block to the network card 502.
  • the network card 502 is configured to calculate a target data block sent by the CPU to obtain a calculation result data block.
  • the network card 502 may include a storage space capable of data storage and a processor or a calculator capable of performing data calculation, such as a computing unit of the FPGA, for performing data calculation on the target data block, such as parity calculation. Etc., a calculation result data block such as a check data block is obtained. In this process, the task of data calculation does not need to be implemented on the CPU 501, thereby not occupying the computing resources and memory resources of the CPU 501.
  • the network card 502 can store the calculation result data block in its own storage space, and does not need to return the calculation result data block to the CPU 501 for memory space storage, thereby occupying no CPU memory. Space resources.
  • the network card 502 can send the address information indicating the calculation result data block to the CPU 501, and the CPU 501 can determine the storage address to be stored in the target data block and the calculation result data block, and then send it to the network card 502, and the network card 502 receives the storage address. Then, the target data block transmitted from the CPU and the calculation result data block in the storage space of the network card 502 are directly sent to the storage address, thereby implementing parity storage of the target data block. In this process, the network card 502 does not need to be The calculation result data block is sent to the CPU 501 for data transmission, thereby reducing the data input and output pressure of the CPU and reducing the delay of the CPU data input and output.
  • the memory space in the CPU has a target data block to be stored, and the target data block has 5 pieces of data: D1, D2, D3, D4, and D5, which need to be parity-checked to obtain 3 copies.
  • the CPU performs the parity calculation, that is, the third step, thereby calculating the parity data blocks P1, P2, and P3, and storing them in the ring buffer of the network card, waiting for transmission, as shown in FIG.
  • the network card After the network card performs parity check and obtains the check data block, the network card asynchronously notifies the CPU (operating system) of the information of the data block (such as the storage address and related data block information) through the driver, that is, the fourth step.
  • the CPU (Operating System) determines the storage location (storage address) of the target data block and the parity data block according to the configuration of the distributed storage, and then sends the storage address to the network card through the interrupt, that is, the fifth step.
  • the network card sends the target data block and the calculation result data block to a storage address on the network based on the storage address.
  • the network card for distributed storage in this embodiment may have a gate array of the FPGA, and program supports the Erasure Code algorithm, whereby the FPGA chip on the network card can perform parity check data. Calculation, which can reduce the computational pressure of the CPU.
  • the distributed storage network card can have a storage space of volatile random access memory (RAM). After the system is started, this part of the storage space is mapped into the memory space of the system through MMIO, and it will pass through the ring cache. The check data block is stored, thereby avoiding the traditional copying of data from the memory to the network card, so as to realize zero copy of the check data, thereby reducing the delay of data input and output.
  • RAM volatile random access memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Detection And Correction Of Errors (AREA)

Abstract

一种计算控制方法、网卡(502)及电子设备,该方法包括:接收CPU发送的目标数据块(101);对所述目标数据块进行计算,得到计算结果数据块(102)。将数据计算任务交给区别于CPU的设备,从而不再占用CPU的计算资源,从而减少数据输入输出延时。

Description

一种计算控制方法、网卡及电子设备 技术领域
本申请涉及数据处理技术领域,特别涉及一种计算控制方法、网卡及电子设备。
背景技术
针对于云计算环境中数据量的爆发式增长,在计算机中开始使用Erasure Code算法替代磁盘阵列(Redundant Arrays ofIndependent Disks,RAID),以获得更高的存储容错率及更高的存储空间利用率。
但Erasure Code算法在实现中由中央处理器(Central Processing Unit,CPU)进行奇偶校验计算,使得对CPU的占用量较高,从而增大了数据输入输出的延时。
发明内容
有鉴于此,本申请的目的是提供一种计算控制方法、网卡及电子设备,用以解决现有技术中CPU进行奇偶校验计算使得CPU占用量较高,从而增大数据输入输出延时的技术问题。
本申请提供了一种计算控制方法,应用于网卡,所述方法包括:
接收CPU发送的目标数据块;
对所述目标数据块进行计算,得到计算结果数据块。
上述方法,优选的,还包括:
预先设置存储空间,所述存储空间与CPU的内存空间具有地址映射。
上述方法,优选的:所述存储空间包括环形队列的数据存储结构。
上述方法,优选的,所述存储空间与CPU的内存空间具有地址映射包括:
通过内存映射(Memory mapping Input/Output,MMIO)建立所述存储空间与CPU的内存空间之间的地址映射。
上述方法,优选的,还包括:
将所述计算结果数据块存储到所述存储空间。
上述方法,优选的,还包括:
将与存储所述计算结果数据块相对应的地址信息发送给CPU。
上述方法,优选的,所述相对应的地址信息包括:
基于所述地址映射获得的与所述计算结果数据块存储在所述存储空间中的存储地址相对应的所述CPU内存空间中的地址信息。
上述方法,优选的,还包括:
接收CPU发送的所述目标数据块与所述计算结果数据块的存储地址;
基于所述存储地址,将所述目标数据块与所述计算结果数据块进行发送。
上述方法,优选的,对所述目标数据块进行计算,得到计算结果数据块,包括:
对所述目标数据块进行奇偶校验计算,得到校验数据块。
本申请还提供了一种网卡,包括:
一个或多个计算器;
存储器,用于存储一个或多个应用程序及所述一个或多个应用程序运行所产生的数据,
其中,当所述一个或多个应用程序被所述一个或多个计算器执行时,使得所述一个或多个计算器执行上述方法。本申请还提供了一种电子设备,包括:
CPU;以及
上述网卡。
由上述方案可知,本申请提供的一种计算控制方法、网卡及电子设备,通过将需要进行数据计算的目标数据块发送给区别于CPU的设备如网卡等进行数据计算,来得到计算结果数据块。本申请中将数据计算任务交给区别于CPU的设备,从而不再占用CPU的计算资源,从而减少数据输入输出延时,实现本申请目的。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种计算控制方法的实现流程图;
图2为本申请实施例提供的一种计算控制方法的另一流程图;
图3为本申请实施例提供的一种计算控制方法的的部分流程图;
图4为本申请实施例提供的一种电子设备网卡的结构示意图;
图5为本申请实施例提供的一种电子设备的结构示意图;
图6及图7分别为本申请实施例的应用示例图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请 中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在此使用的术语仅仅是为了描述具体实施例,而并非意在限制本公开。在此使用的术语“包括”、“包含”等表明了所述特征、步骤、操作和/或部件的存在,但是并不排除存在或添加一个或多个其他特征、步骤、操作或部件。
在此使用的所有术语(包括技术和科学术语)具有本领域技术人员通常所理解的含义,除非另外定义。应注意,这里使用的术语应解释为具有与本说明书的上下文相一致的含义,而不应以理想化或过于刻板的方式来解释。
在使用类似于“A、B和C等中至少一个”这样的表述的情况下,一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如,“具有A、B和C中至少一个的***”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的***等)。在使用类似于“A、B或C等中至少一个”这样的表述的情况下,一般来说应该按照本领域技术人员通常理解该表述的含义来予以解释(例如,“具有A、B或C中至少一个的***”应包括但不限于单独具有A、单独具有B、单独具有C、具有A和B、具有A和C、具有B和C、和/或具有A、B、C的***等)。本领域技术人员还应理解,实质上任意表示两个或更多可选项目的转折连词和/或短语,无论是在说明书、权利要求书还是附图中,都应被理解为给出了包括这些项目之一、这些项目任一方、或两个项目的可能性。例如,短语“A或B”应当被理解为包括“A”或“B”、或“A和B”的可能性。
附图中示出了一些方框图和/或流程图。应理解,方框图和/或流程图中的一些方框或其组合可以由计算机程序指令来实现。这些计算机程序指令可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,从而这些指令在由该处理器执行时可以创建用于实现这些方框图和/或流程图中所说明的功能/操作的装置。
因此,本公开的技术可以硬件和/或软件(包括固件、微代码等)的形式来实现。另外,本公开的技术可以采取存储有指令的计算机可读介质上的计算机程序产品的形式,该计算机程序产品可供指令执行***使用或者结合指令执行***使用。在本公开的上下文中,计算机可读介质可以是能够包含、存储、传送、传播或传输指令的任意介质。例如,计算机可读介质可以包括但不限于电、磁、光、电磁、红外或半导体***、装置、器件或传播介质。计算机可读介质的具体示例包括:磁存储装置,如磁带或硬盘(HDD);光存储装置,如光盘(CD-ROM);存储器,如随机存取存储器(RAM)或闪存;和/或有线/无线通信链路。
参考图1,为本申请实施例提供的一种计算控制方法的实现流程图,应用于网卡(Network  Interface Card,NIC)等能够进行数据计算的设备中。
本实施例中,该计算控制方法可以包括以下步骤:
步骤101:接收CPU发送的目标数据块。
该目标数据块可以由CPU在需要进行数据分布式存储时获得,之后,CPU将目标数据块发送给本实施例中能够进行数据计算的设备,例如,CPU可以将目标数据发送给网卡。
其中,网卡接收到的目标数据块可以包含有多个已经被分好块的数据块,例如,目标数据块中包含有D1、D2、D3、D4及D5;或者目标数据块中包含未被分块的数据。
步骤102:对目标数据块进行计算,得到计算结果数据块。
其中,如果目标数据块是由CPU已经分好块的数据块组合,那么本实施例中网卡在接收到目标数据块之后,直接对分块后的数据块进行数据计算,如奇偶校验计算,得到计算结果数据块,例如,校验数据块。
或者,如果目标数据块为未被分块的数据,那么本实施例中网卡在接收到目标数据块之后,首先对目标数据块进行分块,如分成D1、D2、D3、D4及D5的5个数据块,之后再对这些数据块进行数据计算,如奇偶校验计算,得到计算结果数据块,例如,校验数据块。
其中,本实施例中网卡在对目标数据块进行计算时,可以是对目标数据块进行奇偶校验计算,从而得到包括校验数据块的计算结果数据块。
也就是说,本实施例中的实现方案可以通过与CPU相连接的网卡设备实现,CPU在需要进行目标数据块存储时,通常是在对目标数据块进行奇偶校验计算,将计算得到的校验数据与目标数据块一起通过网卡发送到存储地,如云盘或服务器上,本实施例由传输数据的网卡进行目标数据块的奇偶校验计算,无需在CPU上进行奇偶校验计算,从而无需大量或长时间占用CPU资源如CPU内存,从而提高CPU的输入输出的速率。
由上述方案可知,本申请实施例提供的一种计算控制方法,通过将需要进行数据计算的目标数据块发送给区别于CPU的设备如网卡等进行数据计算,来得到计算结果数据块。本申请中将数据计算任务交给区别于CPU的设备,从而不再占用CPU的计算资源,从而减少数据输入输出延时,实现本实施例目的。
在一种实现方式中,本实施例中的网卡等进行数据计算的设备可以预先设置存储空间,该存储空间是在区别于CPU的设备上进行设置的,例如,该存储空间为区别于CPU内存空间的存储空间,例如可以是网卡上的存储空间,也可以是网卡可以访问的其他设备上的存储空间。而该存储空间与CPU的内存空间之间是具有地址映射的。
其中,存储空间可以以队列的模式进行工作,例如,存储空间可以为环形队列的数据存 储结构,该存储空间在进行数据存储时可以按照环形队列先进先出的存储规则进行存储。
而网卡上预设的存储空间与CPU的内存空间之间的地址映射可以通过MMIO建立。相应的,在本实施例获得计算结果数据块之后,如图2中所示,还可以包括以下步骤:
步骤103:将计算结果数据块存储到存储空间中。
其中,存储空间可以为上述预先设置好的存储空间,在网卡将计算结果数据块存储到存储空间之后,可以不将计算结果数据块发送给CPU进行存储,从而无需占用CPU的内存空间。
而之后,本实施例中网卡可以根据存储空间与CPU的内存空间之间的地址映射来获得计算结果数据块在内存空间中的地址信息,例如,计算结果数据存储在存储空间中的第一地址,网卡可以基于地址映射获得与第一地址相对应的CPU内存空间中的地址信息,进而在步骤103之后,还可以具有以下步骤:
步骤104:将与计算结果数据块相对应的地址信息发送给CPU。
根据本公开实施例,相对应的地址信息包括基于所述地址映射获得的与所述计算结果数据块存储在所述存储空间中的存储地址相对应的所述CPU内存空间中的地址信息。例如,网卡端基于地址映射获得与存储空间中的存储地址相对应的CPU内存空间中的地址信息,然后将相对应的地址信息发送给CPU。可以理解,在本公开实施例中,也可以将计算结果数据块存储在存储空间中的地址信息发送给CPU,而后,由CPU基于地址映射获得与存储空间中的存储地址相对应的CPU内存空间中的地址信息。
在具体实现中,本实施例中网卡可以通过驱动程序异步通知给CPU该计算结果数据块所对应的地址信息。
需要说明的是,CPU在接收到计算结果数据块所对应的地址信息之后,基于该地址信息可以确定目标数据块与计算结果数据块的存储地址,进而将存储地址进行发送。
其中,CPU所确定的存储地址区别于目标数据块在内存空间中存储的地址及计算结果数据块在存储空间中存储的地址,这里的CPU所确定的目标数据块与计算结果数据块的存储地址是指,目标数据块与计算结果数据块需要或即将存储的目标地址。
由此,本实施例中的方法还可以包括以下步骤,如图3中所示:
步骤301:接收CPU发送的目标数据块与计算结果数据块的存储地址,其中,存储地址包括目标数据块与计算结果数据块存储的目标地址。
其中,CPU在向本实施例中的网卡等设备发送存储地址时,可以通过启动中断进程,进而通过中断进程将存储地址进行发送。
步骤302:基于存储地址,将目标数据块与计算结果数据块进行发送。
例如,目标地址为X,本实施例中的网卡等设备可以将CPU传来的目标数据块与其存储空间中的计算结果数据块朝着X所在的位置进行发送,进而将目标数据块与计算结果数据块能够存储到该存储地址X上,在这一过程中网卡不需要将计算结果数据块再发给CPU进行数据发送,从而减少CPU的数据输入输出压力,减少CPU数据输入输出的延时。
参考图4,为本申请实施例提供的一种网卡的结构示意图,该网卡可以为具有数据计算与存储的设备,且网卡与CPU相连接。
在本实施例中,该网卡可以包括存储器401以及计算器402。该网卡可以执行上面参考图1~图3描述的方法。
具体地,计算器402例如可以包括通用微处理器、指令集处理器和/或相关芯片组和/或专用微处理器(例如,专用集成电路(ASIC)),等等。计算器402还可以包括用于缓存用途的板载存储器。计算器402可以是用于执行参考图1~图3描述的根据本公开实施例的方法流程的不同动作的单一处理单元或者是多个处理单元。
存储器401,用于一个或多个存储应用程序及一个或多个应用程序所产生的数据。例如可以是能够包含、存储、传送、传播或传输指令的任意介质。例如,可读存储介质可以包括但不限于电、磁、光、电磁、红外或半导体***、装置、器件或传播介质。可读存储介质的具体示例包括:磁存储装置,如磁带或硬盘(HDD);光存储装置,如光盘(CD-ROM);存储器,如随机存取存储器(RAM)或闪存;和/或有线/无线通信链路。
存储器401可以包括计算机程序,该计算机程序可以包括代码/计算机可执行指令,其在由计算器402执行时使得计算器402执行例如上面结合图1~图3所描述的方法流程及其任何变形。其中,本实施例中的网卡中具有存储器401等具有数据存储功能的器件,在存储器401中可以预先设置存储空间,预设的存储空间可以用于存储计算结果数据块。
其中,存储器401中预设的存储空间可以为环形队列的数据存储结构,具有先进先出的存储特点,用以缓存计算结果数据块。
另外,网卡中的计算器402可以通过现场可编程门阵列(Field-Programmable GateArray,FPGA)计算单元实现,对目标数据块进行数据计算,如进行奇偶校验计算,得到校验数据块等计算结果数据块。
而在网卡获得计算结果数据块之后,将计算结果数据块在CPU的内存空间中对应的地址信息发送给CPU,此时,CPU来确定目标数据块与计算结果数据块所需存储的存储地址再发送给网卡,网卡就可以将CPU传来的目标数据块与其存储空间中的计算结果数据块朝着CPU发来的存储地址所在的位置进行发送,进而将目标数据块与计算结果数据块能够存储到该存 储地址上。
由上述方案中可知,本申请实施例中通过网卡来进行数据计算及计算结果数据块的存储任务,无需占用CPU的计算资源与内存资源,从而减少CPU数据输入输出的延时,实现本实施例目的。
参考图5,为本申请实施例提供的一种电子设备的结构示意图,该电子设备可以包括CPU501以及网卡502。
其中,CPU501可以用于获得目标数据块,并将目标数据块发送给网卡502。
网卡502,用于对CPU发送的目标数据块进行计算得到计算结果数据块。
其中,在网卡502中可以包含有能够进行数据存储的存储空间以及能够进行数据计算的处理器或计算器,如FPGA的计算单元等,用以对目标数据块进行数据计算,如奇偶校验计算等,得到校验数据块等计算结果数据块,这一过程中,数据计算的任务无需在CPU501上实现,由此不会占用CPU501的计算资源和内存资源。
另外,网卡502在计算得到计算结果数据块之后,可以将计算结果数据块存储到自己的存储空间中,不需要将计算结果数据块再返回给CPU501进行内存空间存储,从而不会占用CPU的内存空间资源。
之后,网卡502可以将表征计算结果数据块的地址信息发送给CPU501,CPU501就可以确定目标数据块与计算结果数据块所需存储的存储地址,再发给网卡502,网卡502在接收到存储地址之后,将CPU传来的目标数据块与网卡502自己的存储空间中的计算结果数据块直接向存储地址发送,进而实现目标数据块的奇偶校验存储,在这一过程中网卡502不需要将计算结果数据块再发给CPU501进行数据发送,从而减少CPU的数据输入输出压力,减少CPU数据输入输出的延时。
以下用实例对以上方案进行说明:
如图6中所示,CPU中内存空间中具有待存储的目标数据块,该目标数据块中具有5份数据:D1、D2、D3、D4及D5,需要对其进行奇偶校验得到3份校验数据块。由于数据为分布式存储,那么数据一定要通过网卡发送,所以第一步和第二步,CPU首先通过启动中断程序把数据(D1、D2、D3、D4及D5)发给网卡,由网卡替代CPU进行奇偶校验计算,即第三步,从而计算出校验数据块P1、P2及P3,并存储在网卡的环形缓冲区中,等待发送,如图7中所示。
在网卡进行奇偶校验并得到检验数据块后,网卡会通过驱动程序异步通知CPU(操作***)校验数据块的信息(如存放地址及相关数据块信息等),即第四步。
之后,CPU(操作***)会根据分布式存储的配置,来决定目标数据块和校验数据块的存储位置(存储地址),再通过中断把存储地址发送到网卡上,即第五步。
最后,网卡基于存储地址将目标数据块及计算结果数据块发送到网络上的存储地址。
基于上述实例,相比较于传统的网卡,本实施例中做分布式存储的网卡可以带有FPGA的门阵列,并编程支持Erasure Code算法,由此网卡上的FPGA芯片能够进行奇偶校验的数据计算,从而能够减少CPU的计算压力。
而分布式存储网卡可以带有易挥发性随机存取存储器(RamdomAccessMemory,RAM)的存储空间,在***启动后,这部分存储空间会通过MMIO映射到***的内存空间中,它会通过环形缓存来存放校验数据块,由此用来避免传统上的从内存到网卡的数据拷贝,以实现校验数据的零拷贝,进而减少数据输入输出的延时。
需要说明的是,本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上对本发明所提供的一种计算控制方法、网卡及电子设备进行了详细介绍,对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (11)

  1. 一种计算控制方法,应用于网卡,所述方法包括:
    接收CPU发送的目标数据块;
    对所述目标数据块进行计算,得到计算结果数据块。
  2. 根据权利要求1所述的方法,还包括:
    预先设置存储空间,所述存储空间与CPU的内存空间具有地址映射。
  3. 根据权利要求2所述的方法,其中,所述存储空间包括环形队列的数据存储结构。
  4. 根据权利要求2所述的方法,其中,所述存储空间与CPU的内存空间具有地址映射,包括:
    通过内存映射(Memory mapping Input/Output,MMIO)建立所述存储空间与CPU的内存空间之间的地址映射。
  5. 根据权利要求2所述的方法,还包括:
    将所述计算结果数据块存储到所述存储空间。
  6. 根据权利要求5所述的方法,还包括:
    将与所述计算结果数据块相对应的地址信息发送给CPU。
  7. 根据权利要求6所述的方法,其中,所述相对应的地址信息包括:
    基于所述地址映射获得的与所述计算结果数据块存储在所述存储空间中的存储地址相对应的所述CPU内存空间中的地址信息。
  8. 根据权利要求1所述的方法,还包括:
    接收CPU发送的所述目标数据块与所述计算结果数据块的存储地址;
    基于所述存储地址,将所述目标数据块与所述计算结果数据块进行发送。
  9. 根据权利要求1所述的方法,其中,所述对所述目标数据块进行计算,得到计算结果数据块,包括:
    对所述目标数据块进行奇偶校验计算,得到校验数据块。
  10. 一种网卡,包括:
    一个或多个计算器;
    存储器,用于存储一个或多个应用程序及所述一个或多个应用程序运行所产生的数据,
    其中,当所述一个或多个应用程序被所述一个或多个计算器执行时,使得所述一个或多个计算器执行根据权利要求1~9中任一项所述的方法。
  11. 一种电子设备,包括:
    CPU;以及
    根据权利要求10所述的网卡。
PCT/CN2017/106871 2017-06-27 2017-10-19 一种计算控制方法、网卡及电子设备 WO2019000716A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710500553.7A CN107273213B (zh) 2017-06-27 2017-06-27 一种计算控制方法、网卡及电子设备
CN201710500553.7 2017-06-27

Publications (1)

Publication Number Publication Date
WO2019000716A1 true WO2019000716A1 (zh) 2019-01-03

Family

ID=60068827

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/106871 WO2019000716A1 (zh) 2017-06-27 2017-10-19 一种计算控制方法、网卡及电子设备

Country Status (2)

Country Link
CN (1) CN107273213B (zh)
WO (1) WO2019000716A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108599907B (zh) * 2018-03-30 2021-03-02 上海兆芯集成电路有限公司 网络接口控制器
CN109815029B (zh) * 2019-01-10 2023-03-28 西北工业大学 一种嵌入式分区操作***分区间通信的实现方法
CN110267260B (zh) * 2019-06-17 2022-03-01 Oppo广东移动通信有限公司 刷机方法、装置、终端及计算机可读存储介质
CN111541783B (zh) 2020-07-08 2020-10-20 支付宝(杭州)信息技术有限公司 一种基于区块链一体机的交易转发方法及装置
CN111539829B (zh) 2020-07-08 2020-12-29 支付宝(杭州)信息技术有限公司 一种基于区块链一体机的待过滤交易识别方法及装置
CN111541789A (zh) 2020-07-08 2020-08-14 支付宝(杭州)信息技术有限公司 一种基于区块链一体机的数据同步方法及装置
CN111541726B (zh) * 2020-07-08 2021-05-18 支付宝(杭州)信息技术有限公司 一种基于区块链一体机的重放交易识别方法及装置
CN113726875B (zh) 2020-07-08 2024-06-21 支付宝(杭州)信息技术有限公司 一种基于区块链一体机的交易处理方法及装置
CN113296718B (zh) * 2021-07-27 2022-01-04 阿里云计算有限公司 数据处理方法以及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101911613A (zh) * 2008-01-10 2010-12-08 住友电工网络株式会社 网卡及信息处理装置
CN102523205A (zh) * 2011-12-05 2012-06-27 中兴通讯股份有限公司 内容校验和的确定方法及装置
CN102541803A (zh) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 数据发送方法和计算机
CN103946828A (zh) * 2013-10-29 2014-07-23 华为技术有限公司 数据处理***和数据处理的方法
US20160261526A1 (en) * 2015-03-03 2016-09-08 Fujitsu Limited Communication apparatus and processor allocation method for the same

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7389398B2 (en) * 2005-12-14 2008-06-17 Intel Corporation Methods and apparatus for data transfer between partitions in a computer system
CN100342353C (zh) * 2006-04-07 2007-10-10 浙江大学 嵌入式操作***中进程映射实现方法
CN101873337A (zh) * 2009-04-22 2010-10-27 电子科技大学 一种基于rt8169千兆网卡和Linux操作***的零拷贝数据捕获技术
CN101902504A (zh) * 2009-05-27 2010-12-01 北京神州飞航科技有限责任公司 航空电子全双工交换式以太网网卡及其集成化方法
CN101729202B (zh) * 2009-12-21 2013-04-03 杭州合众信息技术股份有限公司 一种基于分光技术的纯单向数据可靠传输的装置及方法
CN102866935B (zh) * 2011-07-07 2014-11-12 北京飞杰信息技术有限公司 基于iscsi的即时复制方法和存储***
CN102291408B (zh) * 2011-08-15 2014-03-26 华为数字技术(成都)有限公司 对iSCSI协议报文的处理方法及装置
CN102402473A (zh) * 2011-10-28 2012-04-04 武汉供电公司变电检修中心 计算机硬件及软件故障诊断修复***
CN102523164B (zh) * 2011-12-19 2015-09-23 曙光信息产业(北京)有限公司 一种在网卡中实现复杂同源同宿分流的***
CN106302201A (zh) * 2015-05-14 2017-01-04 华为技术有限公司 流量控制方法、设备和***
CN105868121B (zh) * 2016-03-28 2019-05-17 北京联想核芯科技有限公司 一种信息处理方法及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101911613A (zh) * 2008-01-10 2010-12-08 住友电工网络株式会社 网卡及信息处理装置
CN102523205A (zh) * 2011-12-05 2012-06-27 中兴通讯股份有限公司 内容校验和的确定方法及装置
CN102541803A (zh) * 2011-12-31 2012-07-04 曙光信息产业股份有限公司 数据发送方法和计算机
CN103946828A (zh) * 2013-10-29 2014-07-23 华为技术有限公司 数据处理***和数据处理的方法
US20160261526A1 (en) * 2015-03-03 2016-09-08 Fujitsu Limited Communication apparatus and processor allocation method for the same

Also Published As

Publication number Publication date
CN107273213B (zh) 2024-04-19
CN107273213A (zh) 2017-10-20

Similar Documents

Publication Publication Date Title
WO2019000716A1 (zh) 一种计算控制方法、网卡及电子设备
US10896086B2 (en) Maximizing use of storage in a data replication environment
US8904061B1 (en) Managing storage operations in a server cache
US10031808B2 (en) Raid 10 reads optimized for solid state drives
US10824574B2 (en) Multi-port storage device multi-socket memory access system
WO2015192685A1 (zh) 一种存储数据的方法及网络接口卡
US20210255924A1 (en) Raid storage-device-assisted deferred parity data update system
US11494266B2 (en) Raid storage-device-assisted parity update data storage system
US20110282963A1 (en) Storage device and method of controlling storage device
US10936420B1 (en) RAID storage-device-assisted deferred Q data determination system
CN110168513B (zh) 在不同存储***中对大文件的部分存储
US11163501B2 (en) Raid storage multi-step command system
US9760577B2 (en) Write-behind caching in distributed file systems
US11340989B2 (en) RAID storage-device-assisted unavailable primary data/Q data rebuild system
US11093175B1 (en) Raid data storage device direct communication system
US11429573B2 (en) Data deduplication system
JP2019525349A (ja) コンピューティングデバイスにおける、外部で管理される入出力のスタベーションの回避
US20240012585A1 (en) Drive-assisted raid system
US10776033B2 (en) Repurposable buffers for target port processing of a data transfer
US11327683B2 (en) RAID storage-device-assisted read-modify-write system
US10891244B2 (en) Method and apparatus for redundant array of independent drives parity quality of service improvements
US9697059B2 (en) Virtualized communication sockets for multi-flow access to message channel infrastructure within CPU
US20210294496A1 (en) Data mirroring system
US20210096766A1 (en) Data-transfer-based raid data update system
CN114745438B (zh) 多数据中心的缓存数据处理方法、装置、***和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17916086

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 14/07/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 17916086

Country of ref document: EP

Kind code of ref document: A1