CN111490946A - FPGA connection implementation method and device based on OpenC L framework - Google Patents

FPGA connection implementation method and device based on OpenC L framework Download PDF

Info

Publication number
CN111490946A
CN111490946A CN201910083309.4A CN201910083309A CN111490946A CN 111490946 A CN111490946 A CN 111490946A CN 201910083309 A CN201910083309 A CN 201910083309A CN 111490946 A CN111490946 A CN 111490946A
Authority
CN
China
Prior art keywords
fpga
memory
memory object
module
connection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910083309.4A
Other languages
Chinese (zh)
Other versions
CN111490946B (en
Inventor
蒋佳立
龙欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910083309.4A priority Critical patent/CN111490946B/en
Publication of CN111490946A publication Critical patent/CN111490946A/en
Application granted granted Critical
Publication of CN111490946B publication Critical patent/CN111490946B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • H04L49/102Packet switching elements characterised by the switching fabric construction using shared medium, e.g. bus or ring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/90Buffering arrangements
    • H04L49/9084Reactions to storage capacity overflow
    • H04L49/9089Reactions to storage capacity overflow replacing packets in a storage arrangement, e.g. pushout
    • H04L49/9094Arrangements for simultaneous transmit and receive, e.g. simultaneous reading/writing from/to the storage element
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides an FPGA connection implementation method based on an OpenC L framework, which is characterized in that a direct FPGA connection channel among a plurality of FPGAs is established, so that a direct call of one FPGA kernel to a DDR memory of another FPGA is realized, the problem that a server is required to be passed when data call is carried out among a plurality of FPGAs in the prior art is solved, more efficient and flexible data exchange among chips is realized, the delay of the data exchange is reduced, the efficiency of data transmission is improved, meanwhile, the original framework which is compatible with the OpenC L is very good, and a user does not need to specially transplant the design.

Description

FPGA connection implementation method and device based on OpenC L framework
Technical Field
The invention relates to the field of data transmission, in particular to an FPGA connection implementation method and device based on an OpenC L framework.
Background
In the field programmable gate array FPGA acceleration service of the data center based on the OpenC L framework, when a user relates to multi-card application, if the kernel of the OpenC L in the current card needs to use the output of the OpenC L kernel in another card as input, namely when data calling is carried out among a plurality of FPGAs, data in the prior art can only be transmitted through a server at present.
However, in many application scenarios of FPGA acceleration based on OpenC L, if data exchange is performed through a server, higher delay and lower throughput are only brought, no benefit is brought, and poor user experience is caused.
Disclosure of Invention
In view of this, an object of the present invention is to provide an FPGA connection implementation method based on an OpenC L framework, so as to solve the problem that in the prior art, when data is called among multiple FPGAs, a server needs to be used, thereby resulting in higher latency and lower throughput, and implement direct calling of data among multiple FPGAs by establishing an FPGA connection channel, thereby providing higher bandwidth and more links for multi-chip communication under an OpenC L framework of an FPGA, effectively reducing transmission latency, improving transmission efficiency, and improving user experience.
In order to solve the above technical problems, the proposed solution is as follows:
an FPGA connection implementation method based on an OpenC L framework comprises the following steps:
creating a first memory object in a first FPGA;
creating an FPGA connecting channel;
and responding to a request of a second FPGA for accessing the first memory object, and accessing the first memory object in the first FPGA through the FPGA connecting channel.
Preferably, the creating a first memory object in the first FPGA specifically includes:
and allocating a space in the off-chip memory of the first FPGA, and associating the space information into the first memory object.
Preferably, the memory space information includes: the FPGA connects the channel, off-chip memory address offset and memory size information.
Preferably, the creating of the FPGA connecting channel specifically includes:
calling an Application Program Interface (API), and inputting the associated first memory object and the associated FPGA connecting channel in the API;
and associating the spatial information of the first memory object and the FPGA connecting channel in a second memory object of the second FPGA.
Preferably, the responding to the request for accessing the first memory object by the second FPGA accesses the first memory object in the first FPGA through the FPGA connection includes:
the kernel of the second FPGA judges the position of the first memory object;
if the first memory object is not in the off-chip memory of the second FPGA, calculating the relative address of the first memory space;
and the kernel initiates access to the first memory object according to the relative address.
Preferably, a local relative address is calculated according to the spatial information of the first memory object, and a calculation formula of the relative address is as follows: relative address (FPGA connection channel number +1) total size of off-chip memory + off-chip memory address offset.
Preferably, it is characterized in that: the first FPGA and/or the second FPGA respectively comprise at least one serdes module, and the FPGA connection realizes the direct connection between the FPGAs through the serdes modules.
An FPGA connection implementation device based on an OpenC L framework comprises:
the memory creating module is used for creating a first memory object in the first FPGA;
the path creation module is used for creating an FPGA connecting channel;
and the transmission control module responds to a request of a second FPGA for accessing the first memory object and accesses the first memory object in the first FPGA through the FPGA connecting channel.
Preferably, the memory creating module creates a first memory object in the first FPGA, and specifically includes:
and allocating a space in the off-chip memory of the first FPGA, and associating the space information into the first memory object.
Preferably, the memory space information includes: the FPGA connects the channel, off-chip memory address offset and memory size information.
Preferably, the path creating module specifically includes:
the API calling module is used for calling an Application Program Interface (API), and the associated first memory object and the associated FPGA connecting channel are input into the API;
and the information association module is used for associating the spatial information of the first memory object and the FPGA connecting channel in a second memory object of a second FPGA.
Preferably, the transmission control module specifically includes:
the judging module is used for judging the position of the first memory object by the kernel of the second FPGA;
a calculating module, configured to calculate a relative address of the first memory space if the first memory object is not in an off-chip memory of the second FPGA;
and the connection module is used for initiating the access to the first memory object by the kernel according to the relative address.
Preferably, a local relative address is calculated according to the spatial information of the first memory object, and a calculation formula of the relative address is as follows: relative address (FPGA connection channel number +1) total size of off-chip memory + off-chip memory address offset.
Preferably, the first FPGA and/or the second FPGA each include at least one serdes module, and the FPGA connection realizes direct connection between the FPGAs through the serdes modules.
An FPGA cell, comprising:
one or more cores, a PCIe controller, a data interconnect bus, a DDR controller, a translation module, and a Serdes module;
the PCIe controller is used for connecting the FPGA with a server;
the kernel is used for controlling a program running on the FPGA;
the data interconnection bus is used for connecting the data bus on the kernel to corresponding equipment and routing the request through an address on the data bus;
and the DDR controller is used for controlling the off-chip memory of the FPGA.
The conversion module is used for translating the memory mapping bus into a serdes bus;
and the Serdes module is used for transmitting data between the two FPGA cards.
Preferably, the FPGA unit includes one or more serdes modules, and a plurality of FPGAs can be simultaneously interconnected through the serdes modules.
An OpenC L framework-based FPGA connection implementation apparatus includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the OpenC L framework-based FPGA connection implementation method when executing the computer program.
A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the OpenC L framework-based FPGA connection implementation method.
According to the technical scheme, the FPGA connection implementation method based on the OpenC L framework, which is provided by the embodiment of the application, establishes direct FPGA connection channels among a plurality of FPGAs through the Serdes interfaces, realizes direct calling of one FPGA kernel to a DDR memory of another FPGA, realizes more efficient and flexible data exchange among chips, simplifies the data transmission flow, improves the data transmission efficiency, is very good compatible with the original framework of the OpenC L, and a user does not need to carry out special transplantation on the design.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a diagram comparing the FPGA connection implementation method based on the OpenC L framework of the present invention with the prior art.
Fig. 2 is a logical block diagram of an FPGA cell of the present invention.
Fig. 3 is a technical framework diagram of an FPGA connection implementation method based on an OpenC L framework according to the present invention.
Fig. 4 is a flowchart of an FPGA connection implementation method based on the OpenC L framework according to the present invention.
Fig. 5 is a second flowchart of an FPGA connection implementation method based on the OpenC L framework according to the present invention.
Fig. 6 is a schematic structural diagram of an FPGA connection implementation apparatus based on an OpenC L framework according to the present invention.
Fig. 7 is a second schematic structural diagram of an FPGA connection implementation apparatus based on the OpenC L framework according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The FPGA connection implementation method based on the OpenC L framework is suitable for direct communication among a plurality of FPGAs under the OpenC L framework.
OpenC L is known as Open Computing L anguage, Open Computing language, and a framework for programming heterogeneous platforms, which may be composed of CPUs, GPUs, or other types of processors, storage objects of OpenC L are generally classified into memory buffers and picture images, and the buffers are a memory object in the OpenC L standard and are generally used for FPGA-based development.
OpenC L consists of a door for programming a kernel (functions running on OpenC L devices) and a set of APIs for defining and controlling the platform.
As shown in the upper part of FIG. 1, the system comprises a server and a plurality of FPGA units, and a PCIe switch based on PCI-Express of high-speed serial computer expansion bus standard. When one FPGA is to transmit data to another FPGA in the system, the data sent by the first FPGA is conventionally transmitted to the server through the PCIe switch, and then is controlled by the CPU of the server to forward the data to the other FPGA, thereby completing the data transmission and exchange. In such a manner, the transmission path is long, data needs to be forwarded through HOST, the data transmission delay is long, and the transmission efficiency is low.
As shown in the lower half of fig. 1, in the method of the present invention, when one FPGA is to transmit data to another FPGA in the system, the FPGA connection between the two FPGAs is directly established to perform data exchange, and a PCIe switch and a server are not required, thereby shortening a transmission path, reducing transmission delay, and improving transmission efficiency.
FIG. 2 shows the logical structural framework inside the FPGA. One FPGA unit at least comprises the following functional modules: one or more cores, a PCIe controller, a data interconnect bus, a DDR controller, a translation module, and a Serdes module.
The PCIe controller is used for connecting the FPGA card with a CPU of the server, and the FPGA card is used as PCIe terminal equipment.
Like the CUDA program, the OpenC L program is divided into two parts, one part running on the server and the other part running on the device.
And the data interconnection bus is used for connecting the data bus on the kernel to the corresponding equipment and routing the request through the address on the data bus. As shown, it connects the DDR controller and the translation modules of the serdes.
And the DDR controller is used for controlling the FPGA off-chip memory.
The conversion module is used for translating the memory mapping bus into the serdes bus.
The SERDES is a short for an english SERializer/DESerializer, the SERDES module is a controller of SERDES, an interconnection module is inserted on which an interconnection cable in front of two FPGA cards is inserted, and is used for transmitting data between the two FPGA cards, and the SERDES is a physical medium for connecting the FPGAs.
The FPGA unit can comprise one or more serdes modules, and a plurality of FPGAs can be simultaneously interconnected through the plurality of serdes modules.
Specifically, the server may control the kernel through a register channel (lane No. 1- > lane No. 0 in the figure) of the PCIe controller.
The server can also access the global memory (path 1- > path 2- > path 4) of the board card through the DMA channel of the PCIe controller.
The global memory is an upper storage resource of the FPGA board, usually a DDR memory bank, and the memory object is created in the designated global memory.
The kernel can directly access the Serdes module (a path 3- > a path 7- > a path 8) and can also access a global memory (DDR) of the board card (a path 3- > a path 4).
Serdes can access the board's global memory (DDR) (Path 8- > Path 5)
With reference to fig. 3 to fig. 5, the method for implementing FPGA connection based on the OpenC L framework provided by the present invention specifically includes:
step 101, a first memory object is created in a first FPGA.
In step 1011, a first memory space of the first FPGA is initialized.
When creating an OpenC L memory object, first, a memory object called DATA is created on the content of the first FPGA through clCreateBuffer, that is, the first memory object.
At step 1012, the spatial information of the first memory object is associated.
The hardware adaptation layer of OpenC L allocates a space in the off-chip memory of the first FPGA, and then associates information of the space, including an off-chip memory address offset, a memory size, and the like, with the first memory object.
And 102, establishing an FPGA connecting channel.
Step 1021, call API function.
And realizing an API function of the application program interface of the map, informing the OpenC L through the API, and accessing the first memory object through the FPGA connection by another FPGA.
Step 1022, associate the spatial information of the first memory object and the FPGA connection channel in the second memory object of the second FPGA.
The API needs to input the associated memory object and the associated FPGA connection channel and return a memory object, i.e., the second memory object. Here, the information about the memory object of the second memory object already associated with DATA specifically includes an FPGA connection channel, an off-chip memory address offset, and memory size information.
Step 103, responding to a request for accessing the first memory object by the second FPGA, and accessing the first memory object in the first FPGA through the FPGA connection.
And step 1031, the kernel of the second FPGA judges the position of the first memory object.
When the kernel of the second FPGA configures kernel parameters, firstly, whether the memory object needing to be called is located in the storage medium of the second FPGA is judged.
Step 1032, if the first memory object is not in the second FPGA, calculate its relative address.
If the memory object (the first memory object associated with the second memory object) is not local, i.e. not stored in the off-chip memory DDR of the second FPGA, the hardware adaptation layer of OpenC L calculates the local relative address according to the FPGA connection channel and the off-chip memory address offset.
Relative address (FPGA connection channel number +1) total size of off-chip memory + off-chip memory address offset.
The off-chip memory is a global memory or a DDR memory.
In step 1033, the kernel initiates access to the first memory object according to the relative address.
Specifically, when the kernel of the second FPGA initiates access by using the relative address, the bus interconnection module routes the access to the corresponding conversion module, and the conversion module converts the memory mapped access into serdes.
The memory mapping bus is a memory mapping bus, and the active side can actively access the address space of the passive side through the memory mapping bus.
The conversion module of the first FPGA converts the access received by the serdes into a memory mapping request, and accesses a first memory object in a corresponding global memory (DDR).
Based on the same concept of the method for realizing the FPGA connection based on the OpenC L framework provided by the invention, the invention also provides a device for realizing the FPGA connection based on the OpenC L framework, as shown in FIGS. 6 and 7, the device comprises a memory creation module 100, a path creation module 200 and a transmission control module 300, wherein:
a memory creation module 100, configured to create a first memory object in a first FPGA.
A space is allocated in the off-chip memory of the first FPGA, and the space information is related to the first memory object.
And a path creating module 200 for creating an FPGA connecting channel. The method specifically comprises the following steps:
and the API calling module 201 is used for calling an Application Program Interface (API) and inputting the associated first memory object and the associated FPGA connecting channel in the API.
The information association module 202 is configured to associate, in a second memory object of a second FPGA, spatial information of the first memory object and the FPGA connection channel.
The transmission control module 300 responds to a request for accessing the first memory object by the second FPGA, and accesses the first memory object in the first FPGA through the FPGA connection. The method specifically comprises the following steps:
the judging module 301 is configured to judge, by the kernel of the second FPGA, a position of the first memory object;
a calculating module 302, configured to calculate a relative address of the first memory space if the first memory object is not in the off-chip memory of the second FPGA;
the connection module 303 is configured to initiate, by the kernel, access to the first memory object according to the relative address.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of methods, apparatus, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart and block diagrams may represent a module, segment, or portion of code, which comprises one or more computer-executable instructions for implementing the logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. It will also be noted that each block of the block diagrams and flowchart illustrations, and combinations of blocks in the block diagrams and flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims (18)

1. An FPGA connection implementation method based on an OpenC L framework is characterized by comprising the following steps:
creating a first memory object in a first FPGA;
creating an FPGA connecting channel;
and responding to a request of a second FPGA for accessing the first memory object, and accessing the first memory object in the first FPGA through the FPGA connecting channel.
2. The method of claim 1, wherein: the creating a first memory object in the first FPGA specifically includes:
and allocating a space in the off-chip memory of the first FPGA, and associating the space information into the first memory object.
3. The method of claim 2, wherein: the memory space information includes: the FPGA connects the channel, off-chip memory address offset and memory size information.
4. The method of claim 1, wherein: the creating of the FPGA connection channel specifically includes:
calling an Application Program Interface (API), and inputting the associated first memory object and the associated FPGA connecting channel in the API;
and associating the spatial information of the first memory object and the FPGA connecting channel in a second memory object of the second FPGA.
5. The method of claim 1, wherein: the accessing the first memory object in the first FPGA through the FPGA connection in response to the request for accessing the first memory object by the second FPGA specifically includes:
the kernel of the second FPGA judges the position of the first memory object;
if the first memory object is not in the off-chip memory of the second FPGA, calculating the relative address of the first memory space;
and the kernel initiates access to the first memory object according to the relative address.
6. The method of claim 5, wherein: according to the spatial information of the first memory object, calculating a local relative address, wherein a calculation formula of the relative address is as follows: relative address (FPGA connection channel number +1) total size of off-chip memory + off-chip memory address offset.
7. The method according to any one of claims 1-6, wherein: the first FPGA and/or the second FPGA respectively comprise at least one serdes module, and the FPGA connection realizes the direct connection between the FPGAs through the serdes modules.
8. An FPGA connection implementation device based on an OpenC L framework, the device comprising:
the memory creating module is used for creating a first memory object in the first FPGA;
the path creation module is used for creating an FPGA connecting channel;
and the transmission control module responds to a request of a second FPGA for accessing the first memory object and accesses the first memory object in the first FPGA through the FPGA connecting channel.
9. The apparatus of claim 8, wherein: the memory creation module creates a first memory object in a first FPGA, and specifically includes:
and allocating a space in the off-chip memory of the first FPGA, and associating the space information into the first memory object.
10. The apparatus of claim 9, wherein: the memory space information includes: the FPGA connects the channel, off-chip memory address offset and memory size information.
11. The apparatus of claim 8, wherein: the path creation module specifically includes:
the API calling module is used for calling an Application Program Interface (API), and the associated first memory object and the associated FPGA connecting channel are input into the API;
and the information association module is used for associating the spatial information of the first memory object and the FPGA connecting channel in a second memory object of a second FPGA.
12. The apparatus of claim 8, wherein: the transmission control module specifically includes:
the judging module is used for judging the position of the first memory object by the kernel of the second FPGA;
a calculating module, configured to calculate a relative address of the first memory space if the first memory object is not in an off-chip memory of the second FPGA;
and the connection module is used for initiating the access to the first memory object by the kernel according to the relative address.
13. The apparatus of claim 12, wherein: according to the spatial information of the first memory object, calculating a local relative address, wherein a calculation formula of the relative address is as follows: relative address (FPGA connection channel number +1) total size of off-chip memory + off-chip memory address offset.
14. The apparatus according to any one of claims 8-13, wherein: the first FPGA and/or the second FPGA respectively comprise at least one serdes module, and the FPGA connection realizes the direct connection between the FPGAs through the serdes modules.
15. An FPGA cell, comprising:
one or more cores, a PCIe controller, a data interconnect bus, a DDR controller, a translation module, and a Serdes module;
the PCIe controller is used for connecting the FPGA with a server;
the kernel is used for controlling a program running on the FPGA;
the data interconnection bus is used for connecting the data bus on the kernel to corresponding equipment and routing the request through an address on the data bus;
the DDR controller is used for controlling an off-chip memory of the FPGA;
the conversion module is used for translating the memory mapping bus into a serdes bus;
and the Serdes module is used for transmitting data between the two FPGA cards.
16. The FPGA cell of claim 15, wherein: the FPGA unit comprises one or more serdes modules, and a plurality of FPGAs can be simultaneously interconnected through the serdes modules.
17. An FPGA connection implementation apparatus based on the OpenC L framework, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the steps of the method according to any one of claims 1-7 are implemented when the computer program is executed by the processor.
18. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN201910083309.4A 2019-01-28 2019-01-28 FPGA connection realization method and device based on OpenCL framework Active CN111490946B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910083309.4A CN111490946B (en) 2019-01-28 2019-01-28 FPGA connection realization method and device based on OpenCL framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910083309.4A CN111490946B (en) 2019-01-28 2019-01-28 FPGA connection realization method and device based on OpenCL framework

Publications (2)

Publication Number Publication Date
CN111490946A true CN111490946A (en) 2020-08-04
CN111490946B CN111490946B (en) 2023-08-11

Family

ID=71794250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910083309.4A Active CN111490946B (en) 2019-01-28 2019-01-28 FPGA connection realization method and device based on OpenCL framework

Country Status (1)

Country Link
CN (1) CN111490946B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111740847A (en) * 2020-08-24 2020-10-02 常州楠菲微电子有限公司 High-speed network data transmission system and method based on FPGA
CN112001494A (en) * 2020-08-20 2020-11-27 浪潮电子信息产业股份有限公司 Method for realizing support of FPGA (field programmable Gate array) back-end equipment by nGraph framework
CN112306775A (en) * 2020-11-19 2021-02-02 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for testing communication link between two-way CPUs (central processing unit)
WO2023051248A1 (en) * 2021-09-30 2023-04-06 华为技术有限公司 Data access system and method, and related device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110161495A1 (en) * 2009-12-26 2011-06-30 Ralf Ratering Accelerating opencl applications by utilizing a virtual opencl device as interface to compute clouds
CN104850866A (en) * 2015-06-08 2015-08-19 电子科技大学 SoC-FPGA-based self-reconstruction K-means cluster technology realization method
CN105335326A (en) * 2015-10-10 2016-02-17 广州慧睿思通信息科技有限公司 PCIE-SATA interface array device based on FPGA
CN106250349A (en) * 2016-08-08 2016-12-21 浪潮(北京)电子信息产业有限公司 A kind of high energy efficiency heterogeneous computing system
CN107295343A (en) * 2017-06-27 2017-10-24 郑州云海信息技术有限公司 A kind of palette becomes optimization method, the apparatus and system of scaling method
CN107341053A (en) * 2017-06-01 2017-11-10 深圳大学 The programmed method of heterogeneous polynuclear programmable system and its memory configurations and computing unit
US20180181443A1 (en) * 2016-12-27 2018-06-28 Seoul National University R&Db Foundation METHOD OF PROCESSING OpenCL KERNEL AND COMPUTING DEVICE THEREFOR
CN108776649A (en) * 2018-06-11 2018-11-09 山东超越数控电子股份有限公司 One kind being based on CPU+FPGA heterogeneous computing systems and its accelerated method
CN108897630A (en) * 2018-06-06 2018-11-27 郑州云海信息技术有限公司 A kind of global memory's caching method, system and device based on OpenCL

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110161495A1 (en) * 2009-12-26 2011-06-30 Ralf Ratering Accelerating opencl applications by utilizing a virtual opencl device as interface to compute clouds
CN104850866A (en) * 2015-06-08 2015-08-19 电子科技大学 SoC-FPGA-based self-reconstruction K-means cluster technology realization method
CN105335326A (en) * 2015-10-10 2016-02-17 广州慧睿思通信息科技有限公司 PCIE-SATA interface array device based on FPGA
CN106250349A (en) * 2016-08-08 2016-12-21 浪潮(北京)电子信息产业有限公司 A kind of high energy efficiency heterogeneous computing system
US20180181443A1 (en) * 2016-12-27 2018-06-28 Seoul National University R&Db Foundation METHOD OF PROCESSING OpenCL KERNEL AND COMPUTING DEVICE THEREFOR
CN107341053A (en) * 2017-06-01 2017-11-10 深圳大学 The programmed method of heterogeneous polynuclear programmable system and its memory configurations and computing unit
CN107295343A (en) * 2017-06-27 2017-10-24 郑州云海信息技术有限公司 A kind of palette becomes optimization method, the apparatus and system of scaling method
CN108897630A (en) * 2018-06-06 2018-11-27 郑州云海信息技术有限公司 A kind of global memory's caching method, system and device based on OpenCL
CN108776649A (en) * 2018-06-11 2018-11-09 山东超越数控电子股份有限公司 One kind being based on CPU+FPGA heterogeneous computing systems and its accelerated method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JEFF FIFIELD ,等: "Optimizing OpenCL applications on Xilinx FPGA", 《IWOCL \'16: PROCEEDINGS OF THE 4TH INTERNATIONAL WORKSHOP ON OPENCL》, pages 1 - 2 *
鲍云峰;曾张帆;唐文龙;田茂;: "基于OpenCL与FPGA异构模式的Sobel算法研究", 计算机测量与控制, no. 01 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001494A (en) * 2020-08-20 2020-11-27 浪潮电子信息产业股份有限公司 Method for realizing support of FPGA (field programmable Gate array) back-end equipment by nGraph framework
US11762721B2 (en) 2020-08-20 2023-09-19 Inspur Electronic Information Industry Co., Ltd. Method for realizing nGraph framework supporting FPGA rear-end device
CN111740847A (en) * 2020-08-24 2020-10-02 常州楠菲微电子有限公司 High-speed network data transmission system and method based on FPGA
CN111740847B (en) * 2020-08-24 2020-12-11 常州楠菲微电子有限公司 High-speed network data transmission system and method based on FPGA
CN112306775A (en) * 2020-11-19 2021-02-02 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for testing communication link between two-way CPUs (central processing unit)
CN112306775B (en) * 2020-11-19 2023-03-14 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for testing communication link between two-way CPUs (central processing unit)
WO2023051248A1 (en) * 2021-09-30 2023-04-06 华为技术有限公司 Data access system and method, and related device

Also Published As

Publication number Publication date
CN111490946B (en) 2023-08-11

Similar Documents

Publication Publication Date Title
CN111490946A (en) FPGA connection implementation method and device based on OpenC L framework
US11481346B2 (en) Method and apparatus for implementing data transmission, electronic device, and computer-readable storage medium
CN112784989B (en) Inference system, inference method, electronic device, and computer storage medium
KR20210033996A (en) Integrated address space for multiple hardware accelerators using dedicated low-latency links
US8793424B2 (en) Switch apparatus
CN116886751B (en) High-speed communication method and device of heterogeneous equipment and heterogeneous communication system
US20210200571A1 (en) Storage Device Operation Method and Physical Server
CN107526620B (en) User mode input and output equipment configuration method and device
CN112769905A (en) NUMA (non uniform memory access) architecture based high-performance network card performance optimization method under Feiteng platform
CN103092676A (en) Analog input output method, device and system of virtual machine cluster
CN115858103A (en) Method, apparatus, and medium for live migration between open stack architecture virtual machines
CN114756332A (en) Data access method, device and system based on virtual machine device direct connection
CN111026697A (en) Inter-core communication method, inter-core communication system, electronic device and electronic equipment
CN109729731B (en) Accelerated processing method and device
US10339091B2 (en) Packet data processing method, apparatus, and system
CN115114042A (en) Storage data access method and device, electronic equipment and storage medium
CN116303141A (en) Data transmission method, device, equipment and medium
US20190286606A1 (en) Network-on-chip and computer system including the same
CN113900985B (en) IO and SPI multiplexing chip, multiplexing auxiliary chip and data interaction method
CN116150082A (en) Access method, device, chip, electronic equipment and storage medium
CN108494700A (en) Across link data transmission method, device, computer equipment and storage medium
CN112148453A (en) Computing chip for privacy computation and network computing system
CN117971135B (en) Storage device access method and device, storage medium and electronic device
CN114124850B (en) Network communication method and device and storage medium
US10990408B1 (en) Place and route aware data pipelining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40035203

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant