CN115473861B - High-performance processing system and method based on communication and calculation separation and storage medium - Google Patents

High-performance processing system and method based on communication and calculation separation and storage medium Download PDF

Info

Publication number
CN115473861B
CN115473861B CN202210991611.1A CN202210991611A CN115473861B CN 115473861 B CN115473861 B CN 115473861B CN 202210991611 A CN202210991611 A CN 202210991611A CN 115473861 B CN115473861 B CN 115473861B
Authority
CN
China
Prior art keywords
communication
node
tcp
application
application layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210991611.1A
Other languages
Chinese (zh)
Other versions
CN115473861A (en
Inventor
张建军
杨少波
耿世磊
余军
赵洋
范建超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Comleader Information Technology Co Ltd
Original Assignee
Zhuhai Comleader Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Comleader Information Technology Co Ltd filed Critical Zhuhai Comleader Information Technology Co Ltd
Priority to CN202210991611.1A priority Critical patent/CN115473861B/en
Publication of CN115473861A publication Critical patent/CN115473861A/en
Application granted granted Critical
Publication of CN115473861B publication Critical patent/CN115473861B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/10Packet switching elements characterised by the switching fabric construction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/15Interconnection of switching modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer And Data Communications (AREA)

Abstract

The application discloses a high-performance processing system and method based on communication and calculation separation, and a storage medium, wherein the system comprises a TOE node, a PCIe switching node and a CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.

Description

High-performance processing system and method based on communication and calculation separation and storage medium
Technical Field
The application relates to the field of data processing, in particular to a high-performance processing system and method based on communication and calculation separation and a storage medium.
Background
Current common data processing systems are built in the manner of CPU + ethernet switch + interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.
The model is constructed based on mature Ethernet switching technology and TCP/IP technology, and can meet certain application requirements. However, with the development of communication technology, the communication bandwidth is rapidly increased, the data load is greatly increased, and the disadvantage of the model is more and more obvious, so that the model cannot adapt to the current high-performance processing requirement. The concrete steps are as follows:
(1) The processing of packet data by a TCP/IP protocol stack of kernel software, especially the processing requirements of checksum calculation of each layer of TCP/IP protocol, repeated interaction of TCP protocol, error control, overtime transmission and the like, bring heavy protocol processing pressure to a CPU, and especially under the high throughput and high concurrency application scenarios, the protocol stack processing can occupy the processing capacity of the CPU greatly. And the processing delay will be further exacerbated as the load increases due to the uncertainty of the software processing delay.
(2) Based on the processing model of the kernel of the operating system, the whole system has the problems of high interrupt rate, repeated copying of data among memories, frequent switching of application program contexts and the like, and the processing performance of the CPU is further deteriorated.
(3) Because each node in the communication generally adopts TCP/IP protocol to communicate, each node needs to receive and transmit TCP/IP, and the processing redundancy of the whole system is increased.
Accordingly, the above-mentioned technical problems of the related art are to be solved.
Disclosure of Invention
The present application is directed to solving one of the technical problems in the related art. Therefore, the embodiment of the application provides a high-performance processing system, a high-performance processing method and a storage medium based on communication and calculation separation, which can improve the processing performance of the system and reduce the occupation of the processing capacity of a CPU.
According to an aspect of an embodiment of the present application, there is provided a high performance processing system based on separation of communication and computation, the system including: TOE node, PCIe switching node, CPU node;
the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data;
the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is used for running functional software based on the application to complete the application calculation function of the system.
In one embodiment, the TOE node is configured for TCP or IP communication with the system outside, and includes:
the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication;
the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network.
In one embodiment, the TOE node transfers the application layer data to the corresponding memory space over a PCIe bus.
In one embodiment, the PCIe switching node transfers the application layer data to a user buffer of the software by DMA, including:
the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus.
In one embodiment, the processing tasks of the system include a computing task that is computationally completed by the CPU node and functional software, and a communication task that is completed by hardware.
According to an aspect of an embodiment of the present application, there is provided a high performance processing method based on separation of communication and computation, the method including:
carrying out TCP or IP communication externally and terminating TCP or IP data;
the application layer data is transported to a user buffer area of the software in a DMA mode;
and running application-based functional software to complete the application computing function of the system.
In one embodiment, the performing TCP or IP communication with the external device includes:
analyzing application layer data and sending the application layer data to a PCIe bus for internal communication;
and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
In one embodiment, the method further comprises:
and the application layer data are transported to the corresponding memory space through the PCIe bus.
In one embodiment, the method for transferring the application layer data to the user buffer area of the software through the DMA mode comprises the following steps:
the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
According to an aspect of an embodiment of the present application, there is provided a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 6 to 9.
The high-performance processing system and method based on communication and calculation separation and the storage medium provided by the embodiment of the application have the beneficial effects that: the system comprises TOE nodes, PCIe switching nodes and CPU nodes; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.
Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a conventional data processing system architecture;
FIG. 2 is a schematic diagram of a high performance processing system architecture based on communication and computing separation according to an embodiment of the present application;
fig. 3 is a flowchart of a high performance processing method based on separation of communication and computation according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The technical terms appearing in the present application are explained as follows:
DMA: DMA (Direct Memory Access ) allows hardware devices of different speeds to communicate without relying on the massive interrupt load of the CPU. Otherwise, the CPU needs to copy each piece of data from the source to the register and then write them back to the new place again. During this time, the CPU is not available for other tasks.
TCP: TCP (TCP offload engine) is a TCP acceleration technique used in a Network Interface Controller (NIC) to offload the work of TCP/IP stacking to the network interface controller, and is done in hardware. TCP functions are common on high-speed ethernet interfaces, such as gigabit ethernet (GbE) or 10 gigabit ethernet (10 GbE), where the work of processing TCP/IP packet headers becomes heavier, which can be done by hardware to ease the burden on the processor.
PCIe: PCIe is a high-speed serial computer expansion bus standard. PCIe includes higher maximum system bus throughput, lower I/O pin count and smaller physical size, better bus device performance scaling, more detailed error detection and reporting mechanisms (advanced error reporting, AER) and native hot plug functionality. PCIe provides hardware support for I/O virtualization.
As shown in FIG. 1, a currently common data processing system is built with CPU+Ethernet switch+interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.
The model is constructed based on mature Ethernet switching technology and TCP/IP technology, and can meet certain application requirements. However, with the development of communication technology, the communication bandwidth is rapidly increased, the data load is greatly increased, and the disadvantage of the model is more and more obvious, so that the model cannot adapt to the current high-performance processing requirement. The concrete steps are as follows:
(1) The processing of packet data by a TCP/IP protocol stack of kernel software, especially the processing requirements of checksum calculation of each layer of TCP/IP protocol, repeated interaction of TCP protocol, error control, overtime transmission and the like, bring heavy protocol processing pressure to a CPU, and especially under the high throughput and high concurrency application scenarios, the protocol stack processing can occupy the processing capacity of the CPU greatly. And the processing delay will be further exacerbated as the load increases due to the uncertainty of the software processing delay.
(2) Based on the processing model of the kernel of the operating system, the whole system has the problems of high interrupt rate, repeated copying of data among memories, frequent switching of application program contexts and the like, and the processing performance of the CPU is further deteriorated.
(3) Because each node in the communication generally adopts TCP/IP protocol to communicate, each node needs to receive and transmit TCP/IP, and the processing redundancy of the whole system is increased.
Based on the facts and analysis, the conventional processing model cannot meet the scene requirements of high performance, high concurrency and low time delay. In order to face the actual processing challenges, a high-performance processing platform is constructed, and the application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation. Decoupling the computing problem, the communication problem in the system and the communication problem outside the system, and treating the decoupling and the decoupling separately, the application is characterized in that: the calculation problem is solved by the CPU and the functional software operated by the CPU; the communication problem is completely separated from the CPU and is uniformly solved by hardware; the system external communication has hardware TCP/IP protocol stack (namely TOE hardware engine) for processing and application termination; the communication content in the system is application layer data, and high-performance bearing and exchange are carried out through a PCIe bus; and the application layer is directly communicated into a user buffer area of the software in a DMA mode.
Specifically, as shown in fig. 2, the high-performance processing system based on communication and calculation separation proposed by the present application includes: TOE node, PCIe switching node, CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system.
Specifically, the TOE node in this embodiment is used for TCP or IP communication of the system outside, and includes: the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication; the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network. The TOE node of this embodiment is a fully functional TCP/IP hardware protocol stack. Externally, the whole system is responsible for external TCP/IP communication; in the pair, the complete TCP/IP data is terminated at the node, on one hand, the application layer data is analyzed and sent to a PCIe bus for internal communication; on the other hand, the data from the PCIe bus is received, carried by the TCP/IP protocol and sent to the external network. Meanwhile, the TOE node also has a DMA function, and can transport the application layer data to the corresponding memory space through the PCIe bus.
Specifically, the PCIe switching node of the present embodiment transfers the application layer data to the user buffer area of the software in a DMA mode, including: the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus. The PCIe switching node uses the PCIe bus to efficiently carry data communication in the system, and supports direct access to the memory space of the CPU node.
Therefore, the processing tasks of the system provided in this embodiment include a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and the functional software, and the communication task is completed by hardware. The CPU node is responsible for running various application-based functional software to complete the application computing function of the system. Because the communication task is decoupled into hardware for implementation, and the data source model facing the functional software is based on memory rather than IO, a specific functional software only receives and transmits associated data, so that the communication task can have higher throughput performance. On the other hand, in order to "pass through" data to the functional software, a driver for the efficient kernel-bypass model needs to be built. On the software model, a software model similar to DPDK can be used for development.
The application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation, which is based on the idea of communication and calculation separation and divides communication into communication outside a system and communication inside the system. The method and the device can be used for decoupling and processing the calculation problem, the intra-system communication problem and the external-system communication problem, and can be used for treating the calculation problem, the internal-system communication problem and the external-system communication problem separately. The method provides a practical technical scheme for engineering construction of a high-performance processing system. The system built by the application can meet the practical requirements of high concurrency, high throughput and low delay.
Fig. 3 is a flowchart of a high performance processing method based on communication and computation separation according to an embodiment of the present application, and as shown in fig. 3, the present application provides a high performance processing method based on communication and computation separation, including:
s301, carrying out TCP or IP communication externally and terminating TCP or IP data.
S302, the application layer data is transported to a user buffer area of the software in a DMA mode.
S303, running function software based on the application to complete the application calculation function of the system.
Optionally, the performing TCP or IP communication to the outside includes:
analyzing application layer data and sending the application layer data to a PCIe bus for internal communication; and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
Optionally, the method of the present embodiment further includes: and the application layer data are transported to the corresponding memory space through the PCIe bus.
It should be noted that, the method for transferring the application layer data to the user buffer area of the software by the DMA mode includes: the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
Further, the present embodiment also provides a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and-computation-separation-based high-performance processing method as described in the previous embodiment.
The content in the method embodiment is applicable to the storage medium embodiment, and functions specifically implemented by the storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (5)

1. A high performance processing system based on separation of communications and computing, the system comprising: TOE node, PCIe switching node, CPU node;
the TOE node is used for external TCP or IP communication of the system, transmitting application layer data to a corresponding memory space through a PCIe bus, and terminating the TCP or IP data;
the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is used for running functional software based on the application to complete the application calculation function of the system;
the processing tasks of the system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and functional software, and the communication task is completed by hardware;
the TOE node is used for TCP or IP communication of the system outside, and comprises:
the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication;
the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network.
2. The communication and computation separation-based high performance processing system of claim 1, wherein the PCIe switching node DMA-transfers application layer data to a user buffer of software, comprising:
the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus.
3. A high performance processing method based on separation of communication and computation, applied to the high performance processing system according to any one of claims 1 to 2, characterized in that the method comprises:
TCP or IP communication is carried out by adopting TOE nodes, application layer data is transported to a corresponding memory space through a PCIe bus, and the TCP or IP data is terminated;
the PCIe switching node is adopted to transmit the application layer data to a user buffer area of the software in a DMA mode;
the CPU node is adopted to run the function software based on the application, so as to complete the application calculation function of the system;
the processing tasks of the high-performance processing system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by functional software, and the communication task is completed by hardware;
the adopting the TOE node to externally perform TCP or IP communication comprises the following steps:
analyzing the application layer data by adopting the TOE node and sending the application layer data to a PCIe bus for internal communication;
and receiving data sent by the PCIe bus by adopting the TOE node, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.
4. A communication and computation separation based high performance processing method according to claim 3, wherein the application layer data is transferred to the user buffer of the software by DMA, comprising:
the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.
5. A storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 3 to 4.
CN202210991611.1A 2022-08-18 2022-08-18 High-performance processing system and method based on communication and calculation separation and storage medium Active CN115473861B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210991611.1A CN115473861B (en) 2022-08-18 2022-08-18 High-performance processing system and method based on communication and calculation separation and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210991611.1A CN115473861B (en) 2022-08-18 2022-08-18 High-performance processing system and method based on communication and calculation separation and storage medium

Publications (2)

Publication Number Publication Date
CN115473861A CN115473861A (en) 2022-12-13
CN115473861B true CN115473861B (en) 2023-11-03

Family

ID=84365900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210991611.1A Active CN115473861B (en) 2022-08-18 2022-08-18 High-performance processing system and method based on communication and calculation separation and storage medium

Country Status (1)

Country Link
CN (1) CN115473861B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1667601A (en) * 2004-03-11 2005-09-14 国际商业机器公司 Apparatus and method for sharing a network I/O adapter between logical partitions
CN1819584A (en) * 2004-11-12 2006-08-16 微软公司 Method and apparatus for secure internet protocol (ipsec) offloading with integrated host protocol stack management
WO2017046582A1 (en) * 2015-09-16 2017-03-23 Nanospeed Technologies Limited Tcp/ip offload system
WO2018018611A1 (en) * 2016-07-29 2018-02-01 华为技术有限公司 Task processing method and network card
CN109491934A (en) * 2018-09-28 2019-03-19 方信息科技(上海)有限公司 A kind of storage management system control method of integrated computing function
CN109714302A (en) * 2017-10-25 2019-05-03 阿里巴巴集团控股有限公司 The discharging method of algorithm, device and system
CN110109852A (en) * 2019-04-03 2019-08-09 华东计算技术研究所(中国电子科技集团公司第三十二研究所) System and method for realizing TCP _ IP protocol by hardware
CN111031011A (en) * 2019-11-26 2020-04-17 中科驭数(北京)科技有限公司 Interaction method and device of TCP/IP accelerator
CN111163121A (en) * 2019-11-19 2020-05-15 核芯互联科技(青岛)有限公司 Ultra-low-delay high-performance network protocol stack processing method and system
CN112953967A (en) * 2021-03-30 2021-06-11 扬州万方电子技术有限责任公司 Network protocol unloading device and data transmission system
CN113225307A (en) * 2021-03-18 2021-08-06 西安电子科技大学 Optimization method, system and terminal for pre-reading descriptors in offload engine network card
CN113312283A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Heterogeneous image learning system based on FPGA acceleration
CN113347017A (en) * 2021-04-09 2021-09-03 中科创达软件股份有限公司 Network communication method and device, network node equipment and hybrid network
CN114238187A (en) * 2022-02-24 2022-03-25 苏州浪潮智能科技有限公司 FPGA-based full-stack network card task processing system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11537541B2 (en) * 2018-09-28 2022-12-27 Xilinx, Inc. Network interface device and host processing device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1667601A (en) * 2004-03-11 2005-09-14 国际商业机器公司 Apparatus and method for sharing a network I/O adapter between logical partitions
CN1819584A (en) * 2004-11-12 2006-08-16 微软公司 Method and apparatus for secure internet protocol (ipsec) offloading with integrated host protocol stack management
WO2017046582A1 (en) * 2015-09-16 2017-03-23 Nanospeed Technologies Limited Tcp/ip offload system
WO2018018611A1 (en) * 2016-07-29 2018-02-01 华为技术有限公司 Task processing method and network card
CN109714302A (en) * 2017-10-25 2019-05-03 阿里巴巴集团控股有限公司 The discharging method of algorithm, device and system
CN109491934A (en) * 2018-09-28 2019-03-19 方信息科技(上海)有限公司 A kind of storage management system control method of integrated computing function
CN110109852A (en) * 2019-04-03 2019-08-09 华东计算技术研究所(中国电子科技集团公司第三十二研究所) System and method for realizing TCP _ IP protocol by hardware
CN111163121A (en) * 2019-11-19 2020-05-15 核芯互联科技(青岛)有限公司 Ultra-low-delay high-performance network protocol stack processing method and system
CN111031011A (en) * 2019-11-26 2020-04-17 中科驭数(北京)科技有限公司 Interaction method and device of TCP/IP accelerator
CN113225307A (en) * 2021-03-18 2021-08-06 西安电子科技大学 Optimization method, system and terminal for pre-reading descriptors in offload engine network card
CN112953967A (en) * 2021-03-30 2021-06-11 扬州万方电子技术有限责任公司 Network protocol unloading device and data transmission system
CN113347017A (en) * 2021-04-09 2021-09-03 中科创达软件股份有限公司 Network communication method and device, network node equipment and hybrid network
CN113312283A (en) * 2021-05-28 2021-08-27 北京航空航天大学 Heterogeneous image learning system based on FPGA acceleration
CN114238187A (en) * 2022-02-24 2022-03-25 苏州浪潮智能科技有限公司 FPGA-based full-stack network card task processing system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Decentralized Attribute-Based Encryption and Data Sharing Scheme in Cloud Storage;Xiehua Li;Yanlong Wang;Ming Xu;Yaping Cui;;中国通信(第02期);全文 *
User-Level Device Drivers: Achieved Performance;Ben Leslie;Peter Chubb;Nicholas Fitzroy-Dale;Stefan G(o|¨)tz;Charles Gray;Luke Macpherson;Daniel Potts;Kevin Elphinstone;Gernot Heiser;;Journal of Computer Science and Technology(第05期);全文 *
一种TCP/IP卸载的数据零拷贝传输方法;王小峰;时向泉;苏金树;;计算机工程与科学(第02期);全文 *
基于多核NPU的TCP数据接收卸载;李杰;陈曙晖;;计算机工程与科学(第07期);全文 *

Also Published As

Publication number Publication date
CN115473861A (en) 2022-12-13

Similar Documents

Publication Publication Date Title
US8477806B2 (en) Method and system for transmission control packet (TCP) segmentation offload
US9110668B2 (en) Enhanced buffer-batch management for energy efficient networking based on a power mode of a network interface
JP2019036298A (en) Intelligent high bandwidth memory system and logic dies therefor
CN109960671B (en) Data transmission system, method and computer equipment
WO2022025966A1 (en) Receiver-based precision congestion control
US20230080588A1 (en) Mqtt protocol simulation method and simulation device
US20220166698A1 (en) Network resource monitoring
WO2022139930A1 (en) Resource consumption control
US20220311711A1 (en) Congestion control based on network telemetry
US20230127722A1 (en) Programmable transport protocol architecture
DE102022126611A1 (en) SERVICE MESH OFFSET TO NETWORK DEVICES
DE102022129250A1 (en) Transmission rate based on detected available bandwidth
US8161126B2 (en) System and method for RDMA QP state split between RNIC and host software
CN115686836A (en) Unloading card provided with accelerator
CN115202573A (en) Data storage system and method
CN115473861B (en) High-performance processing system and method based on communication and calculation separation and storage medium
CN113347017A (en) Network communication method and device, network node equipment and hybrid network
WO2023207295A1 (en) Data processing method, data processing unit, system and related device
US20220321491A1 (en) Microservice data path and control path processing
US20220291928A1 (en) Event controller in a device
Gilfeather et al. Modeling protocol offload for message-oriented communication
WO2023075930A1 (en) Network interface device-based computations
CN115913952A (en) Efficient parallelization and deployment method of multi-target service function chain based on CPU + DPU platform
Jin et al. Impact of protocol overheads on network throughput over high-speed interconnects: measurement, analysis, and improvement
CN113572575A (en) Self-adaptive data transmission method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant