CN115473861B

CN115473861B - High-performance processing system and method based on communication and calculation separation and storage medium

Info

Publication number: CN115473861B
Application number: CN202210991611.1A
Authority: CN
Inventors: 张建军; 杨少波; 耿世磊; 余军; 赵洋; 范建超
Original assignee: Zhuhai Comleader Information Technology Co Ltd
Current assignee: Zhuhai Comleader Information Technology Co Ltd
Priority date: 2022-08-18
Filing date: 2022-08-18
Publication date: 2023-11-03
Anticipated expiration: 2042-08-18
Also published as: CN115473861A

Abstract

The application discloses a high-performance processing system and method based on communication and calculation separation, and a storage medium, wherein the system comprises a TOE node, a PCIe switching node and a CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.

Description

High-performance processing system and method based on communication and calculation separation and storage medium

Technical Field

The application relates to the field of data processing, in particular to a high-performance processing system and method based on communication and calculation separation and a storage medium.

Background

Current common data processing systems are built in the manner of CPU + ethernet switch + interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.

The model is constructed based on mature Ethernet switching technology and TCP/IP technology, and can meet certain application requirements. However, with the development of communication technology, the communication bandwidth is rapidly increased, the data load is greatly increased, and the disadvantage of the model is more and more obvious, so that the model cannot adapt to the current high-performance processing requirement. The concrete steps are as follows:

(1) The processing of packet data by a TCP/IP protocol stack of kernel software, especially the processing requirements of checksum calculation of each layer of TCP/IP protocol, repeated interaction of TCP protocol, error control, overtime transmission and the like, bring heavy protocol processing pressure to a CPU, and especially under the high throughput and high concurrency application scenarios, the protocol stack processing can occupy the processing capacity of the CPU greatly. And the processing delay will be further exacerbated as the load increases due to the uncertainty of the software processing delay.

(2) Based on the processing model of the kernel of the operating system, the whole system has the problems of high interrupt rate, repeated copying of data among memories, frequent switching of application program contexts and the like, and the processing performance of the CPU is further deteriorated.

(3) Because each node in the communication generally adopts TCP/IP protocol to communicate, each node needs to receive and transmit TCP/IP, and the processing redundancy of the whole system is increased.

Accordingly, the above-mentioned technical problems of the related art are to be solved.

Disclosure of Invention

The present application is directed to solving one of the technical problems in the related art. Therefore, the embodiment of the application provides a high-performance processing system, a high-performance processing method and a storage medium based on communication and calculation separation, which can improve the processing performance of the system and reduce the occupation of the processing capacity of a CPU.

According to an aspect of an embodiment of the present application, there is provided a high performance processing system based on separation of communication and computation, the system including: TOE node, PCIe switching node, CPU node;

the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data;

the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode;

the CPU node is used for running functional software based on the application to complete the application calculation function of the system.

In one embodiment, the TOE node is configured for TCP or IP communication with the system outside, and includes:

the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication;

the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network.

In one embodiment, the TOE node transfers the application layer data to the corresponding memory space over a PCIe bus.

In one embodiment, the PCIe switching node transfers the application layer data to a user buffer of the software by DMA, including:

the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus.

In one embodiment, the processing tasks of the system include a computing task that is computationally completed by the CPU node and functional software, and a communication task that is completed by hardware.

According to an aspect of an embodiment of the present application, there is provided a high performance processing method based on separation of communication and computation, the method including:

carrying out TCP or IP communication externally and terminating TCP or IP data;

the application layer data is transported to a user buffer area of the software in a DMA mode;

and running application-based functional software to complete the application computing function of the system.

In one embodiment, the performing TCP or IP communication with the external device includes:

analyzing application layer data and sending the application layer data to a PCIe bus for internal communication;

and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.

In one embodiment, the method further comprises:

and the application layer data are transported to the corresponding memory space through the PCIe bus.

In one embodiment, the method for transferring the application layer data to the user buffer area of the software through the DMA mode comprises the following steps:

the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.

According to an aspect of an embodiment of the present application, there is provided a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 6 to 9.

The high-performance processing system and method based on communication and calculation separation and the storage medium provided by the embodiment of the application have the beneficial effects that: the system comprises TOE nodes, PCIe switching nodes and CPU nodes; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system. The application can improve the processing performance of the whole system, reduce the occupation of the processing capacity of the CPU, and is a practical high-performance processing system construction scheme in engineering.

Additional aspects and advantages of the application will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the application.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a conventional data processing system architecture;

FIG. 2 is a schematic diagram of a high performance processing system architecture based on communication and computing separation according to an embodiment of the present application;

fig. 3 is a flowchart of a high performance processing method based on separation of communication and computation according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The technical terms appearing in the present application are explained as follows:

DMA: DMA (Direct Memory Access ) allows hardware devices of different speeds to communicate without relying on the massive interrupt load of the CPU. Otherwise, the CPU needs to copy each piece of data from the source to the register and then write them back to the new place again. During this time, the CPU is not available for other tasks.

TCP: TCP (TCP offload engine) is a TCP acceleration technique used in a Network Interface Controller (NIC) to offload the work of TCP/IP stacking to the network interface controller, and is done in hardware. TCP functions are common on high-speed ethernet interfaces, such as gigabit ethernet (GbE) or 10 gigabit ethernet (10 GbE), where the work of processing TCP/IP packet headers becomes heavier, which can be done by hardware to ease the burden on the processor.

PCIe: PCIe is a high-speed serial computer expansion bus standard. PCIe includes higher maximum system bus throughput, lower I/O pin count and smaller physical size, better bus device performance scaling, more detailed error detection and reporting mechanisms (advanced error reporting, AER) and native hot plug functionality. PCIe provides hardware support for I/O virtualization.

As shown in FIG. 1, a currently common data processing system is built with CPU+Ethernet switch+interface chips. The interface chip is responsible for the communication of the system to the outside, and forms a unified gateway for the system to the outside. The CPU node completes the Ethernet interface, receives and transmits data through the software TCP/IP protocol stack, and simultaneously carries out application calculation on the corresponding data according to the application scene. The Ethernet switch is responsible for interconnection among the nodes of the whole system. For the purpose of achieving reliable end-to-end data transfer, communication protocols typically employ TCP/IP based protocols.

Based on the facts and analysis, the conventional processing model cannot meet the scene requirements of high performance, high concurrency and low time delay. In order to face the actual processing challenges, a high-performance processing platform is constructed, and the application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation. Decoupling the computing problem, the communication problem in the system and the communication problem outside the system, and treating the decoupling and the decoupling separately, the application is characterized in that: the calculation problem is solved by the CPU and the functional software operated by the CPU; the communication problem is completely separated from the CPU and is uniformly solved by hardware; the system external communication has hardware TCP/IP protocol stack (namely TOE hardware engine) for processing and application termination; the communication content in the system is application layer data, and high-performance bearing and exchange are carried out through a PCIe bus; and the application layer is directly communicated into a user buffer area of the software in a DMA mode.

Specifically, as shown in fig. 2, the high-performance processing system based on communication and calculation separation proposed by the present application includes: TOE node, PCIe switching node, CPU node; the TOE node is used for external TCP or IP communication of the system and terminating TCP or IP data; the PCIe switching node is used for transmitting the application layer data to a user buffer area of the software in a DMA mode; the CPU node is used for running functional software based on the application to complete the application calculation function of the system.

Specifically, the TOE node in this embodiment is used for TCP or IP communication of the system outside, and includes: the TOE node analyzes the application layer data and sends the application layer data to a PCIe bus for internal communication; the TOE node receives data sent by the PCIe bus, bears a TCP/IP protocol, and sends the data sent by the PCIe bus to an external network. The TOE node of this embodiment is a fully functional TCP/IP hardware protocol stack. Externally, the whole system is responsible for external TCP/IP communication; in the pair, the complete TCP/IP data is terminated at the node, on one hand, the application layer data is analyzed and sent to a PCIe bus for internal communication; on the other hand, the data from the PCIe bus is received, carried by the TCP/IP protocol and sent to the external network. Meanwhile, the TOE node also has a DMA function, and can transport the application layer data to the corresponding memory space through the PCIe bus.

Specifically, the PCIe switching node of the present embodiment transfers the application layer data to the user buffer area of the software in a DMA mode, including: the PCIe switching node conveys the application layer data to a user buffer of the corresponding software through a PCIe bus. The PCIe switching node uses the PCIe bus to efficiently carry data communication in the system, and supports direct access to the memory space of the CPU node.

Therefore, the processing tasks of the system provided in this embodiment include a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and the functional software, and the communication task is completed by hardware. The CPU node is responsible for running various application-based functional software to complete the application computing function of the system. Because the communication task is decoupled into hardware for implementation, and the data source model facing the functional software is based on memory rather than IO, a specific functional software only receives and transmits associated data, so that the communication task can have higher throughput performance. On the other hand, in order to "pass through" data to the functional software, a driver for the efficient kernel-bypass model needs to be built. On the software model, a software model similar to DPDK can be used for development.

The application provides a design and implementation scheme of a high-performance processing system based on communication and calculation separation, which is based on the idea of communication and calculation separation and divides communication into communication outside a system and communication inside the system. The method and the device can be used for decoupling and processing the calculation problem, the intra-system communication problem and the external-system communication problem, and can be used for treating the calculation problem, the internal-system communication problem and the external-system communication problem separately. The method provides a practical technical scheme for engineering construction of a high-performance processing system. The system built by the application can meet the practical requirements of high concurrency, high throughput and low delay.

Fig. 3 is a flowchart of a high performance processing method based on communication and computation separation according to an embodiment of the present application, and as shown in fig. 3, the present application provides a high performance processing method based on communication and computation separation, including:

s301, carrying out TCP or IP communication externally and terminating TCP or IP data.

S302, the application layer data is transported to a user buffer area of the software in a DMA mode.

S303, running function software based on the application to complete the application calculation function of the system.

Optionally, the performing TCP or IP communication to the outside includes:

analyzing application layer data and sending the application layer data to a PCIe bus for internal communication; and receiving data sent by the PCIe bus, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.

Optionally, the method of the present embodiment further includes: and the application layer data are transported to the corresponding memory space through the PCIe bus.

It should be noted that, the method for transferring the application layer data to the user buffer area of the software by the DMA mode includes: the application layer data is carried over the PCIe bus to the user buffers of the corresponding software.

Further, the present embodiment also provides a storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and-computation-separation-based high-performance processing method as described in the previous embodiment.

The content in the method embodiment is applicable to the storage medium embodiment, and functions specifically implemented by the storage medium embodiment are the same as those of the method embodiment, and the achieved beneficial effects are the same as those of the method embodiment.

In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present application are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.

Furthermore, while the application is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the functions and/or features may be integrated in a single physical device and/or software module or may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present application. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the application as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the application, which is to be defined in the appended claims and their full scope of equivalents.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While embodiments of the present application have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the application, the scope of which is defined by the claims and their equivalents.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A high performance processing system based on separation of communications and computing, the system comprising: TOE node, PCIe switching node, CPU node;

the TOE node is used for external TCP or IP communication of the system, transmitting application layer data to a corresponding memory space through a PCIe bus, and terminating the TCP or IP data;

the CPU node is used for running functional software based on the application to complete the application calculation function of the system;

the processing tasks of the system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by the CPU node and functional software, and the communication task is completed by hardware;

the TOE node is used for TCP or IP communication of the system outside, and comprises:

2. The communication and computation separation-based high performance processing system of claim 1, wherein the PCIe switching node DMA-transfers application layer data to a user buffer of software, comprising:

3. A high performance processing method based on separation of communication and computation, applied to the high performance processing system according to any one of claims 1 to 2, characterized in that the method comprises:

TCP or IP communication is carried out by adopting TOE nodes, application layer data is transported to a corresponding memory space through a PCIe bus, and the TCP or IP data is terminated;

the PCIe switching node is adopted to transmit the application layer data to a user buffer area of the software in a DMA mode;

the CPU node is adopted to run the function software based on the application, so as to complete the application calculation function of the system;

the processing tasks of the high-performance processing system comprise a calculation task and a communication task, wherein the calculation task is calculated and completed by functional software, and the communication task is completed by hardware;

the adopting the TOE node to externally perform TCP or IP communication comprises the following steps:

analyzing the application layer data by adopting the TOE node and sending the application layer data to a PCIe bus for internal communication;

and receiving data sent by the PCIe bus by adopting the TOE node, carrying a TCP/IP protocol, and sending the data sent by the PCIe bus to an external network.

4. A communication and computation separation based high performance processing method according to claim 3, wherein the application layer data is transferred to the user buffer of the software by DMA, comprising:

5. A storage medium storing a processor-executable program which, when executed by a processor, implements the communication-and computation-separation-based high-performance processing method according to any one of claims 3 to 4.