WO2023046141A1 - Acceleration framework and acceleration method for database network load performance, and device - Google Patents

Acceleration framework and acceleration method for database network load performance, and device Download PDF

Info

Publication number
WO2023046141A1
WO2023046141A1 PCT/CN2022/121232 CN2022121232W WO2023046141A1 WO 2023046141 A1 WO2023046141 A1 WO 2023046141A1 CN 2022121232 W CN2022121232 W CN 2022121232W WO 2023046141 A1 WO2023046141 A1 WO 2023046141A1
Authority
WO
WIPO (PCT)
Prior art keywords
database
data
network
protocol stack
thread
Prior art date
Application number
PCT/CN2022/121232
Other languages
French (fr)
Chinese (zh)
Inventor
梁家琦
吕温
钟舟
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023046141A1 publication Critical patent/WO2023046141A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication

Definitions

  • the present application relates to the field of databases, and in particular to an acceleration framework, acceleration method and equipment for database network load performance.
  • a database is an organic collection of large amounts of shared data organized according to a certain structure and stored in a computer for a long time.
  • a database system (database system, DBS) is a system composed of computer software, hardware and data resources that realizes organized and dynamic storage of a large amount of associated data and facilitates multi-user access.
  • the main factors affecting database network load performance include network load delay, central processing unit (CPU) usage, disk input/output (I/O), memory usage efficiency, and database kernel technology.
  • the current technologies to accelerate database network load performance mainly include: deep mining of database kernel technology at the software level, combination of software and hardware CPU, and new storage media.
  • Ali OceanBase a high-performance distributed database system supporting massive data
  • OLTP on-line transaction processing
  • Ali PolarDB (Alibaba Cloud's self-developed cloud-native relational database) builds a database kernel engine based on remote direct memory access (RDMA) based on new hardware technology, and directly writes the local memory through RDMA The memory address of another machine, the encoding and decoding of the communication protocol in the middle, and the retransmission mechanism are all completed by the RDMA network card without CPU participation.
  • RDMA remote direct memory access
  • the above method 1 will cause frequent switching between user mode and kernel mode, multiple memory copies of kernel mode protocol stack data, etc., resulting in system resource loss and network load time delay, thereby losing database performance;
  • the above method 2 uses RDMA interaction realizes the database bypass operating system (OS) kernel to accelerate load performance, but it relies on RDMA network card hardware equipment, which belongs to a new hardware technology. In practical applications, end-to-end physical hardware cooperation is required, which has poor flexibility and versatility.
  • OS database bypass operating system
  • end-to-end physical hardware cooperation is required, which has poor flexibility and versatility.
  • the implementation of the RDMA protocol requires a lot of complex adaptation and modification for the application layer database kernel to ensure its usability.
  • the embodiment of the present application provides an acceleration framework, acceleration method and equipment for database network load performance.
  • the acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize operating system kernel bypass. This pure soft technology, It does not rely on new network equipment and is controllable and friendly.
  • the framework decouples the database from the user-mode network protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
  • the embodiment of the present application firstly provides an acceleration framework for database network load performance, which can be used in the field of databases.
  • the acceleration framework for database network load performance provided by the embodiment of the present application runs on computer equipment, and the computer equipment It is composed of hardware and software, and the software mainly includes operating system and database.
  • the acceleration framework provided by the embodiment of the present application realizes data transmission and reception from the client (referring to other devices different from the computer device) to the computer device through the network card device, and utilizes the user state network protocol stack and database software to provide database data addition, deletion, Change, check and other services.
  • the acceleration framework includes: a user-mode network protocol stack, a database network architecture (also referred to as a database network communication framework), a database multi-thread architecture, the database network architecture includes at least one database network thread, and the database multi-thread architecture includes at least one database business thread
  • the database multi-thread architecture and the database network architecture are connected through a communication control transceiver interface, and both the database network architecture and the database multi-thread architecture are included in the database kernel.
  • the user mode network protocol stack is used to receive the initial data sent by the network card device (for example, one or more initial data packet), and parse the initial data through the TCP/IP protocol stack in it to obtain the first data; at least one database network thread included in the database network architecture is used to obtain the first data, and indicates the database multi-threaded architecture Read the first data from the database network architecture; the database multi-thread architecture is used to read the first data through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, so as to execute the database and the first data.
  • a service corresponding to data may be referred to as a first service).
  • the acceleration framework replaces the kernel mode network protocol stack with the user mode network protocol stack. Since the user mode network protocol stack is in the user mode of the operating system, the operating system kernel bypass can be realized. This pure software Technology, does not rely on new network equipment, and is controllable and friendly. Moreover, the acceleration framework decouples the database from the user-mode network protocol stack, which can cope with the high concurrency of the user-mode network. In addition, the interaction between the database multi-thread architecture and the database network thread in the database network architecture decouples the business and communication of the traditional database (the database network thread and the database business thread are consumers and producers of each other), thereby reducing System overhead.
  • the acceleration framework may further include a user mode network configuration module configured to configure the user mode network protocol stack by creating a daemon process.
  • the acceleration framework may also include a user-mode network configuration module, which is responsible for enabling the current operating system of the computer device to have the capability of the user-mode network protocol stack, thereby realizing automatic deployment of the user-mode network protocol stack , the configuration process is more convenient.
  • the user-mode network configuration module is specifically configured to perform at least one of the following configuration operations: setting data plane development kit (data plane development kit, DPDK) user-mode drivers, setting large pages Memory (huge pages), setting timing tasks, setting kernel virtual network card (kernel nic interface, KNI), setting control permissions of user mode components, etc.
  • the timing task is responsible for managing and controlling the configuration environment of the user-mode network protocol stack to ensure the high availability of the database during the use of the user-mode network protocol stack.
  • the obtained data can be called the second data.
  • the thread architecture can also be used to send the second data to the database network architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture. At least one database network thread in the database network architecture is used to further send the second data to the user mode network protocol stack.
  • the interaction process between the database multi-thread architecture and the database network thread in the database network architecture is specifically described when the database on the computer device writes data, and the traditional database business and communication Decoupling (the database network thread and the database business thread are mutual consumers and producers), thereby reducing system overhead.
  • the user state network protocol stack includes a user state process (also called an Ltran process) and a network protocol stack component (also called a dynamic library Lstack.so), that is
  • the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level.
  • the user mode network protocol stack is specifically used to: start the user mode process in the user mode space, receive the initial data sent by the network card device through the user mode process, and store the initial data in the shared memory Afterwards, the initial data is analyzed based on the TCP/IP protocol stack in the shared memory by the network protocol stack component to obtain the first data (that is, the initial data is resolved into a data packet format that can be recognized by the computer device), and the obtained first data is obtained. A data is still stored in the shared memory.
  • the user-mode network protocol stack may further include a user-mode process and a network protocol stack component, the user-mode process shares memory with the network protocol stack component, and the two perform message interaction through shared memory, Including the protocol stack data analysis process of TCP/IP, it is achievable.
  • the user state network protocol stack includes a user state process (also called an Ltran process) and a network protocol stack component (also called a dynamic library Lstack.so), that is
  • the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level.
  • the database network thread in the communication pool is specifically used to send the second data to the network protocol stack component
  • the network protocol stack component is specifically used to send the second data to the network protocol stack component. It is also used to store the second data in the shared memory.
  • the memory shared by the user state process and the network protocol stack components can also be used to store the second data after business processing from the upper layer of the database, so that the network card device can read from the shared memory. Taking the second data has wide applicability.
  • the database network architecture may further include a data sharing buffer (buffer), and the data sharing buffer may be called a data resource pool, that is, data resource pooling is realized by creating a data sharing buffer,
  • the data resource pool is used to store the first data from the user mode network protocol stack.
  • the database network architecture may further include a data sharing buffer, which may be called a data resource pool, that is, data resource pooling is realized by creating a data sharing buffer, and the data resource pool Responsible for packet aggregation and/or batch sending and receiving of data in the user-mode network protocol stack to achieve dynamic flow control and scaling.
  • a data resource pool is used to store the second data from the database multi-thread architecture.
  • the database network architecture can also include a data resource pool
  • the database network thread in the communication pool is used to read the first
  • a piece of data is put into the data resource pool, and the database multi-thread framework is instructed to read the first data from the data resource pool, thereby completing the data interaction with the user-mode network protocol stack.
  • the first data obtained by the database network thread is stored in the data resource pool, and the database service thread also calls the first data from the data resource pool, thereby realizing the flow control of the database.
  • the second aspect of the embodiment of the present application provides a method for accelerating the load performance of the database and the network.
  • the method includes: when the peer device sends data (which can be referred to as initial data) to the computer device through the network card device, the computer device passes through the user state network.
  • the protocol stack receives the initial data sent by the peer device from the network card device, and further parses the initial data through the TCP/IP protocol stack in the user mode network protocol stack to obtain the first data.
  • the computer device After the user-mode network protocol stack receives the initial data and parses the first data, the computer device will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing) to obtain the first data from the user state network protocol stack, for example, the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic Modes such as viewing, waking up and viewing) obtain the first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) from the database network architecture.
  • polling mode also can be other modes, such as periodic Modes such as viewing, waking up and viewing
  • the first data is read, wherein the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture.
  • the computer device reads the first data through the database multi-thread architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
  • the operating system kernel bypass can be realized.
  • This pure soft technology does not rely on new network devices and is controllable and friendly.
  • the business of the database and the network are in the same thread, while the network capability of the user mode is an independent process and data resource in the user space, resulting in the unavailability of the RTC in the user mode. Therefore, the acceleration method provided by the embodiment of the present application decouples the business and network of the database.
  • the communication pool is used in the database network architecture to improve resource reuse and reduce system overhead.
  • the user-mode network protocol stack is configured by the user-mode network configuration module deployed on the computer device by creating a daemon process.
  • the acceleration framework may also include a user-mode network configuration module, which is responsible for enabling the current operating system of the computer device to have the capability of the user-mode network protocol stack, thereby realizing automatic deployment of the user-mode network protocol stack , the configuration process is more convenient.
  • the daemon process at least performs one of the following configuration operations: setting the data plane development kit DPDK user mode driver, setting the huge page memory, setting the scheduled task, setting the kernel virtual network card KNI, setting the user The control authority of the dynamic component.
  • the timing task is responsible for managing and controlling the configuration environment of the user-mode network protocol stack to ensure the high availability of the database during the use of the user-mode network protocol stack.
  • the acceleration method may also include: after the database multi-thread architecture executes the upper-level business (which may be called the second business) of the database on the computer device, the obtained data may be called The second data, after that, the computer device will send the second data to the database network architecture through the database multi-thread architecture, through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and then pass through the database network architecture.
  • the database network thread in the communication pool further sends the second data to the user mode network protocol stack.
  • the interaction process between the database multi-thread architecture and the database network thread in the database network architecture is described when the database on the computer device writes data, and the business and communication of the traditional database are decoupled (database network thread and the database business thread as consumers and producers), thereby reducing system overhead.
  • the user-mode network protocol stack may further include a user-mode process (also referred to as an Ltran process) and a network protocol stack component (also referred to as a dynamic library Lstack.so), That is to say, the user mode network protocol stack is embodied as the Ltran process in the user mode space and the dynamic library Lstack.so at the software level.
  • the user mode process shares memory with the network protocol stack component, and the two exchange messages through the shared memory.
  • the computer device receives the initial data sent by the network card device through the user state process, and stores it in the memory shared by the user state process and the network protocol stack component.
  • the computer device uses the network protocol stack component in the shared memory based on The TCP/IP protocol stack parses the initial data to obtain the first data (that is, parse the initial data into a data packet format that can be recognized by the computer device), and the obtained first data is still stored in the shared memory.
  • the user-mode network protocol stack may further include a user-mode process and a network protocol stack component, the user-mode process shares memory with the network protocol stack component, and the two perform message interaction through shared memory, Including the protocol stack data analysis process of TCP/IP, it is achievable.
  • the computer device passes the database network
  • the way in which the thread sends the second data to the user-mode network protocol stack may specifically be: the computer device sends the second data to the network protocol stack component through the database network thread, and then the network protocol stack component stores the received second data in the in shared memory.
  • the memory shared by the user state process and the network protocol stack components can also be used to store the second data after business processing from the upper layer of the database, so that the network card device can read from the shared memory. Taking the second data has wide applicability.
  • the database network architecture may also include a data sharing buffer.
  • the data sharing buffer may be called a data resource pool, that is, the data resource pool is realized by creating a data sharing buffer change.
  • the data resource pool is responsible for packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack, so as to realize dynamic flow control and scaling and expansion.
  • the database network thread in the communication pool can be based on the polling mode (or other modes, such as periodic checking, wake-up checking)
  • the first data is stored in the data resource pool.
  • the resource sharing and buffer multiplexing of the network protocol stack data ie, the first data
  • the overhead of data copying and resource creation is reduced.
  • the computer device uses the database multi-thread architecture to send the second After the data is sent to the database network architecture, the acceleration method may further include: the computer device stores the second data in the data resource pool through the database network thread.
  • a data sharing buffer that is, a data resource pool
  • a third aspect of the embodiments of the present application provides a computer device, where the computer device has a function of implementing the method of the second aspect or any possible implementation manner of the second aspect.
  • This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • the hardware or software includes one or more modules corresponding to the above functions.
  • the fourth aspect of the embodiment of the present application provides a computer device, which may include a memory, a processor, and a bus system, wherein the memory is used to store a program, and the processor is used to call the program stored in the memory to execute the second aspect of the embodiment of the present application Or any possible implementation method of the second aspect.
  • the fifth aspect of the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores instructions, and when it is run on a computer, the computer can execute any one of the above-mentioned second aspect or the second aspect. method of possible implementation.
  • the sixth aspect of the embodiments of the present application provides a computer program, which, when running on a computer, causes the computer to execute the method of the above-mentioned second aspect or any possible implementation manner of the second aspect.
  • the seventh aspect of the embodiment of the present application provides a chip, the chip includes at least one processor and at least one interface circuit, the interface circuit is coupled to the processor, and the at least one interface circuit is used to perform the function of sending and receiving, and send instructions to At least one processor, at least one processor is used to run computer programs or instructions, which has the function of realizing the method of the second aspect or any possible implementation mode of the second aspect above, and this function can be realized by hardware or by software Realization can also be achieved through a combination of hardware and software, where the hardware or software includes one or more modules corresponding to the above functions.
  • the interface circuit is used to communicate with other modules outside the chip.
  • Fig. 1 is a schematic diagram of a system architecture of the Libeasy network framework
  • Figure 2 is a schematic diagram of an implementation architecture of the libev-based reactor model of the Libeasy network framework
  • Fig. 3 is a schematic diagram of the thread sharing model of the Libeasy network framework
  • Figure 4 is a schematic diagram of a system architecture based on the RDMA-based database kernel engine to accelerate database load performance
  • FIG. 5 is a schematic diagram of an acceleration framework for database network load performance provided by an embodiment of the present application.
  • FIG. 6 is another schematic diagram of the acceleration framework of the database network load performance provided by the embodiment of the present application.
  • Fig. 7 is a system structure diagram of the acceleration framework provided by the embodiment of the present application.
  • FIG. 8 is a schematic diagram of the interaction between the database multi-thread architecture and the database network architecture provided by the embodiment of the present application.
  • FIG. 9 is a schematic flowchart of a method for accelerating database network load performance provided by an embodiment of the present application.
  • Fig. 10 is a core implementation flowchart of the method for accelerating the database network load performance provided by the embodiment of the present application.
  • FIG. 11 is an implementation flowchart of a method for accelerating database network load performance provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • FIG. 13 is another schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Embodiments of the present application provide an acceleration framework, acceleration method, and device for database network load performance.
  • the acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize OS kernel bypass. This pure soft technology does not Relying on new network equipment, controllability is friendly.
  • the framework decouples the database from the user-mode protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
  • the embodiment of the present application involves a lot of relevant knowledge about databases.
  • the following first introduces related terms and concepts that may be involved in the embodiment of the present application. It should be understood that the interpretation of related concepts may be limited due to the specific conditions of the embodiment of the application, but it does not mean that the application is limited to the specific conditions, and there may be differences in the specific conditions of different embodiments. Specifically, there is no limitation here.
  • the database is a warehouse that organizes, stores and manages data according to the data structure. Its essence is a file system.
  • the data is stored in a specific format. Users can add, modify, delete and query the data in the database.
  • the database is stored according to the data structure.
  • relational databases such as Oracle, SQLServer, DB2 and other databases
  • the stored data is mainly structured data, which has a regular format and is generally stored in the form of rows and columns.
  • a database system is a system composed of computer software, hardware and data resources that can organize, dynamically store a large amount of associated data, and facilitate multi-user access.
  • OLTP is a typical application of database.
  • OLTP is a very transactional system, mainly with frequent and large number of small transactions. In such a system, a single database often handles more than thousands of transactions per second, and the execution volume of query statements is even tens of thousands per second. Therefore, OLTP is also known as a transaction-oriented processing system. Its basic feature is that the original data of customers can be immediately transmitted to the computing center for processing, and the processing results will be given in a very short time. The biggest advantage of this is that it can be processed instantly. Process incoming data and respond promptly. Typical OLTP has e-commerce systems, such as bank transactions, securities transactions, etc. OLTP is done by the database engine.
  • An important performance indicator for measuring OLTP is system performance, which is embodied in real-time response time (response time, RT), that is, the time required for the computer device to reply to the request after the user sends data on the terminal.
  • response time response time
  • the operating system requires two CPU states, one is called the user state, and the other is called the kernel state, wherein the kernel state allows operating system programs, and the user state runs user programs.
  • Kernel mode and user mode are two operating levels of the operating system.
  • a program runs at the third-level privilege level, it can be called running in user mode, because this is the lowest privilege level, which is the privilege of ordinary user processes. level, most of the programs that users directly face run in user mode; conversely, when a program runs at level 0 privilege level, it can be called running in kernel mode.
  • Programs running in user mode cannot directly access operating system kernel data structures and programs.
  • a user executes a program in the system it runs in the user mode most of the time, and switches to the kernel mode when it needs the help of the operating system to complete some work that it does not have the power and ability to complete.
  • the main difference between the two states of user mode and kernel mode is: when executing in user mode, the memory space and objects that a process can access are limited, and the processor it occupies can be preempted; while executing in kernel mode The process in the process can access all memory spaces and objects, and the processors it occupies are not allowed to be preempted.
  • the user-mode process requests to use the service program provided by the operating system to complete the work through the system call.
  • the core of the system call mechanism is implemented by using an interrupt specially opened by the operating system for users, such as the int 80h interrupt of Linux.
  • the peripheral device When the peripheral device completes the operation requested by the user, it will send a corresponding interrupt signal to the CPU. At this time, the CPU will suspend the execution of the next instruction to be executed and turn to execute the processing program corresponding to the interrupt signal. If the previously executed instruction is The program in the user mode, then the conversion process naturally occurs from the user mode to the kernel mode. For example, when the hard disk reading and writing operation is completed, the system will switch to the interrupt handler for hard disk reading and writing to perform subsequent operations.
  • the above three methods are the most important ways for the operating system to transfer from the user mode to the kernel mode when the operating system is running. Among them, the system call can be considered to be initiated by the user process actively, and the exception and the interruption of the peripheral device are passive.
  • the user-mode network protocol can also be called the user-mode network protocol stack.
  • the kernel mode and the user mode are two operating levels of the operating system.
  • a task that is, a process
  • the process is in the kernel running state, that is, the kernel state.
  • the process is executing the user's own code, it is said to be in the user running state, that is, the user state.
  • the traditional transmission control protocol/internet protocol (transmission control protocol/internet protocol, TCP/IP) protocol stack runs in the kernel mode
  • the user mode network protocol is the TCP/IP protocol stack running in the user mode of the operating system.
  • Ali OceanBase (a high-performance distributed database system supporting massive data) proposes a high-performance Libeasy network framework, which is implemented based on the event-driven model Libev of the kernel-state network protocol stack, which uses coroutines to manage task scheduling.
  • the system architecture diagram of the network framework is shown in Figure 1, which includes software modules: database server (ie, DB in Figure 1), Libeasy network framework (ie, Libeasy in Figure 1), event-driven model libev (ie, DB in Figure 1 libev) and the network card device (that is, the nic in Figure 1).
  • the database server is responsible for receiving structured query language (structured query language, SQL) request processing from the client, and performing data receiving/sending interaction through the data reading/writing of the network card.
  • structured query language structured query language
  • the Libeasy network framework is based on the event-driven model libev, responsible for organizing connection, message, request and other message processing and resource management. Threads in libeasy are divided into business logic threads and network I/O threads, responsible for business processing and network I/O processing.
  • the event-driven model libev is implemented based on the reactor mode. It performs multiple I/O multiplexing by calling the TCP/IP protocol stack interface in the operating system kernel state, and completes the control of sending and receiving data packets of the network card.
  • the client requests to reach the network card device of the database server through Ethernet, and the server obtains the data of the network device (also called nic data) through direct memory access (DMA) and interrupt wake-up technology.
  • libev processes nic data.
  • the Libeasy network framework is implemented based on libev's reactor model, and its main implementation architecture is shown in Figure 2, which specifically includes the following modules: 1, EventHandler: an interface for event data, such as timer events, I/O events, etc.; 2 , Reactor: Reactor uses multiple I/O multiplexing and Timer, and when EventHandler is registered, it will call the corresponding interface. Reactor's HandleEvents needs to call I/O multiplexing and Timer first, get ready events, and finally call each EventHandler. 3, Timer: manage timers, mainly responsible for registering events, obtaining a list of timeout events, etc., generally implemented by network framework developers. 4, Multi-channel I/O multiplexing model, read and write data from the operating system kernel through epoll, and realize the data transmission and reception of kernel state TCP/IP of multiple listening handles.
  • EventHandler an interface for event data, such as timer events, I/O events, etc.
  • Reactor Rea
  • the Libeasy network framework supports two common thread models. One is that the network I/O thread and the worker thread share the same thread, and the other is that the network I/O thread and the worker thread are separated.
  • Figure 3 is a schematic diagram of the thread sharing model of the Libeasy network framework, where Process I/O Read: process read I/O. Process: Parse the request and calculate the result. Process I/O Write: Used to process write I/O, return network data and calculation results.
  • each network I/O thread is responsible for an event_loop for data interaction and reading and writing.
  • Process I/O Read processes the read data, then parses the request, generates a task, pushes it to the queue of the worker thread, and then notifies the worker thread for processing in the form of an asynchronous event.
  • the Process receives the asynchronous event through the worker thread, it takes out the tasks from its work queue and processes them sequentially. After the processing is completed, the results are generated and placed in the queue of the I/O thread, and then the I/O thread is notified to process in the form of an asynchronous event. , corresponding to the network I/O thread.
  • Process Write I/O receives the notification through the I/O thread, it processes the write data request in sequence.
  • the implementation of method 1 is mainly to accelerate the load performance of OLTP database through the high-performance Libeasy network framework.
  • the disadvantages of this technical solution are as follows: a.
  • the used event-driven model libev is a communication framework based on the kernel state TCP/IP network protocol stack. It will cause frequent switching between user mode and kernel mode, and multiple memory copies of kernel mode protocol stack data. Cause system resource loss and network load time delay, thus loss of database performance under OLTP.
  • the thread sharing model of the Libeasy network framework can be processed directly in the same thread after parsing the request, which saves the overhead of thread switching, and is very suitable for requests that take less time to process.
  • Ali PolarDB (Alibaba Cloud's self-developed cloud-native relational database) builds an RDMA-based database kernel engine based on new hardware technology. Through the RMDA network, the memory of this machine is directly written into the memory address of another machine. The intermediate communication protocol encoding and decoding and retransmission mechanism are all completed by the RDMA network card, which does not require CPU participation, and provides a complete set of functions running in user mode. I/O and network protocol stacks. As shown in Figure 4, the implementation methods include: PolarDB adopts a distributed cluster architecture design, and high-speed network interconnection is used between computing nodes and storage nodes.
  • Data transmission is performed through the RDMA protocol, so that I/O performance is no longer a bottleneck, and the CPU bypass of the operating system kernel is realized, thereby accelerating the performance of the database kernel.
  • DB data files, redolog, etc. are transmitted to the remote data server through the user state file system, through the block device data management route, and relying on the high-speed network and RDMA protocol.
  • the data of the data server Chunk Server adopts multiple copies to ensure the reliability of the data, and ensures the consistency of the data through the Parallel-Raft protocol. This method relies on the RDMA network card to directly transmit data from other servers to the storage area of the machine, and the data transmission is directly transmitted at the user layer, without entering the kernel mode, without using the system memory, and without any impact on the operating system .
  • method 2 uses RDMA interaction to realize the database bypass operating system kernel to accelerate load performance, it relies on the RDMA network card hardware device, which belongs to a new hardware technology. In practical applications, end-to-end physical hardware cooperation is required, which has poor flexibility and versatility. At the same time, at the software level, the implementation of the RDMA protocol requires a lot of complex adaptation and modification for the application layer database kernel to ensure its usability.
  • the embodiment of the present application firstly provides an acceleration framework for database network load performance.
  • the acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize operating system kernel bypass.
  • This pure soft technology does not rely on new network equipment and is controllable and friendly.
  • the framework decouples the database from the user-mode protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
  • the database network load performance acceleration framework provided by the embodiment of the present application runs on computer equipment, which is composed of hardware and software, and the software mainly includes an operating system and a database.
  • the acceleration framework provided by the embodiment of the present application realizes data transmission and reception from the client (referring to other devices different from the computer device) to the computer device through the network card device, and utilizes the user state network protocol stack and database software to provide database data addition, deletion, Change, check and other services. Please refer to FIG. 5 for details.
  • FIG. 5 is a schematic diagram of an acceleration framework for database network load performance provided by an embodiment of the present application.
  • the acceleration framework 500 is deployed on a computer device (such as a server) with a network card (such as a 1822 network card device).
  • the computer equipment has been deployed with an operating system and a database.
  • the user state and kernel state (including kernel state applications, such as the Linux kernel) in Figure 5 are two operating states of the operating system.
  • the acceleration framework 500 runs in the user state, specifically The following modules may be included: user state network protocol stack 501, database network architecture (also referred to as database network communication framework) 502, database multi-thread architecture 503, wherein, database network architecture 502 includes at least one database network thread, database multi-thread architecture 503 includes at least one database service thread, and the database multi-thread architecture 503 and the database network architecture 502 are connected through a communication control transceiver interface, and both the database network architecture 502 and the database multi-thread architecture 503 are included in the database kernel.
  • database network architecture also referred to as database network communication framework
  • database multi-thread architecture 503 includes at least one database service thread
  • the database multi-thread architecture 503 and the database network architecture 502 are connected through a communication control transceiver interface, and both the database network architecture
  • the acceleration framework 500 may further include a user-mode network configuration module 504, which is responsible for enabling the current operating system to have a user-mode network protocol
  • the capability of the stack 501 specifically, is used to automatically configure the user-mode network protocol stack 501 by creating a daemon process.
  • the user-mode network configuration module 504 is used to perform at least one of the following configuration operations: set DPDK user-mode Drivers, setting huge pages, setting scheduled tasks, setting KNI, setting control permissions of user mode components, etc.
  • the network card device of the computer device deployed with the acceleration framework 500 needs to support the DPDK driver, but the type of the network card device is not limited.
  • DPDK is an open source data plane development tool set, which provides an efficient packet processing library function in user mode.
  • Multiple technologies such as /buffer/queue management, load balancing based on network card multi-queue and flow identification, etc., realize the x86 (a complex instruction set introduced by Intel, which is used to control the running program of the chip)/ARM processor
  • x86 a complex instruction set introduced by Intel, which is used to control the running program of the chip
  • ARM processor With the high-performance packet forwarding capability under the architecture, users can develop various high-speed network frameworks in the user mode space.
  • the physical network card loads the DPDK driver, and maps the network card hardware registers to the user mode to realize the DPDK network card takeover.
  • the user mode network protocol stack 501 may also provide a KNI driver.
  • the acceleration framework 500 in the embodiment of the present application mainly realizes the high availability of network card DPDK takeover and KNI driver loading through daemon technology. Through the configured number of huge page memory and the mounting of hugetlbfs, DPDK can configure the huge page memory. In addition, the timing task technology can be used to implement process management under user-mode network configuration.
  • the peer device can write data to the database deployed by the computer device through the network card, that is, the process of reading data from the database; the computer device can also send the data of the database to the peer device through the network card, that is, the process of writing data to the database process.
  • the acceleration framework 500 described above in FIG. 5 the operations performed by the acceleration framework provided by the embodiment of the present application are described in detail below based on these two different data processing situations:
  • the user mode network protocol stack 501 is used to receive the initial data sent by the network card device (such as one or a plurality of initial data packets), and parse the initial data through the TCP/IP protocol stack therein to obtain the first data.
  • the user mode network protocol stack 501 may further include user mode processes and network protocol stack components, as shown in FIG. 6 , the user mode network protocol stack 501 may further include user mode State process (also can be referred to as Ltran process) 5011 and network protocol stack component (also can be referred to as dynamic library Lstack.so) 5012, that is to say, user state network protocol stack 501 is embodied as the Ltran process of user state space at the software level And the dynamic library Lstack.so.
  • the user state process 5011 shares memory with the network protocol stack component 5012, and the two exchange messages through the shared memory, including the TCP/IP protocol stack data analysis process.
  • the user mode network protocol stack 501 is specifically used to: start the user mode process 5011 in the user mode space, and the user mode process 5011 receives the initial data sent by the network card device, and the Initial data is stored in shared memory. Afterwards, the network protocol stack component 5012 analyzes the initial data based on the TCP/IP protocol stack in the shared memory to obtain the first data (that is, resolve the initial data into a data packet format that can be recognized by the computer device), and obtain the first The data is still stored in the shared memory.
  • the related service process in the database (that is, the collection of various threads on the database, embodied in the database multi-thread architecture 503 in FIG. 5 ) is dynamically linked to the network protocol stack component 5012, In order to realize the communication interface call of the entire user mode network protocol stack 501.
  • the first data parsed by the network protocol stack component 5012 is handed over to the dedicated database network thread in the database multi-thread architecture 503 through the communication interface for data sending and receiving control.
  • the dynamic link means that it is only needed at runtime, without compiling and other processes, and the database software does not depend on the network protocol stack component 5012 .
  • the network thread in the database network architecture 502 constitutes the communication pool 5021 of the database network architecture 502, and the network thread in the communication pool 5021 is used to obtain the first data parsed by the user mode network protocol stack 501, And instruct the database multi-thread framework 503 to read the first data from the database network framework 502 .
  • the resource reuse rate is improved and the system overhead is reduced.
  • the network thread in the communication pool 5021 in the database network architecture 502 is used to retrieve data from the memory shared by the two. Get the first data in.
  • the database network architecture 502 may also include a data sharing buffer, which may be called a data resource pool 5022, that is, data resource pooling is realized by creating a data sharing buffer,
  • the data resource pool 5022 can be used to store the first data from the user-mode network protocol stack 501.
  • the data resource pool 5022 is responsible for performing packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack 501 to realize dynamic Flow control and scaling.
  • the database network architecture 502 may also include a data resource pool 5022
  • the database network thread in the communication pool 5021 is used for The first data read in 501 is put into the data resource pool 5022, and the database multithreading architecture 503 is instructed to read the first data from the data resource pool 5022, thereby completing the data interaction with the user mode network protocol stack 501.
  • the database multi-thread framework 503 reads the first data through the communication control transceiver interface between the database multi-thread framework 503 and the database network framework 502 based on the instruction message of the database network framework 502 to execute
  • the service corresponding to the first data in the database may be referred to as the first service).
  • the acceleration framework provided by the embodiment of the present application decouples the business and network of the database into the above-mentioned database network architecture 502 and database multi-thread architecture 503 .
  • the communication pool 5021 and the data resource pool 5022 for data sharing are used in the database network architecture 502 to solve the high concurrency of the user mode protocol load network, and realize communication flow control and send and receive.
  • the database multi-thread architecture 503 executes the upper-level business (which can be called the second business) of the database on the computer device, the obtained data can be called the second data, and the second data will be passed by the database multi-thread architecture 503 through the database multi-
  • the communication control transceiver interface between the thread architecture 503 and the database network architecture 502 is sent to the database network architecture 502, and the database network thread in the communication pool 5021 in the database network architecture 502 is used to further send the second data to the user mode network protocol stack 501 sent.
  • the user mode network protocol stack 501 further includes a user mode process 5011 and a network protocol stack component 5012, and the user mode process 5011 shares memory with the network protocol stack component 5012 , the database network thread in the communication pool 5021 is specifically used to send the second data to the network protocol stack component 5012, and the network protocol stack component 5012 is used to store the second data in the shared memory, so that the network card device The second data is read from the shared memory.
  • the database network architecture 502 may also include a data sharing buffer (that is, the data resource pool 5022), the data resource pool 5022 may be further used to store data from the database The second data of the multithreading architecture 503 .
  • the network protocol stack in the user mode is used to replace the network protocol stack in the kernel mode, so as to avoid system performance loss caused by mode switching and memory copying. That is to say, it eliminates the switching overhead of the operating system and reduces the data copying from the kernel to the user process, thus liberating memory bandwidth and CPU cycles to improve application system performance and database network load performance.
  • the main thread (one of the business threads) is responsible for the execution and scheduling of the entire business layer;
  • CommProxyLayer is the communication interface layer of the communication pool (that is, the communication control transceiver interface between the database business thread and the data resource pool 5022 in Figure 6), responsible for providing calls to the business layer;
  • CommCoreLayer is the proprietary network sending and receiving thread entity layer of the communication pool (that is, the database network architecture 502 in Figure 6, the buffer in Figure 7 is the data resource pool, and the communicator is the database network thread, composed of multiple communicators communication pool), responsible for the control of network message sending and receiving, and completing the data communication processing of the protocol stack database;
  • LtranProcess refers to the network thread of the user state network protocol stack (that is, the user state process 5011 in Figure 6), responsible for the user state network protocol stack and
  • the network card device performs data
  • FIG. 8 is a schematic diagram of the interaction between the database multi-threaded architecture and the database network architecture provided by the embodiment of the present application.
  • the communication pool (that is, the communication pool 5021 in the above-mentioned figure 6) composed of the sending and receiving process (that is, the database network thread) comm_proxy is used to provide the ability of network thread control processing, simplex receiving, and simplex sending, and is responsible for communicating with the user mode protocol stack at the same time.
  • Send and receive data, and perform data interaction with the database business thread that is, the worker in Figure 8).
  • the proxy1, proxy2, proxy3, ... in Figure 8 are different database network threads in the communication pool.
  • the data resource pool formed by the ring buffer of the data buffer (that is, the data resource pool 5022 in the above-mentioned figure 6) is responsible for the caching of network and business communication data, and realizes multi-concurrent data sending and receiving control and reading and writing of database services through atomic operations.
  • Buffer can be dynamically expanded, data flow control and message batch processing.
  • FIG. 9 is a process flow of the method for accelerating the database network load performance provided by the embodiment of the present application
  • the schematic diagram may specifically include the following steps:
  • the computer device obtains initial data from the network card device through the user mode network protocol stack, and parses the initial data through the TCP/IP protocol stack to obtain first data.
  • the computer device When the peer device sends data (which can be referred to as initial data) to the computer device through the network card device, the computer device receives the initial data sent by the peer device from the network card device through the user mode network protocol stack, and further transmits data in the user mode network protocol
  • the initial data is parsed through the TCP/IP protocol stack in the stack to obtain the first data.
  • the user-mode network protocol stack can be configured by the user-mode network configuration module deployed on the computer device by creating a daemon process.
  • the database on the computer device is firstly responsible for enabling user-mode network configuration during the installation and startup phase.
  • creating a daemon process performs at least one of the following configuration operations: setting the data plane development kit DPDK user-mode driver, setting large page memory , Set the timing task, set the kernel virtual network card KNI, set the control authority of the user mode component.
  • creating a daemon process performs at least one of the following configuration operations: setting the data plane development kit DPDK user-mode driver, setting large page memory , Set the timing task, set the kernel virtual network card KNI, set the control authority of the user mode component.
  • the specific process of the above-mentioned configuration operation performed by the daemon process can refer to the above-mentioned operation process of the user-mode network configuration module 504 , which will not be repeated here.
  • the user-mode network protocol stack may further include a user-mode process (also referred to as an Ltran process) and a network protocol stack component (also referred to as a dynamic library Lstack.so ), that is to say, the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level.
  • the user state process shares memory with the network protocol stack components, and the two exchange messages through the shared memory, including the protocol stack data analysis process of TCP/IP.
  • the computer device receives the initial data sent by the network card device through the user state process, and stores it in the memory shared by the user state process and the network protocol stack component.
  • the computer device uses the network protocol stack component in the shared memory based on The TCP/IP protocol stack parses the initial data to obtain the first data (that is, parse the initial data into a data packet format that can be recognized by the computer device), and the obtained first data is still stored in the shared memory.
  • the computer device completes the configuration of the user-mode network protocol stack
  • the user-mode network configuration module deployed by the computer device completes the configuration of the user-mode network protocol stack
  • further configuration is required.
  • the communication pool in the database network architecture (the communication pool is composed of at least one database network thread) is created and initialized, and the required communication control sending and receiving interface between the database network architecture and the database multi-thread architecture is initialized.
  • the computer device obtains the first data from the user-mode network protocol stack through at least one database network thread, and instructs the database multi-thread architecture to read the first data from the database network architecture.
  • the database network thread belongs to the database network architecture, and the database multi-thread The architecture includes at least one database business thread.
  • the computer device After the user-mode network protocol stack receives the initial data and parses the first data, the computer device will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing) to obtain the first data from the user state network protocol stack, for example, the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic Modes such as viewing, waking up and viewing) obtain the first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) from the database network architecture.
  • the first data is read, wherein the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture.
  • the database can start the back-end listening process and background business thread required by its own business (both of which belong to different types of processes in the database business thread), and realize high-concurrency downloading based on multiple I/O multiplexing. Listen to the communication event of the communication pool, and perform data interaction with the data resource pool in the database network architecture by calling the communication control sending and receiving interface provided by the communication pool (for example, reading the first data from the database network architecture).
  • the database network architecture may also include a data sharing buffer in addition to a communication pool, and the data sharing buffer may be called a data resource pool, that is, by creating a data sharing buffer to realize pooling.
  • the data resource pool is responsible for packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack, so as to realize dynamic flow control and scaling and expansion.
  • the database network thread in the communication pool can be based on the polling mode (or other modes, such as periodic checking, wake-up checking)
  • the first data is stored in the data resource pool.
  • the database network architecture also includes a data resource pool
  • the computer device completes the configuration of the user-mode network protocol stack, for example, the user-mode network protocol stack is completed through the user-mode network configuration module deployed by the computer device
  • the required communication control transceiver interface in addition to creating and initializing the communication pool in the database network architecture, you also need to create and initialize the data resource pool in the database network architecture, and initialize the communication pool between the database network architecture and the database multi-threaded architecture.
  • the computer device reads the first data through the communication control transceiver interface between the database multi-thread architecture and the database network architecture through the database multi-thread architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
  • the computer device reads the first data through the database multi-thread architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
  • the database on the computer device may also have the data writing status in addition to the status of reading data (that is, receiving data from the network card device). Therefore, the method of the embodiment of the present application may also include: after the database multi-thread architecture executes the upper-level business (which may be called the second business) of the database on the computer device, the obtained data may be called the second data, and then the computer device
  • the second data will be sent to the database network architecture through the database multithread architecture through the communication control transceiver interface between the database multithread architecture and the database network architecture, and then sent to the database network architecture through the database network thread in the communication pool in the database network architecture.
  • the second data is further sent to the user mode network protocol stack.
  • the computer device passes the database
  • the way in which the network thread sends the second data to the user mode network protocol stack may specifically be: the computer device sends the second data to the network protocol stack component through the database network thread, and then the network protocol stack component stores the received second data stored in the shared memory, so that the network card device reads the second data from the shared memory.
  • the database network architecture can also include a data sharing buffer (that is, a data resource pool)
  • the sending and receiving interface is controlled via communication
  • the acceleration method further includes: the computer device stores the second data in the data resource pool through the database network thread.
  • the following computer equipment includes user-mode network protocol stack, database network architecture, database multi-thread architecture, user-mode network configuration module, and database network architecture includes communication pool and resource data pool
  • user-mode network protocol stack includes user-mode
  • FIG. 10 is the acceleration of the database network load performance provided by the embodiment of the present application
  • a core implementation flowchart of the method may specifically include the following core steps:
  • the database on the computer device is installed and started, and the user mode network configuration module is firstly responsible for enabling the user mode network configuration.
  • Create a daemon process to automate the deployment of user-mode network protocol stacks such as DPDK takeover, driver loading, configuration memory large pages, and start user-mode processes, and achieve high availability of user-mode networks.
  • the communication pool and data resource pool in the database network architecture are created and initialized, and the upper-layer business application (that is, the database in the multi-threaded architecture) is initialized.
  • database business thread required communication control transceiver interface.
  • step 3 the database starts to start the back-end monitoring process and the background business thread required by its own business (both of which belong to the process in the database business thread). Realize communication event monitoring under high concurrency based on multiple I/O multiplexing, and start data interaction by invoking the communication control transceiver interface provided by the communication pool.
  • Step 4 The upper-layer business calls the communication pool control interface. Specifically, each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing. The data is stored in the data resource pool.
  • the data resource pool is responsible for packet aggregation and batch sending and receiving of the data of the network protocol stack, so as to realize dynamic flow control and shrink and expand.
  • Step 6 when the upper-level database business thread senses the communication event, the database business thread uses the simplex receive blocking interface to read data from the data resource pool, or uses the simplex send asynchronous interface to write data to the data resource pool ( That is, write data), so as to complete the entire data interaction process between the business layer and the communication layer.
  • Figure 11 is an implementation process of an acceleration method for database network load performance based on the acceleration framework provided above. As shown in the figure, the main implementation steps of the acceleration framework are as follows:
  • Step1 in the business call of the database, use the comm_XXX interface to replace the original call interface, for example, comm_recv replaces recv, comm_send replaces send, and comm_socket replaces socket. Externally, it only perceives changes in the interface call layer.
  • Step2 create a socket request: comm_proxy_socket uses PORTREUSE to create a server fd for each network proxy thread, and realizes the logical mapping of the user mode network protocol stack fd.
  • the server fd monitoring established in the embodiment of this application, in order to cope with the limitation that fd cannot cross threads, so that all network threads can monitor this fd, thereby creating a new connection, this application uses the form of REUSEPORT to allow each network thread entity Go to listen/bind this server fd address.
  • Step3 fd broadcast: broadcast broadcasts server fd to each network agent thread.
  • step4 event-driven model based on multiple I/O multiplexing: use epoll multi-channel I/O model to realize event processing of protocol stack data.
  • Step5 the database network thread processes fd control messages: socket/accept/poll/epoll_wait/epoll_ctl... and other socket communication control transceiver interfaces are implemented. All business session fds are created and modified from here to realize fd processing without crossing threads. .
  • Step6 data receiving: implement a simplex receiving mode, add all data fd to the epoll fd of the network proxy thread, receive data requests from the user-mode network protocol stack by polling, and send data from the user-mode network The data of the protocol stack is put into the recv buffer.
  • Step7 data sending: implement a simplex sending mode, when the business session needs to send data, the corresponding data is appended to the send buffer of the corresponding fd, and the network agent thread uniformly processes all the monitored data of the fd, and finally the package is pressed Need to send.
  • the user-mode network protocol stack is used to replace the kernel-mode network protocol stack, and the operating system bypass is realized.
  • this pure soft technology does not rely on new network equipment, and is controllable and friendly.
  • the communication resource pooling technology realizes the communication pool without inter-threading, message wholesale processing, and ring buffer data read/write, effectively reducing the performance loss caused by the switching of database business threads and database network threads.
  • the database network load performance acceleration framework specifically includes a user-mode network protocol stack, a database network architecture, and a database multi-thread architecture.
  • it may additionally include a user-mode network configuration module , which is used to automatically configure the user-mode network protocol stack by creating a daemon process.
  • the network card device receives the initial data sent by the peer device, and sends the initial data to the user mode process of the acceleration framework, and the user mode process further puts the initial data into the network protocol.
  • the network protocol stack component will parse the initial data into a format that the server can recognize, and the first data obtained after parsing will also be put into the shared memory.
  • the database network thread in the communication pool will take the first data out of the shared memory and put it in the data resource pool in a polling manner (or other methods, not limited here) from the shared memory, and then Notify the corresponding database business thread in the database multi-thread architecture to read the first data, and the corresponding database business thread reads the first data and uses it to execute the first task corresponding to the first data, and the first task can be executed
  • the communication pool may or may not be notified, which is not limited in this application.
  • the business thread in the database multi-thread architecture directly puts the data (i.e. the second data) after executing the task actively into the data resource pool in the database network architecture.
  • the database network thread in the thread pool will put the second data of the data resource pool into the shared memory of the user state process and the network protocol stack component, so that the network card device can read the second data from the shared memory.
  • FIG. 12 is a schematic structural diagram of a computer device provided by the embodiment of the present application, which may specifically include: an analysis module 1201, an acquisition module 1202, and a read-write module 1203, wherein the analysis module 1201 is used for the computer device to pass the user
  • the state network protocol stack obtains initial data from the network card device, and parses the initial data through the TCP/IP protocol stack to obtain the first data;
  • the obtaining module 1202 is used to obtain the first data from the user state network protocol stack through at least one database network thread
  • the read-write module 1203 is used to Through the database multi-thread architecture, the first data is read through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and the first data corresponding to the first data in the database is executed according to the first data. Task.
  • the user-mode network protocol stack is configured by a user-mode network configuration module deployed on the computer device by creating a daemon process.
  • the daemon process performs at least one of the following configuration operations: setting the data plane development kit DPDK user mode driver, setting huge page memory, setting scheduled tasks, setting the kernel virtual network card KNI, and setting the control authority of user mode components .
  • the read-write module 1203 is also configured to: send second data to the database network architecture via the communication control transceiver interface through the database multi-thread architecture, the second data being the at least one The data obtained after the service thread executes the second service in the database; the second data is sent to the user state network protocol stack through the at least one database network thread.
  • the user mode network protocol stack includes a user mode process and a network protocol stack component, the user mode process and the network protocol stack component share memory, and the parsing module 1201 is specifically used to:
  • the process receives the initial data sent by the network card device and stores it in the memory;
  • the network protocol stack component parses the initial data in the memory based on the TCP/IP protocol stack to obtain the first data, and the first data is stored in the memory .
  • the user-mode network protocol stack includes a user-mode process and a network protocol stack component, the user-mode process and the network protocol stack component share memory, and the read-write module 1203 is specifically configured to pass the at least one The database network thread sends the second data to the network protocol stack component; the parsing module 1201 is further configured to store the second data in the memory through the network protocol stack component.
  • the database network architecture further includes a data sharing buffer
  • the acquiring module 1202 is specifically configured to: after acquiring the first data from the user mode network protocol stack through at least one database network thread, through the At least one database network thread stores the first data in the data sharing buffer.
  • the database network architecture further includes a data sharing buffer buffer, and the obtaining 1202 module is specifically configured to: store the second data in the data sharing buffer through the at least one database network thread.
  • FIG. 12 corresponds to the information interaction and execution process among the various modules/units in the computer device 1200 described in the embodiment, and the method embodiments corresponding to FIGS. 9 to 11 in this application are based on the same idea.
  • FIG. 13 is a schematic structural diagram of the computer device provided by the embodiment of the present application.
  • the computer device 1300 can be deployed with The described modules are used to implement the functions of the computer device 1200 in the embodiment corresponding to FIG. 12 .
  • the computer device 1300 is realized by one or more servers, and the computer device 1300 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1322 (for example, one or one central processing unit) and memory 1332, one or more storage media 1330 (such as one or more mass storage devices) for storing application programs 1342 or data 1344.
  • CPU central processing units
  • storage media 1330 such as one or more mass storage devices
  • the memory 1332 and the storage medium 1330 may be temporary storage or persistent storage.
  • the program stored in the storage medium 1330 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the computer device 1300 .
  • the central processing unit 1322 may be configured to communicate with the storage medium 1330 , and execute a series of instruction operations in the storage medium 1330 on the computer device 1300 .
  • Computer device 1300 can also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input and output interfaces 1358, and/or, one or more operating systems 1341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • one or more operating systems 1341 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the computer device 1300 can be used to execute the steps performed by the computer device in the embodiments corresponding to FIGS. 9 to 11.
  • the central processing unit 1322 can be used to: (can be referred to as initial data), the initial data sent by the peer device is received from the network card device through the user-mode network protocol stack, and the initial data is further parsed through the TCP/IP protocol stack in the user-mode network protocol stack to obtain the first data.
  • the user mode network protocol stack After the user mode network protocol stack receives the initial data and parses the first data, it will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing.
  • the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic inspection, Wake up the mode such as viewing) obtain this first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) to read from the database network architecture
  • polling mode also can be other modes, such as periodic inspection, Wake up the mode such as viewing
  • the database service thread in the database multi-thread architecture the database multi-thread architecture includes at least one database service thread
  • the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture.
  • the first data is read through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and the first task corresponding to the first data in the database is executed according to the first data.
  • the central processing unit 1322 is configured to execute any one of the steps executed by the computer device in the embodiments corresponding to FIG. 9 to FIG. 11 .
  • the central processing unit 1322 is configured to execute any one of the steps executed by the computer device in the embodiments corresponding to FIG. 9 to FIG. 11 .
  • An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a program for signal processing, and when it is run on a computer, the computer executes the program described in the foregoing embodiments. Steps performed by computer equipment.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be A physical unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
  • the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal A computer, a training device, or a network device, etc.) executes the methods described in various embodiments of the present application.
  • a computer device which can be a personal A computer, a training device, or a network device, etc.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, training device, or data
  • the center transmits to another website site, computer, training device or data center via wired (eg coaxial cable, optical fiber, digital subscriber line) or wireless (eg infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (for example, a solid state disk (solid state disk) , SSD)) etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computer And Data Communications (AREA)

Abstract

The present application discloses an acceleration framework and acceleration method for database network load performance, and a device. The framework comprises: a user mode network protocol stack, a database network architecture, and a database multi-thread architecture. The database network architecture comprises a database network thread. The database multi-thread architecture comprises a database service thread. The user mode network protocol stack is used for receiving initial data sent by a network card device, and analyzing the initial data by means of a TCP/IP protocol stack to obtain first data. The database network thread is used for obtaining the first data. The database multi-thread architecture is used for reading the first data by means of a communication control transceiving interface between the database multi-thread architecture and the database network architecture, to execute a corresponding service. According to the present application, the user mode network protocol stack is used to replace a kernel mode network protocol stack to implement kernel bypass of an operating system. Such a pure software technique does not rely on a novel network device, and has good controllability. Moreover, the acceleration framework decouples a database from the user mode network protocol stack, such that high user mode network concurrency can be dealt with.

Description

一种数据库网络负载性能的加速框架、加速方法及设备An acceleration framework, acceleration method and device for database network load performance
本申请要求于2021年9月27日提交中国专利局、申请号为202111136877.X、申请名称为“一种数据库网络负载性能的加速框架、加速方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on September 27, 2021, with the application number 202111136877.X and the application name "A framework, method and device for accelerating database network load performance". The entire contents are incorporated by reference in this application.
技术领域technical field
本申请涉及数据库领域,尤其涉及一种数据库网络负载性能的加速框架、加速方法及设备。The present application relates to the field of databases, and in particular to an acceleration framework, acceleration method and equipment for database network load performance.
背景技术Background technique
数据库(database,DB)是按一定结构组织并长期存储在计算机内的、可共享的大量数据的有机集合。数据库***(database system,DBS)则是实现有组织的、动态地存储大量关联数据、方便多用户访问的计算机软件、硬件和数据资源组成的***。影响数据库网络负载性能的主要因素包括网络负载时延、中央处理器(central processing unit,CPU)使用率、磁盘输入/输出(input/output,I/O)、内存使用效率以及数据库内核技术。当前加速数据库网络负载性能的技术主要包括:软件层面深挖数据库内核技术、软硬结合CPU、新型存储介质上。A database (database, DB) is an organic collection of large amounts of shared data organized according to a certain structure and stored in a computer for a long time. A database system (database system, DBS) is a system composed of computer software, hardware and data resources that realizes organized and dynamic storage of a large amount of associated data and facilitates multi-user access. The main factors affecting database network load performance include network load delay, central processing unit (CPU) usage, disk input/output (I/O), memory usage efficiency, and database kernel technology. The current technologies to accelerate database network load performance mainly include: deep mining of database kernel technology at the software level, combination of software and hardware CPU, and new storage media.
网络协议及数据库之间的通信是影响数据库***性能的重要因素。为此,基于网络协议及数据库之间的通信改善数据库***性能的方式有:1)阿里OceanBase(一个支持海量数据的高性能分布式数据库***)提出的通过高性能Libeasy网络框架加速联机事务处理过程(on-line transaction processing,OLTP)的负载性能,该网络框架基于内核态网络协议栈的事件驱动模型Libev实现,其使用协程来管理任务调度。2)阿里PolarDB(阿里云自研的云原生关系型数据库)基于新型硬件技术搭建基于远程直接数据读取(remote direct memory access,RDMA)的数据库内核引擎,通过RDMA直接将本机的内存写入另一台机器的内存地址,中间的通讯协议编解码、重传机制都由RDMA网卡来完成,不需要CPU参与。Communication between network protocols and databases is an important factor affecting the performance of database systems. For this reason, the ways to improve the performance of the database system based on network protocols and communication between databases include: 1) Ali OceanBase (a high-performance distributed database system supporting massive data) proposes to accelerate online transaction processing through the high-performance Libeasy network framework (on-line transaction processing, OLTP) load performance, the network framework is implemented based on the event-driven model Libev of the kernel state network protocol stack, which uses coroutines to manage task scheduling. 2) Ali PolarDB (Alibaba Cloud's self-developed cloud-native relational database) builds a database kernel engine based on remote direct memory access (RDMA) based on new hardware technology, and directly writes the local memory through RDMA The memory address of another machine, the encoding and decoding of the communication protocol in the middle, and the retransmission mechanism are all completed by the RDMA network card without CPU participation.
然而,上述方式1会带来用户态与内核态之间的频繁切换、内核态协议栈数据多次内存拷贝等,造成***资源损耗和网络负载时间延迟,从而损失数据库性能;上述方式2虽然利用RDMA交互实现数据库旁路操作***(operating system,OS)内核实现负载性能加速,但其依赖RDMA网卡硬件设备,属于一种硬件新型技术。在实际应用场合中需要端到端的物理硬件配合,灵活性和通用性差,同时在软件层面,RDMA协议的实现对于应用层数据库内核而言,需要大量复杂适配修改才能保证其可用性。However, the above method 1 will cause frequent switching between user mode and kernel mode, multiple memory copies of kernel mode protocol stack data, etc., resulting in system resource loss and network load time delay, thereby losing database performance; the above method 2 uses RDMA interaction realizes the database bypass operating system (OS) kernel to accelerate load performance, but it relies on RDMA network card hardware equipment, which belongs to a new hardware technology. In practical applications, end-to-end physical hardware cooperation is required, which has poor flexibility and versatility. At the same time, at the software level, the implementation of the RDMA protocol requires a lot of complex adaptation and modification for the application layer database kernel to ensure its usability.
发明内容Contents of the invention
本申请实施例提供了一种数据库网络负载性能的加速框架、加速方法及设备,该加速框架用用户态网络协议栈替换内核态网络协议栈,实现操作***内核旁路,这种纯软技术,不依赖新型网络设备,可控性友好。并且,该框架将数据库与用户态网络协议栈解耦,以应对用户态网络高并发;将传统数据库的业务与通信解耦,降低了***开销。The embodiment of the present application provides an acceleration framework, acceleration method and equipment for database network load performance. The acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize operating system kernel bypass. This pure soft technology, It does not rely on new network equipment and is controllable and friendly. Moreover, the framework decouples the database from the user-mode network protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
基于此,本申请实施例提供以下技术方案:Based on this, the embodiment of the present application provides the following technical solutions:
第一方面,本申请实施例首先提供一种数据库网络负载性能的加速框架,可用于数据库领域中,本申请实施例所提供的数据库网络负载性能的加速框架是运行在计算机设备上的,计算机设备由硬件和软件构成,而其中软件主要包括操作***和数据库。本申请实施例提供的加速框架通过网卡设备实现客户机(是指不同于该计算机设备的其他设备)到该计算机设备的数据收发,利用用户态网络协议栈和数据库软件提供数据库数据增、删、改、查等服务。该加速框架包括:用户态网络协议栈、数据库网络架构(也可称为数据库网络通信框架)、数据库多线程架构,数据库网络架构包括至少一个数据库网络线程,数据库多线程架构包括至少一个数据库业务线程,数据库多线程架构与数据库网络架构之间通过通信控制收发接口连接,数据库网络架构与数据库多线程架构均包含于数据库内核中。当对端设备通过网卡设备向计算机设备发送数据(可称为初始数据)时,用户态网络协议栈,用于接收由该网卡设备发送的初始数据(如,对端设备发送的一个或多个初始数据包),并通过其内的TCP/IP协议栈解析该初始数据,得到第一数据;数据库网络架构包括的至少一个数据库网络线程,用于获取该第一数据,并指示数据库多线程架构从数据库网络架构中读取该第一数据;数据库多线程架构,用于通过数据库多线程架构与数据库网络架构之间的通信控制收发接口读取该第一数据,以执行该数据库中与该第一数据对应的业务(可称为第一业务)。In the first aspect, the embodiment of the present application firstly provides an acceleration framework for database network load performance, which can be used in the field of databases. The acceleration framework for database network load performance provided by the embodiment of the present application runs on computer equipment, and the computer equipment It is composed of hardware and software, and the software mainly includes operating system and database. The acceleration framework provided by the embodiment of the present application realizes data transmission and reception from the client (referring to other devices different from the computer device) to the computer device through the network card device, and utilizes the user state network protocol stack and database software to provide database data addition, deletion, Change, check and other services. The acceleration framework includes: a user-mode network protocol stack, a database network architecture (also referred to as a database network communication framework), a database multi-thread architecture, the database network architecture includes at least one database network thread, and the database multi-thread architecture includes at least one database business thread The database multi-thread architecture and the database network architecture are connected through a communication control transceiver interface, and both the database network architecture and the database multi-thread architecture are included in the database kernel. When the peer device sends data (which may be referred to as initial data) to the computer device through the network card device, the user mode network protocol stack is used to receive the initial data sent by the network card device (for example, one or more initial data packet), and parse the initial data through the TCP/IP protocol stack in it to obtain the first data; at least one database network thread included in the database network architecture is used to obtain the first data, and indicates the database multi-threaded architecture Read the first data from the database network architecture; the database multi-thread architecture is used to read the first data through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, so as to execute the database and the first data. A service corresponding to data (may be referred to as a first service).
在本申请上述实施方式中,加速框架用用户态网络协议栈替换内核态网络协议栈,由于用户态网络协议栈处于操作***的用户态上,从而可实现操作***内核旁路,这种纯软技术,不依赖新型网络设备,可控性友好。并且,该加速框架将数据库与用户态网络协议栈解耦,可应对用户态网络高并发。此外,数据库多线程架构与数据库网络架构中的数据库网络线程之间的交互方式使得传统数据库的业务与通信解耦(数据库网络线程与数据库业务线程互为消费者和生产者)了,从而降低了***开销。In the above embodiments of the present application, the acceleration framework replaces the kernel mode network protocol stack with the user mode network protocol stack. Since the user mode network protocol stack is in the user mode of the operating system, the operating system kernel bypass can be realized. This pure software Technology, does not rely on new network equipment, and is controllable and friendly. Moreover, the acceleration framework decouples the database from the user-mode network protocol stack, which can cope with the high concurrency of the user-mode network. In addition, the interaction between the database multi-thread architecture and the database network thread in the database network architecture decouples the business and communication of the traditional database (the database network thread and the database business thread are consumers and producers of each other), thereby reducing System overhead.
在第一方面的一种可能的实现方式中,该加速框架还可以包括用户态网络配置模块,用户态网络配置模块,用于通过创建守护进程对用户态网络协议栈进行配置。In a possible implementation manner of the first aspect, the acceleration framework may further include a user mode network configuration module configured to configure the user mode network protocol stack by creating a daemon process.
在本申请上述实施方式中,加速框架还可以包括用户态网络配置模块,负责使能计算机设备当前的操作***使其拥有用户态网络协议栈的能力,从而实现对用户态网络协议栈的自动化部署,配置过程更加便捷。In the above embodiments of the present application, the acceleration framework may also include a user-mode network configuration module, which is responsible for enabling the current operating system of the computer device to have the capability of the user-mode network protocol stack, thereby realizing automatic deployment of the user-mode network protocol stack , the configuration process is more convenient.
在第一方面的一种可能的实现方式中,用户态网络配置模块,具体用于至少执行以下一种配置操作:设置数据平面开发套件(data plane development kit,DPDK)用户态驱动、设置大页内存(huge pages)、设置定时任务、设置内核虚拟网卡(kernel nic interface,KNI)、设置用户态组件的控制权限等。其中,定时任务负责管控用户态网络协议栈的配置环境,保证数据库在使用用户态网络协议栈期间的高可用。In a possible implementation of the first aspect, the user-mode network configuration module is specifically configured to perform at least one of the following configuration operations: setting data plane development kit (data plane development kit, DPDK) user-mode drivers, setting large pages Memory (huge pages), setting timing tasks, setting kernel virtual network card (kernel nic interface, KNI), setting control permissions of user mode components, etc. Among them, the timing task is responsible for managing and controlling the configuration environment of the user-mode network protocol stack to ensure the high availability of the database during the use of the user-mode network protocol stack.
在本申请上述实施方式中,具体阐述了用户态网络配置模块进行的配置操作可以包括哪些方面,具备可操作性以及灵活性。In the above-mentioned embodiments of the present application, what aspects may be included in the configuration operation performed by the network configuration module in the user mode is specifically described, and it has operability and flexibility.
在第一方面的一种可能的实现方式中,当数据库多线程架构执行完计算机设备上数据库的上层业务(可称为第二业务)后,得到的数据可称为第二数据,该数据库多线程架构,还可以用于通过数据库多线程架构与数据库网络架构之间的通信控制收发接口将第二数据 向数据库网络架构发送。数据库网络架构中的至少一个数据库网络线程,就用于将该第二数据进一步向用户态网络协议栈发送。In a possible implementation of the first aspect, after the database multi-thread architecture executes the upper-layer business (which can be called the second business) of the database on the computer device, the obtained data can be called the second data. The thread architecture can also be used to send the second data to the database network architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture. At least one database network thread in the database network architecture is used to further send the second data to the user mode network protocol stack.
在本申请上述实施方式中,具体阐述了当处于计算机设备上的数据库写数据的情形时,数据库多线程架构与数据库网络架构中的数据库网络线程之间的交互过程,将传统数据库的业务与通信解耦(数据库网络线程与数据库业务线程互为消费者和生产者),从而降低了***开销。In the above-mentioned embodiments of the present application, the interaction process between the database multi-thread architecture and the database network thread in the database network architecture is specifically described when the database on the computer device writes data, and the traditional database business and communication Decoupling (the database network thread and the database business thread are mutual consumers and producers), thereby reducing system overhead.
在第一方面的一种可能的实现方式中,该用户态网络协议栈包括用户态进程(也可称为Ltran进程)以及网络协议栈组件(也可称为动态库Lstack.so),也就是说,用户态网络协议栈在软件层面体现为用户态空间的Ltran进程以及动态库Lstack.so。在这种情况下,用户态网络协议栈,具体用于:在用户态空间启动该用户态进程,通过该用户态进程接收由网卡设备发送的初始数据,并将该初始数据存放于共享的内存中;之后,通过该网络协议栈组件在共享的内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据(即将初始数据解析成计算机设备能识别的数据包格式),得到的第一数据依然存放于该共享的内存。In a possible implementation of the first aspect, the user state network protocol stack includes a user state process (also called an Ltran process) and a network protocol stack component (also called a dynamic library Lstack.so), that is In other words, the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level. In this case, the user mode network protocol stack is specifically used to: start the user mode process in the user mode space, receive the initial data sent by the network card device through the user mode process, and store the initial data in the shared memory Afterwards, the initial data is analyzed based on the TCP/IP protocol stack in the shared memory by the network protocol stack component to obtain the first data (that is, the initial data is resolved into a data packet format that can be recognized by the computer device), and the obtained first data is obtained. A data is still stored in the shared memory.
在本申请上述实施方式中,该用户态网络协议栈可以进一步包括用户态进程以及网络协议栈组件,该用户态进程与网络协议栈组件共享内存,两者通过共享内存的方式进行报文交互,包括TCP/IP的协议栈数据解析过程,具备可实现性。In the above-mentioned embodiment of the present application, the user-mode network protocol stack may further include a user-mode process and a network protocol stack component, the user-mode process shares memory with the network protocol stack component, and the two perform message interaction through shared memory, Including the protocol stack data analysis process of TCP/IP, it is achievable.
在第一方面的一种可能的实现方式中,该用户态网络协议栈包括用户态进程(也可称为Ltran进程)以及网络协议栈组件(也可称为动态库Lstack.so),也就是说,用户态网络协议栈在软件层面体现为用户态空间的Ltran进程以及动态库Lstack.so。在这种情况下,当处于计算机设备上的数据库写数据的情形时,通信池中的数据库网络线程,具体用于将该第二数据向网络协议栈组件发送,该网络协议栈组件,则具体还用于将该第二数据存放于共享的内存中。In a possible implementation of the first aspect, the user state network protocol stack includes a user state process (also called an Ltran process) and a network protocol stack component (also called a dynamic library Lstack.so), that is In other words, the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level. In this case, when the database on the computer device is writing data, the database network thread in the communication pool is specifically used to send the second data to the network protocol stack component, and the network protocol stack component is specifically used to send the second data to the network protocol stack component. It is also used to store the second data in the shared memory.
在本申请上述实施方式中,具体阐述了用户态进程以及网络协议栈组件共享的内存中还可以用于存储来自数据库上层业务处理后的第二数据,以便于网卡设备从该共享的内存中读取该第二数据,具有广泛适用性。In the above embodiments of the present application, it has been specifically explained that the memory shared by the user state process and the network protocol stack components can also be used to store the second data after business processing from the upper layer of the database, so that the network card device can read from the shared memory. Taking the second data has wide applicability.
在第一方面的一种可能的实现方式中,数据库网络架构还可以包括数据共享缓冲器(buffer),该数据共享buffer可称为数据资源池,即通过创建数据共享buffer实现数据资源池化,该数据资源池,用于存储来自用户态网络协议栈的第一数据。In a possible implementation of the first aspect, the database network architecture may further include a data sharing buffer (buffer), and the data sharing buffer may be called a data resource pool, that is, data resource pooling is realized by creating a data sharing buffer, The data resource pool is used to store the first data from the user mode network protocol stack.
在本申请上述实施方式中,通过创建数据共享buffer,实现在多并发场景下的网络协议栈数据(即第一数据)的资源共享和缓存区复用,减少数据拷贝和资源创建的开销。In the above embodiments of the present application, by creating a data sharing buffer, resource sharing and buffer multiplexing of network protocol stack data (ie, first data) in a multi-concurrency scenario are realized, reducing the overhead of data copying and resource creation.
在第一方面的一种可能的实现方式中,数据库网络架构还可以包括数据共享buffer,该数据共享buffer可称为数据资源池,即通过创建数据共享buffer实现数据资源池化,该数据资源池负责将用户态网络协议栈的数据进行包聚合和/或批量收发,以实现动态流控和缩扩容。具体地,该数据资源池,用于存储来自数据库多线程架构的第二数据。In a possible implementation of the first aspect, the database network architecture may further include a data sharing buffer, which may be called a data resource pool, that is, data resource pooling is realized by creating a data sharing buffer, and the data resource pool Responsible for packet aggregation and/or batch sending and receiving of data in the user-mode network protocol stack to achieve dynamic flow control and scaling. Specifically, the data resource pool is used to store the second data from the database multi-thread architecture.
在本申请上述实施方式中,通过创建数据共享buffer,实现在多并发场景下的数据库业务数据(即第二数据)的资源共享和缓存区复用,减少数据拷贝和资源创建的开销。In the above embodiments of the present application, by creating a data sharing buffer, resource sharing and buffer reuse of database service data (ie, second data) in multiple concurrent scenarios are realized, reducing the overhead of data copying and resource creation.
在第一方面的一种可能的实现方式中,在数据库网络架构还可以包括数据资源池的情况下,该通信池中的数据库网络线程,就用于从用户态网络协议栈中读取的第一数据放入 该数据资源池中,并指示数据库多线程架构从该数据资源池中读取第一数据,从而完成与用户态网络协议栈的数据交互。In a possible implementation of the first aspect, when the database network architecture can also include a data resource pool, the database network thread in the communication pool is used to read the first A piece of data is put into the data resource pool, and the database multi-thread framework is instructed to read the first data from the data resource pool, thereby completing the data interaction with the user-mode network protocol stack.
在本申请上述实施方式中,具体阐述了数据库网络线程获取的第一数据,都存在数据资源池中,数据库业务线程也是从该数据资源池中调用该第一数据,实现了数据库的流控。In the above embodiments of the present application, it is specifically explained that the first data obtained by the database network thread is stored in the data resource pool, and the database service thread also calls the first data from the data resource pool, thereby realizing the flow control of the database.
本申请实施例第二方面提供一种数据库我网络负载性能的加速方法,该方法包括:当对端设备通过网卡设备向计算机设备发送数据(可称为初始数据)时,计算机设备通过用户态网络协议栈从网卡设备处接收到对端设备发送的初始数据,并进一步在用户态网络协议栈内通过TCP/IP协议栈解析该初始数据,得到该第一数据。用户态网络协议栈接收到该初始数据,并解析得到第一数据后,计算机设备会通过通信池(该通信池由至少一个数据库网络线程组成,在通信池中的每个数据库网络线程负责报文控制处理和报文收发处理)中的网络线程从用户态网络协议栈中获取该第一数据,例如,通信池中的数据库网络线程可以基于轮询模式(也可以是别的模式,如周期性查看、唤醒查看等模式)从用户态网络协议栈中获取该第一数据,并进一步指示数据库多线程架构中的数据库业务线程(该数据库多线程架构包括至少一个数据库业务线程)从数据库网络架构中读取该第一数据,其中,该通信池属于该数据库网络架构,也就是数据库网络线程属于该数据库网络架构。最后,计算机设备通过数据库多线程架构,经由数据库多线程架构与数据库网络架构之间的通信控制收发接口读取该第一数据,并根据该第一数据执行数据库中与第一数据对应的第一任务。The second aspect of the embodiment of the present application provides a method for accelerating the load performance of the database and the network. The method includes: when the peer device sends data (which can be referred to as initial data) to the computer device through the network card device, the computer device passes through the user state network. The protocol stack receives the initial data sent by the peer device from the network card device, and further parses the initial data through the TCP/IP protocol stack in the user mode network protocol stack to obtain the first data. After the user-mode network protocol stack receives the initial data and parses the first data, the computer device will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing) to obtain the first data from the user state network protocol stack, for example, the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic Modes such as viewing, waking up and viewing) obtain the first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) from the database network architecture. The first data is read, wherein the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture. Finally, the computer device reads the first data through the database multi-thread architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
在本申请上述实施方式中,由于用户态网络协议栈处于操作***的用户态上,从而可实现操作***内核旁路,这种纯软技术,不依赖新型网络设备,可控性友好。此外,由于传统的RTC通信模型,数据库的业务与网络在同一个线程,而用户态网络能力在用户空间是独立的进程和数据资源,导致用户态下的RTC的不可用。因此,本申请实施例提供的加速方法,解耦了数据库的业务和网络。此外,为了应对网络高并发并适应数据库多线程架构,在数据库网络架构中使用通信池提高资源复用率,降低***开销。In the above embodiments of the present application, since the user mode network protocol stack is in the user mode of the operating system, the operating system kernel bypass can be realized. This pure soft technology does not rely on new network devices and is controllable and friendly. In addition, due to the traditional RTC communication model, the business of the database and the network are in the same thread, while the network capability of the user mode is an independent process and data resource in the user space, resulting in the unavailability of the RTC in the user mode. Therefore, the acceleration method provided by the embodiment of the present application decouples the business and network of the database. In addition, in order to cope with the high concurrency of the network and adapt to the multi-threaded architecture of the database, the communication pool is used in the database network architecture to improve resource reuse and reduce system overhead.
在第二方面的一种可能的实现方式中,用户态网络协议栈由计算机设备上部署的用户态网络配置模块通过创建守护进程进行配置。In a possible implementation manner of the second aspect, the user-mode network protocol stack is configured by the user-mode network configuration module deployed on the computer device by creating a daemon process.
在本申请上述实施方式中,加速框架还可以包括用户态网络配置模块,负责使能计算机设备当前的操作***使其拥有用户态网络协议栈的能力,从而实现对用户态网络协议栈的自动化部署,配置过程更加便捷。In the above embodiments of the present application, the acceleration framework may also include a user-mode network configuration module, which is responsible for enabling the current operating system of the computer device to have the capability of the user-mode network protocol stack, thereby realizing automatic deployment of the user-mode network protocol stack , the configuration process is more convenient.
在第二方面的一种可能的实现方式中,守护进程至少执行以下一种配置操作:设置数据平面开发套件DPDK用户态驱动、设置大页内存、设置定时任务、设置内核虚拟网卡KNI、设置用户态组件的控制权限。其中,定时任务负责管控用户态网络协议栈的配置环境,保证数据库在使用用户态网络协议栈期间的高可用。In a possible implementation of the second aspect, the daemon process at least performs one of the following configuration operations: setting the data plane development kit DPDK user mode driver, setting the huge page memory, setting the scheduled task, setting the kernel virtual network card KNI, setting the user The control authority of the dynamic component. Among them, the timing task is responsible for managing and controlling the configuration environment of the user-mode network protocol stack to ensure the high availability of the database during the use of the user-mode network protocol stack.
在本申请上述实施方式中,具体阐述了守护进程包括的配置操作可以包括哪些方面,具备可操作性以及灵活性。In the above implementation manners of the present application, what aspects the configuration operations included in the daemon process may include are described in detail, with operability and flexibility.
在第二方面的一种可能的实现方式中,该加速方法还可以包括:当数据库多线程架构执行完计算机设备上数据库的上层业务(可称为第二业务)后,得到的数据可称为第二数据,之后,计算机设备会通过该数据库多线程架构,经由该数据库多线程架构与数据库网 络架构之间的通信控制收发接口将该第二数据向数据库网络架构发送,再通过数据库网络架构内通信池中的数据库网络线程将该第二数据进一步向用户态网络协议栈发送。In a possible implementation of the second aspect, the acceleration method may also include: after the database multi-thread architecture executes the upper-level business (which may be called the second business) of the database on the computer device, the obtained data may be called The second data, after that, the computer device will send the second data to the database network architecture through the database multi-thread architecture, through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and then pass through the database network architecture. The database network thread in the communication pool further sends the second data to the user mode network protocol stack.
在本申请上述实施方式中,阐述了计算机设备上的数据库写数据时数据库多线程架构与数据库网络架构中的数据库网络线程之间的交互过程,将传统数据库的业务与通信解耦(数据库网络线程与数据库业务线程互为消费者和生产者),从而降低了***开销。In the above-mentioned embodiments of the present application, the interaction process between the database multi-thread architecture and the database network thread in the database network architecture is described when the database on the computer device writes data, and the business and communication of the traditional database are decoupled (database network thread and the database business thread as consumers and producers), thereby reducing system overhead.
在第二方面的一种可能的实现方式中,该用户态网络协议栈可以进一步包括用户态进程(也可称为Ltran进程)以及网络协议栈组件(也可称为动态库Lstack.so),也就是说,用户态网络协议栈在软件层面体现为用户态空间的Ltran进程以及动态库Lstack.so。该用户态进程与网络协议栈组件共享内存,两者通过共享内存的方式进行报文交互。具体地,计算机设备通过用户态进程接收由网卡设备发送的初始数据,并存放于用户态进程与网络协议栈组件共享的内存中,之后,计算机设备通过网络协议栈组件在该共享的内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据(即将初始数据解析成计算机设备能识别的数据包格式),得到的第一数据依然存放于该共享的内存。In a possible implementation of the second aspect, the user-mode network protocol stack may further include a user-mode process (also referred to as an Ltran process) and a network protocol stack component (also referred to as a dynamic library Lstack.so), That is to say, the user mode network protocol stack is embodied as the Ltran process in the user mode space and the dynamic library Lstack.so at the software level. The user mode process shares memory with the network protocol stack component, and the two exchange messages through the shared memory. Specifically, the computer device receives the initial data sent by the network card device through the user state process, and stores it in the memory shared by the user state process and the network protocol stack component. After that, the computer device uses the network protocol stack component in the shared memory based on The TCP/IP protocol stack parses the initial data to obtain the first data (that is, parse the initial data into a data packet format that can be recognized by the computer device), and the obtained first data is still stored in the shared memory.
在本申请上述实施方式中,该用户态网络协议栈可以进一步包括用户态进程以及网络协议栈组件,该用户态进程与网络协议栈组件共享内存,两者通过共享内存的方式进行报文交互,包括TCP/IP的协议栈数据解析过程,具备可实现性。In the above-mentioned embodiment of the present application, the user-mode network protocol stack may further include a user-mode process and a network protocol stack component, the user-mode process shares memory with the network protocol stack component, and the two perform message interaction through shared memory, Including the protocol stack data analysis process of TCP/IP, it is achievable.
在第二方面的一种可能的实现方式中,在用户态网络协议栈进一步包括用户态进程以及网络协议栈组件,该用户态进程与网络协议栈组件共享内存的情况下,计算机设备通过数据库网络线程将第二数据向用户态网络协议栈发送的方式具体可以是:计算机设备通过数据库网络线程将第二数据向网络协议栈组件发送,之后,网络协议栈组件将接收到的第二数据存放于共享的内存中。In a possible implementation of the second aspect, when the user-mode network protocol stack further includes a user-mode process and a network protocol stack component, and the user-mode process shares memory with the network protocol stack component, the computer device passes the database network The way in which the thread sends the second data to the user-mode network protocol stack may specifically be: the computer device sends the second data to the network protocol stack component through the database network thread, and then the network protocol stack component stores the received second data in the in shared memory.
在本申请上述实施方式中,具体阐述了用户态进程以及网络协议栈组件共享的内存中还可以用于存储来自数据库上层业务处理后的第二数据,以便于网卡设备从该共享的内存中读取该第二数据,具有广泛适用性。In the above embodiments of the present application, it has been specifically explained that the memory shared by the user state process and the network protocol stack components can also be used to store the second data after business processing from the upper layer of the database, so that the network card device can read from the shared memory. Taking the second data has wide applicability.
在第二方面的一种可能的实现方式中,数据库网络架构除了包括通信池外,还可以包括数据共享buffer,该数据共享buffer可称为数据资源池,即通过创建数据共享buffer实现数据资源池化。该数据资源池负责将用户态网络协议栈的数据进行包聚合和/或批量收发,以实现动态流控和缩扩容。具体地,在计算机设备通过通信池从用户态网络协议栈获取第一数据之后,例如,通信池中的数据库网络线程可以基于轮询模式(也可以是别的模式,如周期性查看、唤醒查看等模式)从用户态网络协议栈中获取该第一数据之后,就将该第一数据存放在该数据资源池中。In a possible implementation of the second aspect, in addition to the communication pool, the database network architecture may also include a data sharing buffer. The data sharing buffer may be called a data resource pool, that is, the data resource pool is realized by creating a data sharing buffer change. The data resource pool is responsible for packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack, so as to realize dynamic flow control and scaling and expansion. Specifically, after the computer device obtains the first data from the user mode network protocol stack through the communication pool, for example, the database network thread in the communication pool can be based on the polling mode (or other modes, such as periodic checking, wake-up checking) After the first data is obtained from the user mode network protocol stack, the first data is stored in the data resource pool.
在本申请上述实施方式中,通过创建的数据共享buffer,实现在多并发场景下的网络协议栈数据(即第一数据)的资源共享和缓存区复用,减少数据拷贝和资源创建的开销。In the above embodiments of the present application, through the created data sharing buffer, the resource sharing and buffer multiplexing of the network protocol stack data (ie, the first data) in a multi-concurrency scenario are realized, and the overhead of data copying and resource creation is reduced.
在第二方面的一种可能的实现方式中,在数据库网络架构还可以包括数据共享buffer(即数据资源池)的情况下,在计算机设备通过数据库多线程架构,经由通信控制收发接口将第二数据向数据库网络架构发送之后,该加速方法还可以包括:计算机设备通过数据库网络线程将该第二数据存储在数据资源池中。In a possible implementation of the second aspect, in the case that the database network architecture may also include a data sharing buffer (that is, a data resource pool), the computer device uses the database multi-thread architecture to send the second After the data is sent to the database network architecture, the acceleration method may further include: the computer device stores the second data in the data resource pool through the database network thread.
在本申请上述实施方式中,通过创建的数据共享buffer,实现在多并发场景下的数据库 业务数据(即第二数据)的资源共享和缓存区复用,减少数据拷贝和资源创建的开销。In the above embodiments of the present application, through the created data sharing buffer, resource sharing and buffer reuse of database business data (that is, second data) in multiple concurrent scenarios are realized, reducing the overhead of data copying and resource creation.
本申请实施例第三方面提供一种计算机设备,该计算机设备具有实现上述第二方面或第二方面任意一种可能实现方式的方法的功能。该功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。A third aspect of the embodiments of the present application provides a computer device, where the computer device has a function of implementing the method of the second aspect or any possible implementation manner of the second aspect. This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware. The hardware or software includes one or more modules corresponding to the above functions.
本申请实施例第四方面提供一种计算机设备,可以包括存储器、处理器以及总线***,其中,存储器用于存储程序,处理器用于调用该存储器中存储的程序以执行本申请实施例第二方面或第二方面任意一种可能实现方式的方法。The fourth aspect of the embodiment of the present application provides a computer device, which may include a memory, a processor, and a bus system, wherein the memory is used to store a program, and the processor is used to call the program stored in the memory to execute the second aspect of the embodiment of the present application Or any possible implementation method of the second aspect.
本申请实施例第五方面提供一种计算机可读存储介质,该计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机可以执行上述第二方面或第二方面任意一种可能实现方式的方法。The fifth aspect of the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores instructions, and when it is run on a computer, the computer can execute any one of the above-mentioned second aspect or the second aspect. method of possible implementation.
本申请实施例第六方面提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第二方面或第二方面任意一种可能实现方式的方法。The sixth aspect of the embodiments of the present application provides a computer program, which, when running on a computer, causes the computer to execute the method of the above-mentioned second aspect or any possible implementation manner of the second aspect.
本申请实施例第七方面提供了一种芯片,该芯片包括至少一个处理器和至少一个接口电路,该接口电路和该处理器耦合,至少一个接口电路用于执行收发功能,并将指令发送给至少一个处理器,至少一个处理器用于运行计算机程序或指令,其具有实现如上述第二方面或第二方面任意一种可能实现方式的方法的功能,该功能可以通过硬件实现,也可以通过软件实现,还可以通过硬件和软件组合实现,该硬件或软件包括一个或多个与上述功能相对应的模块。此外,该接口电路用于与该芯片之外的其它模块进行通信。The seventh aspect of the embodiment of the present application provides a chip, the chip includes at least one processor and at least one interface circuit, the interface circuit is coupled to the processor, and the at least one interface circuit is used to perform the function of sending and receiving, and send instructions to At least one processor, at least one processor is used to run computer programs or instructions, which has the function of realizing the method of the second aspect or any possible implementation mode of the second aspect above, and this function can be realized by hardware or by software Realization can also be achieved through a combination of hardware and software, where the hardware or software includes one or more modules corresponding to the above functions. In addition, the interface circuit is used to communicate with other modules outside the chip.
附图说明Description of drawings
图1为Libeasy网络框架的一个***架构示意图;Fig. 1 is a schematic diagram of a system architecture of the Libeasy network framework;
图2为Libeasy网络框架基于libev的reactor模型的一个实现架构示意图;Figure 2 is a schematic diagram of an implementation architecture of the libev-based reactor model of the Libeasy network framework;
图3为Libeasy网络框架线程共用模型的一个示意图;Fig. 3 is a schematic diagram of the thread sharing model of the Libeasy network framework;
图4为基于RDMA的数据库内核引擎实现数据库负载性能加速的一个***架构示意图;Figure 4 is a schematic diagram of a system architecture based on the RDMA-based database kernel engine to accelerate database load performance;
图5为本申请实施例提供的数据库网络负载性能的加速框架的一个示意图;FIG. 5 is a schematic diagram of an acceleration framework for database network load performance provided by an embodiment of the present application;
图6为本申请实施例提供的数据库网络负载性能的加速框架的另一示意图;FIG. 6 is another schematic diagram of the acceleration framework of the database network load performance provided by the embodiment of the present application;
图7为本申请实施例提供的加速框架的一个***结构图;Fig. 7 is a system structure diagram of the acceleration framework provided by the embodiment of the present application;
图8为本申请实施例提供的数据库多线程架构与数据库网络架构之间的交互的一个示意图;FIG. 8 is a schematic diagram of the interaction between the database multi-thread architecture and the database network architecture provided by the embodiment of the present application;
图9为本申请实施例提供的数据库网络负载性能的加速方法的一个流程示意图;FIG. 9 is a schematic flowchart of a method for accelerating database network load performance provided by an embodiment of the present application;
图10为本申请实施例提供的数据库网络负载性能的加速方法的一个核心实现流程图;Fig. 10 is a core implementation flowchart of the method for accelerating the database network load performance provided by the embodiment of the present application;
图11为本申请实施例提供的数据库网络负载性能的加速方法的一个实现流程图;FIG. 11 is an implementation flowchart of a method for accelerating database network load performance provided by an embodiment of the present application;
图12为本申请实施例提供的计算机设备的一个结构示意图;FIG. 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application;
图13为本申请实施例提供的计算机设备的另一结构示意图。FIG. 13 is another schematic structural diagram of a computer device provided by an embodiment of the present application.
具体实施方式Detailed ways
本申请实施例提供了一种数据库网络负载性能的加速框架、加速方法及设备,该加速 框架用用户态网络协议栈替换内核态网络协议栈,实现OS内核旁路,这种纯软技术,不依赖新型网络设备,可控性友好。并且,该框架将数据库与用户态协议栈解耦,以应对用户态网络高并发;将传统数据库的业务与通信解耦,降低了***开销。Embodiments of the present application provide an acceleration framework, acceleration method, and device for database network load performance. The acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize OS kernel bypass. This pure soft technology does not Relying on new network equipment, controllability is friendly. Moreover, the framework decouples the database from the user-mode protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、***、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。The terms "first", "second" and the like in the specification and claims of the present application and the above drawings are used to distinguish similar objects, and are not necessarily used to describe a specific sequence or sequence. It should be understood that the terms used in this way can be interchanged under appropriate circumstances, and this is merely a description of the manner in which objects with the same attribute are described in the embodiments of the present application. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, product, or apparatus comprising a series of elements is not necessarily limited to those elements, but may include elements not expressly included. Other elements listed explicitly or inherent to the process, method, product, or apparatus.
本申请实施例涉及了许多关于数据库的相关知识,为了更好地理解本申请实施例的方案,下面先对本申请实施例可能涉及的相关术语和概念进行介绍。应理解的是,相关的概念解释可能会因为本申请实施例的具体情况有所限制,但并不代表本申请仅能局限于该具体情况,在不同实施例的具体情况可能也会存在差异,具体此处不做限定。The embodiment of the present application involves a lot of relevant knowledge about databases. In order to better understand the solution of the embodiment of the present application, the following first introduces related terms and concepts that may be involved in the embodiment of the present application. It should be understood that the interpretation of related concepts may be limited due to the specific conditions of the embodiment of the application, but it does not mean that the application is limited to the specific conditions, and there may be differences in the specific conditions of different embodiments. Specifically, there is no limitation here.
(1)数据库(database,DB)(1) database (database, DB)
数据库是按照数据结构来组织、存储和管理数据的仓库,其本质是一个文件***,数据按照特定的格式将数据存储起来,用户可以对数据库中的数据进行增加,修改,删除及查询操作。数据库内按照数据结构进行存储,对于数据库,特别是关系型数据库,如Oracle、SQLServer、DB2等数据库,存储的数据主要是结构化数据,具有规整的格式,一般以行列形式存储。The database is a warehouse that organizes, stores and manages data according to the data structure. Its essence is a file system. The data is stored in a specific format. Users can add, modify, delete and query the data in the database. The database is stored according to the data structure. For databases, especially relational databases, such as Oracle, SQLServer, DB2 and other databases, the stored data is mainly structured data, which has a regular format and is generally stored in the form of rows and columns.
(2)数据库***(database system,DBS)(2) Database system (database system, DBS)
数据库***是实现有组织的、动态地存储大量关联数据、方便多用户访问的计算机软件、硬件和数据资源组成的***。A database system is a system composed of computer software, hardware and data resources that can organize, dynamically store a large amount of associated data, and facilitate multi-user access.
(3)联机事务处理过程(on-line transaction processing,OLTP)(3) Online transaction processing (on-line transaction processing, OLTP)
OLTP是数据库的一种典型的应用,OLTP是一种事务性非常高的***,以频繁大量的小的事务为主。在这样的***中,单个数据库每秒处理的事务往往超过几千个,查询语句的执行量每秒甚至几万个。因此,OLTP也称为面向交易的处理***,其基本特征是顾客的原始数据可以立即传送到计算中心进行处理,并在很短的时间内给出处理结果,这样做的最大优点是可以即时地处理输入的数据,及时地回答。典型的OLTP有电子商务***,如银行交易、证券交易等。OLTP是由数据库引擎负责完成的。OLTP is a typical application of database. OLTP is a very transactional system, mainly with frequent and large number of small transactions. In such a system, a single database often handles more than thousands of transactions per second, and the execution volume of query statements is even tens of thousands per second. Therefore, OLTP is also known as a transaction-oriented processing system. Its basic feature is that the original data of customers can be immediately transmitted to the computing center for processing, and the processing results will be given in a very short time. The biggest advantage of this is that it can be processed instantly. Process incoming data and respond promptly. Typical OLTP has e-commerce systems, such as bank transactions, securities transactions, etc. OLTP is done by the database engine.
衡量OLTP的一个重要性能指标是***性能,具体体现为实时响应时间(response time,RT),即用户在终端上送入数据之后,到计算机设备对这个请求给出答复所需要的时间。An important performance indicator for measuring OLTP is system performance, which is embodied in real-time response time (response time, RT), that is, the time required for the computer device to reply to the request after the user sends data on the terminal.
(4)用户态(user mode)和内核态(kernel mode)(4) User mode (user mode) and kernel mode (kernel mode)
操作***需要两种CPU状态,一种称为用户态,另一种称为内核态,其中,内核态允许操作***程序,用户态运行用户程序。The operating system requires two CPU states, one is called the user state, and the other is called the kernel state, wherein the kernel state allows operating system programs, and the user state runs user programs.
内核态与用户态是操作***的两种运行级别,当程序运行在3级特权级上时,就可以称之为运行在用户态,因为这是最低特权级,是普通的用户进程运行的特权级,大部分用户直接面对的程序都是运行在用户态;反之,当程序运行在0级特权级上时,就可以称之为运行在内核态。运行在用户态下的程序不能直接访问操作***内核数据结构和程序。当 用户在***中执行一个程序时,大部分时间是运行在用户态下的,在其需要操作***帮助完成某些它没有权力和能力完成的工作时就会切换到内核态。Kernel mode and user mode are two operating levels of the operating system. When a program runs at the third-level privilege level, it can be called running in user mode, because this is the lowest privilege level, which is the privilege of ordinary user processes. level, most of the programs that users directly face run in user mode; conversely, when a program runs at level 0 privilege level, it can be called running in kernel mode. Programs running in user mode cannot directly access operating system kernel data structures and programs. When a user executes a program in the system, it runs in the user mode most of the time, and switches to the kernel mode when it needs the help of the operating system to complete some work that it does not have the power and ability to complete.
用户态和内核态这两种状态的主要差别是:处于用户态执行时,进程所能访问的内存空间和对象受到限制,其所处于占有的处理机是可被抢占的;而处于内核态执行中的进程,则能访问所有的内存空间和对象,且所占有的处理机是不允许被抢占的。The main difference between the two states of user mode and kernel mode is: when executing in user mode, the memory space and objects that a process can access are limited, and the processor it occupies can be preempted; while executing in kernel mode The process in the process can access all memory spaces and objects, and the processors it occupies are not allowed to be preempted.
通常来说,以下三种情况会导致用户态到内核态的切换:Generally speaking, the following three situations will lead to switching from user mode to kernel mode:
a、***调用a. System call
这是用户态进程主动要求切换到内核态的一种方式,用户态进程通过***调用申请使用操作***提供的服务程序完成工作。而***调用的机制的核心还是使用了操作***为用户特别开放的一个中断来实现,例如Linux的int 80h中断。This is a way for the user-mode process to actively request to switch to the kernel mode. The user-mode process requests to use the service program provided by the operating system to complete the work through the system call. The core of the system call mechanism is implemented by using an interrupt specially opened by the operating system for users, such as the int 80h interrupt of Linux.
b、异常b. Abnormal
当CPU在执行运行在用户态下的程序时,发生了某些事先不可知的异常,这时会触发由当前运行进程切换到处理此异常的内核相关程序中,也就转到了内核态,比如缺页异常。When the CPU is executing a program running in user mode, some unknowable exceptions occur, which will trigger the current running process to switch to the kernel-related program that handles the exception, and also transfer to the kernel mode, such as Page fault exception.
c、***设备的中断c. Interruption of peripheral equipment
当***设备完成用户请求的操作后,会向CPU发出相应的中断信号,这时CPU会暂停执行下一条即将要执行的指令转而去执行与中断信号对应的处理程序,如果先前执行的指令是用户态下的程序,那么这个转换的过程自然也就发生了由用户态到内核态的切换。比如硬盘读写操作完成,***会切换到硬盘读写的中断处理程序中执行后续操作等。When the peripheral device completes the operation requested by the user, it will send a corresponding interrupt signal to the CPU. At this time, the CPU will suspend the execution of the next instruction to be executed and turn to execute the processing program corresponding to the interrupt signal. If the previously executed instruction is The program in the user mode, then the conversion process naturally occurs from the user mode to the kernel mode. For example, when the hard disk reading and writing operation is completed, the system will switch to the interrupt handler for hard disk reading and writing to perform subsequent operations.
以上三种方式是操作***在运行时由用户态转到内核态的最主要方式,其中***调用可以认为是用户进程主动发起的,异常和***设备的中断则是被动的。The above three methods are the most important ways for the operating system to transfer from the user mode to the kernel mode when the operating system is running. Among them, the system call can be considered to be initiated by the user process actively, and the exception and the interruption of the peripheral device are passive.
(4)用户态网络协议(4) User mode network protocol
用户态网络协议也可称为用户态网络协议栈,由上述可知,内核态与用户态是操作***的两种运行级别。当一个任务(即进程)执行***调用而陷入操作***内核代码中执行时,该进程处于内核运行态,即内核态。当进程在执行用户自己的代码时,则称其处于用户运行态,即用户态。传统的传输控制协议/互联网协议(transmission control protocol/internet protocol,TCP/IP)协议栈都是运行在内核态的,而用户态网络协议就是TCP/IP协议栈运行在操作***的用户态。The user-mode network protocol can also be called the user-mode network protocol stack. From the above, it can be seen that the kernel mode and the user mode are two operating levels of the operating system. When a task (that is, a process) executes a system call and falls into the execution of the operating system kernel code, the process is in the kernel running state, that is, the kernel state. When the process is executing the user's own code, it is said to be in the user running state, that is, the user state. The traditional transmission control protocol/internet protocol (transmission control protocol/internet protocol, TCP/IP) protocol stack runs in the kernel mode, and the user mode network protocol is the TCP/IP protocol stack running in the user mode of the operating system.
此外,在介绍本申请实施例之前,先对目前几种常见加速数据库网络***性能的方式进行简单介绍,使得后续便于理解本申请实施例。In addition, before introducing the embodiment of the present application, a brief introduction will be given to several common ways of accelerating the performance of the database network system at present, so that the subsequent understanding of the embodiment of the present application will be facilitated.
方式一、通过高性能Libeasy网络框架加速OLTP数据库负载性能 Method 1. Accelerate OLTP database load performance through the high-performance Libeasy network framework
阿里OceanBase(一个支持海量数据的高性能分布式数据库***)提出高性能Libeasy网络框架,该网络框架基于内核态网络协议栈的事件驱动模型Libev实现,其使用协程来管理任务调度。该网络框架的***架构图如图1所示,其中包含软件模块:数据库服务器(即图1中的DB)、Libeasy网络框架(即图1中的Libeasy)、事件驱动模型libev(即图1中的libev)以及网卡设备(即图1中的nic)。其中,数据库服务器负责接收来自客户端结构化查询语言(structured query language,SQL)请求处理,通过网卡的数据读/写,进行数据收/发交互。Libeasy网络框架基于事件驱动模型libev,负责组织连接、消息、请求等报文处理和资源管理。libeasy中的线程分为业务逻辑线程和网络I/O线程,负责业务处理和 网络I/O处理。事件驱动模型libev基于reactor模式实现,通过调用操作***内核态TCP/IP协议栈接口进行多路I/O复用,完成网卡数据报文控制收发处理。客户端请求通过以太网到达数据库服务器的网卡设备,服务器通过直接存储器访问(direct memory access,DMA)和中断唤醒技术,获取网络设备的数据(也可称为nic数据)。libev对nic数据进行处理。Ali OceanBase (a high-performance distributed database system supporting massive data) proposes a high-performance Libeasy network framework, which is implemented based on the event-driven model Libev of the kernel-state network protocol stack, which uses coroutines to manage task scheduling. The system architecture diagram of the network framework is shown in Figure 1, which includes software modules: database server (ie, DB in Figure 1), Libeasy network framework (ie, Libeasy in Figure 1), event-driven model libev (ie, DB in Figure 1 libev) and the network card device (that is, the nic in Figure 1). Among them, the database server is responsible for receiving structured query language (structured query language, SQL) request processing from the client, and performing data receiving/sending interaction through the data reading/writing of the network card. The Libeasy network framework is based on the event-driven model libev, responsible for organizing connection, message, request and other message processing and resource management. Threads in libeasy are divided into business logic threads and network I/O threads, responsible for business processing and network I/O processing. The event-driven model libev is implemented based on the reactor mode. It performs multiple I/O multiplexing by calling the TCP/IP protocol stack interface in the operating system kernel state, and completes the control of sending and receiving data packets of the network card. The client requests to reach the network card device of the database server through Ethernet, and the server obtains the data of the network device (also called nic data) through direct memory access (DMA) and interrupt wake-up technology. libev processes nic data.
具体地,Libeasy网络框架基于libev的reactor模型实现,其主要实现架构如图2所示,具体可以包括如下模块:①、EventHandler:事件数据的接口,如定时器事件、I/O事件等;②、Reactor:Reactor中使用了多路I/O复用和Timer,有EventHandler注册时,会调用相应的接口。Reactor的HandleEvents中需要先调用I/O复用和Timer,获取已就绪好的事件,最终调用每个EventHandler。③、Timer:管理定时器,主要负责注册事件、获取超时事件列表等等,一般由网络框架开发者实现。④、多路I/O多路复用模型,通过epoll从操作***内核读写数据,实现多个监听句柄的内核态TCP/IP的数据收发。Specifically, the Libeasy network framework is implemented based on libev's reactor model, and its main implementation architecture is shown in Figure 2, which specifically includes the following modules: ①, EventHandler: an interface for event data, such as timer events, I/O events, etc.; ② , Reactor: Reactor uses multiple I/O multiplexing and Timer, and when EventHandler is registered, it will call the corresponding interface. Reactor's HandleEvents needs to call I/O multiplexing and Timer first, get ready events, and finally call each EventHandler. ③, Timer: manage timers, mainly responsible for registering events, obtaining a list of timeout events, etc., generally implemented by network framework developers. ④, Multi-channel I/O multiplexing model, read and write data from the operating system kernel through epoll, and realize the data transmission and reception of kernel state TCP/IP of multiple listening handles.
Libeasy网络框架支持两种常见的线程模型,一是网络I/O线程和工作线程共用相同线程,二是网络I/O线程和工作线程分开。如图3所示,图3为Libeasy网络框架线程共用模型的一个示意图,其中,Process I/O Read:处理读I/O。Process:解析请求,计算结果。Process I/O Write:用于处理写I/O,返回网络数据和计算结果。具体地,在Libeasy网络框架的共用线程架构中,每个网络I/O线程都负责一个event_loop进行数据交互读写。其中Process I/O Read处理读数据,然后解析请求,生成任务,推送到工作线程的队列中,然后以异步事件方式通知工作线程处理。Process通过工作线程接收到异步事件后,从其工作队列中拿出任务,依次处理,处理完成后,生成结果,放到I/O线程的队列中,然后以异步事件方式通知I/O线程处理,与网络I/O线程相对应。Process Write I/O通过I/O线程收到通知后,依次处理写数据请求。The Libeasy network framework supports two common thread models. One is that the network I/O thread and the worker thread share the same thread, and the other is that the network I/O thread and the worker thread are separated. As shown in Figure 3, Figure 3 is a schematic diagram of the thread sharing model of the Libeasy network framework, where Process I/O Read: process read I/O. Process: Parse the request and calculate the result. Process I/O Write: Used to process write I/O, return network data and calculation results. Specifically, in the shared thread architecture of the Libeasy network framework, each network I/O thread is responsible for an event_loop for data interaction and reading and writing. Among them, Process I/O Read processes the read data, then parses the request, generates a task, pushes it to the queue of the worker thread, and then notifies the worker thread for processing in the form of an asynchronous event. After the Process receives the asynchronous event through the worker thread, it takes out the tasks from its work queue and processes them sequentially. After the processing is completed, the results are generated and placed in the queue of the I/O thread, and then the I/O thread is notified to process in the form of an asynchronous event. , corresponding to the network I/O thread. After Process Write I/O receives the notification through the I/O thread, it processes the write data request in sequence.
方式一的实现方案,主要是通过高性能Libeasy网络框架加速OLTP数据库负载性能。该技术方案存在的缺点如下:a、所用的事件驱动模型libev是一种基于内核态TCP/IP网络协议栈的通信框架。会带来用户态-内核态频繁切换、内核态协议栈数据多次内存拷贝。造成***资源损耗和网络负载时间延迟,从而损失OLTP下数据库性能。b、Libeasy网络框架的线程共用模型,解析好请求后就能直接在同一线程处理,省去了线程切换的开销,非常适合Process耗费时间较小的请求。但是这种数据运行至终结(run to continue,RTC)转发模型的工作机制,即物理CPU核负责处理整个报文的生命周期,无法在用户态网络协议栈下使用,不能够应对OLTP高并发网络收发。c、Libeasy网络框架的线程分开模型,对于密集小任务请求不适合,大量时间耗费在线程切换开销,带来额外的性能损失。The implementation of method 1 is mainly to accelerate the load performance of OLTP database through the high-performance Libeasy network framework. The disadvantages of this technical solution are as follows: a. The used event-driven model libev is a communication framework based on the kernel state TCP/IP network protocol stack. It will cause frequent switching between user mode and kernel mode, and multiple memory copies of kernel mode protocol stack data. Cause system resource loss and network load time delay, thus loss of database performance under OLTP. b. The thread sharing model of the Libeasy network framework can be processed directly in the same thread after parsing the request, which saves the overhead of thread switching, and is very suitable for requests that take less time to process. However, the working mechanism of this data run-to-continue (RTC) forwarding model, that is, the physical CPU core is responsible for processing the entire message life cycle, cannot be used under the user-mode network protocol stack, and cannot cope with OLTP high-concurrency networks send and receive. c. The thread separation model of the Libeasy network framework is not suitable for intensive small task requests, and a lot of time is spent on thread switching overhead, which brings additional performance loss.
方式二、搭建基于RDMA的数据库内核引擎实现数据库负载性能加速Method 2: Building an RDMA-based database kernel engine to accelerate database load performance
阿里PolarDB(阿里云自研的云原生关系型数据库)基于新型硬件技术搭建基于RDMA的数据库内核引擎。通过RMDA网络,直接将本机的内存写入另一台机器的内存地址,中间的通讯协议编解码、重传机制都由RDMA网卡来完成,不需要CPU参与,提供了一整套在用户态运行的I/O和网络协议栈。如图4所示,其实现方式包括:PolarDB采用分布式集群架构的设计,计算节点和存储节点之间采用高速网络互联。通过RDMA协议进行数据传输,使得I/O性能不再成为瓶颈,实现操作***内核CPU旁路,从而加速数据库内核性能。DB的数据文件、redolog等通过用户态文件***,经过块设备数据管理路由,依靠 高速网络和RDMA协议传输到远端的数据服务器。数据服务器Chunk Server的数据采用多副本确保数据的可靠性,并通过Parallel-Raft协议保证数据的一致性。这种该方式依靠RDMA网卡直接把来自其他服务器数据传入本机的存储区,数据传输直接在用户层来进行传输,无须进行进入内核态,也不用进行***内存,也不对操作***造成任何影响。Ali PolarDB (Alibaba Cloud's self-developed cloud-native relational database) builds an RDMA-based database kernel engine based on new hardware technology. Through the RMDA network, the memory of this machine is directly written into the memory address of another machine. The intermediate communication protocol encoding and decoding and retransmission mechanism are all completed by the RDMA network card, which does not require CPU participation, and provides a complete set of functions running in user mode. I/O and network protocol stacks. As shown in Figure 4, the implementation methods include: PolarDB adopts a distributed cluster architecture design, and high-speed network interconnection is used between computing nodes and storage nodes. Data transmission is performed through the RDMA protocol, so that I/O performance is no longer a bottleneck, and the CPU bypass of the operating system kernel is realized, thereby accelerating the performance of the database kernel. DB data files, redolog, etc. are transmitted to the remote data server through the user state file system, through the block device data management route, and relying on the high-speed network and RDMA protocol. The data of the data server Chunk Server adopts multiple copies to ensure the reliability of the data, and ensures the consistency of the data through the Parallel-Raft protocol. This method relies on the RDMA network card to directly transmit data from other servers to the storage area of the machine, and the data transmission is directly transmitted at the user layer, without entering the kernel mode, without using the system memory, and without any impact on the operating system .
方式二的实现方案虽然利用RDMA交互实现数据库旁路操作***内核实现负载性能加速,但其依赖RDMA网卡硬件设备,属于一种硬件新型技术。在实际应用场合中需要端到端的物理硬件配合,灵活性和通用性差,同时在软件层面,RDMA协议的实现对于应用层数据库内核而言,需要大量复杂适配修改才能保证其可用性。Although the implementation scheme of method 2 uses RDMA interaction to realize the database bypass operating system kernel to accelerate load performance, it relies on the RDMA network card hardware device, which belongs to a new hardware technology. In practical applications, end-to-end physical hardware cooperation is required, which has poor flexibility and versatility. At the same time, at the software level, the implementation of the RDMA protocol requires a lot of complex adaptation and modification for the application layer database kernel to ensure its usability.
综上所述,为解决上述问题,本申请实施例首先提供了一种数据库网络负载性能的加速框架,该加速框架用用户态网络协议栈替换内核态网络协议栈,实现操作***内核旁路,这种纯软技术,不依赖新型网络设备,可控性友好。并且,该框架将数据库与用户态协议栈解耦,以应对用户态网络高并发;将传统数据库的业务与通信解耦,降低了***开销。To sum up, in order to solve the above problems, the embodiment of the present application firstly provides an acceleration framework for database network load performance. The acceleration framework replaces the kernel-mode network protocol stack with the user-mode network protocol stack to realize operating system kernel bypass. This pure soft technology does not rely on new network equipment and is controllable and friendly. Moreover, the framework decouples the database from the user-mode protocol stack to cope with the high concurrency of the user-mode network; it decouples the business and communication of traditional databases to reduce system overhead.
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。Embodiments of the present application are described below in conjunction with the accompanying drawings. Those of ordinary skill in the art know that, with the development of technology and the emergence of new scenarios, the technical solutions provided in the embodiments of the present application are also applicable to similar technical problems.
本申请实施例所提供的数据库网络负载性能的加速框架是运行在计算机设备上的,计算机设备由硬件和软件构成,而其中软件主要包括操作***和数据库。本申请实施例提供的加速框架通过网卡设备实现客户机(是指不同于该计算机设备的其他设备)到该计算机设备的数据收发,利用用户态网络协议栈和数据库软件提供数据库数据增、删、改、查等服务。具体请参阅图5,图5为本申请实施例提供的数据库网络负载性能的加速框架的一个示意图,该加速框架500部署于具有网卡(如,1822网卡设备)的计算机设备(如,服务器)上,该计算机设备已部署有操作***和数据库,图5中的用户态和内核态(包括内核态应用,如Linux内核)是操作***的两种运行状态,该加速框架500运行在用户态,具体可以包括如下模块:用户态网络协议栈501、数据库网络架构(也可称为数据库网络通信框架)502、数据库多线程架构503,其中,数据库网络架构502包括至少一个数据库网络线程,数据库多线程架构503包括至少一个数据库业务线程,数据库多线程架构503与数据库网络架构502之间通过通信控制收发接口连接,数据库网络架构502与数据库多线程架构503均包含于数据库内核中。The database network load performance acceleration framework provided by the embodiment of the present application runs on computer equipment, which is composed of hardware and software, and the software mainly includes an operating system and a database. The acceleration framework provided by the embodiment of the present application realizes data transmission and reception from the client (referring to other devices different from the computer device) to the computer device through the network card device, and utilizes the user state network protocol stack and database software to provide database data addition, deletion, Change, check and other services. Please refer to FIG. 5 for details. FIG. 5 is a schematic diagram of an acceleration framework for database network load performance provided by an embodiment of the present application. The acceleration framework 500 is deployed on a computer device (such as a server) with a network card (such as a 1822 network card device). , the computer equipment has been deployed with an operating system and a database. The user state and kernel state (including kernel state applications, such as the Linux kernel) in Figure 5 are two operating states of the operating system. The acceleration framework 500 runs in the user state, specifically The following modules may be included: user state network protocol stack 501, database network architecture (also referred to as database network communication framework) 502, database multi-thread architecture 503, wherein, database network architecture 502 includes at least one database network thread, database multi-thread architecture 503 includes at least one database service thread, and the database multi-thread architecture 503 and the database network architecture 502 are connected through a communication control transceiver interface, and both the database network architecture 502 and the database multi-thread architecture 503 are included in the database kernel.
需要说明的是,在本申请的一些实施方式中,该加速框架500还可以进一步包括用户态网络配置模块504,该用户态网络配置模块504负责使能当前的操作***使其拥有用户态网络协议栈501的能力,具体地,用于通过创建守护进程的方式对用户态网络协议栈501进行自动化配置,例如,该用户态网络配置模块504用于执行以下至少一种配置操作:设置DPDK用户态驱动、设置大页内存(huge pages)、设置定时任务、设置KNI、设置用户态组件的控制权限等。这里需要注意的是,在本申请的一些实施方式中,部署有该加速框架500的计算机设备具备的网卡设备需支持DPDK驱动,但对网卡设备的类型不做限定。It should be noted that, in some embodiments of the present application, the acceleration framework 500 may further include a user-mode network configuration module 504, which is responsible for enabling the current operating system to have a user-mode network protocol The capability of the stack 501, specifically, is used to automatically configure the user-mode network protocol stack 501 by creating a daemon process. For example, the user-mode network configuration module 504 is used to perform at least one of the following configuration operations: set DPDK user-mode Drivers, setting huge pages, setting scheduled tasks, setting KNI, setting control permissions of user mode components, etc. It should be noted here that, in some embodiments of the present application, the network card device of the computer device deployed with the acceleration framework 500 needs to support the DPDK driver, but the type of the network card device is not limited.
具体地,DPDK是一个开源的数据平面开发工具集,提供了一个用户态下的高效数据包处理库函数,它通过旁路内核态网络协议栈、轮询模式的报文无中断收发、优化内存/缓冲区/队列管理、基于网卡多队列和流识别的负载均衡等多项技术,实现了在x86(由Intel推出的一种复杂指令集,用于控制芯片的运行的程序)/ARM处理器架构下的高性能报文 转发能力,用户可以在用户态空间开发各类高速网络框架。物理网卡加载DPDK驱动,将网卡硬件寄存器映射到用户态,以实现DPDK网卡接管。为了继承原有的内核接口,用户态网络协议栈501还可以提供了KNI驱动。本申请实施例的加速框架500主要通过守护进程技术来实现网卡DPDK接管和KNI驱动加载的高可用。通过配置的大页内存个数以及hugetlbfs的挂载,实现DPDK对大页内存的配置。此外,可以利用定时任务技术在用户态网络配置下实现进程管理。Specifically, DPDK is an open source data plane development tool set, which provides an efficient packet processing library function in user mode. Multiple technologies such as /buffer/queue management, load balancing based on network card multi-queue and flow identification, etc., realize the x86 (a complex instruction set introduced by Intel, which is used to control the running program of the chip)/ARM processor With the high-performance packet forwarding capability under the architecture, users can develop various high-speed network frameworks in the user mode space. The physical network card loads the DPDK driver, and maps the network card hardware registers to the user mode to realize the DPDK network card takeover. In order to inherit the original kernel interface, the user mode network protocol stack 501 may also provide a KNI driver. The acceleration framework 500 in the embodiment of the present application mainly realizes the high availability of network card DPDK takeover and KNI driver loading through daemon technology. Through the configured number of huge page memory and the mounting of hugetlbfs, DPDK can configure the huge page memory. In addition, the timing task technology can be used to implement process management under user-mode network configuration.
在本申请实施例中,对端设备可通过网卡向计算机设备部署的数据库写入数据,即数据库读数据的过程;计算机设备也可以通过网卡向对端设备发送数据库的数据,即数据库写数据的过程。下面在上述图5所述的加速框架500的基础上,基于这两种不同的数据处理情形详细阐述本申请实施例提供的加速框架在其中所执行的操作:In the embodiment of this application, the peer device can write data to the database deployed by the computer device through the network card, that is, the process of reading data from the database; the computer device can also send the data of the database to the peer device through the network card, that is, the process of writing data to the database process. On the basis of the acceleration framework 500 described above in FIG. 5 , the operations performed by the acceleration framework provided by the embodiment of the present application are described in detail below based on these two different data processing situations:
一、计算机设备上的数据库读数据的情形1. The situation of reading data from the database on the computer equipment
当对端设备通过网卡设备向计算机设备发送数据(可称为初始数据)时,用户态网络协议栈501,就用于接收由该网卡设备发送的初始数据(如,对端设备发送的一个或多个初始数据包),并通过其内的TCP/IP协议栈解析该初始数据,得到第一数据。When the peer device sends data (which may be referred to as initial data) to the computer device through the network card device, the user mode network protocol stack 501 is used to receive the initial data sent by the network card device (such as one or a plurality of initial data packets), and parse the initial data through the TCP/IP protocol stack therein to obtain the first data.
需要说明的是,在本申请的一些实施方式中,该用户态网络协议栈501可以进一步包括用户态进程以及网络协议栈组件,如图6所示,该用户态网络协议栈501可以进一步包括用户态进程(也可称为Ltran进程)5011以及网络协议栈组件(也可称为动态库Lstack.so)5012,也就是说,用户态网络协议栈501在软件层面体现为用户态空间的Ltran进程以及动态库Lstack.so。该用户态进程5011与网络协议栈组件5012共享内存,两者通过共享内存的方式进行报文交互,包括TCP/IP的协议栈数据解析过程。It should be noted that, in some embodiments of the present application, the user mode network protocol stack 501 may further include user mode processes and network protocol stack components, as shown in FIG. 6 , the user mode network protocol stack 501 may further include user mode State process (also can be referred to as Ltran process) 5011 and network protocol stack component (also can be referred to as dynamic library Lstack.so) 5012, that is to say, user state network protocol stack 501 is embodied as the Ltran process of user state space at the software level And the dynamic library Lstack.so. The user state process 5011 shares memory with the network protocol stack component 5012, and the two exchange messages through the shared memory, including the TCP/IP protocol stack data analysis process.
具体地,在这种实现方式中,用户态网络协议栈501,具体用于:在用户态空间启动该用户态进程5011,由该用户态进程5011接收由网卡设备发送的初始数据,并将该初始数据存放于共享的内存中。之后,由该网络协议栈组件5012在共享的内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据(即将初始数据解析成计算机设备能识别的数据包格式),得到的第一数据依然存放于该共享的内存。Specifically, in this implementation, the user mode network protocol stack 501 is specifically used to: start the user mode process 5011 in the user mode space, and the user mode process 5011 receives the initial data sent by the network card device, and the Initial data is stored in shared memory. Afterwards, the network protocol stack component 5012 analyzes the initial data based on the TCP/IP protocol stack in the shared memory to obtain the first data (that is, resolve the initial data into a data packet format that can be recognized by the computer device), and obtain the first The data is still stored in the shared memory.
还需要说明的是,在本申请实施例中,数据库内的相关服务进程(即数据库上的各种线程的集合,体现在图5中的数据库多线程架构503)动态链接网络协议栈组件5012,以实现整个用户态网络协议栈501的通信接口调用。网络协议栈组件5012解析过后的第一数据通过该通信接***给数据库多线程架构503中专有的数据库网络线程进行数据的收发控制。这里需要注意的是,动态链接的意思是运行时才需要,无需编译等过程,数据库软件不依赖网络协议栈组件5012。在本申请实施例中,数据库网络架构502中的网络线程构成该数据库网络架构502的通信池5021,该通信池5021中的网络线程用于获取用户态网络协议栈501解析后的第一数据,并指示数据库多线程架构503从该数据库网络架构502中读取该第一数据。通过数据库网络线程的池化(即组成通信池)提高资源复用率,降低***开销。It should also be noted that, in the embodiment of the present application, the related service process in the database (that is, the collection of various threads on the database, embodied in the database multi-thread architecture 503 in FIG. 5 ) is dynamically linked to the network protocol stack component 5012, In order to realize the communication interface call of the entire user mode network protocol stack 501. The first data parsed by the network protocol stack component 5012 is handed over to the dedicated database network thread in the database multi-thread architecture 503 through the communication interface for data sending and receiving control. It should be noted here that the dynamic link means that it is only needed at runtime, without compiling and other processes, and the database software does not depend on the network protocol stack component 5012 . In the embodiment of the present application, the network thread in the database network architecture 502 constitutes the communication pool 5021 of the database network architecture 502, and the network thread in the communication pool 5021 is used to obtain the first data parsed by the user mode network protocol stack 501, And instruct the database multi-thread framework 503 to read the first data from the database network framework 502 . Through the pooling of database network threads (that is, forming a communication pool), the resource reuse rate is improved and the system overhead is reduced.
需要注意的是,在用户态网络协议栈501包括用户态进程5011和网络协议栈组件5012的情况下,数据库网络架构502中通信池5021中的网络线程则是用于从这两者共享的内存中获取该第一数据。It should be noted that, in the case that the user-mode network protocol stack 501 includes the user-mode process 5011 and the network protocol stack component 5012, the network thread in the communication pool 5021 in the database network architecture 502 is used to retrieve data from the memory shared by the two. Get the first data in.
需要说明的是,在本申请的另一些实施方式中,数据库网络架构502还可以包括数据共享buffer,该数据共享buffer可称为数据资源池5022,即通过创建数据共享buffer实现数据资源池化,该数据资源池5022可用于存储来自用户态网络协议栈501的第一数据,具体地,该数据资源池5022负责将用户态网络协议栈501的数据进行包聚合和/或批量收发,以实现动态流控和缩扩容。It should be noted that, in other embodiments of the present application, the database network architecture 502 may also include a data sharing buffer, which may be called a data resource pool 5022, that is, data resource pooling is realized by creating a data sharing buffer, The data resource pool 5022 can be used to store the first data from the user-mode network protocol stack 501. Specifically, the data resource pool 5022 is responsible for performing packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack 501 to realize dynamic Flow control and scaling.
还需要说明的是,在本申请的另一些实施方式中,在数据库网络架构502还可以包括数据资源池5022的情况下,该通信池5021中的数据库网络线程就用于从用户态网络协议栈501中读取的第一数据放入该数据资源池5022中,并指示数据库多线程架构503从该数据资源池5022中读取第一数据,从而完成与用户态网络协议栈501的数据交互。It should also be noted that, in some other embodiments of the present application, when the database network architecture 502 may also include a data resource pool 5022, the database network thread in the communication pool 5021 is used for The first data read in 501 is put into the data resource pool 5022, and the database multithreading architecture 503 is instructed to read the first data from the data resource pool 5022, thereby completing the data interaction with the user mode network protocol stack 501.
在本申请实施例中,数据库多线程架构503,就基于数据库网络架构502的指示消息,通过数据库多线程架构503与数据库网络架构502之间的通信控制收发接口读取该第一数据,以执行该数据库中与该第一数据对应的业务(可称为第一业务)。In the embodiment of the present application, the database multi-thread framework 503 reads the first data through the communication control transceiver interface between the database multi-thread framework 503 and the database network framework 502 based on the instruction message of the database network framework 502 to execute The service corresponding to the first data in the database (may be referred to as the first service).
传统的RTC通信模型,数据库的业务与网络在同一个线程,而用户态网络能力在用户空间是独立的进程和数据资源,导致用户态下的RTC的不可用。因此,本申请实施例提供的加速框架,解耦了数据库的业务和网络,解耦为上述所述的数据库网络架构502以及数据库多线程架构503。此外,为了应对网络高并发并适应数据库多线程架构503,在数据库网络架构502中使用通信池5021和数据共享的数据资源池5022解决用户态协议负载网络的高并发,并实现通信的流量控制和收发。In the traditional RTC communication model, the business of the database and the network are in the same thread, while the network capabilities of the user mode are independent processes and data resources in the user space, resulting in the unavailability of RTC in the user mode. Therefore, the acceleration framework provided by the embodiment of the present application decouples the business and network of the database into the above-mentioned database network architecture 502 and database multi-thread architecture 503 . In addition, in order to cope with the high concurrency of the network and adapt to the multi-threaded architecture 503 of the database, the communication pool 5021 and the data resource pool 5022 for data sharing are used in the database network architecture 502 to solve the high concurrency of the user mode protocol load network, and realize communication flow control and send and receive.
二、计算机设备上的数据库写数据的情形2. The situation of writing data in the database on the computer equipment
当数据库多线程架构503执行完计算机设备上数据库的上层业务(可称为第二业务)后,得到的数据可称为第二数据,该第二数据会由该数据库多线程架构503通过数据库多线程架构503与数据库网络架构502之间的通信控制收发接口向数据库网络架构502发送,数据库网络架构502内通信池5021中的数据库网络线程就用于将该第二数据进一步向用户态网络协议栈501发送。After the database multi-thread architecture 503 executes the upper-level business (which can be called the second business) of the database on the computer device, the obtained data can be called the second data, and the second data will be passed by the database multi-thread architecture 503 through the database multi- The communication control transceiver interface between the thread architecture 503 and the database network architecture 502 is sent to the database network architecture 502, and the database network thread in the communication pool 5021 in the database network architecture 502 is used to further send the second data to the user mode network protocol stack 501 sent.
需要说明的是,在本申请的一些实施方式中,在用户态网络协议栈501进一步包括用户态进程5011以及网络协议栈组件5012,该用户态进程5011与网络协议栈组件5012共享内存的情况下,通信池5021中的数据库网络线程具体用于将该第二数据向网络协议栈组件5012发送,该网络协议栈组件5012则用于将该第二数据存放于共享的内存中,以便于网卡设备从该共享的内存中读取该第二数据。It should be noted that, in some embodiments of the present application, the user mode network protocol stack 501 further includes a user mode process 5011 and a network protocol stack component 5012, and the user mode process 5011 shares memory with the network protocol stack component 5012 , the database network thread in the communication pool 5021 is specifically used to send the second data to the network protocol stack component 5012, and the network protocol stack component 5012 is used to store the second data in the shared memory, so that the network card device The second data is read from the shared memory.
还需要说明的是,在本申请的另一些实施方式中,在数据库网络架构502还可以包括数据共享buffer(即数据资源池5022)的情况下,该数据资源池5022可进一步用于存储来自数据库多线程架构503的第二数据。It should also be noted that, in other embodiments of the present application, in the case where the database network architecture 502 may also include a data sharing buffer (that is, the data resource pool 5022), the data resource pool 5022 may be further used to store data from the database The second data of the multithreading architecture 503 .
对于数据库所在的计算机设备,其操作***默认是使用内核态网络协议栈接受来自网卡设备的数据。在本申请上述实施方式中,采用用户态网络协议栈替换内核态网络协议栈,避免形态切换和内存拷贝带来的***性能损耗。即消除了操作***形态切换开销,减少了内核到用户进程的数据拷贝,因而能解放内存带宽和CPU周期用于改进应用***性能,提高了数据库网络负载性能。For the computer equipment where the database is located, its operating system defaults to use the kernel mode network protocol stack to accept data from the network card device. In the above embodiments of the present application, the network protocol stack in the user mode is used to replace the network protocol stack in the kernel mode, so as to avoid system performance loss caused by mode switching and memory copying. That is to say, it eliminates the switching overhead of the operating system and reduces the data copying from the kernel to the user process, thus liberating memory bandwidth and CPU cycles to improve application system performance and database network load performance.
为便于进一步理解上述加速框架,下面以一个具体的实例介绍加速框架的***结构, 具体请参阅图7,图7为本申请实施例提供的加速框架的一个***结构图,其中,postmaster是数据库业务主线程(业务线程中的一种),负责整个业务层的执行和调度;CommProxyLayer是通信池的通信接口层(就是图6中数据库业务线程与数据资源池5022之间的通信控制收发接口),负责提供给业务层调用;CommCoreLayer是通信池的专有网络收发线程实体层(就是图6中的数据库网络架构502,图7中的buffer是数据资源池,communicator是数据库网络线程,多个communicator构成通信池),负责网络报文收发控制,完成协议栈数据库的数据通信处理;LtranProcess是指用户态网络协议栈的网络线程(就是图6中的用户态进程5011),负责用户态网络协议栈和网卡设备进行数据交互;Physical Nic则是计算机设备的物理网卡。In order to further understand the above-mentioned acceleration framework, the system structure of the acceleration framework is introduced below with a specific example. For details, please refer to FIG. 7. FIG. The main thread (one of the business threads) is responsible for the execution and scheduling of the entire business layer; CommProxyLayer is the communication interface layer of the communication pool (that is, the communication control transceiver interface between the database business thread and the data resource pool 5022 in Figure 6), Responsible for providing calls to the business layer; CommCoreLayer is the proprietary network sending and receiving thread entity layer of the communication pool (that is, the database network architecture 502 in Figure 6, the buffer in Figure 7 is the data resource pool, and the communicator is the database network thread, composed of multiple communicators communication pool), responsible for the control of network message sending and receiving, and completing the data communication processing of the protocol stack database; LtranProcess refers to the network thread of the user state network protocol stack (that is, the user state process 5011 in Figure 6), responsible for the user state network protocol stack and The network card device performs data interaction; the Physical Nic is the physical network card of the computer device.
下面进一步对本申请实施例提供的数据库网络架构进行说明,具体请参阅图8,图8为本申请实施例提供的数据库多线程架构与数据库网络架构之间的交互的一个示意图,其中,专有网络收发进程(也就是数据库网络线程)comm_proxy组成的通信池(即上述图6中的通信池5021)用于提供网络线程控制处理、单工收、单工发的能力,同时负责与用户态协议栈进行数据收发,以及和数据库业务线程(即图8中的worker)进行数据交互,图8中的proxy1、proxy2、proxy3、……即为该通信池中不同的数据库网络线程。数据缓冲区环形buffer则组成的数据资源池(即上述图6中的数据资源池5022),负责网络和业务通信数据的缓存,通过原子操作实现数据库业务多并发下数据收发控制和读写,整个buffer可以动态收扩容、数据流量控制和消息批量处理。The following further describes the database network architecture provided by the embodiment of the present application. Please refer to FIG. 8 for details. FIG. 8 is a schematic diagram of the interaction between the database multi-threaded architecture and the database network architecture provided by the embodiment of the present application. The communication pool (that is, the communication pool 5021 in the above-mentioned figure 6) composed of the sending and receiving process (that is, the database network thread) comm_proxy is used to provide the ability of network thread control processing, simplex receiving, and simplex sending, and is responsible for communicating with the user mode protocol stack at the same time. Send and receive data, and perform data interaction with the database business thread (that is, the worker in Figure 8). The proxy1, proxy2, proxy3, ... in Figure 8 are different database network threads in the communication pool. The data resource pool formed by the ring buffer of the data buffer (that is, the data resource pool 5022 in the above-mentioned figure 6) is responsible for the caching of network and business communication data, and realizes multi-concurrent data sending and receiving control and reading and writing of database services through atomic operations. Buffer can be dynamically expanded, data flow control and message batch processing.
基于上述所述的加速框架,下面对本申请实施例提供的数据库网络负载性能的加速方法进行说明,具体请参阅图9,图9为本申请实施例提供的数据库网络负载性能的加速方法的一个流程示意图,具体可以包括如下步骤:Based on the acceleration framework described above, the method for accelerating the database network load performance provided by the embodiment of the present application will be described below. Please refer to FIG. 9 for details. FIG. 9 is a process flow of the method for accelerating the database network load performance provided by the embodiment of the present application The schematic diagram may specifically include the following steps:
901、计算机设备通过用户态网络协议栈从网卡设备获取初始数据,并通过TCP/IP协议栈解析该初始数据,得到第一数据。901. The computer device obtains initial data from the network card device through the user mode network protocol stack, and parses the initial data through the TCP/IP protocol stack to obtain first data.
当对端设备通过网卡设备向计算机设备发送数据(可称为初始数据)时,计算机设备通过用户态网络协议栈从网卡设备处接收到对端设备发送的初始数据,并进一步在用户态网络协议栈内通过TCP/IP协议栈解析该初始数据,得到该第一数据。When the peer device sends data (which can be referred to as initial data) to the computer device through the network card device, the computer device receives the initial data sent by the peer device from the network card device through the user mode network protocol stack, and further transmits data in the user mode network protocol The initial data is parsed through the TCP/IP protocol stack in the stack to obtain the first data.
需要说明的是,在本申请的一些实施方式中,用户态网络协议栈可以由计算机设备上部署的用户态网络配置模块通过创建守护进程进行配置。具体地,在计算机设备上的数据库在安装启动阶段,首先负责使能用户态网络配置,例如,创建守护进程至少执行以下一种配置操作:设置数据平面开发套件DPDK用户态驱动、设置大页内存、设置定时任务、设置内核虚拟网卡KNI、设置用户态组件的控制权限。以实现用户态网络协议栈的自动化部署,并实现用户态网络的高可用。这里需要注意的是,在本申请实施例中,守护进程执行的上述配置操作的具体过程可参阅上述用户态网络配置模块504的操作过程,此处不予赘述。It should be noted that, in some embodiments of the present application, the user-mode network protocol stack can be configured by the user-mode network configuration module deployed on the computer device by creating a daemon process. Specifically, the database on the computer device is firstly responsible for enabling user-mode network configuration during the installation and startup phase. For example, creating a daemon process performs at least one of the following configuration operations: setting the data plane development kit DPDK user-mode driver, setting large page memory , Set the timing task, set the kernel virtual network card KNI, set the control authority of the user mode component. In order to realize the automatic deployment of the user mode network protocol stack, and realize the high availability of the user mode network. It should be noted here that, in the embodiment of the present application, the specific process of the above-mentioned configuration operation performed by the daemon process can refer to the above-mentioned operation process of the user-mode network configuration module 504 , which will not be repeated here.
还需要说明的是,在本申请的一些实施方式中,该用户态网络协议栈可以进一步包括用户态进程(也可称为Ltran进程)以及网络协议栈组件(也可称为动态库Lstack.so),也就是说,用户态网络协议栈在软件层面体现为用户态空间的Ltran进程以及动态库Lstack.so。该用户态进程与网络协议栈组件共享内存,两者通过共享内存的方式进行报文 交互,包括TCP/IP的协议栈数据解析过程。具体地,计算机设备通过用户态进程接收由网卡设备发送的初始数据,并存放于用户态进程与网络协议栈组件共享的内存中,之后,计算机设备通过网络协议栈组件在该共享的内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据(即将初始数据解析成计算机设备能识别的数据包格式),得到的第一数据依然存放于该共享的内存。It should also be noted that, in some embodiments of the present application, the user-mode network protocol stack may further include a user-mode process (also referred to as an Ltran process) and a network protocol stack component (also referred to as a dynamic library Lstack.so ), that is to say, the user-mode network protocol stack is embodied as the Ltran process in the user-mode space and the dynamic library Lstack.so at the software level. The user state process shares memory with the network protocol stack components, and the two exchange messages through the shared memory, including the protocol stack data analysis process of TCP/IP. Specifically, the computer device receives the initial data sent by the network card device through the user state process, and stores it in the memory shared by the user state process and the network protocol stack component. After that, the computer device uses the network protocol stack component in the shared memory based on The TCP/IP protocol stack parses the initial data to obtain the first data (that is, parse the initial data into a data packet format that can be recognized by the computer device), and the obtained first data is still stored in the shared memory.
需要注意的是,在本申请实施例中,计算机设备完成用户态网络协议栈的配置之后,如,通过计算机设备部署的用户态网络配置模块完成用户态网络协议栈的配置之后,还需进一步对数据库网络架构中的通信池(该通信池由至少一个数据库网络线程组成)进行创建和初始化,并初始化数据库网络架构与数据库多线程架构之间所需的通信控制收发接口。It should be noted that in the embodiment of the present application, after the computer device completes the configuration of the user-mode network protocol stack, for example, after the user-mode network configuration module deployed by the computer device completes the configuration of the user-mode network protocol stack, further configuration is required. The communication pool in the database network architecture (the communication pool is composed of at least one database network thread) is created and initialized, and the required communication control sending and receiving interface between the database network architecture and the database multi-thread architecture is initialized.
902、计算机设备通过至少一个数据库网络线程从用户态网络协议栈获取第一数据,并指示数据库多线程架构从数据库网络架构中读取该第一数据,数据库网络线程属于数据库网络架构,数据库多线程架构包括至少一个数据库业务线程。902. The computer device obtains the first data from the user-mode network protocol stack through at least one database network thread, and instructs the database multi-thread architecture to read the first data from the database network architecture. The database network thread belongs to the database network architecture, and the database multi-thread The architecture includes at least one database business thread.
用户态网络协议栈接收到该初始数据,并解析得到第一数据后,计算机设备会通过通信池(该通信池由至少一个数据库网络线程组成,在通信池中的每个数据库网络线程负责报文控制处理和报文收发处理)中的网络线程从用户态网络协议栈中获取该第一数据,例如,通信池中的数据库网络线程可以基于轮询模式(也可以是别的模式,如周期性查看、唤醒查看等模式)从用户态网络协议栈中获取该第一数据,并进一步指示数据库多线程架构中的数据库业务线程(该数据库多线程架构包括至少一个数据库业务线程)从数据库网络架构中读取该第一数据,其中,该通信池属于该数据库网络架构,也就是数据库网络线程属于该数据库网络架构。After the user-mode network protocol stack receives the initial data and parses the first data, the computer device will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing) to obtain the first data from the user state network protocol stack, for example, the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic Modes such as viewing, waking up and viewing) obtain the first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) from the database network architecture. The first data is read, wherein the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture.
作为一个示例,数据库可以启动自身业务所需的后端监听进程和后台业务线程(这两者都属于数据库业务线程中的不同类型的进程),并基于多路I/O复用实现高并发下的通信事件监听,并通过调用通信池提供的通信控制收发接口与数据库网络架构中的数据资源池进行数据交互(如,从数据库网络架构中读取第一数据)。As an example, the database can start the back-end listening process and background business thread required by its own business (both of which belong to different types of processes in the database business thread), and realize high-concurrency downloading based on multiple I/O multiplexing. Listen to the communication event of the communication pool, and perform data interaction with the data resource pool in the database network architecture by calling the communication control sending and receiving interface provided by the communication pool (for example, reading the first data from the database network architecture).
需要说明的是,在本申请的一些实施方式中,数据库网络架构除了包括通信池外,还可以包括数据共享buffer,该数据共享buffer可称为数据资源池,即通过创建数据共享buffer实现数据资源池化。该数据资源池负责将用户态网络协议栈的数据进行包聚合和/或批量收发,以实现动态流控和缩扩容。具体地,在计算机设备通过通信池从用户态网络协议栈获取第一数据之后,例如,通信池中的数据库网络线程可以基于轮询模式(也可以是别的模式,如周期性查看、唤醒查看等模式)从用户态网络协议栈中获取该第一数据之后,就将该第一数据存放在该数据资源池中。It should be noted that, in some implementations of the present application, the database network architecture may also include a data sharing buffer in addition to a communication pool, and the data sharing buffer may be called a data resource pool, that is, by creating a data sharing buffer to realize pooling. The data resource pool is responsible for packet aggregation and/or batch sending and receiving of the data of the user-mode network protocol stack, so as to realize dynamic flow control and scaling and expansion. Specifically, after the computer device obtains the first data from the user mode network protocol stack through the communication pool, for example, the database network thread in the communication pool can be based on the polling mode (or other modes, such as periodic checking, wake-up checking) After the first data is obtained from the user mode network protocol stack, the first data is stored in the data resource pool.
这里需要注意的是,在数据库网络架构还包括数据资源池的情况下,那么计算机设备完成用户态网络协议栈的配置之后,如,通过计算机设备部署的用户态网络配置模块完成用户态网络协议栈的配置之后,除了需对数据库网络架构中的通信池进行创建和初始化之外,还需要对数据库网络架构中的数据资源池进行创建和初始化,并初始化数据库网络架构与数据库多线程架构之间所需的通信控制收发接口。It should be noted here that if the database network architecture also includes a data resource pool, then after the computer device completes the configuration of the user-mode network protocol stack, for example, the user-mode network protocol stack is completed through the user-mode network configuration module deployed by the computer device After the configuration, in addition to creating and initializing the communication pool in the database network architecture, you also need to create and initialize the data resource pool in the database network architecture, and initialize the communication pool between the database network architecture and the database multi-threaded architecture. The required communication control transceiver interface.
903、计算机设备通过数据库多线程架构,经由数据库多线程架构与数据库网络架构之间的通信控制收发接口读取该第一数据,并根据该第一数据执行数据库中与第一数据对应 的第一任务。903. The computer device reads the first data through the communication control transceiver interface between the database multi-thread architecture and the database network architecture through the database multi-thread architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
最后,计算机设备通过数据库多线程架构,经由数据库多线程架构与数据库网络架构之间的通信控制收发接口读取该第一数据,并根据该第一数据执行数据库中与第一数据对应的第一任务。Finally, the computer device reads the first data through the database multi-thread architecture through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and executes the first data corresponding to the first data in the database according to the first data. Task.
需要说明的是,在本申请的一些实施方式中,计算机设备上的数据库除了具有读数据的情形(即从网卡设备接收数据),还可以具有写数据的情形。因此,本申请实施例的方法还可以包括:当数据库多线程架构执行完计算机设备上数据库的上层业务(可称为第二业务)后,得到的数据可称为第二数据,之后,计算机设备会通过该数据库多线程架构,经由该数据库多线程架构与数据库网络架构之间的通信控制收发接口将该第二数据向数据库网络架构发送,再通过数据库网络架构内通信池中的数据库网络线程将该第二数据进一步向用户态网络协议栈发送。It should be noted that, in some implementations of the present application, the database on the computer device may also have the data writing status in addition to the status of reading data (that is, receiving data from the network card device). Therefore, the method of the embodiment of the present application may also include: after the database multi-thread architecture executes the upper-level business (which may be called the second business) of the database on the computer device, the obtained data may be called the second data, and then the computer device The second data will be sent to the database network architecture through the database multithread architecture through the communication control transceiver interface between the database multithread architecture and the database network architecture, and then sent to the database network architecture through the database network thread in the communication pool in the database network architecture. The second data is further sent to the user mode network protocol stack.
需要说明的是,在本申请的一些实施方式中,在用户态网络协议栈进一步包括用户态进程以及网络协议栈组件,该用户态进程与网络协议栈组件共享内存的情况下,计算机设备通过数据库网络线程将第二数据向用户态网络协议栈发送的方式具体可以是:计算机设备通过数据库网络线程将第二数据向网络协议栈组件发送,之后,网络协议栈组件将接收到的第二数据存放于共享的内存中,以便于网卡设备从该共享的内存中读取该第二数据。It should be noted that, in some embodiments of the present application, when the user-mode network protocol stack further includes a user-mode process and a network protocol stack component, and the user-mode process shares memory with the network protocol stack component, the computer device passes the database The way in which the network thread sends the second data to the user mode network protocol stack may specifically be: the computer device sends the second data to the network protocol stack component through the database network thread, and then the network protocol stack component stores the received second data stored in the shared memory, so that the network card device reads the second data from the shared memory.
还需要说明的是,在本申请的另一些实施方式中,在数据库网络架构还可以包括数据共享buffer(即数据资源池)的情况下,在计算机设备通过数据库多线程架构,经由通信控制收发接口将第二数据向数据库网络架构发送之后,该加速方法还包括:计算机设备通过数据库网络线程将该第二数据存储在数据资源池中。It should also be noted that, in other embodiments of the present application, when the database network architecture can also include a data sharing buffer (that is, a data resource pool), when the computer device uses the database multi-thread architecture, the sending and receiving interface is controlled via communication After sending the second data to the database network architecture, the acceleration method further includes: the computer device stores the second data in the data resource pool through the database network thread.
为便于理解,下面以计算机设备包括用户态网络协议栈、数据库网络架构、数据库多线程架构、用户态网络配置模块,且数据库网络架构包括通信池及资源数据池、用户态网络协议栈包括用户态进程及网络协议栈组件为例,对上述实施例所述的数据库网络负载性能的加速方法的实现步骤进行总结,具体请参阅图10,图10为本申请实施例提供的数据库网络负载性能的加速方法的一个核心实现流程图,具体可以包括如下所述的核心步骤:For ease of understanding, the following computer equipment includes user-mode network protocol stack, database network architecture, database multi-thread architecture, user-mode network configuration module, and database network architecture includes communication pool and resource data pool, user-mode network protocol stack includes user-mode Taking the process and network protocol stack components as an example, the implementation steps of the method for accelerating the database network load performance described in the above-mentioned embodiments are summarized. For details, please refer to FIG. 10. FIG. 10 is the acceleration of the database network load performance provided by the embodiment of the present application A core implementation flowchart of the method may specifically include the following core steps:
第1步,计算机设备上的数据库在安装启动阶段,用户态网络配置模块首先负责使能用户态网络配置。通过创建守护进程进行DPDK接管、驱动加载、配置内存大页和启动用户态进程等用户态网络协议栈的自动化部署,并实现用户态网络的高可用。In the first step, the database on the computer device is installed and started, and the user mode network configuration module is firstly responsible for enabling the user mode network configuration. Create a daemon process to automate the deployment of user-mode network protocol stacks such as DPDK takeover, driver loading, configuration memory large pages, and start user-mode processes, and achieve high availability of user-mode networks.
第2步,数据库通过用户态网络配置模块完成用户态网络协议栈配置后,进行数据库网络架构中的通信池和数据资源池的创建和初始化,并初始化上层业务应用(即数据库多线程架构中的数据库业务线程)所需的通信控制收发接口。In the second step, after the database completes the configuration of the user-mode network protocol stack through the user-mode network configuration module, the communication pool and data resource pool in the database network architecture are created and initialized, and the upper-layer business application (that is, the database in the multi-threaded architecture) is initialized. database business thread) required communication control transceiver interface.
第3步,数据库开始启动自身业务所需的后端监听进程和后台业务线程(这两者都属于数据库业务线程中的进程)。基于多路I/O复用实现高并发下的通信事件监听,并通过调用通信池提供的通信控制收发接口开始进行数据交互。In step 3, the database starts to start the back-end monitoring process and the background business thread required by its own business (both of which belong to the process in the database business thread). Realize communication event monitoring under high concurrency based on multiple I/O multiplexing, and start data interaction by invoking the communication control transceiver interface provided by the communication pool.
第4步,上层业务调用通信池控制接口,具体地,在通信池中的每个数据库网络线程负责报文控制处理和报文收发处理,同时数据库网络线程基于轮询模式将用户态网络协议栈的数据存放在数据资源池中。Step 4: The upper-layer business calls the communication pool control interface. Specifically, each database network thread in the communication pool is responsible for message control processing and message sending and receiving processing. The data is stored in the data resource pool.
第5步,数据资源池负责将网络协议栈的数据进行包聚合和批量收发,以实现动态流 控和缩扩容。In step 5, the data resource pool is responsible for packet aggregation and batch sending and receiving of the data of the network protocol stack, so as to realize dynamic flow control and shrink and expand.
第6步,当上层的数据库业务线程感知到通信事件,数据库业务线程利用单工收阻塞式接口从数据资源池读取数据,或,利用单工发异步式接口写入数据到数据资源池(即写数据),以此完成业务层和通信层的整个数据交互流程。Step 6, when the upper-level database business thread senses the communication event, the database business thread uses the simplex receive blocking interface to read data from the data resource pool, or uses the simplex send asynchronous interface to write data to the data resource pool ( That is, write data), so as to complete the entire data interaction process between the business layer and the communication layer.
下面以一个具体的示例,对上述所述的加速框架以及加速方法进行说明,具体请参阅图11,图11为在上述提供的加速框架的基础上,数据库网络负载性能的加速方法的一个实现流程图,该加速框架主要实现步骤如下:The above-mentioned acceleration framework and acceleration method are described below with a specific example. For details, please refer to Figure 11. Figure 11 is an implementation process of an acceleration method for database network load performance based on the acceleration framework provided above. As shown in the figure, the main implementation steps of the acceleration framework are as follows:
step1,在数据库的业务调用中,使用comm_XXX接口替换原先的调用接口,例如comm_recv替换recv,comm_send替换send,comm_socket替换socket。对外只是感知接口调用层的变化。Step1, in the business call of the database, use the comm_XXX interface to replace the original call interface, for example, comm_recv replaces recv, comm_send replaces send, and comm_socket replaces socket. Externally, it only perceives changes in the interface call layer.
step2,建立socket请求:comm_proxy_socket利用PORTREUSE为每个网络代理线程创建server fd,实现用户态网络协议栈fd的逻辑映射。本申请实施例所建立的server fd监听,为了应对fd不能跨线程的限制,使所有的网络线程都能够监听此fd,从而创建新的连接,本申请使用REUSEPORT的形式,让每个网络线程实体都去listen/bind此server fd地址。Step2, create a socket request: comm_proxy_socket uses PORTREUSE to create a server fd for each network proxy thread, and realizes the logical mapping of the user mode network protocol stack fd. The server fd monitoring established in the embodiment of this application, in order to cope with the limitation that fd cannot cross threads, so that all network threads can monitor this fd, thereby creating a new connection, this application uses the form of REUSEPORT to allow each network thread entity Go to listen/bind this server fd address.
step3,fd广播:broadcast将server fd的广播到每个网络代理线程中。Step3, fd broadcast: broadcast broadcasts server fd to each network agent thread.
step4,基于多路I/O复用的事件驱动模型:采用epoll多路I/O模型实现协议栈数据的事件处理。step4, event-driven model based on multiple I/O multiplexing: use epoll multi-channel I/O model to realize event processing of protocol stack data.
step5,数据库网络线程对fd的控制报文处理:socket/accept/poll/epoll_wait/epoll_ctl…等socket通信控制收发接口的实现所有业务session的fd都从这里创建和修改,实现fd不跨线程的处理。Step5, the database network thread processes fd control messages: socket/accept/poll/epoll_wait/epoll_ctl... and other socket communication control transceiver interfaces are implemented. All business session fds are created and modified from here to realize fd processing without crossing threads. .
step6,数据接收:实现一种单工收模式,把所有的数据fd添加到网络代理线程的epoll fd中,通过轮询(polling)方式接收用户态网络协议栈的数据请求,把来自用户态网络协议栈的数据放入到recv buffer中。Step6, data receiving: implement a simplex receiving mode, add all data fd to the epoll fd of the network proxy thread, receive data requests from the user-mode network protocol stack by polling, and send data from the user-mode network The data of the protocol stack is put into the recv buffer.
step7,数据发送:实现一种单工发模式,业务session需要发数据时,将对应数据追加到对应fd的send buffer中,由网络代理线程统一处理所监听的所有fd的数据,最后组包按需发送。Step7, data sending: implement a simplex sending mode, when the business session needs to send data, the corresponding data is appended to the send buffer of the corresponding fd, and the network agent thread uniformly processes all the monitored data of the fd, and finally the package is pressed Need to send.
在本申请上述实施方式中,相比现有技术的方案,在本申请实施例提供的加速框架以及加速方法中,利用用户态网络协议栈替换内核态网络协议栈,实现了操作***旁路,提升了***性能,这种纯软技术,不依赖新型网络设备,可控性友好。此外,通信资源池化技术,实现fd不跨线程的通信池、消息批发处理、环形buffer的数据读/写,有效减少数据库业务线程和数据库网络线程切换带来的性能损失。In the above embodiments of the present application, compared with the solutions of the prior art, in the acceleration framework and the acceleration method provided by the embodiments of the present application, the user-mode network protocol stack is used to replace the kernel-mode network protocol stack, and the operating system bypass is realized. Improve system performance, this pure soft technology does not rely on new network equipment, and is controllable and friendly. In addition, the communication resource pooling technology realizes the communication pool without inter-threading, message wholesale processing, and ring buffer data read/write, effectively reducing the performance loss caused by the switching of database business threads and database network threads.
综上所述,本申请实施例提供的数据库网络负载性能的加速框架具体包括用户态网络协议栈、数据库网络架构以及数据库多线程架构,在一些实施方式中,还可以额外包括用户态网络配置模块,用于通过创建守护进程对用户态网络协议栈进行自动化配置。在计算机设备上的数据库读数据的情形下,网卡设备接收到对端设备发的初始数据,会将该初始数据发送给加速框架的用户态进程,用户态进程进一步将初始数据放到与网络协议栈组件的共享内存里,网络协议栈组件就会去对该初始数据进行解析,解析成服务器能识别的格 式,解析后得到的第一数据也会放入共享内存中。通信池中的数据库网络线程就会从共享内存中以轮询的方式(也可以是别的方式,此处不做限定)去共享内存中将第一数据取出来放到数据资源池中,然后通知数据库多线程架构中对应的数据库业务线程来读取该第一数据,对应的数据库业务线程读取该第一数据并用于执行与该第一数据对应的第一任务,第一任务执行完可以告知通信池,也可以不告知通信池,具体本申请对此不做限定。而在计算机设备上的数据库写数据的情形下,那么数据库多线程架构中的业务线程直接将执行完任务后的数据(即第二数据)主动放入数据库网络架构中的数据资源池中,通线池中的数据库网络线程就会将该数据资源池的第二数据放入用户态进程与网络协议栈组件的共享内存里,以便于网卡设备从该共享的内存中读取该第二数据。In summary, the database network load performance acceleration framework provided by the embodiment of the present application specifically includes a user-mode network protocol stack, a database network architecture, and a database multi-thread architecture. In some implementations, it may additionally include a user-mode network configuration module , which is used to automatically configure the user-mode network protocol stack by creating a daemon process. In the case of reading data from the database on the computer device, the network card device receives the initial data sent by the peer device, and sends the initial data to the user mode process of the acceleration framework, and the user mode process further puts the initial data into the network protocol. In the shared memory of the stack component, the network protocol stack component will parse the initial data into a format that the server can recognize, and the first data obtained after parsing will also be put into the shared memory. The database network thread in the communication pool will take the first data out of the shared memory and put it in the data resource pool in a polling manner (or other methods, not limited here) from the shared memory, and then Notify the corresponding database business thread in the database multi-thread architecture to read the first data, and the corresponding database business thread reads the first data and uses it to execute the first task corresponding to the first data, and the first task can be executed The communication pool may or may not be notified, which is not limited in this application. In the case of writing data in the database on the computer device, the business thread in the database multi-thread architecture directly puts the data (i.e. the second data) after executing the task actively into the data resource pool in the database network architecture. The database network thread in the thread pool will put the second data of the data resource pool into the shared memory of the user state process and the network protocol stack component, so that the network card device can read the second data from the shared memory.
在图9至图11所对应的实施例的基础上,为了更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关设备。具体参阅图12,图12为本申请实施例提供的计算机设备的一个结构示意图,具体可以包括:解析模块1201、获取模块1202以及读写模块1203,其中,解析模块1201,用于计算机设备通过用户态网络协议栈从网卡设备获取初始数据,并通过TCP/IP协议栈解析该初始数据,得到第一数据;获取模块1202,用于通过至少一个数据库网络线程从该用户态网络协议栈获取该第一数据,并指示数据库多线程架构从该数据库网络架构中读取该第一数据,该数据库网络线程属于数据库网络架构,该数据库多线程架构包括至少一个数据库业务线程;读写模块1203,用于通过该数据库多线程架构,经由该数据库多线程架构与该数据库网络架构之间的通信控制收发接口读取该第一数据,并根据该第一数据执行数据库中与该第一数据对应的第一任务。On the basis of the embodiments corresponding to FIG. 9 to FIG. 11 , in order to better implement the above-mentioned solution of the embodiment of the present application, related equipment for implementing the above-mentioned solution is also provided below. Referring to Fig. 12 for details, Fig. 12 is a schematic structural diagram of a computer device provided by the embodiment of the present application, which may specifically include: an analysis module 1201, an acquisition module 1202, and a read-write module 1203, wherein the analysis module 1201 is used for the computer device to pass the user The state network protocol stack obtains initial data from the network card device, and parses the initial data through the TCP/IP protocol stack to obtain the first data; the obtaining module 1202 is used to obtain the first data from the user state network protocol stack through at least one database network thread One data, and instruct the database multi-thread architecture to read the first data from the database network architecture, the database network thread belongs to the database network architecture, and the database multi-thread architecture includes at least one database business thread; the read-write module 1203 is used to Through the database multi-thread architecture, the first data is read through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and the first data corresponding to the first data in the database is executed according to the first data. Task.
在一种可能的设计中,该用户态网络协议栈由该计算机设备上部署的用户态网络配置模块通过创建守护进程进行配置。In a possible design, the user-mode network protocol stack is configured by a user-mode network configuration module deployed on the computer device by creating a daemon process.
在一种可能的设计中,守护进程至少执行以下一种配置操作:设置数据平面开发套件DPDK用户态驱动、设置大页内存、设置定时任务、设置内核虚拟网卡KNI、设置用户态组件的控制权限。In a possible design, the daemon process performs at least one of the following configuration operations: setting the data plane development kit DPDK user mode driver, setting huge page memory, setting scheduled tasks, setting the kernel virtual network card KNI, and setting the control authority of user mode components .
在一种可能的设计中,该读写模块1203,还用于:通过该数据库多线程架构,经由该通信控制收发接口将第二数据向该数据库网络架构发送,该第二数据为该至少一个业务线程执行该数据库中的第二业务后得到的数据;通过该至少一个数据库网络线程将该第二数据向该用户态网络协议栈发送。In a possible design, the read-write module 1203 is also configured to: send second data to the database network architecture via the communication control transceiver interface through the database multi-thread architecture, the second data being the at least one The data obtained after the service thread executes the second service in the database; the second data is sent to the user state network protocol stack through the at least one database network thread.
在一种可能的设计中,该用户态网络协议栈包括用户态进程以及网络协议栈组件,该用户态进程以及该网络协议栈组件共享内存,该解析模块1201,具体用于:通过该用户态进程接收由网卡设备发送的初始数据,并存放于该内存;通过该网络协议栈组件在该内存内基于TCP/IP协议栈解析该初始数据,得到第一数据,该第一数据存放于该内存。In a possible design, the user mode network protocol stack includes a user mode process and a network protocol stack component, the user mode process and the network protocol stack component share memory, and the parsing module 1201 is specifically used to: The process receives the initial data sent by the network card device and stores it in the memory; the network protocol stack component parses the initial data in the memory based on the TCP/IP protocol stack to obtain the first data, and the first data is stored in the memory .
在一种可能的设计中,该用户态网络协议栈包括用户态进程以及网络协议栈组件,该用户态进程以及该网络协议栈组件共享内存,该读写模块1203,具体用于通过该至少一个数据库网络线程将该第二数据向该网络协议栈组件发送;该解析模块1201,具体还用于通过该网络协议栈组件将该第二数据存放于该内存。In a possible design, the user-mode network protocol stack includes a user-mode process and a network protocol stack component, the user-mode process and the network protocol stack component share memory, and the read-write module 1203 is specifically configured to pass the at least one The database network thread sends the second data to the network protocol stack component; the parsing module 1201 is further configured to store the second data in the memory through the network protocol stack component.
在一种可能的设计中,该数据库网络架构还包括数据共享buffer,该获取模块1202,具体用于:在通过至少一个数据库网络线程从该用户态网络协议栈获取该第一数据之后, 通过该至少一个数据库网络线程将该第一数据存储在该数据共享buffer。In a possible design, the database network architecture further includes a data sharing buffer, and the acquiring module 1202 is specifically configured to: after acquiring the first data from the user mode network protocol stack through at least one database network thread, through the At least one database network thread stores the first data in the data sharing buffer.
在一种可能的设计中,该数据库网络架构还包括数据共享缓冲器buffer,该获取1202模块,具体用于:通过该至少一个数据库网络线程将该第二数据存储在该数据共享buffer。In a possible design, the database network architecture further includes a data sharing buffer buffer, and the obtaining 1202 module is specifically configured to: store the second data in the data sharing buffer through the at least one database network thread.
需要说明的是,图12对应实施例所述的计算机设备1200中各模块/单元之间的信息交互、执行过程等内容,与本申请中图9至图11对应的方法实施例基于同一构思,具体内容可参见本申请前述所示的方法实施例中的操作过程以及叙述,此处不再赘述。It should be noted that FIG. 12 corresponds to the information interaction and execution process among the various modules/units in the computer device 1200 described in the embodiment, and the method embodiments corresponding to FIGS. 9 to 11 in this application are based on the same idea. For specific content, reference may be made to the operation process and descriptions in the foregoing method embodiments of the present application, and details are not repeated here.
接下来介绍本申请实施例提供的一种计算机设备,请参阅图13,图13为本申请实施例提供的计算机设备的一种结构示意图,该计算机设备1300上可以部署有图12对应实施例中所描述的模块,用于实现图12对应实施例中计算机设备1200的功能。计算机设备1300由一个或多个服务器实现,计算机设备1300可因配置或性能不同而产生比较大的差异,可以包括一个或一个以***处理器(central processing units,CPU)1322(例如,一个或一个以***处理器)和存储器1332,一个或一个以上存储应用程序1342或数据1344的存储介质1330(例如一个或一个以上海量存储设备)。其中,存储器1332和存储介质1330可以是短暂存储或持久存储。存储在存储介质1330的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对计算机设备1300中的一系列指令操作。更进一步地,中央处理器1322可以设置为与存储介质1330通信,在计算机设备1300上执行存储介质1330中的一系列指令操作。Next, we will introduce a computer device provided by the embodiment of the present application. Please refer to FIG. 13. FIG. 13 is a schematic structural diagram of the computer device provided by the embodiment of the present application. The computer device 1300 can be deployed with The described modules are used to implement the functions of the computer device 1200 in the embodiment corresponding to FIG. 12 . The computer device 1300 is realized by one or more servers, and the computer device 1300 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (central processing units, CPU) 1322 (for example, one or one central processing unit) and memory 1332, one or more storage media 1330 (such as one or more mass storage devices) for storing application programs 1342 or data 1344. Wherein, the memory 1332 and the storage medium 1330 may be temporary storage or persistent storage. The program stored in the storage medium 1330 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the computer device 1300 . Furthermore, the central processing unit 1322 may be configured to communicate with the storage medium 1330 , and execute a series of instruction operations in the storage medium 1330 on the computer device 1300 .
计算机设备1300还可以包括一个或一个以上电源1326,一个或一个以上有线或无线网络接口1350,一个或一个以上输入输出接口1358,和/或,一个或一个以上操作***1341,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。 Computer device 1300 can also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input and output interfaces 1358, and/or, one or more operating systems 1341, such as Windows Server™, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
本申请实施例中,计算机设备1300可用于执行图9至图11对应实施例中由计算机设备执行的步骤,例如,中央处理器1322可以用于:当对端设备通过网卡设备向计算机设备发送数据(可称为初始数据)时,通过用户态网络协议栈从网卡设备处接收到对端设备发送的初始数据,并进一步在用户态网络协议栈内通过TCP/IP协议栈解析该初始数据,得到该第一数据。用户态网络协议栈接收到该初始数据,并解析得到第一数据后,会通过通信池(该通信池由至少一个数据库网络线程组成,在通信池中的每个数据库网络线程负责报文控制处理和报文收发处理)中的网络线程从用户态网络协议栈中获取该第一数据,例如,通信池中的数据库网络线程可以基于轮询模式(也可以是别的模式,如周期性查看、唤醒查看等模式)从用户态网络协议栈中获取该第一数据,并进一步指示数据库多线程架构中的数据库业务线程(该数据库多线程架构包括至少一个数据库业务线程)从数据库网络架构中读取该第一数据,其中,该通信池属于该数据库网络架构,也就是数据库网络线程属于该数据库网络架构。最后,通过数据库多线程架构,经由数据库多线程架构与数据库网络架构之间的通信控制收发接口读取该第一数据,并根据该第一数据执行数据库中与第一数据对应的第一任务。In the embodiment of the present application, the computer device 1300 can be used to execute the steps performed by the computer device in the embodiments corresponding to FIGS. 9 to 11. For example, the central processing unit 1322 can be used to: (can be referred to as initial data), the initial data sent by the peer device is received from the network card device through the user-mode network protocol stack, and the initial data is further parsed through the TCP/IP protocol stack in the user-mode network protocol stack to obtain the first data. After the user mode network protocol stack receives the initial data and parses the first data, it will pass through the communication pool (the communication pool is composed of at least one database network thread, and each database network thread in the communication pool is responsible for message control processing. and message sending and receiving process) to obtain the first data from the user state network protocol stack, for example, the database network thread in the communication pool can be based on polling mode (also can be other modes, such as periodic inspection, Wake up the mode such as viewing) obtain this first data from the user state network protocol stack, and further instruct the database service thread in the database multi-thread architecture (the database multi-thread architecture includes at least one database service thread) to read from the database network architecture For the first data, the communication pool belongs to the database network architecture, that is, the database network thread belongs to the database network architecture. Finally, through the database multi-thread architecture, the first data is read through the communication control transceiver interface between the database multi-thread architecture and the database network architecture, and the first task corresponding to the first data in the database is executed according to the first data.
中央处理器1322,用于执行图9至图11对应实施例中由计算机设备执行的任意一个步骤。具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。The central processing unit 1322 is configured to execute any one of the steps executed by the computer device in the embodiments corresponding to FIG. 9 to FIG. 11 . For specific content, reference may be made to the descriptions in the foregoing method embodiments of the present application, and details are not repeated here.
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有用于进行信号处理的程序,当其在计算机上运行时,使得计算机执行如前述所示实施例描述 中计算机设备所执行的步骤。An embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a program for signal processing, and when it is run on a computer, the computer executes the program described in the foregoing embodiments. Steps performed by computer equipment.
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。In addition, it should be noted that the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be A physical unit can be located in one place, or it can be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the device embodiments provided in the present application, the connection relationship between the modules indicates that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines.
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,训练设备,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus necessary general-purpose hardware, and of course it can also be realized by special hardware including application-specific integrated circuits, dedicated CPUs, dedicated memories, Special components, etc. to achieve. In general, all functions completed by computer programs can be easily realized by corresponding hardware, and the specific hardware structure used to realize the same function can also be varied, such as analog circuits, digital circuits or special-purpose circuit etc. However, for this application, software program implementation is a better implementation mode in most cases. Based on this understanding, the essence of the technical solution of this application or the part that contributes to the prior art can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a floppy disk of a computer , U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal A computer, a training device, or a network device, etc.) executes the methods described in various embodiments of the present application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。In the above embodiments, all or part of them may be implemented by software, hardware, firmware or any combination thereof. When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、训练设备或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、训练设备或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的训练设备、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,高密度数字视频光盘(digital video disc,DVD))、或者半导体介质(例如,固态硬盘(solid state disk,SSD))等。The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part. The computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transferred from a website, computer, training device, or data The center transmits to another website site, computer, training device or data center via wired (eg coaxial cable, optical fiber, digital subscriber line) or wireless (eg infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be stored by a computer, or a data storage device such as a training device or a data center integrated with one or more available media. The available medium may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, a high-density digital video disc (digital video disc, DVD)), or a semiconductor medium (for example, a solid state disk (solid state disk) , SSD)) etc.

Claims (22)

  1. 一种数据库网络负载性能的加速框架,其特征在于,所述框架部署于计算机设备,所述计算机设备上部署有操作***以及数据库,所述框架包括:An acceleration framework for database network load performance, characterized in that the framework is deployed on computer equipment, and an operating system and a database are deployed on the computer equipment, and the framework includes:
    用户态网络协议栈、数据库网络架构以及数据库多线程架构,所述数据库网络架构包括至少一个数据库网络线程,所述数据库多线程架构包括至少一个数据库业务线程,所述数据库多线程架构与所述数据库网络架构之间通过通信控制收发接口连接;User mode network protocol stack, database network architecture and database multi-thread architecture, the database network architecture includes at least one database network thread, the database multi-thread architecture includes at least one database business thread, the database multi-thread architecture and the database The network architecture is connected through the communication control transceiver interface;
    所述用户态网络协议栈,用于接收由网卡设备发送的初始数据,并通过TCP/IP协议栈解析所述初始数据,得到第一数据;The user mode network protocol stack is used to receive the initial data sent by the network card device, and parse the initial data through the TCP/IP protocol stack to obtain the first data;
    所述至少一个数据库网络线程,用于获取所述第一数据,并指示所述数据库多线程架构从所述数据库网络架构中读取所述第一数据;The at least one database network thread is used to obtain the first data, and instruct the database multithreading framework to read the first data from the database network framework;
    所述数据库多线程架构,用于通过所述通信控制收发接口读取所述第一数据,以执行所述数据库中与所述第一数据对应的第一业务。The database multi-thread architecture is used to read the first data through the communication control transceiver interface, so as to execute the first service corresponding to the first data in the database.
  2. 根据权利要求1所述的框架,其特征在于,所述框架还包括:The frame according to claim 1, further comprising:
    用户态网络配置模块,用于通过创建守护进程对用户态网络协议栈进行配置。The user mode network configuration module is used to configure the user mode network protocol stack by creating a daemon process.
  3. 根据权利要求2所述的框架,其特征在于,所述用户态网络配置模块,具体用于至少执行以下一种配置操作:The framework according to claim 2, wherein the user mode network configuration module is specifically configured to perform at least one of the following configuration operations:
    设置数据平面开发套件DPDK用户态驱动、设置大页内存、设置定时任务、设置内核虚拟网卡KNI、设置用户态组件的控制权限。Set the data plane development kit DPDK user mode driver, set the huge page memory, set the timing task, set the kernel virtual network card KNI, and set the control authority of the user mode component.
  4. 根据权利要求1-3中任一项所述的框架,其特征在于,所述数据库多线程架构,还用于:The framework according to any one of claims 1-3, wherein the database multi-thread architecture is also used for:
    通过所述通信控制收发接口将第二数据向所述数据库网络架构发送,所述第二数据为所述至少一个业务线程执行所述数据库中的第二业务后得到的数据;Sending second data to the database network architecture through the communication control transceiver interface, the second data is data obtained after the at least one service thread executes a second service in the database;
    所述至少一个数据库网络线程,还用于将所述第二数据向所述用户态网络协议栈发送。The at least one database network thread is further configured to send the second data to the user mode network protocol stack.
  5. 根据权利要求1-4中任一项所述的框架,其特征在于,所述用户态网络协议栈包括:The framework according to any one of claims 1-4, wherein the user mode network protocol stack comprises:
    用户态进程以及网络协议栈组件,其中,所述用户态进程以及所述网络协议栈组件共享内存;A user state process and a network protocol stack component, wherein the user state process and the network protocol stack component share memory;
    所述用户态网络协议栈,具体用于:The user mode network protocol stack is specifically used for:
    通过所述用户态进程接收由网卡设备发送的初始数据,并存放于所述内存;receiving the initial data sent by the network card device through the user state process, and storing it in the memory;
    通过所述网络协议栈组件在所述内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据,所述第一数据存放于所述内存。The network protocol stack component parses the initial data based on the TCP/IP protocol stack in the memory to obtain first data, and the first data is stored in the memory.
  6. 根据权利要求4所述的框架,其特征在于,所述用户态网络协议栈包括:The framework according to claim 4, wherein the user state network protocol stack comprises:
    用户态进程以及网络协议栈组件,其中,所述用户态进程以及所述网络协议栈组件共享内存;A user state process and a network protocol stack component, wherein the user state process and the network protocol stack component share memory;
    所述至少一个数据库网络线程,具体用于将所述第二数据向所述网络协议栈组件发送;The at least one database network thread is specifically used to send the second data to the network protocol stack component;
    所述用户态网络协议栈,具体用于通过所述网络协议栈组件将所述第二数据存放于所述内存。The user mode network protocol stack is specifically configured to store the second data in the memory through the network protocol stack component.
  7. 根据权利要求1-6中任一项所述的框架,其特征在于,所述数据库网络架构还包括:The framework according to any one of claims 1-6, wherein the database network architecture further comprises:
    数据共享缓冲器buffer,用于存储来自所述用户态网络协议栈的所述第一数据。A data sharing buffer buffer, configured to store the first data from the user mode network protocol stack.
  8. 根据权利要求4、6中任一项所述的框架,其特征在于,所述数据库网络架构还包括:The framework according to any one of claims 4 and 6, wherein the database network architecture further comprises:
    数据共享缓冲器buffer,用于存储来自所述数据库多线程架构的所述第二数据。A data sharing buffer buffer is used to store the second data from the multi-thread architecture of the database.
  9. 根据权利要求1-8中任一项所述的框架,其特征在于,所述至少一个数据库网络线程,具体用于:The framework according to any one of claims 1-8, wherein the at least one database network thread is specifically used for:
    获取所述第一数据并存储于所述数据共享buffer;Obtaining the first data and storing it in the data sharing buffer;
    指示所述数据库多线程架构从所述数据共享buffer中读取所述第一数据。Instructing the database multi-thread architecture to read the first data from the data sharing buffer.
  10. 一种数据库网络负载性能的加速方法,其特征在于,包括:A method for accelerating database network load performance, comprising:
    计算机设备通过用户态网络协议栈从网卡设备获取初始数据,并通过TCP/IP协议栈解析所述初始数据,得到第一数据;The computer device obtains initial data from the network card device through the user mode network protocol stack, and parses the initial data through the TCP/IP protocol stack to obtain the first data;
    所述计算机设备通过至少一个数据库网络线程从所述用户态网络协议栈获取所述第一数据,并指示数据库多线程架构从所述数据库网络架构中读取所述第一数据,所述数据库网络线程属于数据库网络架构,所述数据库多线程架构包括至少一个数据库业务线程;The computer device obtains the first data from the user-mode network protocol stack through at least one database network thread, and instructs the database multi-thread architecture to read the first data from the database network architecture, and the database network The thread belongs to the database network architecture, and the database multi-thread architecture includes at least one database business thread;
    所述计算机设备通过所述数据库多线程架构,经由所述数据库多线程架构与所述数据库网络架构之间的通信控制收发接口读取所述第一数据,并根据所述第一数据执行数据库中与所述第一数据对应的第一任务。The computer device reads the first data via the communication control transceiver interface between the database multi-thread architecture and the database network architecture through the database multi-thread architecture, and executes the data in the database according to the first data. A first task corresponding to the first data.
  11. 根据权利要求10所述的方法,其特征在于,所述用户态网络协议栈由所述计算机设备上部署的用户态网络配置模块通过创建守护进程进行配置。The method according to claim 10, wherein the user-mode network protocol stack is configured by a user-mode network configuration module deployed on the computer device by creating a daemon process.
  12. 根据权利要求11所述的方法,其特征在于,所述守护进程至少执行以下一种配置操作:The method according to claim 11, wherein the daemon process performs at least one of the following configuration operations:
    设置数据平面开发套件DPDK用户态驱动、设置大页内存、设置定时任务、设置内核虚拟网卡KNI、设置用户态组件的控制权限。Set the data plane development kit DPDK user mode driver, set the huge page memory, set the timing task, set the kernel virtual network card KNI, and set the control authority of the user mode component.
  13. 根据权利要求10-12中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 10-12, wherein the method further comprises:
    所述计算机设备通过所述数据库多线程架构,经由所述通信控制收发接口将第二数据向所述数据库网络架构发送,所述第二数据为所述至少一个业务线程执行所述数据库中的第二业务后得到的数据;The computer device sends second data to the database network architecture via the communication control transceiver interface through the database multi-thread architecture, and the second data is for the at least one business thread to execute the first operation in the database. 2. Data obtained after the business;
    所述计算机设备通过所述至少一个数据库网络线程将所述第二数据向所述用户态网络协议栈发送。The computer device sends the second data to the user mode network protocol stack through the at least one database network thread.
  14. 根据权利要求10-13中任一项所述的方法,其特征在于,所述用户态网络协议栈包括用户态进程以及网络协议栈组件,所述用户态进程以及所述网络协议栈组件共享内存,所述计算机设备通过用户态网络协议栈从网卡设备获取初始数据,并通过TCP/IP协议栈解析所述初始数据,得到第一数据包括:The method according to any one of claims 10-13, wherein the user state network protocol stack includes a user state process and a network protocol stack component, and the user state process and the network protocol stack component share memory , the computer device obtains initial data from the network card device through the user mode network protocol stack, and parses the initial data through the TCP/IP protocol stack, and obtains the first data including:
    所述计算机设备通过所述用户态进程接收由网卡设备发送的初始数据,并存放于所述内存;The computer device receives the initial data sent by the network card device through the user mode process, and stores it in the memory;
    所述计算机设备通过所述网络协议栈组件在所述内存内基于TCP/IP协议栈解析所述初始数据,得到第一数据,所述第一数据存放于所述内存。The computer device parses the initial data based on the TCP/IP protocol stack in the memory through the network protocol stack component to obtain first data, and the first data is stored in the memory.
  15. 根据权利要求13所述的方法,其特征在于,所述用户态网络协议栈包括用户态进程以及网络协议栈组件,所述用户态进程以及所述网络协议栈组件共享内存,所述计算机 设备通过所述至少一个数据库网络线程将所述第二数据向所述用户态网络协议栈发送包括:The method according to claim 13, wherein the user state network protocol stack includes a user state process and a network protocol stack component, and the user state process and the network protocol stack component share memory, and the computer device passes The at least one database network thread sending the second data to the user mode network protocol stack includes:
    所述计算机设备通过所述至少一个数据库网络线程将所述第二数据向所述网络协议栈组件发送;The computer device sends the second data to the network protocol stack component through the at least one database network thread;
    所述计算机设备通过所述网络协议栈组件将所述第二数据存放于所述内存。The computer device stores the second data in the memory through the network protocol stack component.
  16. 根据权利要求10-15中任一项所述的方法,其特征在于,所述数据库网络架构还包括数据共享缓冲器buffer,在所述计算机设备通过至少一个数据库网络线程从所述用户态网络协议栈获取所述第一数据之后,所述方法还包括:The method according to any one of claims 10-15, wherein the database network architecture further comprises a data sharing buffer buffer, and the computer device uses at least one database network thread from the user state network protocol After the stack acquires the first data, the method further includes:
    所述计算机设备通过所述至少一个数据库网络线程将所述第一数据存储在所述数据共享buffer。The computer device stores the first data in the data sharing buffer through the at least one database network thread.
  17. 根据权利要求13、15中的任一项所述的方法,其特征在于,所述数据库网络架构还包括数据共享缓冲器buffer,在所述计算机设备通过所述数据库多线程架构,经由所述通信控制收发接口将第二数据向所述数据库网络架构发送之后,所述方法还包括:The method according to any one of claims 13 and 15, wherein the database network architecture further includes a data sharing buffer buffer, and the computer device passes through the database multi-thread architecture, via the communication After controlling the transceiver interface to send the second data to the database network architecture, the method further includes:
    所述计算机设备通过所述至少一个数据库网络线程将所述第二数据存储在所述数据共享buffer。The computer device stores the second data in the data sharing buffer through the at least one database network thread.
  18. 一种计算机设备,所述设备具有实现权利要求10-17中任一项所述方法的功能,所述功能通过硬件或通过硬件执行相应的软件实现,所述硬件或所述软件包括一个或多个与所述功能相对应的模块。A computer device, the device has the function of implementing the method according to any one of claims 10-17, the function is implemented by hardware or by executing corresponding software through hardware, and the hardware or the software includes one or more A module corresponding to the described function.
  19. 一种计算机设备,包括处理器和存储器,所述处理器与所述存储器耦合,其特征在于,A computer device comprising a processor and a memory, the processor being coupled to the memory, characterized in that,
    所述存储器,用于存储程序;The memory is used to store programs;
    所述处理器,用于执行所述存储器中的程序,使得所述计算机设备执行如权利要求10-17中任一项所述的方法。The processor is configured to execute the program in the memory, so that the computer device executes the method according to any one of claims 10-17.
  20. 一种计算机可读存储介质,包括程序,当其在计算机上运行时,使得计算机执行如权利要求10-17中任一项所述的方法。A computer-readable storage medium, including a program, which, when run on a computer, causes the computer to execute the method according to any one of claims 10-17.
  21. 一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行如权利要求10-17中任一项所述的方法。A computer program product comprising instructions which, when run on a computer, cause the computer to perform the method according to any one of claims 10-17.
  22. 一种芯片,所述芯片包括处理器与数据接口,所述处理器通过所述数据接口读取存储器上存储的指令,执行如权利要求10-17中任一项所述的方法。A chip, the chip includes a processor and a data interface, the processor reads instructions stored in the memory through the data interface, and executes the method according to any one of claims 10-17.
PCT/CN2022/121232 2021-09-27 2022-09-26 Acceleration framework and acceleration method for database network load performance, and device WO2023046141A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111136877.XA CN115878301A (en) 2021-09-27 2021-09-27 Acceleration framework, acceleration method and equipment for database network load performance
CN202111136877.X 2021-09-27

Publications (1)

Publication Number Publication Date
WO2023046141A1 true WO2023046141A1 (en) 2023-03-30

Family

ID=85720121

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/121232 WO2023046141A1 (en) 2021-09-27 2022-09-26 Acceleration framework and acceleration method for database network load performance, and device

Country Status (2)

Country Link
CN (1) CN115878301A (en)
WO (1) WO2023046141A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116599917A (en) * 2023-05-31 2023-08-15 中科驭数(北京)科技有限公司 Network port determining method, device, equipment and storage medium
CN116781650A (en) * 2023-07-11 2023-09-19 中科驭数(北京)科技有限公司 Data processing method and system
CN117076542A (en) * 2023-08-29 2023-11-17 中国中金财富证券有限公司 Data processing method and related device
CN117076542B (en) * 2023-08-29 2024-06-07 中国中金财富证券有限公司 Data processing method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897278A (en) * 2015-12-17 2017-06-27 阿里巴巴集团控股有限公司 For the data read-write processing method and equipment of key value database
US20190075170A1 (en) * 2017-09-06 2019-03-07 Oracle International Corporation System and method for high availability and load balancing in a database environment
CN110602154A (en) * 2018-06-13 2019-12-20 网宿科技股份有限公司 WEB server and method for processing data message thereof
CN113296974A (en) * 2020-08-31 2021-08-24 阿里巴巴集团控股有限公司 Database access method and device, electronic equipment and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897278A (en) * 2015-12-17 2017-06-27 阿里巴巴集团控股有限公司 For the data read-write processing method and equipment of key value database
US20190075170A1 (en) * 2017-09-06 2019-03-07 Oracle International Corporation System and method for high availability and load balancing in a database environment
CN110602154A (en) * 2018-06-13 2019-12-20 网宿科技股份有限公司 WEB server and method for processing data message thereof
CN113296974A (en) * 2020-08-31 2021-08-24 阿里巴巴集团控股有限公司 Database access method and device, electronic equipment and readable storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116599917A (en) * 2023-05-31 2023-08-15 中科驭数(北京)科技有限公司 Network port determining method, device, equipment and storage medium
CN116599917B (en) * 2023-05-31 2024-03-01 中科驭数(北京)科技有限公司 Network port determining method, device, equipment and storage medium
CN116781650A (en) * 2023-07-11 2023-09-19 中科驭数(北京)科技有限公司 Data processing method and system
CN116781650B (en) * 2023-07-11 2024-03-19 中科驭数(北京)科技有限公司 Data processing method and system
CN117076542A (en) * 2023-08-29 2023-11-17 中国中金财富证券有限公司 Data processing method and related device
CN117076542B (en) * 2023-08-29 2024-06-07 中国中金财富证券有限公司 Data processing method and related device

Also Published As

Publication number Publication date
CN115878301A (en) 2023-03-31

Similar Documents

Publication Publication Date Title
WO2023046141A1 (en) Acceleration framework and acceleration method for database network load performance, and device
Jose et al. Memcached design on high performance RDMA capable interconnects
Lu et al. Accelerating spark with RDMA for big data processing: Early experiences
US20060026169A1 (en) Communication method with reduced response time in a distributed data processing system
WO2021042840A1 (en) Data processing method and apparatus, server, and computer-readable storage medium
Pipatsakulroj et al. mumq: A lightweight and scalable mqtt broker
CN111459418B (en) RDMA (remote direct memory Access) -based key value storage system transmission method
US9218401B2 (en) Systems and methods for remote access to DB2 databases
JPH10149296A (en) Method and device for discriminating server computer collective topology
US9218226B2 (en) System and methods for remote access to IMS databases
US7840940B2 (en) Semantic management method for logging or replaying non-deterministic operations within the execution of an application process
US10873630B2 (en) Server architecture having dedicated compute resources for processing infrastructure-related workloads
Sun et al. SKV: A SmartNIC-Offloaded Distributed Key-Value Store
US20240179092A1 (en) Traffic service threads for large pools of network addresses
CN114371935A (en) Gateway processing method, gateway, device and medium
CN106131162A (en) A kind of method realizing network service agent based on IOCP mechanism
WO2024007934A1 (en) Interrupt processing method, electronic device, and storage medium
Lin et al. Towards Accelerating Data Intensive Application's Shuffle Process Using SmartNICs
US7320044B1 (en) System, method, and computer program product for interrupt scheduling in processing communication
Frachtenberg et al. Storm: Scalable resource management for large-scale parallel computers
Li et al. Light: A Compatible, high-performance and scalable user-level network stack
Argyroulis Recent Advancements In Distributed System Communications
CN115412500B (en) Asynchronous communication method, system, medium and equipment supporting load balancing strategy
Lutz Enhancing the performance of twitter storm with in-network processing
WO2023198128A1 (en) Distributed resource sharing method and related apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22872185

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE