CN116737403A - Data processing method, device, electronic equipment and storage medium - Google Patents

Data processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116737403A
CN116737403A CN202210197851.4A CN202210197851A CN116737403A CN 116737403 A CN116737403 A CN 116737403A CN 202210197851 A CN202210197851 A CN 202210197851A CN 116737403 A CN116737403 A CN 116737403A
Authority
CN
China
Prior art keywords
virtual
simulation
application program
host
initialized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210197851.4A
Other languages
Chinese (zh)
Inventor
苏金钊
李兆耕
曹颖
程钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202210197851.4A priority Critical patent/CN116737403A/en
Publication of CN116737403A publication Critical patent/CN116737403A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer And Data Communications (AREA)

Abstract

The disclosure provides a data processing method, a data processing device, electronic equipment and a storage medium, and relates to the field of computers, in particular to the field of communication application. The specific implementation scheme is as follows: determining at least a first virtual host and a second virtual host on a physical host; determining a first emulated device associated with each first virtual host, and determining a second emulated device associated with a second virtual host; transmitting a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; and returning the virtual resource created by the second simulation device in response to the resource request to the application program so that the application program can conduct data transmission.

Description

Data processing method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technology, and in particular, to a data processing method, apparatus, electronic device, and storage medium in the field of communication application.
Background
Currently, remote direct memory access (Remote Direct Memory Access, abbreviated as RDMA) is typically used when applications communicate, but this must rely on a special network card (Network Interface Card, abbreviated as NIC).
Disclosure of Invention
The present disclosure provides a method, apparatus, electronic device, and storage medium for data processing.
According to an aspect of the present disclosure, a method of data processing is provided. The method comprises the following steps: determining at least a first virtual host and a second virtual host on a physical host; determining a first emulated device associated with each first virtual host, and determining a second emulated device associated with a second virtual host; transmitting a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; and returning the virtual resource created by the second simulation device in response to the resource request to the application program so that the application program can conduct data transmission.
According to another aspect of the present disclosure, there is also provided a data processing apparatus. The device comprises: a first determining unit, configured to determine at least a first virtual host and a second virtual host on a physical host; a second determination unit configured to determine a first emulated device associated with each first virtual host, and determine a second emulated device associated with a second virtual host; the device comprises a sending unit, a first simulation device and a second simulation device, wherein the sending unit is used for sending a resource request initiated by the first simulation device to the second simulation device, and the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; and the return unit is used for returning the virtual resource created by the second simulation device in response to the resource request to the application program so as to enable the application program to perform data transmission.
According to another aspect of the present disclosure, an electronic device is also provided. The electronic device may include: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing methods of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the data processing method of the embodiments of the present disclosure.
According to another aspect of the present disclosure, there is also provided a computer program product, which may comprise a computer program which, when executed by a processor, implements the data processing method of the embodiments of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a data processing method according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a resiliently scalable network interface according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an elastic RDMA according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of a data processing apparatus according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an application scenario of a data processing method according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device of a method of data processing according to an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flow chart of a data processing method according to an embodiment of the present disclosure. As shown in fig. 1, the method may include the steps of:
Step S102, at least one first virtual host and at least one second virtual host are determined on the physical hosts.
In the technical solution provided in the above step S102 of the present disclosure, the physical host creates a plurality of first virtual hosts and second virtual hosts based on a virtual machine manager (Qemu), where at least one of the first virtual host and the second virtual host is disposed on the same physical host, and the first virtual host may be represented by VM0 and the second virtual host may be represented by VM 1.
Step S104, a first emulated device associated with each first virtual host is determined, and a second emulated device associated with a second virtual host is determined.
In the technical solution provided in step S104 of the present disclosure, the physical host creates a plurality of virtual machines and emulation devices based on the virtual machine manager, at least a first virtual host and a second virtual host are disposed on the same physical host, and a first emulation device associated with each first virtual host in the at least a first virtual host and a second emulation device associated with the second virtual host are determined, where the first emulation device may be represented by an emulation device ep0, may be a standard high-speed serial bus (PCI express) device emulated by the virtual machine manager, and is used to present a remote direct data access (Remote Direct Memory Access, abbreviated as RDMA) device to a user, so that the remote direct data access device looks as though the remote direct data access device is identical to a real RDMA device, and the second emulation device may be represented by an emulation device ep 1.
Optionally, the physical Host (Host) creates a first virtual Host (VM 0) and a second virtual Host (VM 1) based on the virtual machine manager, and associates the first emulated device (ep 0) created by the virtual machine manager with the first virtual Host; a second emulated device (ep 1) is associated with a second virtual host.
It should be noted that, the first virtual host and the second virtual host are disposed on the same physical host, so as to implement efficient communication through a shared memory (shared memory).
And step S106, a resource request initiated by the first simulation device is sent to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device.
In the technical solution provided in the above step S106 of the present disclosure, the first simulation device is one device presented to the user, so as to look the same as the actual RDMA device, and after receiving the request of the user, the application program corresponding to the first simulation device initiates a resource request to the second simulation device, so as to obtain a Virtual resource required by the application program corresponding to the first simulation device, where the resource request may be an RDMA resource request, the application program may be a Virtual Machine (VM) application end, and the Virtual resource may be an RDMA resource.
Optionally, a first virtual host and a second virtual host (VM 0 and VM 1) are created on the physical host based on Qemu, and a first emulation device (ep 0) and a second emulation device (ep 1) created by Qemu are respectively bound to the first virtual host and the second virtual host, where the first emulation device is a remote direct data access device presented to the user, so that the user looks identical to the real RDMA device, and the second emulation device receives a virtual resource request sent by an application program corresponding to the first emulation device.
Step S108, the virtual resource created by the second simulation device in response to the resource request is returned to the application program, so that the application program performs data transmission.
In the technical scheme provided in the step S108 of the present disclosure, the first simulation device initiates a resource request and sends the resource request to the second simulation device, and the second simulation device creates a virtual resource in response to the resource request and returns the virtual resource to the application program.
Determining at least one first virtual host and one second virtual host on the physical host through the steps S102 to S108; determining a first emulated device associated with each first virtual host, and determining a second emulated device associated with a second virtual host; transmitting a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; the virtual resources created by the second simulation equipment responding to the resource request are returned to the application program, so that the application program performs data transmission, namely, the RDMA function is realized by simulating the behavior of the RDMA physical network card through the simulation equipment, wherein the second simulation equipment receives the request of the first simulation equipment, simulates the behavior of RDMA hardware and creates resources for the application, and thus, the two parties cooperate together to complete the function of the real RDMA equipment, so that a client looks consistent use experience with the real RDMA network card, the technical effect of improving the data communication efficiency of the application is realized, and the technical problem of low data communication efficiency of the application is solved.
The above-described method of this embodiment is described in further detail below.
As an optional implementation, step S104, determining a first emulation device associated with each first virtual host, and determining a second emulation device associated with a second virtual host includes: and simulating the first simulation equipment and the second simulation equipment based on the Qemu instance of the virtual machine manager.
In this embodiment, the virtual machine manager Qemu instance creates a first emulated device and a second emulated device, and associates the first emulated device (ep 0) created by the virtual machine manager with the first virtual host, and associates the second emulated device (ep 1) with the second virtual host.
As an alternative embodiment, the first analog device is triggered by the application to initiate the resource request.
In this embodiment, the first emulated device is an RDMA device presented to the user, the user side triggering the application to initiate the resource request.
As an optional implementation manner, the first analog device is initialized to obtain a third analog device, and the second analog device is initialized to obtain a fourth analog device, where a register of the third analog device and a register of the fourth analog device are subjected to read-write operation by the central processing unit.
In this embodiment, the virtual machine manager creates the first simulation device and the second simulation device, initializes the register spaces of the first simulation device and the second simulation device, initializes the first simulation device to obtain the third simulation device, initializes the second simulation device to obtain the fourth simulation device, and after the initialization is completed, the central processor may determine the space sizes and definitions of the registers of the third simulation device and the fourth simulation device, and further perform the read-write operation on the registers of the third simulation device and the fourth simulation device.
Optionally, the register spaces of the first analog device and the second analog device are initialized to achieve the purpose of providing a VM with a similar real physical network card, when the VM system is started, all analog devices are enumerated, and an address of a bus domain is allocated to each device, but since a Root Complex (RC) exists in the system, the address of the bus domain allocated to each device cannot be directly accessed by an application program or a central processing unit, and isolation exists between the bus domain and a memory domain accessible by the central processing unit.
Optionally, register spaces of the first analog device and the second analog device are initialized, and mapping between a bus domain and a memory domain of the analog device is completed through memory map (mmap for short) in the process of initialization.
Optionally, after the initialization of the device is completed, the central processor knows the space sizes and definitions of the registers of the first analog device and the second analog device, so that the central processor can read and write the registers of the first analog device and the second analog device, and the first analog device and the second analog device can access the virtual address designated by the central processor.
Optionally, the central processing unit accesses the first analog device and the second analog device by transferring the memory domain address to the bus domain address and then to the device.
As an alternative embodiment, the third simulation device and the fourth simulation device are used for performing an access operation on an address indicated by the central processor.
In this embodiment, the register spaces of the first analog device and the second analog device are initialized, and mapping between the bus domain and the memory domain of the analog device is completed through memory mapping during the initialization.
Optionally, after the initialization of the device is completed, a third simulation device and a fourth simulation device are obtained, and the central processor can determine the space sizes and definitions of the third simulation device and the fourth simulation device so as to realize that the first simulation device and the second simulation device can also access the virtual address designated by the central processor.
Optionally, the first analog device and the second analog device access the virtual address appointed by the central processing unit through the bus domain address to the memory domain address, then to the synchronous dynamic random access memory and finally to the controller for reading and writing the appointed memory.
As an alternative embodiment, the access operation command is synchronized from the initialized first analog device to the initialized second analog device, where the access operation command is used to perform an access operation on the address.
In this embodiment, the initialized first analog device initiates an access operation command to the configured register address, performs an access operation on the address, and synchronizes the access operation command from the initialized first analog device to the initialized second analog device in a shared memory manner.
Optionally, the first simulation device and the second simulation device respectively correspond to a virtual machine manager process on the physical host, and because the first simulation device and the second simulation device are located on the same physical host, the first simulation device and the second simulation device can perform high-efficiency communication in a memory sharing manner, and when performing read-write operation, the first simulation device synchronizes an object (Mailbox cmd or data packet payload) to be read-written by the first simulation device with the shared memory, and the second simulation device performs write-read operation from the shared memory, so as to achieve the purpose of synchronizing an access operation command from the initialized first simulation device to the initialized second simulation device.
As an alternative embodiment, transmitting the notification message acquired from the initialized second analog device to the initialized first analog device includes: and transmitting the notification message acquired from the initialized second simulation device to the initialized first simulation device in parallel based on the multithreading.
In this embodiment, the notification message acquired from the initialized second simulation device is transmitted in parallel to the initialized first simulation device based on multithreading, where the notification message may be dorbell information.
Alternatively, a plurality of First In, first Out, or FIFO for short, of notification information may be maintained at the RDMA engine side, and register addresses of these queues may be set when the second analog device is initialized, so that, in order to accelerate data processing, notification messages acquired from the initialized second analog device may be processed In parallel In a multithreaded manner, and transmitted In parallel to the initialized First analog device.
Optionally, the notification message arrives at a notification message arbiter, and the notification message arbiter judges which type of fifo queue is entered based on the notification message type; the notification message arbiter hashes the notification message to a certain processing core of the notification message based on the information carried in the notification message; the notification message arbiter writes the notification message information into a corresponding first-in first-out queue, and the notification message thread is processed; after the RDMA engine takes out the notification message from the first-in first-out queue, modifying the value of a first-out queue register, and notifying the position of the processed notification message to hardware; the notification message arbiter maintains a pointer to the first-in first-out queue to enable the second analog device to obtain the notification message.
As an alternative embodiment, connection information of the application program is established based on the virtual resource, wherein the connection information is used for enabling the application program to be in communication connection.
In this embodiment, connection information of an application program is established based on virtual resources, and the application program is in communication connection through veth0, where veth0 may be an interface device virtualized by a virtualization machine, and is used to establish a control path to exchange RDMA connection information.
As an alternative embodiment, a buffer is established in the second simulation device; and caching data received or transmitted by the application program through the connection information in the buffer area.
In this embodiment, a buffer is established in the second analog device, and data received or transmitted by the application program through the connection information is buffered in the buffer to complete data transceiving, where the data transceiving is implemented based on VF.
Optionally, the VF is used for communicating with remote data, the RDMA engine simulates the action of an RDMA network card and gives a data packet to a user mode driver of the VF, and the VF driver is based on the physical network card and caches data received or transmitted by an application program through connection information in a buffer zone.
As an alternative embodiment, the virtual resource is a remote direct data access RDMA resource.
In this embodiment, the second analog device prepares to send and receive buffers, after receiving and sending data, the connection is terminated, RDMA resources are released, and virtual resources are remote direct data access RDMA resources.
In this embodiment, by determining at least a first virtual host and a second virtual host on a physical host, determining a first emulated device associated with each first virtual host, and determining a second emulated device associated with the second virtual host; transmitting a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; and returning the virtual resource created by the second simulation equipment in response to the resource request to the application program so as to enable the application program to carry out data transmission, thereby realizing the capability of using RDMA technology on the non-RDMA network card, further improving the technical effect of carrying out data communication by the application and solving the technical problem of low data communication efficiency by the application.
The foregoing technical solutions of the embodiments of the present disclosure are further described by way of example with reference to the preferred embodiments.
Currently, the network bandwidth of a data center is gradually evolving from 10G to 25G/50G or even 100Gbps, and as moore's law fails, a CPU cannot support business to fill up the bandwidth, in the process of forwarding a data packet, a TCP/IP protocol stack needs to participate in multiple data copies by the CPU, a kernel is responsible for processing the protocol stack, the CPU cannot be really used on processing logic of the business, the processing of the data packet is relatively low, and more applications tend to use RDMA to replace TCP/IP due to the advantages of zero copy, bypass kernel and CPU, wherein the RDMA needs to rely on special network cards such as RNIC, RDMA NIC network card and the like for unloading the protocol stack to network card hardware.
In the related art, an elastically extensible network interface (Elastic Fabric Adapter, abbreviated as EFA) of an example is proposed, and fig. 2 is a schematic diagram of an elastically extensible network interface according to an embodiment of the disclosure, as shown in fig. 2, based on the above network interface, an application program of a user can efficiently communicate between large-scale examples, and a network card adds support of RDMA protocol based on an elastic network adapter (Elastic Network Adapter, abbreviated as ENA), so as to implement an extensible reliable datagram (Scalable Reliable Datagram, abbreviated as SRD) transmission type, where the transmission type supports a many-to-many network transmission model and supports multipath load balancing and out-of-order reception of tax transaction (Delivered Duty Paid, abbreviated as DDP), so as to ensure a lower tail delay, but the above problem is very dependent on its special EFA intelligent network card.
In the related art, another elasticity RDMA (Elastic RDMA) is also proposed, and fig. 3 is a schematic diagram of elastic RDMA according to an embodiment of the present disclosure, as shown in fig. 3, the method implements RDMA-based shared memory communication (Shared Memory Communications over RDMA, abbreviated as SMC-R) in a kernel, and transparent replacement from TCP to SMC-R can be implemented in a network namespace or a single application dimension, so that an application can enjoy performance improvement brought by RDMA without any modification, and simultaneously, auto-negotiation and security rollback are supported, but the above method has a problem that the bottom layer depends on RNIC or a special intelligent network card (such as Moc card) and cannot be deployed on a large scale due to transparent use of service.
In summary, the related art has a technical problem that high performance communication cannot be implemented on a common network card, and in order to solve the above problem, the disclosure proposes a device and a system for high performance communication that do not depend on a special network card, where the system is based on a virtual machine monitor (Hypervisor) and an open-source cloud computing management platform project (OpenStack) framework that are commonly used in the industry, where the virtual machine monitor is responsible for creating and managing each virtual machine sold to a client, and an orchestrator and a network component in the cloud computing management platform project framework are responsible for managing mapping relationships between network devices and virtual machines.
FIG. 4 is a schematic diagram of a data processing device according to an embodiment of the present disclosure, as shown in FIG. 4, two virtual machines VM0 and VM1 are created on a physical host based on a virtual machine manager, and simulation devices ep0 and ep1 created by the virtual machine manager are respectively bound to VM0 and VM1, ep0 is one RDMA device presented to a user, so that the user looks unchanged from a real RDMA device, ep1 receives a request of ep0, simulates the behavior of RDMA hardware, and the two parties cooperate together to complete the functions of the real RDMA device.
Optionally, VM0 is a direct participant in the RDMA application, VM1 is invisible to the user, and both veth0 of VM0 and VF of VM1 are virtualized devices (virtual) or single root virtualization-based (SR-IOV) interface devices, wherein veth0 is used to establish a control path to exchange RDMA connection information, and VF is used to communicate with remote data. The Driver is the RDMA kernel mode Driver of the ep0 equipment, the RDMA engine (stack) simulates the behavior of an RDMA network card, and gives a data packet to be transmitted to the user mode of the VF for driving, and the VF Driver completes the receiving and transmitting of the data packet based on the physical network card.
To achieve the above-mentioned functions, the following key problems need to be solved: initialization of the analog and base address register (Base Address Register, simply BAR) space of a high speed bus (PCIe) device; reading and writing of a register (Mailbox); reading and writing of notification messages (Doorbell, abbreviated as DB); direct memory (Direct Memory Access, DMA for short) read/write; virtual machine manager (qemu) inter-instance communication; and transmitting and receiving data packets.
The simulation of the high speed bus device of this embodiment is described below with respect to the initialization of the base address register (Base Address Register, simply BAR) space.
The mapping between the bus domain and the memory domain is established by the simulation of the high-speed bus device and the initialization of the base address register space, so that the CPU can access the device and provide a similar real physical network card for the VM.
Alternatively, standard PCIe devices may be emulated by Qemu, each of which requires a specific vendor (vendor) and device ID (device ID) in accordance with the PCIe specification, and implement a separate register (BAR) space and its operation.
Optionally, when the Vm system is started, the kernel of the Vm enumerates all PCIe devices, and assigns an address of the PCI bus domain to each device, but because a Root Complex (may be RC) in the system, that is, an interface between the cpu and the PCle bus, the address is not directly accessible by an application program or the cpu, thereby isolating the PCI bus domain from a memory domain accessible by the cpu.
Optionally, mapping between the PCI bus domain and the memory domain may be completed through memory mapping when the PCIe device is initialized, so that the central processor may know the BAR space size and definition of the device, further read and write the BAR space register of the device, and the device may also perform direct storage operation on the virtual address specified by the central processor.
Optionally, the central processing unit accesses the device by transferring the memory domain address to the bus domain address and then to the device.
Optionally, the device reads and writes the designated memory to the controller through the bus domain address to the memory domain address, then to the synchronous dynamic random access memory (DDR), so as to achieve the purpose of accessing the memory DMA by the device.
The reading and writing of the register (Mailbox) is described below.
Mailboxes are used to transmit control plane messages on the driver and stack sides, such as Create queues (Create QP), query global ID information (query gid), etc., which is a piece of memory space located in vm 0.
Optionally, driver and stack configure ep0 and ep1 registers, respectively, at device initialization, and inform the device of the addresses in the memory of the registers; the Driver side issues RDMA configuration operation to kernel state drive; the kernel mode driver converts the configuration operation into a corresponding command instruction (command), writes the command load into the DDR of the mailbox, and sets an owner bit (ownerbit) field in the mailbox data structure to 1; the kernel mode driver sends down a mailbox doorbell message to inform ep0 of reading the mailbox; ep0 initiates a DMA read operation to the configured mailbox register address and synchronizes to ep1 through shared memory.
Optionally, setting the owner bit field in the mailbox data structure may be used to indicate to which step the mailbox cmd is processed, and setting 1 initially indicates that the driver side has just issued the mailbox cmd, setting the owner bit to 0 after the Stack side finishes processing the cmd specified by the mailbox, and when the mailbox cmd again arrives at the driver side, the driver checks that the owner bit is already 0, which may indicate that the cmd has been successfully processed.
Optionally, ep1 reads a command instruction from the shared memory, directly writes the content in the direct memory into DDR specified by a mailbox register of ep1, sends an interrupt notification stack side to process, sends a prompt (dorbell) after the stack side finishes cmd processing to notify ep1 that the cmd processing is finished, and modifies ownerbit to 0; ep1 transmits dorbell information to ep0 through a shared memory, and ep0 is sent to a driver side through interruption; the kernel mode driver checks that the owner bit is 0 and considers the cmd to be executed.
The following describes the reading and writing of notification messages.
The other party can be notified by a doorbell mode, a plurality of DB first-in first-out queues are maintained on the Stack side, register addresses of the queues are set when the ep1 equipment is initialized, and the data processing in the database is usually performed in parallel in a multithreading mode for accelerating.
Alternatively, the physical host driver side may issue a DB when a transmission task specification (Work Queue Element, abbreviated as WQE) or stack side needs to be split, either way having the following operations, which may be included.
Optionally, the DB arrives at the DB arbiter, which first-in first-out queue is judged to enter based on the DB transceiving type; the DB arbiter hashes the DB to a certain DB processing core based on the information carried in the DB; the DB arbiter writes DB information into a corresponding first-in first-out queue for DB thread processing; after Stack takes DB from the FIFO queue, modifying the value of the register of the FIFO queue, and informing the position of the DB processed by hardware; the DB arbiter determines whether the queue is full based on pointers of the maintained first-in-first-out queues, such as PI pointers (Producer Index) and CI pointers (Consumer Index).
The following describes the reading and writing of the direct memory (Direct Memory Access, abbreviated as DMA).
The DMA operation can only be initiated on the ep1 side, possibly initiating a read or write operation into the DDR of VM 0. At the initialization of the ep1 device, the Stack side writes the register address, length, etc. information of SQ (Send Queue) and CQ (Complete Queue) to the BAR space of the device. May include:
The Stack side initiates a DMA operation; ep1 takes out the DMA task from the task queue; judging a read or write operation based on the task type, and if the read or write operation is the write operation, moving data of a Stack side variable (Mbuffer) into a shared memory; if the read operation is performed, the driver side waits for the data read from the DDR of the VM0 to be filled into the shared memory and then synchronizes to the stack side buffer area.
Alternatively, ep1 determines whether a CQE is to be sent on the Stack side based on the CQE _enable field of the element in SQ, assembles and writes the complete queue element (Complete Queue Element, simply CQE) into the register-specified DDR if needed.
Optionally, ep1 updates the value of the complete queue pointer (CQ PI) register, the Stack side looks at the owner bit of the completion queue element based on the poll, determines if the completion queue element is valid based on the owner bit, and determines if all processing is complete based on the PI pointer and the CI pointer of the CQ.
Communication between virtual machine manager instances is described below.
Each Qemu instance corresponds to one Qemu process on the physical machine, and only one simulation device is simulated in each Qemu instance, because two Qemu instances are located in the same Host and can be efficiently communicated through shared memory (shared memory), the simulation devices in each Qemu instance synchronize a read-write object (Mailbox cmd or data packet payload) and the shared memory during DMA, and the simulation devices at the other end write read operations from the shared memory, so that the two Qemu processes are synchronized based on signal quantity.
The following describes the transmission and reception of data packets.
The data packet sending, namely, the stack side initiates DMA read data operation, and simultaneously completes the package of the data packet header and the splicing of the payload data (payload) part, and the existing framework of the data plane development suite is combined to be sent to the far end through the VF port; the data packet is received, namely, a user mode driver (Polling Mode Driver, abbreviated as PMD) of DPDK is utilized to poll the data packet on the bound central processor core, the packet header is delivered to a Stack protocol Stack for processing after unpacking, and meanwhile, the payload part DMA is written into the DDR of the host side.
The disclosure can be applied to one-to-one application scenarios, as shown in fig. 4, each VM application uses its own dedicated RDMA emulation device ep0 (ep 1 is transparent to VM 0) and peer communication; but also to many-to-one scenarios, fig. 5 is a schematic diagram of an application scenario of a data processing method according to an embodiment of the present disclosure, as shown in fig. 5. Multiple VM application ends are unaware of each other, but the same RDMA device ep1 simulates RDMA behavior, and when in communication, ep1 creates RDMA resources isolated from each other for each application, similar to creating multiple Queue Pair (QP) connections on the same physical RDMA network card.
The virtual machine manager creates simulation devices ep0 and ep1 and completes the initialization of the base address register space respectively; the application initiates an initialization RDMA resource request through ep0 and synchronizes to ep1; ep1 simulates the hardware behavior, creates RDMA resources and returns to the application; the application exchanges connection information based on a TCP/IP control path; ep1 prepares a transmitting and receiving buffer area, and receives and transmits data after finishing; the connection is terminated, RDMA resources are released, wherein the access is controlled through vetho, data transmission is carried out through VF, so that the RDMA network card is not depended, the service performance close to RDMA can be provided, compared with TCP/IP performance, the service performance is greatly improved, the RDMA technology using capacity on the non-RDMA network card is further realized, the technical effect of data communication efficiency of an application is improved, and the technical problem of low data communication efficiency of the application is solved.
The embodiment of the disclosure also provides a data processing device for executing the data processing method of the embodiment shown in fig. 1.
Fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present disclosure. As shown in fig. 6, the data processing apparatus 60 may include: a first determination unit 61, a second determination unit 62, a transmission unit 63, and a return unit 64.
A first determining unit 61, configured to determine at least a first virtual host and a second virtual host on a physical machine;
the second determining unit 62 is configured to determine a first emulated device associated with each first virtual host, and determine a second emulated device associated with a second virtual host.
And a sending unit 63, configured to send a resource request initiated by the first simulation device to the second simulation device, where the resource request is used to request a virtual resource for an application program corresponding to the first simulation device.
And a return unit 64, configured to return the virtual resource created by the second simulation device in response to the resource request to the application program, so that the application program performs data transmission.
Alternatively, the second determining unit 62 includes: the first simulation module is used for simulating the first simulation equipment and the second simulation equipment based on the Qemu instance of the virtual machine manager.
Optionally, the apparatus further comprises: the first analog device initiates a resource request triggered by the application.
Optionally, the apparatus further comprises: and the processing unit is used for initializing the first simulation equipment to obtain third simulation equipment, and initializing the second simulation equipment to obtain fourth simulation equipment, wherein a register of the third simulation equipment and a register of the fourth simulation equipment are subjected to read-write operation by the central processing unit.
Optionally, the apparatus further comprises: the third simulation device and the fourth simulation device are used for performing access operation on the address indicated by the central processing unit.
Optionally, the first processing module includes: and the first synchronization sub-module is used for synchronizing an access operation command from the initialized first simulation device to the initialized second simulation device, wherein the access operation command is used for performing access operation on the address.
Optionally, the first processing module includes: the first transmission sub-module is used for transmitting the notification message acquired from the initialized second simulation device to the initialized first simulation device, wherein the notification message is used for indicating that the access operation to the address is completed.
Optionally, the first processing module includes: the first transmission sub-module is used for transmitting the notification message acquired from the initialized second simulation device to the initialized first simulation device in parallel based on multiple threads.
Optionally, the apparatus further comprises: the establishing unit is used for establishing the connection information of the application program based on the virtual resource, wherein the connection information is used for enabling the application program to carry out communication connection.
Optionally, the establishing unit includes: the second processing module is used for establishing a buffer area in the second simulation equipment; and caching data received or transmitted by the application program through the connection information in the buffer area.
In the device of the disclosed embodiment, at least a first virtual host and the second virtual host are determined on a physical host by a first determining unit; determining, by the second determining unit, a first emulated device associated with each first virtual host and a second emulated device associated with the second virtual host; transmitting a resource request initiated by the first simulation device to the second simulation device through the transmitting unit, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device; the virtual resource created by the second simulation equipment in response to the resource request is returned to the application program through the return unit, so that the application program performs data transmission, the capability of using RDMA technology on the non-RDMA network card is realized, the technical effect of data communication efficiency of the application is improved, and the technical problem of low data communication efficiency of the application is solved.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Embodiments of the present disclosure provide an electronic device that may include: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the data processing methods of the embodiments of the present disclosure.
Optionally, the electronic device may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
According to an embodiment of the present disclosure, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the data processing method of the embodiments of the present disclosure.
Alternatively, in the present embodiment, the above-described nonvolatile storage medium may be configured to store a computer program for performing the steps of:
s1, determining at least one first virtual host and at least one second virtual host on a physical host;
s2, determining first simulation equipment associated with each first virtual host and determining second simulation equipment associated with a second virtual host;
s3, sending a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device;
and S4, returning the virtual resources created by the second simulation equipment in response to the resource request to the application program so as to enable the application program to conduct data transmission.
Alternatively, in the present embodiment, the non-transitory computer readable storage medium described above may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to an embodiment of the present disclosure, the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of:
s1, determining at least one first virtual host and at least one second virtual host on a physical host;
s2, determining first simulation equipment associated with each first virtual host and determining second simulation equipment associated with a second virtual host;
s3, sending a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting virtual resources for an application program corresponding to the first simulation device;
and S4, returning the virtual resources created by the second simulation equipment in response to the resource request to the application program so as to enable the application program to conduct data transmission.
Fig. 7 is a block diagram of an electronic device of a method of data processing according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the respective methods and processes described above, for example, a method data processing method. For example, in some embodiments, the method data processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM702 and/or communication unit 709. When a computer program is loaded into RAM703 and executed by computing unit 701, one or more steps of the data processing method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the data processing method by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (15)

1. A data processing method, comprising:
determining at least a first virtual host and a second virtual host on a physical host;
determining a first emulated device associated with each of the first virtual hosts, and determining a second emulated device associated with the second virtual host;
transmitting a resource request initiated by the first simulation device to the second simulation device, wherein the resource request is used for requesting a virtual resource for an application program corresponding to the first simulation device;
And returning the virtual resource created by the second simulation device in response to the resource request to the application program so as to enable the application program to conduct data transmission.
2. The method of claim 1, wherein determining a first emulated device associated with each of the first virtual hosts, and determining a second emulated device associated with the second virtual host comprises:
and simulating the first simulation equipment and the second simulation equipment based on a Qemu instance of the virtual machine manager.
3. The method of claim 1, wherein the first analog device initiates the resource request triggered by the application.
4. The method of claim 1, further comprising:
initializing the first simulation equipment to obtain third simulation equipment, initializing the second simulation equipment to obtain fourth simulation equipment, wherein a register of the third simulation equipment and a register of the fourth simulation equipment are subjected to read-write operation by a central processing unit.
5. The method of claim 4, the third analog device and the fourth analog device being configured to perform access operations to addresses indicated by the central processor.
6. The method of claim 4, further comprising:
synchronizing an access operation command from the initialized first simulation device to the initialized second simulation device, wherein the access operation command is used for performing access operation on the address.
7. The method of claim 6, further comprising:
and transmitting a notification message acquired from the initialized second simulation device to the initialized first simulation device, wherein the notification message is used for indicating that the access operation to the address is completed.
8. The method of claim 7, wherein transmitting the notification message acquired from the initialized second analog device to the initialized first analog device comprises:
and transmitting the notification message acquired from the initialized second simulation device to the initialized first simulation device in parallel based on multithreading.
9. The method of claim 1, further comprising:
and establishing connection information of the application program based on the virtual resource, wherein the connection information is used for enabling the application program to carry out communication connection.
10. The method of claim 9, further comprising:
establishing a buffer area in the second simulation equipment;
and caching data received or transmitted by the application program through the connection information in the buffer area.
11. The method of any of claims 1-10, wherein the virtual resource is a remote direct data access, RDMA, resource.
12. A data processing apparatus comprising:
a first determining unit, configured to determine at least a first virtual host and a second virtual host on a physical host;
a second determining unit configured to determine a first emulated device associated with each of the first virtual hosts, and determine a second emulated device associated with the second virtual host;
a sending unit, configured to send a resource request initiated by the first analog device to the second analog device, where the resource request is used to request a virtual resource for an application program corresponding to the first analog device;
and the return unit is used for returning the virtual resource created by the second simulation equipment in response to the resource request to the application program so as to enable the application program to perform data transmission.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-11.
14. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-11.
15. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-11.
CN202210197851.4A 2022-03-01 2022-03-01 Data processing method, device, electronic equipment and storage medium Pending CN116737403A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210197851.4A CN116737403A (en) 2022-03-01 2022-03-01 Data processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210197851.4A CN116737403A (en) 2022-03-01 2022-03-01 Data processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116737403A true CN116737403A (en) 2023-09-12

Family

ID=87903139

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210197851.4A Pending CN116737403A (en) 2022-03-01 2022-03-01 Data processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116737403A (en)

Similar Documents

Publication Publication Date Title
US10095645B2 (en) Presenting multiple endpoints from an enhanced PCI express endpoint device
US9529773B2 (en) Systems and methods for enabling access to extensible remote storage over a network as local storage via a logical storage controller
US9996484B1 (en) Hardware acceleration for software emulation of PCI express compliant devices
US10908895B2 (en) State-preserving upgrade of an intelligent server adapter
EP4160424A2 (en) Zero-copy processing
US9396101B2 (en) Shared physical memory protocol
WO2019195003A1 (en) Virtual rdma switching for containerized applications
US10901725B2 (en) Upgrade of port firmware and driver software for a target device
US10942729B2 (en) Upgrade of firmware in an interface hardware of a device in association with the upgrade of driver software for the device
US11693804B2 (en) Cross bus memory mapping
US20180219797A1 (en) Technologies for pooling accelerator over fabric
CN109983438B (en) Use of Direct Memory Access (DMA) re-establishment mapping to accelerate paravirtualized network interfaces
US10621124B2 (en) Method, device and computer program product for enabling SR-IOV functions in endpoint device
US10452570B1 (en) Presenting physical devices to virtual computers through bus controllers emulated on PCI express endpoints
WO2022143714A1 (en) Server system, and virtual machine creation method and apparatus
CN108965148A (en) A kind of processor and message processing method
US20220050795A1 (en) Data processing method, apparatus, and device
US20140280709A1 (en) Flow director-based low latency networking
CN112799840A (en) Method, device, equipment and storage medium for transmitting data
CN116886751A (en) High-speed communication method and device of heterogeneous equipment and heterogeneous communication system
CN112905304A (en) Communication method and device between virtual machines, physical host and medium
US10673983B2 (en) Processing a unit of work
CN116737403A (en) Data processing method, device, electronic equipment and storage medium
CN116743587B (en) Virtual network interface implementation method and device based on heterogeneous computing accelerator card
CN117520215A (en) Page missing processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination