CN116389280A - Network topology acquisition method, device, equipment and medium - Google Patents

Network topology acquisition method, device, equipment and medium Download PDF

Info

Publication number
CN116389280A
CN116389280A CN202310573372.2A CN202310573372A CN116389280A CN 116389280 A CN116389280 A CN 116389280A CN 202310573372 A CN202310573372 A CN 202310573372A CN 116389280 A CN116389280 A CN 116389280A
Authority
CN
China
Prior art keywords
node
configuration protocol
host configuration
dynamic host
network topology
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310573372.2A
Other languages
Chinese (zh)
Inventor
肖麟阁
阚宏伟
郝锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Inspur Smart Computing Technology Co Ltd
Original Assignee
Guangdong Inspur Smart Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Inspur Smart Computing Technology Co Ltd filed Critical Guangdong Inspur Smart Computing Technology Co Ltd
Priority to CN202310573372.2A priority Critical patent/CN116389280A/en
Publication of CN116389280A publication Critical patent/CN116389280A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/50Address allocation
    • H04L61/5007Internet protocol [IP] addresses
    • H04L61/5014Internet protocol [IP] addresses using dynamic host configuration protocol [DHCP] or bootstrap protocol [BOOTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/164Adaptation or special uses of UDP protocol
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/26Special purpose or proprietary protocols or architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a network topology acquisition method, a device, equipment and a medium, and relates to the field of distributed heterogeneous computing systems. The dynamic host configuration protocol client integrated in the network communication module dedicated to each graphic processor node acquires the internet protocol address of the corresponding graphic processor node from a plurality of prestored internet protocol addresses of the host node, and transmits information containing the acquired internet protocol address to the host node, so that the automatic registration of each graphic processor node in the network of the distributed computing system is realized, and a node table established according to the information transmitted by each dynamic host configuration protocol client and transmitted by the host node is realized, so that each graphic processor node can determine the network topology structure in the distributed computing system according to the node table, and the registration of the graphic processor on the network can be realized without depending on a central processor connected with the graphic processor, thereby improving the flexibility and expandability of the network topology.

Description

Network topology acquisition method, device, equipment and medium
Technical Field
The present invention relates to the field of distributed heterogeneous computing systems, and in particular, to a method, an apparatus, a device, and a medium for obtaining a network topology.
Background
In a conventional distributed heterogeneous computing system, there are a plurality of heterogeneous computing devices, such as a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), a Field programmable gate array (Field-Programmable Gate Array, FPGA), and the like, in a single node, and these devices typically take the CPU in the node as a center, and complete the computation and communication processes.
The GPU, as a heterogeneous computing device without any conventional operating system and software stack, cannot actively access the network interface controller (Network Interface Controller, NIC) across chips as the host CPU, actively register its own node information on the network through the NIC. In the related technology, some heterogeneous computing engines in the nodes do not have exclusive communication modules, the GPU and the CPU on the nodes are required to share one NIC, and the CPU is responsible for registering all heterogeneous computing engines in the nodes on the network topology or registering own information on the network topology through codes provided by a CPU executing manufacturer, so that the attribution party and registration arrangement mode of the heterogeneous computing engines in the nodes in the network topology are limited; in order to accelerate the computation speed of various neural networks, a dedicated network communication module is provided inside each heterogeneous computing engine, and data is not required to be transmitted to a NIC to which a CPU belongs across chips in most cases when external communication is performed, but within a node, the heterogeneous computing node is still connected to the CPU through a high-speed serial computer expansion bus standard (Peripheral Component Interconnect Express, PCIe), and the CPU is responsible for registering all heterogeneous computing engines within the node on a network topology. For a distributed heterogeneous computing system implemented based on a computing network convergence, where each computing engine communicates independently, the responsibility of the CPU to register all heterogeneous computing engines within the node on the network topology severely limits the flexibility and scalability of the network topology.
Therefore, providing a network topology acquisition method to realize the self-adaptive automatic networking of the GPU nodes is a technical problem that needs to be solved by the person skilled in the art.
Disclosure of Invention
The invention aims to provide a network topology acquisition method, device, equipment and medium, which are used for realizing self-adaptive automatic networking of GPU nodes.
In order to solve the above technical problems, the present invention provides a network topology acquisition method applied to a host node in a distributed computing system based on computing network integration, where each node in the distributed computing system is the host node or a graphics processor node, the host node includes a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module dedicated to the graphics processor node, and the method includes:
under the condition that dynamic host configuration protocol broadcast messages sent by the dynamic host configuration protocol clients are received, distributing internet protocol addresses to the dynamic host configuration protocol clients; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
Receiving information sent by each dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
establishing a node table according to the information sent by each dynamic host configuration protocol client;
and respectively sending the node tables to each graphic processor node so that each graphic processor node can determine the network topology structure in the distributed computing system according to the node tables.
In one aspect, the receiving the information sent by each dynamic host configuration protocol client includes:
receiving the information encapsulated by each dynamic host configuration protocol client through a protocol based on a user datagram protocol; the user datagram protocol-based protocol is a protocol set in the data content of the user datagram protocol, and at least comprises an initial source unique code, a target source unique code, a data transmission length and a checksum;
correspondingly, the establishing a node table according to the information sent by each dynamic host configuration protocol client includes:
Analyzing the information encapsulated by each dynamic host configuration protocol client through the user datagram protocol-based protocol and acquiring the analyzed information;
and establishing the node table according to the analyzed information.
In another aspect, the creating a node table according to the information sent by each dynamic host configuration protocol client includes:
determining state information of the graphics processor nodes corresponding to the dynamic host configuration protocol clients according to the information sent by the dynamic host configuration protocol clients;
and establishing the node table according to the information sent by each dynamic host configuration protocol client and the state information of each graphic processor node.
In another aspect, the distributed computing system includes a plurality of host nodes; before said assigning an internet protocol address to each of said dynamic host configuration protocol clients, further comprising:
selecting a target host node from the plurality of host nodes so as to facilitate executing the network topology acquisition method in the target host node; wherein the target host node is the host node whose internet protocol address remains unchanged;
Acquiring an Internet information protocol address corresponding to the target host node;
transmitting an internet protocol address corresponding to the target host node to a common host node; wherein the common host node is the remaining host nodes except the target host node in the host nodes.
In another aspect, after said sending the node table to each of the graphics processor nodes, further comprising:
under the condition that a dynamic host configuration protocol broadcast message sent by a new dynamic host configuration protocol client is received, the Internet protocol address is distributed to the new dynamic host configuration protocol client;
receiving information sent by a new dynamic host configuration protocol client;
updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client; wherein the target graphics processor node is all of the graphics processor nodes or a portion of the graphics processor nodes of all of the graphics processor nodes.
In another aspect, updating each node table in the target graphics processor node to a node table containing new graphics processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client includes:
starting to receive the information sent by the new dynamic host configuration protocol client, if the information sent by the new dynamic host configuration protocol client is received for a plurality of times in a first preset time period, acquiring the information sent by the new dynamic host configuration protocol client received in a second preset time period from the moment when the first preset time period is ended;
and updating each node table in the target graphics processor node into a node table containing the new graphics processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client in the first preset time period and the information sent by the new dynamic host configuration protocol client in the second preset time period.
In another aspect, the method further comprises:
acquiring the data packet which is transmitted by the common host node and is encapsulated by the protocol based on the user datagram protocol; the data packet at least comprises the number of the graphic processor nodes to be requested and the information of the common host node;
Determining the data packet to be returned by the common host node according to the analyzed content of the data packet and the state information of the graphic processor node; wherein, the data packet to be returned at least comprises the internet protocol address of the graphic processor node and a unique code corresponding to the internet protocol address;
and sending the data packet to be returned to the common host node.
On the other hand, after the sending the data packet to be returned to the common host node, the method further includes:
and updating the node table in the target host node according to the data packet to be returned and acquiring the updated node table.
On the other hand, the invention also provides a network topology acquisition method, which is applied to each graphic processor node in a distributed computing system based on computing network integration, wherein each node in the distributed computing system is a host node or the graphic processor node, the host node comprises a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphic processor node, and the method comprises the following steps:
Transmitting a dynamic host configuration protocol broadcast message to the host node;
acquiring an internet protocol address distributed by the host node; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
transmitting information to the host node through the dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
acquiring a node table sent by the host node; the node table is established for the host node according to the information sent by each dynamic host configuration protocol client;
and determining the network topology structure in the distributed computing system according to the node table.
In another aspect, the present invention further provides a network topology acquiring device, which is applied to a host node in a distributed computing system based on computing network convergence, where each node in the distributed computing system is the host node or a graphics processor node, the host node includes a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphics processor node, and the network topology acquiring device includes:
The distribution module is used for distributing the internet protocol address to each dynamic host configuration protocol client under the condition that the dynamic host configuration protocol broadcast message sent by each dynamic host configuration protocol client is received; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
the receiving module is used for receiving the information sent by each dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
the establishing module is used for establishing a node table according to the information sent by each dynamic host configuration protocol client;
and the sending module is used for respectively sending the node tables to the graphic processor nodes so that the graphic processor nodes can determine the network topology structure in the distributed computing system according to the node tables.
On the other hand, the invention also provides a network topology acquisition device, which comprises:
a memory for storing a computer program;
and the processor is used for realizing the steps of the network topology acquisition method when executing the computer program.
In another aspect, the present invention further provides a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements the steps of the network topology acquisition method described above.
The invention provides a network topology acquisition method, which is applied to a host node in a distributed computing system based on computing network integration, in the method, a dynamic host configuration protocol client integrated in a network communication module dedicated to each graphic processor node acquires an Internet protocol address of a corresponding graphic processor node from a plurality of pre-stored Internet protocol addresses of the host node, and transmits information containing the acquired Internet protocol address of the graphic processor node to the host node, thereby realizing automatic registration of each graphic processor node in the network of the distributed computing system, and a node table established by the information transmitted by the host node according to each dynamic host configuration protocol client, so that each graphic processor node can determine the network topology in the distributed computing system according to the node table, and the registration of the graphic processor on the network can be realized without depending on a central processor connected with the graphic processor, thereby improving the flexibility and expandability of the network topology; secondly, each node in the distributed computing system is a host node or a graphics processor node, namely one node is the host node or the graphics processor node, so that decoupling of the graphics processor and a host to which the graphics processor belongs is realized, each graphics processor node is provided with a dedicated network communication module, and limitation of network topology caused by physical binding of PCIe and a host CPU is eliminated; in addition, through the unique identification of the graphic processor nodes, the Internet protocol address of each graphic processor node can be quickly found in the node table according to the unique codes, and the self-networking efficiency of each graphic processor node can be improved.
In addition, the invention also provides a network topology acquisition device, a network topology acquisition equipment and a computer readable storage medium, which have the same or corresponding technical characteristics as the above-mentioned network topology acquisition method, and have the same effects.
Drawings
For a clearer description of embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described, it being apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort for those skilled in the art.
FIG. 1 is a schematic diagram of a distributed computing system according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a background service program of a host node according to an embodiment of the present invention;
fig. 3 is a schematic hardware structure of a GPU node according to an embodiment of the present invention;
fig. 4 is a flowchart of a network topology obtaining method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of the location and components of an RLTL protocol in an overall network protocol stack according to an embodiment of the invention;
FIG. 6 is a schematic diagram of a network topology discovery process of a communication-independent distributed heterogeneous computing system according to an embodiment of the present invention;
fig. 7 is a block diagram of a network topology acquiring apparatus according to an embodiment of the present invention;
fig. 8 is a block diagram of a network topology acquiring apparatus according to another embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making any inventive effort are within the scope of the present invention.
The core of the invention is to provide a network topology acquisition method, a device, equipment and a medium, which are used for realizing self-adaptive automatic networking of GPU nodes. Fig. 1 is a schematic diagram of a distributed computing system according to an embodiment of the present invention, as shown in fig. 1, where the distributed computing system includes a target host node 1, a common host node 2, and a graphics processor node 3, where the target host node 1 and the common host node 2 are both CPU nodes, the target host node 1 is a node running a background service program, and the common host node 2 is a node that makes a data request to the target host node 1. Fig. 2 is a schematic diagram of a background service program of a host node provided in an embodiment of the present invention, in a target host node 1, external data is analyzed and encapsulated in a management library through a conventional network protocol stack, and an internet protocol (Internet Protocol, IP) address of the node is allocated to a graphics processor node through a dynamic host configuration protocol (Dynamic Host Configuration Protocol, DHCP) server in network topology management to implement automatic registration of the GPU node on a network, and after obtaining information that all GPU nodes register on the network, network topology discovery is implemented. Fig. 3 is a schematic hardware structure of a GPU node according to an embodiment of the present invention, where an irma module for providing network function support for a GPU is built on top of a conventional ethernet protocol stack, and provides a function similar to remote direct memory access (Remote Direct Memory Access, RDMA), and in the GPU node, analysis, encapsulation, verification, etc. of a data packet is implemented by a dynamic host configuration protocol client (abbreviated as DHCP client) and a user datagram protocol based protocol custom protocol engine, and it should be noted that in this embodiment, a user datagram protocol based protocol is referred to as a reliable lightweight protocol (Reliable Lightweight Transport Protocol, RLTL). The method has the advantages that the power-on self-starting DHCP client is integrated in the network communication module exclusive to the DHCP client, and the host computer on the upper layer is matched with the server program on the background of the host computer, the RLTL analysis engine of the node end of the graphic processor, the DHCP client, the node table and the like, so that each GPU calculation engine has the capability of completely independent and autonomous communication in the network, can freely carry out networking in the form of a single GPU calculation engine, eliminates the limitation of network topology caused by physical binding of PCIe and the CPU host computer, and greatly improves the flexibility and the expandability of the distributed calculation system integrating calculation networks.
In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. Fig. 4 is a flowchart of a network topology acquisition method provided in an embodiment of the present invention, which is applied to a host node in a distributed computing system based on an algorithm network convergence, where each node in the distributed computing system is a host node or a graphics processor node, the host node includes a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module dedicated to the graphics processor node, and the method includes:
s10: and under the condition that the dynamic host configuration protocol broadcast message sent by each dynamic host configuration protocol client is received, the Internet protocol address is distributed to each dynamic host configuration protocol client.
Wherein, the dynamic host configuration protocol server stores a plurality of internet protocol addresses in advance.
The DHCP client side in each GPU node automatically sends a DHCP broadcast message when the GPU node is powered on, and each GPU node can acquire the IP address of the node in the network under the assistance of the DHCP server in the background service program of the host node. It should be noted that, the IP address allocated by the DHCP server to the GPU may be determined according to the lease condition, so that the IP address of the current GPU node is changed or remains unchanged. However, it is only necessary to ensure that IP addresses corresponding to different GPU nodes at the same time are different.
S11: receiving information sent by each dynamic host configuration protocol client; wherein the information at least comprises an internet protocol address and a unique code of each graphic processor node.
After each GPU node obtains the IP address of the node in the network, the information containing the IP address of the current GPU node is automatically sent to a background service program, and a host end receives the information sent by each DHCP client end. The information contains a unique code (Identity Document, ID) of the current GPU node in addition to the IP address of the current GPU node.
S12: and establishing a node table according to the information sent by each dynamic host configuration protocol client.
S13: and respectively sending the node tables to each graphic processor node so that each graphic processor node can determine the network topology structure in the distributed computing system according to the node tables.
After the information sent by each dynamic host configuration protocol client is obtained, the information is stored in a background service program. And establishing a node table, such as an ID-IP, according to the information sent by each DHCP client. In addition, in order to intuitively understand the state of each GPU node according to the node table and facilitate resource request, allocation, and the like according to the node table, in implementation, the node table may include the state of each GPU node in addition to the ID-IP.
Specifically, the node table is built according to the information sent by each dynamic host configuration protocol client side, which comprises the following steps:
determining state information of the graphics processor nodes corresponding to each dynamic host configuration protocol client according to the information sent by each dynamic host configuration protocol client;
and establishing a node table according to the information sent by each dynamic host configuration protocol client and the state information of each graphic processor node.
Table 1 is a GPU node table provided in an embodiment of the present invention, as shown in Table 1, where the GPU node table includes IDs of all GPUs, corresponding IPs, and corresponding GPU states. In practice, the node table may include other information of the GPU in addition to the ID, IP, GPU state information of the GPU, which is not limited thereto.
TABLE 1GPU node table
GPU ID GPU IP GPU state
GPU ID1 GPU IP1 GPU state
GPU ID2 GPU IP2 GPU state
GPU IDN GPU IPN GPU state
After the node table is built, the node table is sent to each graphics processor node, if the IP address or other information corresponding to the ID of a certain GPU node in the node table is changed within a period of time, the host node sends information to all other GPU nodes so as to update the node tables in the other GPU nodes, and when the certain GPU node needs to communicate with the other nodes, the corresponding IP address can be quickly found according to the ID of the other GPU node.
The network topology acquisition method provided by the embodiment of the invention is applied to the host nodes in the distributed computing system based on computing network integration, in the method, the dynamic host configuration protocol client integrated in the exclusive network communication module of each graphic processor node acquires the Internet protocol address of the corresponding graphic processor node from a plurality of pre-stored Internet protocol addresses of the host nodes, and transmits the information containing the acquired Internet protocol address of the graphic processor node to the host nodes, thereby realizing the automatic registration of each graphic processor node in the network of the distributed computing system, and the node table established according to the information transmitted by each dynamic host configuration protocol client and transmitted by the host nodes, so that each graphic processor node can determine the network topology in the distributed computing system according to the node table, the registration of the graphic processor on the network can be realized without depending on a central processor connected with the graphic processor, and the flexibility and expandability of the network topology are improved; secondly, each node in the distributed computing system provided by the embodiment is a host node or a graphics processor node, namely, one node is the host node or the graphics processor node, so that decoupling of the graphics processor and the host to which the graphics processor belongs is realized, each graphics processor node is provided with a dedicated network communication module, and limitation of network topology caused by physical binding of PCIe and a host CPU is eliminated; in addition, through the unique identification of the graphic processor nodes, the Internet protocol address of each graphic processor node can be quickly found in the node table according to the unique codes, and the self-networking efficiency of each graphic processor node can be improved.
In order to establish a reliable link, the conventional approach is to use the transmission control protocol (Transmission Control Protocol, TCP). However, due to the complexity of the TCP protocol, implementing the functionality of the protocol on hardware consumes a significant amount of FPGA resources and the delay on the data path is higher compared to the user datagram protocol (User Datagram Protocol, UDP). Therefore, the protocol based on the user datagram protocol in this embodiment, that is, the RLTL protocol described above, is used to improve the transmission efficiency of the idma module under the UDP network.
The receiving information sent by each dynamic host configuration protocol client comprises the following steps:
receiving information of each dynamic host configuration protocol client encapsulated by a User Datagram Protocol (UDP) -based protocol; the user datagram protocol-based protocol is a protocol set in the data content of the user datagram protocol, and at least comprises an initial source unique code, a target source unique code, a data transmission length and a checksum;
correspondingly, the node table is built according to the information sent by each dynamic host configuration protocol client side, and the node table comprises:
analyzing the information of each dynamic host configuration protocol client through protocol encapsulation based on a user datagram protocol and acquiring the analyzed information;
And establishing a node table according to the analyzed information.
The RLTL protocol provides the functions necessary to achieve reliable transmission, including time-out/out-of-order retransmission, flow control, congestion management, etc. Fig. 5 is a schematic diagram of the location and components of an RLTL protocol in an entire network protocol stack according to an embodiment of the present invention. As shown in fig. 5, the RLTL protocol is located in the data content (ayoadp) of the UDP protocol, i.e., it is an application layer protocol. Fields such as frame type (frame type), frame number (frame number), flags (flag), source ID (src ID), destination ID (dest addr), transfer length (transfer length), and cyclic redundancy check (Cyclic Redundancy Check, CRC) checksum (checksum) are included. Wherein the frame type is used for distinguishing main functions of the frame, such as DHCP request of GPU node, data movement between nodes, instruction control between nodes, data response and the like; the source ID and the destination ID then represent the unique identities of the sender node and the receiver node of the frame (the ID information of the nodes is fixed); when the frame functions to perform data movement, the source address and destination address represent the source address and destination address of the data; the CRC checksum is a function that provides error checking.
After each graphic processor node obtains the IP address of the node in the network, finally, a FINAL message encapsulated by using RLTL protocol is sent to the background service program, and the RLTL protocol of the message contains a series of information such as unique ID of the node, and the information is stored in the background service program together with the IP address in an IP Header to finish automatic registration in the network.
In the whole network, a server is used as an operating host of a background service program. The background service program can automatically receive DHCP registration information and FINAL information of all GPU nodes in the network, analyze the information, and maintain and update a node table in real time according to the information, wherein the node table comprises ID information and corresponding IP addresses of each GPU node in the network.
In the method provided by the embodiment, the transmission efficiency of the iRDMA module under the UDP network is improved and the reliable transmission of the data is realized through the RLTL protocol.
In implementations, a distributed computing system includes a plurality of host nodes. In the entire network, there is generally one server as an operating host of the background service program, and therefore, in this embodiment, the host operating the background service program is referred to as a target host node, and the remaining host nodes are referred to as normal host nodes. Before the internet protocol address is allocated to each dynamic host configuration protocol client, the method further comprises:
Selecting a target host node from a plurality of host nodes so as to facilitate the step of executing a network topology acquisition method in the target host node; wherein the target host node is a host node with an internet protocol address kept unchanged;
acquiring an Internet information protocol address corresponding to a target host node;
the Internet protocol address corresponding to the target host node is sent to the common host node; the common host nodes are the rest host nodes except the target host node in the host nodes.
Fig. 6 is a schematic diagram of a network topology discovery operation process of a communication independent distributed heterogeneous computing system according to an embodiment of the present invention, and as shown in fig. 6, the network topology discovery operation process includes a preparation stage (a first stage), a power-up stage (a second stage), and an operation stage (a third stage).
For a node running a background service program, namely a target host node, in a preparation stage, the background service program serving as a DHCP server needs to be started in a certain host node in advance, and the IP address of the host node is not changed generally; in addition, through one-time configuration, the IP address of the host node running the background service program is explicitly shown to other host nodes, so that the other host nodes can request the background service program to acquire the address information of the GPU node in the network in later operation.
In the power-on stage (second stage), the DHCP clients in all GPU nodes automatically complete the acquisition of the IP address of the node and the registration on the network through the DHCP protocol and the FINAL message (packed and encapsulated by RLTL Protocol Engine), and the operation mainly depends on the frame type field and the source ID field of the RLTL protocol encapsulated in the FINAL message, besides the conventional DHCP protocol, the frame type field indicates that the message is a FINAL message, and the source ID field gives the unique ID identification of the GPU node sending the message; meanwhile, the background service program completes the handshake operation of the DHCP protocol by means of a conventional network protocol stack (Network Protocol Stack); because the RLTL protocol is a user-defined protocol of the user layer, the background service program can call a management library capable of analyzing and packaging the RTLT protocol to process the received information so as to know the intention of the sender, and for the FINAL information, it can obtain ID-IP address pair information of all GPU nodes in the network; the background service program then maintains a state information for each GPU node that represents some of the state information of the GPU node, e.g., the GPU node is available, the GPU node is busy, the GPU node is unavailable, etc., and eventually network topology discovery is completed.
After the power-up phase, a working phase (third phase) is entered, which has different tasks for different nodes, so after the node tables are sent to the graphics processor nodes respectively, the method further comprises:
under the condition that a dynamic host configuration protocol broadcast message sent by a new dynamic host configuration protocol client is received, an internet protocol address is distributed to the new dynamic host configuration protocol client;
receiving information sent by a new dynamic host configuration protocol client;
updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client; wherein the target graphics processor node is all graphics processor nodes or some of all graphics processor nodes.
a) For a host node running a background service program, the node can be monitored in real time in a working stage to add information of newly added GPU nodes in the network (i.e., some GPU nodes can be added into the network after the whole network has completed initialization processes such as power-up, etc.).
In an implementation, to avoid network congestion, updating each node table in the target graphics processor node to a node table containing new graphics processor node information corresponding to the new dynamic host configuration protocol client according to information sent by the new dynamic host configuration protocol client includes:
starting to receive information sent by a new dynamic host configuration protocol client, if the information sent by the new dynamic host configuration protocol client is received for a plurality of times in a first preset time period, acquiring the information sent by the new dynamic host configuration protocol client received in a second preset time period from the moment when the first preset time period is ended;
and updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client in the first preset time period and the information sent by the new dynamic host configuration protocol client in the second preset time period.
The first preset duration and the second preset duration are not limited. In this embodiment, the node table is changed multiple times in a short time, and the node table is not synchronized multiple times, but the total change in a period of time is synchronized once, so as to avoid network congestion.
b) For GPU nodes, in the working phase, if each GPU node receives a synchronization update message (identified by the frame type field of the RLTL protocol) from the background service program, the change information obtained by parsing by RLTL Protocol Engine is saved in the node table inside the node.
In an implementation, the method further comprises:
acquiring a data packet which is transmitted by a common host node and is based on protocol encapsulation of a user datagram protocol; the data packet at least comprises the number of the graphic processor nodes to be requested and the information of the common host nodes;
determining the data packet to be returned by the common host node according to the content of the parsed data packet and the state information of the graphic processor node; the data packet to be returned at least comprises an internet protocol address of the graphic processor node and a unique code corresponding to the internet protocol address;
and sending the data packet to be returned to the common host node.
After sending the data packet to be returned to the common host node, the method further comprises:
and updating the node table in the target host node according to the data packet to be returned and acquiring the updated node table.
The task of the target host node of the working node is described in a) above, and the task of the GPU node is described in b), in this embodiment, the task performed in the working phase for the normal host node is as follows:
c) For a common host node, when a user of the host node needs to apply for a certain amount of GPU node resources to perform computation, an RLTL protocol encapsulated packet is sent to an IP address according to the IP address where a background service program configured in a configuration stage is located, a frame type field of the RLTL protocol indicates an intention of the common node, that is, a certain amount of GPU node resources (assumed to be N) are requested, and a source IP address in the IP protocol indicates information of a request source to the background service program.
The background service program returns a packet encapsulated by an RLTL protocol to a host of the request source node, wherein the payload field of the RLTL protocol of the packet stores information of N GPU nodes according to a certain format, wherein the information comprises IDs of N devices, IP addresses corresponding to the IDs of the N devices and the like; and the background service program updates the state information of the corresponding GPU nodes stored in the present node, which indicates that the GPU nodes have been used and occupy objects to prevent application by other hosts.
In the method provided by the embodiment of the invention, the functions of packaging and analyzing the self-defined RLTL protocol (positioned at the user layer) are integrated in the background service program and the GPU hardware of the upper host, so that the reliability transmission under the UDP protocol is realized, and the method is highly compatible with the deployment of the existing mainstream network protocol stack;
By using the ID field of the self-defined RLTL protocol, the unique identity of each GPU node in the range is realized, the identity corresponds to the IP address allocated to the GPU node by the DHCP server of the upper host one by one, and the quick lookup of the IP address is realized;
by integrating the power-on self-starting DHCP client in the network communication module exclusive to the GPU and matching with the background service program of the upper host, the RLTL analysis engine of the GPU and the node table, each GPU calculation engine has the capability of completely independent autonomous communication in the network, can freely carry out networking in the form of a single GPU calculation engine, and eliminates the limitation of network topology caused by physical binding of PCIe and the CPU host.
The above describes a network topology acquisition method applied to a host node, and this embodiment also provides a network topology acquisition method applied to each graphics processor node. The network topology acquisition method provided in this embodiment is applied to each graphics processor node in a distributed computing system based on computing network convergence, where each node in the distributed computing system is a host node or a graphics processor node, the host node includes a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module dedicated to the graphics processor node, and the method includes:
Transmitting a dynamic host configuration protocol broadcast message to a host node;
acquiring an Internet protocol address distributed by a host node; wherein, a plurality of internet protocol addresses are prestored in the dynamic host configuration protocol server;
transmitting information to a host node through a dynamic host configuration protocol client; the information at least comprises an internet protocol address and unique codes corresponding to the graphic processor nodes;
acquiring a node table sent by a host node; the node table is established for the host node according to the information sent by each dynamic host configuration protocol client;
a network topology in the distributed computing system is determined from the node table.
The network topology acquisition method applied to each graphics processor node provided in this embodiment has the same or corresponding technical features as the network topology acquisition method applied to the host node described above, and the network topology acquisition method applied to the host node has been described in detail above, so that embodiments of the network topology acquisition method applied to each graphics processor node will not be repeated, and have the same advantages as the above-mentioned network topology acquisition method applied to the host node.
In the above embodiments, the detailed description is given to the network topology acquisition method, and the invention further provides a network topology acquisition device and a corresponding embodiment of the network topology acquisition equipment. It should be noted that the present invention describes an embodiment of the device portion from two angles, one based on the angle of the functional module and the other based on the angle of the hardware.
The embodiment provides a network topology acquisition device, which is applied to host nodes in a distributed computing system based on computing network integration, wherein each node in the distributed computing system is a host node or a graphic processor node, the host node comprises a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphic processor node. Fig. 7 is a block diagram of a network topology acquiring apparatus according to an embodiment of the present invention. The embodiment is based on the angle of the functional module, and comprises:
an allocation module 10, configured to allocate an internet protocol address to each dynamic host configuration protocol client when receiving a dynamic host configuration protocol broadcast message sent by each dynamic host configuration protocol client; wherein, a plurality of internet protocol addresses are prestored in the dynamic host configuration protocol server;
A receiving module 11, configured to receive information sent by each dynamic host configuration protocol client; the information at least comprises an internet protocol address and unique codes corresponding to the graphic processor nodes;
the building module 12 is configured to build a node table according to the information sent by each dynamic host configuration protocol client;
and the sending module 13 is used for respectively sending the node tables to the graphics processor nodes so that the graphics processor nodes can determine the network topology structure in the distributed computing system according to the node tables.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are described with reference to the embodiments of the method portion, which are not repeated herein, and the effects are the same as above.
The receiving module 11 includes:
the first receiving module is used for receiving the information packaged by each dynamic host configuration protocol client through the protocol based on the user datagram protocol; the user datagram protocol-based protocol is a protocol set in the data content of the user datagram protocol, and at least comprises an initial source unique code, a target source unique code, a data transmission length and a checksum;
The setup module 12 includes:
the analysis and acquisition module is used for analyzing the information packaged by each dynamic host configuration protocol client through the protocol based on the user datagram protocol and acquiring the analyzed information;
the first establishing module is used for establishing a node table according to the analyzed information.
The setup module 12 includes:
the first determining module is used for determining the state information of the graphics processor node corresponding to each dynamic host configuration protocol client according to the information sent by each dynamic host configuration protocol client;
and the second building module is used for building a node table according to the information sent by each dynamic host configuration protocol client and the state information of each graphic processor node.
The distributed computing system comprises a plurality of host nodes; further comprises:
a selecting module, configured to select a target host node from a plurality of host nodes, so as to perform a step of a network topology acquiring method in the target host node; wherein the target host node is a host node with an internet protocol address kept unchanged;
the first acquisition module is used for acquiring an Internet information protocol address corresponding to the target host node;
the first sending module is used for sending the internet protocol address corresponding to the target host node to the common host node; the common host nodes are the rest host nodes except the target host node in the host nodes.
Further comprises:
the distribution module is used for distributing the Internet protocol address to the new dynamic host configuration protocol client under the condition of receiving the dynamic host configuration protocol broadcast message sent by the new dynamic host configuration protocol client;
the second receiving module is used for receiving the information sent by the new dynamic host configuration protocol client;
the updating module is used for updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client; wherein the target graphics processor node is all graphics processor nodes or some of all graphics processor nodes.
The updating module comprises:
the second obtaining module is used for obtaining the information sent by the new dynamic host configuration protocol client received in the second preset time period from the moment when the first preset time period is ended if the information sent by the new dynamic host configuration protocol client is received for a plurality of times in the first preset time period from the moment when the first preset time period is ended;
And the first updating module is used for updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client in the first preset time period and the information sent by the new dynamic host configuration protocol client in the second preset time period.
Further comprises:
a third obtaining module, configured to obtain a packet based on a protocol encapsulation of a user datagram protocol sent by a common host node; the data packet at least comprises the number of the graphic processor nodes to be requested and the information of the common host nodes;
the second determining module is used for determining the data packet to be returned by the common host node according to the content of the parsed data packet and the state information of the graphic processor node; the data packet to be returned at least comprises an internet protocol address of the graphic processor node and a unique code corresponding to the internet protocol address;
and the second sending module is used for sending the data packet to be returned to the common host node.
Further comprises:
and the second updating module is used for updating the node table in the target host node according to the data packet to be returned and acquiring the updated node table.
The embodiment also provides a network topology acquisition device, which is applied to each graphic processor node in a distributed computing system based on computing network integration, wherein each node in the distributed computing system is a host node or a graphic processor node, the host node comprises a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphic processor node, and the device comprises:
a third sending module, configured to send a dynamic host configuration protocol broadcast message to a host node;
a fourth obtaining module, configured to obtain an internet protocol address allocated by the host node; wherein, a plurality of internet protocol addresses are prestored in the dynamic host configuration protocol server;
a fourth sending module, configured to send information to the host node through the dynamic host configuration protocol client; the information at least comprises an internet protocol address and unique codes corresponding to the graphic processor nodes;
a fifth obtaining module, configured to obtain a node table sent by the host node; the node table is established for the host node according to the information sent by each dynamic host configuration protocol client;
And the third determining module is used for determining the network topology structure in the distributed computing system according to the node table.
The network topology acquisition device provided in this embodiment has the same or corresponding technical features as the network topology acquisition method described above, and the effects are the same as above.
Fig. 8 is a block diagram of a network topology acquiring apparatus according to another embodiment of the present invention. The present embodiment is based on a hardware angle, and as shown in fig. 8, the network topology acquisition apparatus includes:
a memory 20 for storing a computer program;
a processor 21 for implementing the steps of the network topology acquisition method as mentioned in the above embodiments when executing a computer program.
Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in hardware in at least one of a digital signal processor (Digital Signal Processor, DSP), FPGA, programmable logic array (Programmable Logic Array, PLA). The processor 21 may also include a main processor, which is a processor for processing data in an awake state, also called CPU, and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may be integrated with a GPU for taking care of rendering and drawing of the content that the display screen is required to display. In some embodiments, the processor 21 may also include an artificial intelligence (Artificial Intelligence, AI) processor for processing computing operations related to machine learning.
Memory 20 may include one or more computer-readable storage media, which may be non-transitory. Memory 20 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 20 is at least used for storing a computer program 201, where the computer program, when loaded and executed by the processor 21, is capable of implementing the relevant steps of the network topology acquisition method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may further include an operating system 202, data 203, and the like, where the storage manner may be transient storage or permanent storage. The operating system 202 may include Windows, unix, linux, among others. The data 203 may include, but is not limited to, data related to the above-mentioned network topology acquisition method, and the like.
In some embodiments, the network topology acquisition device can further include a display 22, an input/output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is not limiting of the network topology acquisition device and may include more or fewer components than shown.
The network topology acquisition device provided by the embodiment of the invention comprises a memory and a processor, wherein the processor can realize the following method when executing a program stored in the memory: the network topology structure acquisition method has the same effects.
Finally, the invention also provides a corresponding embodiment of the computer readable storage medium. The computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps described in the above-described method embodiments (the method may be a method corresponding to a host node side, a method corresponding to a graphics processor node side, or a method corresponding to a host node side and a graphics processor node side).
It will be appreciated that the methods of the above embodiments, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored on a computer readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium for performing all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The computer readable storage medium provided by the invention comprises the network topology acquisition method, and the effects are the same as those of the network topology acquisition method.
The method, the device, the equipment and the medium for acquiring the network topology structure provided by the invention are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (12)

1. The network topology acquisition method is characterized by being applied to a host node in a distributed computing system based on computing network integration, wherein each node in the distributed computing system is the host node or a graphic processor node, the host node comprises a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphic processor node, and the method comprises the following steps:
under the condition that dynamic host configuration protocol broadcast messages sent by the dynamic host configuration protocol clients are received, distributing internet protocol addresses to the dynamic host configuration protocol clients; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
receiving information sent by each dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
establishing a node table according to the information sent by each dynamic host configuration protocol client;
and respectively sending the node tables to each graphic processor node so that each graphic processor node can determine the network topology structure in the distributed computing system according to the node tables.
2. The network topology acquisition method of claim 1, wherein said receiving information sent by each of said dynamic host configuration protocol clients comprises:
receiving the information encapsulated by each dynamic host configuration protocol client through a protocol based on a user datagram protocol; the user datagram protocol-based protocol is a protocol set in the data content of the user datagram protocol, and at least comprises an initial source unique code, a target source unique code, a data transmission length and a checksum;
correspondingly, the establishing a node table according to the information sent by each dynamic host configuration protocol client includes:
analyzing the information encapsulated by each dynamic host configuration protocol client through the user datagram protocol-based protocol and acquiring the analyzed information;
and establishing the node table according to the analyzed information.
3. The network topology acquisition method of claim 2, wherein said establishing a node table from said information sent by each of said dynamic host configuration protocol clients comprises:
Determining state information of the graphics processor nodes corresponding to the dynamic host configuration protocol clients according to the information sent by the dynamic host configuration protocol clients;
and establishing the node table according to the information sent by each dynamic host configuration protocol client and the state information of each graphic processor node.
4. A network topology acquisition method as recited in any one of claims 1 to 3, wherein the distributed computing system comprises a plurality of host nodes; before said assigning an internet protocol address to each of said dynamic host configuration protocol clients, further comprising:
selecting a target host node from the plurality of host nodes so as to facilitate executing the network topology acquisition method in the target host node; wherein the target host node is the host node whose internet protocol address remains unchanged;
acquiring an Internet information protocol address corresponding to the target host node;
transmitting an internet protocol address corresponding to the target host node to a common host node; wherein the common host node is the remaining host nodes except the target host node in the host nodes.
5. The network topology acquisition method of claim 4, further comprising, after said sending said node table to each of said graphics processor nodes, respectively:
under the condition that a dynamic host configuration protocol broadcast message sent by a new dynamic host configuration protocol client is received, the Internet protocol address is distributed to the new dynamic host configuration protocol client;
receiving information sent by a new dynamic host configuration protocol client;
updating each node table in the target graphic processor node into a node table containing new graphic processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client; wherein the target graphics processor node is all of the graphics processor nodes or a portion of the graphics processor nodes of all of the graphics processor nodes.
6. The network topology acquisition method of claim 5, wherein updating each of the node tables in the target graphics processor node to a node table containing new graphics processor node information corresponding to the new dynamic host configuration protocol client based on the information sent by the new dynamic host configuration protocol client comprises:
Starting to receive the information sent by the new dynamic host configuration protocol client, if the information sent by the new dynamic host configuration protocol client is received for a plurality of times in a first preset time period, acquiring the information sent by the new dynamic host configuration protocol client received in a second preset time period from the moment when the first preset time period is ended;
and updating each node table in the target graphics processor node into a node table containing the new graphics processor node information corresponding to the new dynamic host configuration protocol client according to the information sent by the new dynamic host configuration protocol client in the first preset time period and the information sent by the new dynamic host configuration protocol client in the second preset time period.
7. The network topology acquisition method of claim 4, further comprising:
acquiring the data packet which is transmitted by the common host node and is encapsulated by the protocol based on the user datagram protocol; the data packet at least comprises the number of the graphic processor nodes to be requested and the information of the common host node;
determining the data packet to be returned by the common host node according to the analyzed content of the data packet and the state information of the graphic processor node; wherein, the data packet to be returned at least comprises the internet protocol address of the graphic processor node and a unique code corresponding to the internet protocol address;
And sending the data packet to be returned to the common host node.
8. The network topology acquisition method of claim 7, further comprising, after said sending said data packet to be returned to said regular host node:
and updating the node table in the target host node according to the data packet to be returned and acquiring the updated node table.
9. The network topology acquisition method is characterized by being applied to each graphic processor node in a distributed computing system based on computing network integration, wherein each node in the distributed computing system is a host node or the graphic processor node, the host node comprises a dynamic host configuration protocol server, and a dynamic host configuration protocol client is integrated in a network communication module exclusive to the graphic processor node, and the method comprises the following steps:
transmitting a dynamic host configuration protocol broadcast message to the host node;
acquiring an internet protocol address distributed by the host node; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
transmitting information to the host node through the dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
Acquiring a node table sent by the host node; the node table is established for the host node according to the information sent by each dynamic host configuration protocol client;
and determining the network topology structure in the distributed computing system according to the node table.
10. The utility model provides a network topology structure acquisition device, its characterized in that is applied to the host node in the distributed computing system based on computing network integration, wherein each node in the distributed computing system is host node or figure processor node, contain dynamic host configuration protocol server in the host node, integrate dynamic host configuration protocol customer end in the exclusive network communication module of figure processor node, include:
the distribution module is used for distributing the internet protocol address to each dynamic host configuration protocol client under the condition that the dynamic host configuration protocol broadcast message sent by each dynamic host configuration protocol client is received; wherein, the dynamic host configuration protocol server is pre-stored with a plurality of internet protocol addresses;
the receiving module is used for receiving the information sent by each dynamic host configuration protocol client; wherein, the information at least comprises the internet protocol address and unique codes corresponding to the graphic processor nodes;
The establishing module is used for establishing a node table according to the information sent by each dynamic host configuration protocol client;
and the sending module is used for respectively sending the node tables to the graphic processor nodes so that the graphic processor nodes can determine the network topology structure in the distributed computing system according to the node tables.
11. A network topology acquisition device, comprising:
a memory for storing a computer program;
processor for implementing the steps of the network topology acquisition method according to any one of claims 1 to 9 when executing said computer program.
12. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the network topology acquisition method of any of claims 1 to 9.
CN202310573372.2A 2023-05-19 2023-05-19 Network topology acquisition method, device, equipment and medium Pending CN116389280A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310573372.2A CN116389280A (en) 2023-05-19 2023-05-19 Network topology acquisition method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310573372.2A CN116389280A (en) 2023-05-19 2023-05-19 Network topology acquisition method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN116389280A true CN116389280A (en) 2023-07-04

Family

ID=86964221

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310573372.2A Pending CN116389280A (en) 2023-05-19 2023-05-19 Network topology acquisition method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116389280A (en)

Similar Documents

Publication Publication Date Title
CN107070691B (en) Cross-host communication method and system of Docker container
CN111542064B (en) Container arrangement management system and arrangement method for wireless access network
CN110113441B (en) Computer equipment, system and method for realizing load balance
CN114024880B (en) Network target range probe acquisition method and system based on proxy IP and flow table
CN105993161B (en) Element, method, system and computer readable storage device for resolving an address
EP1892929A1 (en) A method, an apparatus and a system for message transmission
CN106789606B (en) Network communication system, management method and communication method thereof
CN112631788B (en) Data transmission method and data transmission server
CN107147580B (en) Tunnel establishment method and communication system
JP2017503405A (en) Method, switch and controller for processing address resolution protocol messages
CN114070723A (en) Virtual network configuration method and system of bare metal server and intelligent network card
CN112769959B (en) Session synchronization method, device, first node, second node, system and medium
US20230091501A1 (en) Port status configuration method, apparatus, and system, and storage medium
CN110636149B (en) Remote access method, device, router and storage medium
WO2017219777A1 (en) Packet processing method and device
CN116389280A (en) Network topology acquisition method, device, equipment and medium
CN108353017B (en) Computing system and method for operating multiple gateways on a multi-gateway virtual machine
CN112511440B (en) Message forwarding method, system, storage medium and electronic equipment
CN112953858A (en) Message transmission method in virtual network, electronic device and storage medium
CN113489775A (en) VPP-based seven-layer load balancing server and load balancing method
CN107968846A (en) Networking processing method and processing device
CN107295113B (en) Network configuration method, switch and server
CN116132435B (en) Double-stack cross-node communication method and system of container cloud platform
US12052173B2 (en) Executing workloads across multiple cloud service providers
CN117014636B (en) Data stream scheduling method of audio and video network, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination