CN112187523A - Network high-availability implementation method and super-convergence system - Google Patents

Network high-availability implementation method and super-convergence system Download PDF

Info

Publication number
CN112187523A
CN112187523A CN202010946006.3A CN202010946006A CN112187523A CN 112187523 A CN112187523 A CN 112187523A CN 202010946006 A CN202010946006 A CN 202010946006A CN 112187523 A CN112187523 A CN 112187523A
Authority
CN
China
Prior art keywords
flow table
local flow
server
network
server nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010946006.3A
Other languages
Chinese (zh)
Inventor
黄茂峰
杨帅麒
雷准富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huayun Data Holding Group Co Ltd
Original Assignee
Huayun Data Holding Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huayun Data Holding Group Co Ltd filed Critical Huayun Data Holding Group Co Ltd
Priority to CN202010946006.3A priority Critical patent/CN112187523A/en
Publication of CN112187523A publication Critical patent/CN112187523A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/54Organization of routing tables
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Abstract

The invention provides a network high-availability implementation method and a super-convergence system, wherein the network high-availability implementation method comprises the following steps: configuring local flow tables for virtual network switches of at least two server nodes; carrying out instant synchronization on a local flow table between server nodes so as to copy the local flow table to a server node of an opposite terminal to form an opposite terminal flow table; and integrating the end flow table transmitted by the failed opposite end node in the non-failed server nodes and the existing local flow table of the non-failed server nodes to form a new local flow table. The network high-availability realization method and the super-fusion system disclosed by the invention realize the rapid virtual network recovery of all server nodes in the whole super-fusion system, and obviously reduce the system resources consumed in the process of virtual network recovery.

Description

Network high-availability implementation method and super-convergence system
Technical Field
The invention relates to the technical field of cloud computing, in particular to a network high-availability implementation method and a super-fusion system.
Background
The super convergence system is based on a super Converged Infrastructure (HCI), and means that resources and technologies such as computation, network, storage, and server virtualization are not only provided in the same set of unit devices, but also elements such as backup software, snapshot technology, data de-duplication, online data compression are included, and multiple sets of unit devices can be aggregated through the network to realize modular seamless lateral expansion (scale-out) to form a uniform resource pool. At least three physical hosts (i.e., "super-fusion nodes" or "nodes") are usually arranged in the super-fusion all-in-one machine, and control nodes, storage nodes, network nodes and computing nodes are defined in the hosts.
The super-convergence system generally comprises a plurality of super-convergence nodes (namely server nodes), and a Software Defined Network (SDN) is used for realizing virtual network switches, distributed virtual routing functions, interconnection between virtual machines, virtual machines and external networks, and efficient data exchange. And meanwhile, the cloud management platform is used for carrying out unified management on the network switch, the floating IP and the distributed virtual router, wherein the unified management comprises operations of creation, deletion, query, modification and the like. When one super-convergence node has a network fault, the virtualized network needs to be restored immediately to realize high availability of the network.
The applicant finds that a chinese patent with publication number CN107257295A discloses a scheduling method of a distributed architecture software-defined network controller. The prior art relates to network information synchronization between computing nodes on a cloud computing system, and the prior art updates a local topology information table according to port information of a virtual switch that is connected or disconnected according to a software defined network controller (SDN controller) in a computing node, and synchronizes updated topology information to SDN controllers of other computing nodes. The applicant indicates that the prior art relates to network information synchronization between computing nodes of a cloud computing system due to network information change, and is therefore not suitable for a super-fusion system. Meanwhile, the existing technology cannot solve the problem of how to recover the network fault when a certain super-fusion node in the super-fusion system fails, and provides a corresponding solution.
Therefore, there is a need for an improved method for implementing a network high availability in a super-convergence system and a super-convergence system in the prior art, so as to solve the above problems.
Disclosure of Invention
The invention aims to disclose a network high-availability realization method and a super-fusion system, which are used for realizing the rapid recovery of a virtualized network when a certain server node in the super-fusion system fails so as to realize the high availability of the network and reduce system resources consumed in the process of recovering the virtualized network.
In order to achieve the first object, the present invention provides a method for implementing high availability of a network, comprising the following steps:
s1, configuring local flow tables for the virtual network switches of at least two server nodes;
s2, carrying out instant synchronization on the local flow tables among the server nodes so as to copy the local flow tables to the server nodes of the opposite terminal to form an opposite terminal flow table;
and S3, integrating the flow table of the opposite end introduced by the opposite end node with the fault in the server nodes without the fault with the existing local flow table of the server nodes without the fault to form a new local flow table.
As a further improvement of the present invention, in step S3, the integrating operation performed on the current table at the peer end of the non-failed server node, which is introduced by the peer end node that has failed, and the local current table existing in the non-failed server node is specifically: and modifying and sequencing the opposite-end flow table and the local flow table.
As a further improvement of the present invention, the step S2 immediately synchronizes the local flow tables configured in advance by the server nodes on the premise that neither of the two server nodes that duplicate the local flow tables with each other fails.
As a further improvement of the present invention, the new local flow table contains all local flow tables of all server nodes, and the fault includes a fault in which a server node is down, powered off, or a network outage.
As a further improvement of the invention, the network high availability implementation method is applied to a super-convergence system consisting of three or more server nodes.
As a further improvement of the present invention, the step S2 specifically includes: and selecting one server node which does not have a fault from the plurality of server nodes, and carrying out instant synchronization on the local flow tables in other unselected server nodes so as to copy the local flow tables in other unselected server nodes one by one to the selected server node to form an opposite-end flow table.
As a further improvement of the present invention, the step S2 further includes: and integrating the opposite-end flow table which is formed by copying the opposite-end flow table to the selected server node one by one with the existing local flow table of the server node which does not have a fault to form a new local flow table, and synchronously copying the new local flow table to other server nodes.
As a further improvement of the present invention, after the step S3 is completed, the method further includes: and migrating the virtual machine in the server node with the fault to the server node where the new local flow table is formed according to the new local flow table.
As a further improvement of the invention, the server nodes are connected to the switch together through network cards configured with each other.
Based on the same invention idea, the application also discloses a super-fusion system, which comprises: the system comprises two or more server nodes, a network card and a switch, wherein the server nodes are configured with the network card connected with the switch;
the super-fusion system is characterized in that the super-fusion system operates the network high-availability implementation method created by any one of the inventions.
Compared with the prior art, the invention has the beneficial effects that:
in the application, when all server nodes do not have faults, the local flow tables configured with each other are synchronized to the opposite-end server node and are integrated with the local flow table in the selected server node without faults to form a new local flow table, so that the new local flow table is used for quickly restoring the virtual network of all the server nodes in the whole super-fusion system, and the network service of the virtual machine migrated to the server node forming the new local flow table can be unaffected according to the new local flow table; finally, since only the local flow table needs to be synchronized to the peer server node in real time to form the peer flow table, system resources consumed in the process of recovering the virtualized network are significantly reduced.
Drawings
FIG. 1 is an overall flow chart of a network high availability implementation method of the present invention;
FIG. 2 is a schematic diagram illustrating real-time synchronization of local flow tables between server nodes when two server nodes configured in the super-fusion system fail;
fig. 3 is a schematic diagram illustrating that when one of two server nodes configured in the super-fusion system fails, an opposite-end flow table that has been instantly copied to an opposite-end node is integrated with a local flow table in the server node that has not failed to form a new local flow table;
fig. 4 is a schematic diagram of a variant of a method for implementing high network availability in a super-fusion system including three server nodes, where a server node 1 is the pre-selected non-failed server node, and server nodes 1 to 3 are all configured with local flow tables;
fig. 5 is a schematic diagram illustrating that when the server node 2 in fig. 4 fails, local flow tables in other unselected server nodes (i.e., the server node 2 and the server node 5) are synchronized in real time, so that the local flow tables in the other unselected server nodes are copied to the selected server node (the server node 1) one by one to form an opposite-end flow table;
fig. 6 is a schematic diagram of integrating a local flow table in the server node 1 with an end flow table to form a new local flow table;
fig. 7 is a schematic diagram of synchronously copying the new local flow table formed in the server node 1 to other server nodes and restarting the virtual machines deployed in the server nodes 2 and 3 during the network recovery process.
Detailed Description
The present invention is described in detail with reference to the embodiments shown in the drawings, but it should be understood that these embodiments are not intended to limit the present invention, and those skilled in the art should understand that functional, methodological, or structural equivalents or substitutions made by these embodiments are within the scope of the present invention.
Before describing the embodiments of the present application in detail, technical terms in the embodiments are necessarily described and defined.
The term "Flow Table" (Flow Table): the Flow table is an abstraction of the data forwarding function of the network device by the Open Flow. In a conventional network device, data forwarding of a switch and a router needs to depend on a two-layer MAC address forwarding table or a three-layer IP address routing table stored in the device, and the same applies to a Flow table used in an Open Flow switch, but network configuration information of each layer in a network is integrated in an entry of the Flow table, so that a richer rule can be used in data forwarding, and the data forwarding is embodied as a data entry required by a virtual network switch in a virtual network data forwarding process. Referring to fig. 2 to 7, in the embodiments of the present application, the Flow table is simplified to be "Flow".
The term "vSwitch": virtual network switch, "vSwitch," is merely a subordinate or specific concept of a virtual network switch.
The term "VM": and the virtual machine forms a service subject responding to the access request initiated by the user, provides cloud computing service for the user, and is a computer system with complete functions and a virtual state.
The term "Local Flows": the local flow table, i.e., the virtual network switch, provides routing and forwarding rules needed for network services for the virtual machines of the local node.
The term "Peer Flows": and the opposite-end flow table, namely the local flow table of the local node is synchronized and copied to the opposite-end node instantly.
The term "local node" has the same role as the term "correspondent node" in a computer cluster server or a super-converged system, and can be generally understood as a physical node or a server node consisting of a physical state server. Meanwhile, the term "node" has an equivalent technical meaning to the term "server node" or "super-fusion node". Referring to fig. 2, when both server node a and server node B are in normal operation, if the local Flow table containing Flow01 and Flow02 in server node a is synchronized instantly to server node B to form a peer Flow table containing Flow01 and Flow 02. In the instant copying process, a server node A is called a local node, and a server node B is called an opposite node; on the contrary, if the local Flow table in server node B containing Flow03 and Flow04 is synchronized instantly to server node a to form the peer Flow table containing Flow03 and Flow 04. In this instant replication process, server node B is referred to as the home node and server node a is referred to as the correspondent node. Thus, the local node and the correspondent node are merely relative terms, and the configurations of each other may remain consistent or inconsistent, and are determined only by the logical direction of the instant copy process that initiated the local flow table.
The first embodiment is as follows:
referring to fig. 1 to fig. 3, a specific embodiment of a network high availability implementation method of the present invention is shown.
The local flow table and the opposite-end flow table are composed of a basic field, a condition field and an action field. The vSwitch23 and the vSwitch33 execute forwarding operations on the data packets based on the local flow table and the opposite-end flow table or the integrated new local node. The basic field includes: effective time duration _ sec, table _ id of the belonged entry, priority, number of processed data packets n _ packets, idle timeout idle _ timeout, etc. The condition field includes: an input port number in _ port, an output port number out _ port, a source and destination mac address dl _ src/dl _ dst, a source and destination ip address nw _ src/nw _ dst, a data packet type dl _ type, a network layer protocol type nw _ proto and the like, wherein any combination of the fields can be adopted, but when a field at the bottom layer in the network hierarchical structure does not give a determined value, the field at the top layer is not allowed to give a determined value, namely, the field at the bottom layer is allowed to be specified as a determined value in a flow rule, the field at the top layer is specified as a wildcard (not specified as matching any value), the field at the top layer is not allowed to be specified as a determined value, and the field at the bottom layer is a wildcard (not specified as matching any value); otherwise, the flow rules in ovs-vswitchd will all be lost and the network will not be able to connect. The action field includes: normal forward normal, directed to a certain switch port output: and (3) port, drop discarding, source and destination mac address change mod _ dl _ src/mod _ dl _ dst and the like, wherein one flow rule can have a plurality of actions, and the action execution is completed in sequence according to the specified sequence.
Referring to fig. 1, a method for implementing a high availability network disclosed in this embodiment includes: configuring local flow tables for virtual network switches of at least two server nodes; carrying out instant synchronization on a local flow table between server nodes so as to copy the local flow table to a server node of an opposite terminal to form an opposite terminal flow table; and integrating the end flow table transmitted by the failed opposite end node in the non-failed server nodes and the existing local flow table of the non-failed server nodes to form a new local flow table. The network high-availability implementation method can be applied to a super-fusion system and comprises the following steps.
First, step S1 is executed to configure local flow tables for the virtual network switches of at least two server nodes.
As shown in fig. 2, the super-fusion system includes a server node 1 and a server node 2. The server node 1 configures a network card 20 connected to the switch 10, and the server node 2 configures a network card 30 connected to the switch 10. The server node 1 is provided with a virtual network switch23 (vSwitch) connected to the network card 20, and the virtual network switch23 connects the VM21 and the VM 22. The server node 2 is provided with a virtual network switch33 (vSwitch) connected to the network card 20, and the virtual network switch23 connects the VM31 and the VM 32. The vSwitch23 provides virtualized network services for VMs 21, 22 in server node 1, and the vSwitch33 provides virtualized network services for VMs 31, 32 in server node 2. The server node 1 configures a Local Flow table (Local Flow) including Flow01 and Flow02, Flow01 corresponding to VM21, Flow02 corresponding to VM 22; the server node 2 configures a Local Flow table (Local Flow) including Flow03 and Flow04, Flow03 corresponding to VM31, and Flow04 corresponding to VM 32. The server nodes are connected to the switch 10 through a network card 20 and a network card 30 which are mutually arranged. The switch 10 is a two-layer physical switch.
Then, step S2 is executed to perform instant synchronization (sync) on the local flow table between the server nodes to copy the local flow table into the server node of the opposite end to form an opposite end flow table. Step S2 is performed on the premise that the instant synchronization of the local flow tables configured in advance by the server nodes is performed, that is, no failure occurs in any of the two server nodes that duplicate the local flow tables with each other. Generally, step S2 is executed when the super-converged system is deployed and starts to run, so as to realize high availability of the network when a certain server node fails subsequently.
Referring to fig. 2, specifically, even if the operation of synchronization is performed when both the server node 1 and the server node 2 are normal (OK). The Local Flows in server node 1 are instantly synchronized to server node 2 to form Peer Flow tables (Peer Flows) that contain Flow01 and Flow 02. Similarly, the Local Flows in the server node 2 are synchronized to the server node 1 at the same time to form a Peer Flow table (Peer Flow), which includes Flow03 and Flow 04. Flows 01-04 are all forwarded via the network card and the switch 10 described above. In this embodiment, when both servers are normal (OK), only the local flow tables configured to each other need to be immediately and synchronously copied and copied to the correspondent node to form a correspondent flow table (Peer Flows). The instant copy process of the Local flow table (Local Flows) belongs to data transmission at a text data level, so that system resources consumed by instant synchronization are very small and can be basically ignored.
Then, step S3 is executed to integrate the current table of the peer end introduced by the peer end node having failed in the server nodes having not failed and the existing local current table of the server nodes having not failed, so as to form a new local current table. The new local Flow table includes four Flow tables of Flow01, Flow02, Flow03, and Flow 04.
Referring to fig. 3, in the present embodiment, if the server node 1 fails (NG), the VM21 and the VM22 are not available. At this time, the server node 2 operates normally. The server node 2 integrates the peer Flow table synchronized instantly to the local node (i.e., the server node 2) and containing Flow01 and Flow02 with the local Flow table in the server node 2 and containing Flow03 and Flow04 to form a new local Flow table. The new local flow table contains all the local flow tables configured in all the server nodes in the entire super-converged system before the execution of step S2.
Meanwhile, in step S3, the integrating operation performed on the Peer flow table (Peer flow) introduced by the Peer node (server node 1) that has failed and the existing Local flow table (Local flow) of the server node that has not failed in the server node that has not failed (server node 2) is specifically: and modifying and sequencing the opposite-end flow table and the local flow table.
The integration operation of Local Flows and Peer Flows in the server node (i.e. server node 2) without failure is: and sequentially performing localization processing on each flow table according to the sequence of each flow table in the Peer Flows, and then immediately synchronizing the server node 2 to the Peer Flows formed in the server node 1 according to the following insertion logic to insert the Peer Flows into the Local Flows of the server node 1.
The localization processing means: a number of condition fields in each Peer Flow are modified, where the condition fields are network information on the server node 1 and are now modified to be network information on the server node 2, and for example, the input port number and the output port number in the condition fields are modified to be corresponding port numbers of a Flow table on the server node 2 into a virtual network switch (vSwitch 23).
The insertion logic means that: when a Peer Flow table (assuming that table entry table _ id of the Peer Flow table is table _ x, priority is pri _ x, and x is a parameter value) originally located in the server node 1 is to be inserted into the Local Flow of the server node 2, the Peer Flow table needs to participate in the correct position in the Local Flow through the following steps.
(1) If the Local flow of the server node 2 does not have the table _ x entry, the entry is newly created in the Local flow, and the content of the table _ x entry in the Peer flow is copied into the entry.
(2) If a table _ x entry exists in the Local Flow of the server node 2 but there is no pri _ x Flow table, the Peer Flow table is directly inserted into any position of the table _ x entry.
(3) If a table _ x entry exists in the Local Flow of the server node 2 and a pri _ x Flow table exists, the Peer Flow table needs to be inserted after all pri _ x Flow tables in the table _ x entry.
In this embodiment, a new local flow table is formed by integrating the flow table of the peer node transferred thereto and the existing local flow table of the server node that has not failed, so that the virtual machines (i.e., the VM21 and the VM22 originally deployed in the server node 1) migrated from the peer node can normally use the virtualized network; meanwhile, the normal use of the virtual network by the virtual machines (VM31, VM32) already deployed on the server node 2 is not affected.
Referring to fig. 3, the new local Flow table shown contains all local Flow tables (Flow01, Flow02, Flow03, and Flow04) for all server nodes. Meanwhile, in this embodiment, the fault disclosed by the server node 1 includes a fault that the server node is down, powered off or network interrupted. When the Peer Flows in the server node 2 are integrated with the existing local flow table, the Peer Flows disappear. In this embodiment, network restoration is performed on all server nodes based on the new Local flow table (Local Flows in fig. 3).
Preferably, after the step S3 is completed, the method further includes: the virtual machines (i.e., VM21, VM22) in the failed server node 1 are migrated to the server node 2 where the new local flow table is formed according to the new local flow table. At this time, the Flow01, the Flow02, the VM21, and the VM22 in the server node 2 are restored to the server node 1 whose failure has been repaired along the restoration path shown by the dotted arrow in fig. 3. Thereby migrating VM21, VM22 originally deployed in server node 1 back into server node 1 via network card 30, switch 10, and network card 20 when server node 1 recovers.
In this embodiment, even if the server node 1 fails to cause the VM21 and the VM22 to be unavailable, since the local Flow tables Flow01 and Flow02 are migrated to the server node 2 that operates normally, and the VM21 and the VM22 are migrated to the server node 2 and connected to the vSwitch33, the network service migrated to the virtual machine (the VM21 and the VM22) in the server node 2 that forms the new local Flow table is not affected.
It should be noted that the network high availability method disclosed in this embodiment may be applied not only to a super-convergence system, but also to a data center (IDC), a cluster server, and a cloud computing platform based on a distributed storage architecture. In this embodiment, when an unrecoverable failure or a recoverable failure occurs in the server node 1, the local flow table is synchronized to the opposite node in real time, so that the local flow table of the server node 1 is ensured not to be lost or damaged, and high availability of the virtual network service formed by the whole super-convergence system is ensured. Meanwhile, by integrating the Peer flow tables (Peer Flows) formed by the server nodes which are migrated to the non-failed server nodes into a new local flow table, it is also possible to ensure that the virtual network services migrated to the virtual machines (VM31, VM32) of the non-failed server are not affected, and to ensure that the virtual network services formed by the VM31, VM32 created in the non-failed server nodes, i.e., the server node 2, are not impacted by the network.
Example two:
another embodiment of a network high availability implementation method of the present invention is disclosed with reference to fig. 4-7.
Compared with the first embodiment, the main difference of the network high availability implementation method disclosed in this embodiment is that, in the first embodiment, the network high availability implementation method is applied to a super-convergence system including three or more server nodes. In the present embodiment, three server nodes 1 to 3 are taken as an example for illustration.
Specifically, the server node 1 configures a local Flow table including Flow01 and Flow02, and VM21 and VM22 corresponding thereto, and a virtual network switch (not shown) connects VM21 and VM22 and accesses the switch 10 through the network card 20. The server node 2 configures a local Flow table including Flow03 and Flow04, and VM31 and VM32 corresponding thereto, and a virtual network switch (not shown) connects VM31 and VM32 and accesses the switch 10 through the network card 30. The server node 3 configures a local Flow table including Flow05 and Flow06, and VM51 and VM52 corresponding thereto, and a virtual network switch (not shown) connects VM51 and VM52 and accesses the switch 10 through the network card 50.
Meanwhile, step S2 specifically includes: and selecting one server node which does not fail from the plurality of server nodes, wherein the selection and the determination of the server node are random as long as the server node operates normally. For example, in the present embodiment, the server node 1 is selected as one server node that has not failed. And carrying out instant synchronization on the local flow tables in other unselected server nodes so as to copy the local flow tables in other unselected server nodes one by one to the selected server node to form an opposite-end flow table. At this time, Local Flows including Flow03 and Flow04 in the server node 2 and Local Flows including Flow05 and Flow06 in the server node 3 are all synchronized to the server node 1 at once, forming Peer Flows, as shown in fig. 4.
After the step S3 is completed, the method further includes: and migrating the virtual machine in the server node with the fault to the server node where the new local flow table is formed according to the new local flow table.
Referring to fig. 5, if the server node 2 fails (NG), the configured VMs 31 and 32 are not available. And integrating the opposite-end flow table which is formed by copying the opposite-end flow table to the selected server node one by one with the existing local flow table of the server node which does not have a fault to form a new local flow table, and synchronously copying the new local flow table to other server nodes. The new local Flow table includes flows 01 to 06.
As shown in particular in fig. 6. In this embodiment, the synchronization process for copying the local flow table to the corresponding node, which is performed before a certain server node fails, does not need to perform bidirectional instant synchronization, so that system resources consumed by the instant synchronization process are further reduced. More limited, in this embodiment, the new Local flow table (shown in Local Flows in fig. 6) formed by integration can be executed once again to perform the instant synchronization operation. However, in an application scenario of a super-fusion cluster containing a large number of server nodes, or a computer cluster or a data center containing a large number of server nodes, it is possible to reduce the number of synchronization verification operations required to be performed among the plurality of server nodes, so that the real-time synchronization operations among the plurality of server nodes are simpler and more convenient. Although there is a possibility of failure of the server node 1, the possibility of failure of the server node 1 may be reduced by using various means such as Uninterruptible Power Supply (UPS), high redundancy data backup, nonvolatile memory, etc., thereby removing failure of one or more other server nodes (e.g., the server node 2 and/or the server node 3 in fig. 5) before the selected non-failed server node, and the Flow03, the Flow04, the VM31, and the VM32 can be migrated and restored again to the server node 2 whose failure has been repaired, in the restoration direction shown by the dashed arrow in fig. 7, by migrating to a new local Flow table and a virtual machine in the server node 1; meanwhile, the migration of the Flow05, the Flow06, the VM51 and the VM52 is restored to the normal server node 3, and the virtual machine and the local Flow table in the server node 3 are overwritten.
It should be noted that, in this embodiment, the network high availability method further includes: and determining the server nodes which do not have faults, and migrating the virtual machines in the server nodes which have faults to the server nodes which form the new local flow table according to the new local flow table. For example, in a super-fusion system with three server node instances, if only the server node 2 fails, only the Flow03, the Flow04, the VM31, and the VM32 are migrated and restored to the server node 2 whose failure has been fixed.
Preferably, as shown in fig. 7, in this embodiment, the local Flow tables may be synchronized at the moment to form a new local Flow table, and after the failure of the failed server node 2 is recovered, the local Flow tables (Flow 01-Flow 06) including all the server node Flow tables in the server node 1 are synchronized to the opposite end nodes (i.e., the server node 2 and the server node 3), so as to improve the communication between the virtual machines configured in each server node in the whole super-convergence system and the extranet 40 in the uplink, and improve the high availability and the high disaster-tolerant backup capability of the virtual network service.
Please refer to the embodiment a, and details are not repeated herein, so that the implementation method for network high availability disclosed in this embodiment is similar to the technical solution in the embodiment a.
Example three:
based on the network high availability implementation method disclosed in the first embodiment and/or the second embodiment, the present embodiment further discloses a super-convergence system, which includes: the system comprises two or more server nodes, and the server nodes are configured with network cards connected with the switch. The super-convergence system runs the network high-availability implementation method described in the first embodiment and/or the second embodiment. In the drawings of the present specification, "network card" means "physical network card".
In the present embodiment, the super-fusion system preferably includes only the server node 1 and the server node 2, so as to seek more cost performance. The plurality of server nodes can be selected from a computer system based on an X86 architecture as a physical layer device, and the server nodes form a Software Defined Storage (SDS), a distributed storage (Ceph) and a cloud management platform uniformly manage the plurality of server nodes. The cloud management platform can be based on an Openstack architecture or a VMware architecture, and provides various applications with high efficiency, elasticity, scalability and portability for a user and resource management, service management and operation management of IaaS for the user by utilizing an open source mode scheme.
Referring to fig. 2, in the present embodiment, the super-fusion system includes a server node 1 and a server node 2. The server node 1 configures a network card 20 connected to the switch 10, and the server node 2 configures a network card 30 connected to the switch 10. The server node 1 is provided with a virtual network switch23 (vSwitch) connected to the network card 20, and the virtual network switch23 connects the VM21 and the VM 22. The server node 2 is provided with a virtual network switch33 (vSwitch) connected to the network card 20, and the virtual network switch23 connects the VM31 and the VM 32.
vSwitch (a low-level concept of virtual network switch) is widely used in internet services based on iaas (infrastructure as a service). VM (virtual machine) on a local server node is provided with two-tier network access and part of three-tier network functionality through a virtual switch running on a virtualization platform. The VMs are connected to the external network 40 through a vSwitch, which communicates as an uplink to the external network 40 through the network card 20 on the server node. The user accesses the super convergence system through the HTTP request and the external network 40, and responds to the request initiated by the user based on the virtual machine configured in the super convergence system.
It should be particularly noted that, in this embodiment, when the virtual network switch is based on an Open source architecture, an Open vSwitch (OvS), a commercially available VSS (vSphere Standard vSwitch, vSphere Standard virtual switch) and VDS (vSphere Distributed vSwitch, vSphere Distributed virtual switch), Nexus 1000V of Cisco, Hyper-V virtual switch of microsoft, and the like are used.
The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
Please refer to the technical solutions of the same parts in the first embodiment and/or the second embodiment, and detailed descriptions thereof are omitted here.
The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (10)

1. A network high availability implementation method is characterized by comprising the following steps:
s1, configuring local flow tables for the virtual network switches of at least two server nodes;
s2, carrying out instant synchronization on the local flow tables among the server nodes so as to copy the local flow tables to the server nodes of the opposite terminal to form an opposite terminal flow table;
and S3, integrating the flow table of the opposite end introduced by the opposite end node with the fault in the server nodes without the fault with the existing local flow table of the server nodes without the fault to form a new local flow table.
2. The method according to claim 1, wherein the integrating operation performed in step S3 on the current table of the peer node introduced by the peer node that has failed in the non-failed server nodes and the existing local current table of the non-failed server nodes is specifically: and modifying and sequencing the opposite-end flow table and the local flow table.
3. The method of claim 1, wherein the step S2 is performed on the premise that no failure occurs in any of two server nodes that duplicate the local flow tables.
4. The method of claim 1, wherein the new local flow table contains all local flow tables of all server nodes, and wherein the failure comprises a server node down, power down, or network outage.
5. The network high availability implementation method according to any one of claims 1 to 4, wherein the network high availability implementation method is applied to a super-convergence system comprising three or more server nodes.
6. The method for implementing network high availability according to claim 5, wherein the step S2 specifically includes: and selecting one server node which does not have a fault from the plurality of server nodes, and carrying out instant synchronization on the local flow tables in other unselected server nodes so as to copy the local flow tables in other unselected server nodes one by one to the selected server node to form an opposite-end flow table.
7. The method according to claim 6, wherein the step S2 further includes: and integrating the opposite-end flow table which is formed by copying the opposite-end flow table to the selected server node one by one with the existing local flow table of the server node which does not have a fault to form a new local flow table, and synchronously copying the new local flow table to other server nodes.
8. The method according to claim 7, wherein after the step S3 is completed, the method further includes: and migrating the virtual machine in the server node with the fault to the server node where the new local flow table is formed according to the new local flow table.
9. The method for realizing high availability of the network according to claim 5, wherein the server nodes are connected to the switch in common through network cards configured with each other.
10. A hyper-fusion system, comprising: the system comprises two or more server nodes, a network card and a switch, wherein the server nodes are configured with the network card connected with the switch;
characterized in that the hyper-converged system runs the network high availability implementation method according to any one of claims 1 to 9.
CN202010946006.3A 2020-09-10 2020-09-10 Network high-availability implementation method and super-convergence system Pending CN112187523A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010946006.3A CN112187523A (en) 2020-09-10 2020-09-10 Network high-availability implementation method and super-convergence system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010946006.3A CN112187523A (en) 2020-09-10 2020-09-10 Network high-availability implementation method and super-convergence system

Publications (1)

Publication Number Publication Date
CN112187523A true CN112187523A (en) 2021-01-05

Family

ID=73921754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010946006.3A Pending CN112187523A (en) 2020-09-10 2020-09-10 Network high-availability implementation method and super-convergence system

Country Status (1)

Country Link
CN (1) CN112187523A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338225A (en) * 2021-03-29 2022-04-12 井芯微电子技术(天津)有限公司 Strategy distributor, mimic switch and network system
CN114915602B (en) * 2021-01-29 2024-01-26 中移(苏州)软件技术有限公司 Processing method, processing device and terminal for flow table in virtual switch

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102946365A (en) * 2012-11-09 2013-02-27 清华大学 Flow table updating consistency maintaining method based on software defined network
CN103685250A (en) * 2013-12-04 2014-03-26 蓝盾信息安全技术股份有限公司 Virtual machine security policy migration system and method based on SDN
US20140269683A1 (en) * 2013-03-14 2014-09-18 International Business Machines Corporation Synchronization of OpenFlow controller devices via OpenFlow switching devices
CN104468397A (en) * 2014-11-06 2015-03-25 杭州华三通信技术有限公司 Method and device for preventing package loss in thermal transferring and forwarding process of virtual machine
CN104506511A (en) * 2014-12-15 2015-04-08 蓝盾信息安全技术股份有限公司 Moving target defense system and moving target defense method for SDN (self-defending network)
WO2015180040A1 (en) * 2014-05-27 2015-12-03 华为技术有限公司 Flow table management method and relevant device and system
CN108365979A (en) * 2018-01-31 2018-08-03 深信服科技股份有限公司 Across the controller management method of cluster, SDN controllers and storage medium
CN109905251A (en) * 2017-12-07 2019-06-18 北京金山云网络技术有限公司 Network management, device, electronic equipment and storage medium
CN111221561A (en) * 2020-01-14 2020-06-02 平安科技(深圳)有限公司 Flow table updating method, device, system, computer device and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102946365A (en) * 2012-11-09 2013-02-27 清华大学 Flow table updating consistency maintaining method based on software defined network
US20140269683A1 (en) * 2013-03-14 2014-09-18 International Business Machines Corporation Synchronization of OpenFlow controller devices via OpenFlow switching devices
CN103685250A (en) * 2013-12-04 2014-03-26 蓝盾信息安全技术股份有限公司 Virtual machine security policy migration system and method based on SDN
WO2015081766A1 (en) * 2013-12-04 2015-06-11 蓝盾信息安全技术有限公司 Sdn based virtual machine security policy migration system and method
WO2015180040A1 (en) * 2014-05-27 2015-12-03 华为技术有限公司 Flow table management method and relevant device and system
CN104468397A (en) * 2014-11-06 2015-03-25 杭州华三通信技术有限公司 Method and device for preventing package loss in thermal transferring and forwarding process of virtual machine
CN104506511A (en) * 2014-12-15 2015-04-08 蓝盾信息安全技术股份有限公司 Moving target defense system and moving target defense method for SDN (self-defending network)
CN109905251A (en) * 2017-12-07 2019-06-18 北京金山云网络技术有限公司 Network management, device, electronic equipment and storage medium
CN108365979A (en) * 2018-01-31 2018-08-03 深信服科技股份有限公司 Across the controller management method of cluster, SDN controllers and storage medium
CN111221561A (en) * 2020-01-14 2020-06-02 平安科技(深圳)有限公司 Flow table updating method, device, system, computer device and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114915602B (en) * 2021-01-29 2024-01-26 中移(苏州)软件技术有限公司 Processing method, processing device and terminal for flow table in virtual switch
CN114338225A (en) * 2021-03-29 2022-04-12 井芯微电子技术(天津)有限公司 Strategy distributor, mimic switch and network system
CN114338225B (en) * 2021-03-29 2024-04-12 井芯微电子技术(天津)有限公司 Policy distributor, mimicry switch and network system

Similar Documents

Publication Publication Date Title
CN110912780B (en) High-availability cluster detection method, system and controlled terminal
Zhang et al. A survey on virtual machine migration: Challenges, techniques, and open issues
CN108234307B (en) Network method, network device, and non-transitory computer-readable storage medium
CN108234302B (en) Maintaining consistency in a distributed operating system for network devices
Akella et al. A highly available software defined fabric
US9021459B1 (en) High availability in-service software upgrade using virtual machine instances in dual control units of a network device
Han et al. On the resiliency of virtual network functions
JP6382454B2 (en) Distributed storage and replication system and method
KR101099822B1 (en) Redundant routing capabilities for a network node cluster
US9141502B2 (en) Method and system for providing high availability to computer applications
US9378005B2 (en) Hitless software upgrades
US7933987B2 (en) Application of virtual servers to high availability and disaster recovery solutions
Rajagopalan et al. SecondSite: disaster tolerance as a service
US20120079090A1 (en) Stateful subnet manager failover in a middleware machine environment
JP2005535241A (en) Method of moving application software in multicomputer architecture, multicomputer method and apparatus for realizing continuity of operation using the moving method
CN103761166A (en) Hot standby disaster tolerance system for network service under virtualized environment and method thereof
US7590760B1 (en) Hybrid interface synchronization method and driver-client model for high availability systems
WO2021185169A1 (en) Switching method and apparatus, and device and storage medium
CN112187523A (en) Network high-availability implementation method and super-convergence system
CN111431980B (en) Distributed storage system and path switching method thereof
Anderson et al. Local recovery for high availability in strongly consistent cloud services
US10305987B2 (en) Method to syncrhonize VSAN node status in VSAN cluster
US11418382B2 (en) Method of cooperative active-standby failover between logical routers based on health of attached services
TWI669605B (en) Fault tolerance method and system for virtual machine group
CN111083074A (en) High availability method and system for main and standby dual OSPF state machines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210105