CN115955434A - ECMP group failure recovery method and device, electronic equipment and storage medium - Google Patents

ECMP group failure recovery method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115955434A
CN115955434A CN202310239663.8A CN202310239663A CN115955434A CN 115955434 A CN115955434 A CN 115955434A CN 202310239663 A CN202310239663 A CN 202310239663A CN 115955434 A CN115955434 A CN 115955434A
Authority
CN
China
Prior art keywords
next hop
ecmp group
neighbor
failed
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310239663.8A
Other languages
Chinese (zh)
Other versions
CN115955434B (en
Inventor
郭巍松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202310239663.8A priority Critical patent/CN115955434B/en
Publication of CN115955434A publication Critical patent/CN115955434A/en
Application granted granted Critical
Publication of CN115955434B publication Critical patent/CN115955434B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Computer And Data Communications (AREA)

Abstract

The embodiment of the invention provides a method and a device for recovering failure of an ECMP group, electronic equipment and a storage medium, which relate to the technical field of Internet and comprise the following steps: when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period; when the invalid next hop returns the neighbor response aiming at the neighbor request, the next hop information corresponding to the invalid next hop is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor request to the failed next hop and re-acquiring the next hop information corresponding to the failed next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the failed next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balancing again, and the data flow on the non-failed next hop is little or hardly influenced in the recovery process.

Description

ECMP group failure recovery method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of internet, in particular to a recovery method of ECMP group failure, a recovery device of ECMP group failure, electronic equipment and a computer readable storage medium.
Background
In both conventional networks and data center networks, there are a number of situations where ECMP (Equal Cost Multi-Path Equal Cost routing) is employed to achieve higher reliability and load sharing. Generally, such a route may be calculated by a routing Protocol (such as BGP (Border Gateway Protocol) and OSPF (Open short Path First) or may be obtained by static route configuration, and a data stream may be processed by a HASH algorithm according to a key field of a packet, so as to allocate the data stream to a certain member of the ECMP group for transmission according to a HASH value. When a failure occurs in the relevant link or device, the route may change to another ECMP group with fewer members, each data flow passing through the route may be redistributed (although a small amount of traffic may be distributed to the original group), and for stateful nodes on the path (such as a firewall, etc.), all entries need to be rebuilt almost completely, and traffic on other paths that should not be affected is also severely disturbed. To alleviate this, some devices support a flexible HASH technique, in which when a single-path exception occurs, data on the path is only distributed uniformly to other paths (and some processing is also distributed to a specific path), but when the path is recovered, only the streams on other paths can be randomly placed on a new path according to the HASH value, and the original stable stream still fluctuates.
In actual deployment, due to reasons such as link abnormity, all routes referencing relevant ECMP groups are changed and updated again; secondly, in actual operation, connectivity changes of some ECMP members directly cause route changes to cause severe route updating actions, or partial message forwarding loss is caused because the ECMP members cannot be discovered by a system; still another is that, when the NEIGHBOR information changes (e.g. ARP (Address Resolution Protocol) is aging), even if the switch chip sends the message allocated to the failure path to the CPU (Central Processing Unit), due to the difference between the HASH algorithms of the upper and lower layers, it cannot be ensured that the traffic can trigger the NEIGHBOR request of the relevant path, and thus cannot be ensured to repair the current path quickly; in addition, with the elastic HASH algorithm, when a single path fails, the flow of the path can be uniformly distributed to other paths, but when the path recovers, a large amount of flow which should not be affected originally is affected and redistributed to a new path.
Disclosure of Invention
Embodiments of the present invention provide a method, an apparatus, an electronic device, and a computer-readable storage medium for recovering an ECMP group failure, so as to solve or partially solve a problem that a large amount of originally unaffected flows are affected and redistributed to a new path due to a change and a re-update of a route caused by a path anomaly.
The embodiment of the invention discloses a recovery method for ECMP group failure, wherein the ECMP group consists of a plurality of next hops, and the next hops comprise next hop information, and the method comprises the following steps:
when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
when the invalid next hop returns a neighbor response aiming at the neighbor request, re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response and obtaining the neighbor information aiming at the next hop information; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
and recovering the invalid next hop according to the neighbor information to recover the ECMP group.
Optionally, the method further comprises:
when the next hop in the ECMP group fails, inputting the next hop information corresponding to the failed next hop into a state database
Optionally, after the recovering the failed next hop to recover the ECMP group according to the neighbor information, the method further comprises:
and deleting the next hop information corresponding to the invalid next hop from the state database.
Optionally, the next hop information includes an IP address corresponding to the next hop and an egress interface of the next hop.
Optionally, the neighbor information is a correspondence between the next hop and the MAC address of the next hop.
Optionally, the recovering the failed next hop to recover the ECMP group according to the neighbor information includes:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
Optionally, the method further comprises:
and when the next hop in the ECMP group fails, marking the next hop as the failed next hop.
Optionally, the method further comprises:
and deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
Optionally, the deleting the failed next hop in the ECMP group when the next hop is marked as a failed next hop includes:
and deleting the invalid next hop in the ECMP group when the state of the outgoing interface of the invalid next hop is the outgoing interface DOWN.
Optionally, when the next hop is marked as a failed next hop, deleting the failed next hop in the ECMP group, including:
and when the neighbor information fails, deleting the failed next hop in the ECMP group.
Optionally, the ECMP group is configured with a bidirectional forwarding detection protocol, and deleting the failed next hop in the ECMP group when the next hop is marked as the failed next hop includes:
and when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted, deleting the invalid next hop in the ECMP group.
Optionally, the next hop in the ECMP group comprises a plurality of HASH buckets, the HASH buckets corresponding to HASH values, the method further comprising:
when the ECMP group is recovered, acquiring the corresponding affinity and updated surrogate value of the HASH bucket;
obtaining a numerical value of the priority of the HASH bucket according to the affinity and the updated surrogate value;
and recovering the invalid next hop according to the numerical value of the priority and the number of the HASH buckets corresponding to each next hop in the ECMP group so as to distribute the HASH buckets to the recovered next hop.
Optionally, the value of the priority is used as a basis for allocating the HASH bucket to the recovered next hop.
Optionally, the method further comprises:
allocating the HASH bucket to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
Optionally, the obtaining of the affinity and updated surrogate value corresponding to the HASH bucket comprises:
recording an affinity between the HASH bucket and its corresponding next hop when the HASH bucket is initially assigned to the next hop in the ECMP group.
Optionally, the update metaphor value is attribute information of the HASH bucket.
Optionally, the method further comprises:
and when the next hop in the ECMP group fails and/or the failed next hop recovers, updating the updated surrogate value of the HASH bucket corresponding to the next hop of the current ECMP group.
Optionally, the obtaining a value of the priority of the HASH bucket according to the affinity and the updated surrogate value includes:
obtaining a numerical value of the priority of the HASH bucket in response to a weighting instruction of the affinity and the update surrogate value.
The embodiment of the invention also discloses a device for recovering the failure of the ECMP group, wherein the ECMP group consists of a plurality of next hops, and the next hops comprise next hop information, and the device comprises:
the neighbor request sending module is used for sending a neighbor request to a next hop which fails according to a preset period when the next hop in the ECMP group fails;
a neighbor information obtaining module, configured to re-obtain, according to the neighbor response, next hop information corresponding to the failed next hop and obtain neighbor information for the next hop information, when the failed next hop returns a neighbor response for the neighbor request; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
and the ECMP group recovery module is used for recovering the invalid next hop according to the neighbor information so as to recover the ECMP group.
Optionally, the apparatus further comprises:
and the information input module is used for inputting the next hop information corresponding to the invalid next hop into a state database when the next hop in the ECMP group is invalid.
Optionally, the apparatus further comprises:
and the information deleting module is used for deleting the next hop information corresponding to the invalid next hop from the state database.
Optionally, the ECMP group restoring module is specifically configured to:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
Optionally, the apparatus further comprises:
and the next hop marking module is used for marking the next hop as the invalid next hop when the next hop in the ECMP group is invalid.
Optionally, the apparatus further comprises:
and the next hop deleting module is used for deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
Optionally, the next hop deletion module is specifically configured to:
and deleting the invalid next hop in the ECMP group when the state of the outgoing interface of the invalid next hop is the outgoing interface DOWN.
Optionally, the next hop deletion module is specifically configured to:
and when the neighbor information fails, deleting the failed next hop in the ECMP group.
Optionally, the ECMP group is configured with a bidirectional forwarding detection protocol, and the next hop deletion module is specifically configured to:
and when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted, deleting the invalid next hop in the ECMP group.
Optionally, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets corresponding to HASH values, and the apparatus further comprises:
a data acquisition module, configured to acquire an affinity and an updated surrogate value corresponding to the HASH bucket when the ECMP group is restored;
a priority value obtaining module, configured to obtain a priority value of the HASH bucket according to the affinity and the updated surrogate value;
and the next hop recovery module is used for recovering the invalid next hop according to the numerical value of the priority and the quantity of the HASH buckets corresponding to the next hops in the ECMP group so as to distribute the HASH buckets to the recovered next hops.
Optionally, the apparatus further comprises:
a HASH bucket allocation module to allocate the HASH bucket to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
Optionally, the data obtaining module is specifically configured to:
recording the affinity between the HASH bucket and its corresponding next hop when the HASH bucket is initially assigned to the next hop in the ECMP group.
Optionally, the apparatus further comprises:
and the update agent value updating module is used for updating the update agent value of the HASH bucket corresponding to the next hop of the current ECMP group when the next hop in the ECMP group fails and/or the failed next hop recovers.
Optionally, the priority value obtaining module is specifically configured to:
obtaining a value for the priority of the HASH bucket in response to a weighting instruction of the affinity and the update agent value.
The embodiment of the invention also discloses electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory finish mutual communication through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the method according to the embodiment of the present invention when executing the program stored in the memory.
Also disclosed is a computer-readable storage medium having instructions stored thereon, which, when executed by one or more processors, cause the processors to perform a method according to an embodiment of the invention.
The embodiment of the invention has the following advantages:
in the embodiment of the invention, the ECMP group consists of a plurality of next hops, the next hops comprise next hop information, and when the next hops in the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period; when the invalid next hop returns the neighbor response aiming at the neighbor request, the next hop information corresponding to the invalid next hop is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; wherein the neighbor response contains neighbor information corresponding to the failed next hop; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor request to the failed next hop and re-acquiring the next hop information corresponding to the failed next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the failed next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balancing again, and the data flow on the non-failed next hop is little or hardly influenced in the recovery process.
Drawings
Fig. 1 is a flowchart illustrating steps of a method for recovering an ECMP group from a failure according to an embodiment of the present invention;
fig. 2 is a schematic configuration diagram of an ECMP group according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a priority algorithm provided in an embodiment of the present invention;
fig. 4 is a block diagram of a recovery apparatus for an ECMP group failure according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a computer-readable storage medium provided in an embodiment of the present invention;
fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As an example, there are a number of scenarios where ECMP is employed to achieve higher reliability and load sharing, whether in a legacy network or in a data center network. Generally, such a route can be calculated by a routing protocol and an OSPF protocol or by a static route configuration, and a data flow is processed by a HASH algorithm according to a key field of a packet, so as to distribute the data flow to a member of an ECMP group for transmission according to a HASH value. When a failure occurs in the relevant link or device, the route may change to another ECMP group with fewer members, each data flow passing through the route may be redistributed (although a small amount of traffic may be distributed to the original group), and for stateful nodes on the path (such as a firewall, etc.), all entries need to be rebuilt almost completely, and traffic on other paths that should not be affected is also severely disturbed. To alleviate this, some devices support a flexible HASH technique, in which when a single-path exception occurs, data on the path is only uniformly distributed to other paths (and some processing is also distributed to a specific path), but when the path is recovered, only the streams on the other paths can be randomly placed on a new path, and the original stable stream still fluctuates. Specifically, in actual deployment, due to reasons such as link exception, all routes referencing the relevant ECMP group are changed and updated again; secondly, in actual operation, connectivity changes of some ECMP members directly cause route changes to cause severe route updating actions or cause partial message forwarding loss due to incapability of being discovered by a system; still furthermore, when the neighbor information changes, even if the switch chip sends the information allocated to the failure path to the CPU, due to the difference between the upper and lower HASH algorithms, it cannot be ensured that the traffic can trigger the neighbor request of the relevant path, and thus it cannot be ensured that the current path is repaired quickly; in addition, with the elastic HASH algorithm, when a single path fails, the flow of the path can be uniformly distributed to other paths, but when the path recovers, a large number of flows which should not be affected originally are affected and redistributed to a new path.
One of the core invention points in the invention is that the ECMP group is composed of a plurality of next hops, the next hops include next hop information, and when the next hop in the ECMP group fails, a neighbor request is sent to the failed next hop according to a preset period; when the invalid next hop returns the neighbor response aiming at the neighbor request, the next hop information corresponding to the invalid next hop is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; wherein the neighbor response contains neighbor information corresponding to the failed next hop; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor solicitation to the invalid next hop and reacquiring the next hop information corresponding to the invalid next hop according to the neighbor response returned by the neighbor solicitation to obtain the neighbor information, the invalid next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balance again, and the data flow on the non-invalid next hop is little or hardly influenced in the recovery process.
Referring to fig. 1, a flowchart illustrating steps of a method for recovering an ECMP group from a failure according to an embodiment of the present invention is shown, where the ECMP group is composed of multiple next hops, and the next hops include next hop information, and the method specifically includes the following steps:
step 101, when a next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
the ECMP is an equivalent multi-path route, and the ECMP has a network environment in which a plurality of different links can reach the same destination address; the ECMP group may be composed of a plurality of next hops, and the next hops include next hop information.
Optionally, the next hop information may include an IP (Internet Protocol) address corresponding to the next hop and an outgoing interface of the next hop;
in the embodiment of the present invention, the preset period is a result set by a person, and a person skilled in the art can adjust the preset period according to an actual situation, which is not limited in this embodiment of the present invention.
For the neighbor request, the neighbor request is an information request sent to the failed next hop so as to obtain the next hop information and obtain the neighbor information according to the next hop information, so that the failed next hop can be recovered according to the neighbor information.
In a specific implementation, the ECMP group is composed of a plurality of next hops, each next hop includes next hop information, and when the next hop of the ECMP group fails, the neighbor request may be sent to the failed next hop according to a preset period.
102, when the invalid next hop returns the neighbor response aiming at the neighbor request, re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response and obtaining the neighbor information aiming at the next hop information; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
for the neighbor request, it is an information request sent to the failed next hop in order to obtain the next hop information and obtain the neighbor information according to the next hop information.
For the neighbor response, the neighbor response is an information reply responding to the neighbor request, and the neighbor response can contain neighbor information corresponding to the invalid next hop; the neighbor information is a correspondence between MAC (Media Access Control) addresses of a next hop and a next hop, and it can be understood that the next hop information in the ECMP group generally refers to an IP address of the next hop and an outgoing interface of the next hop, and the IP address and the outgoing interface can be sent over the ethernet only by having a corresponding MAC address, that is, the neighbor information is a correspondence between the IP address of the next hop and the outgoing interface of the next hop and the MAC address of the next hop, and therefore the next hop takes effect only when the neighbor information is needed.
In the specific implementation, the ECMP group consists of a plurality of next hops, each next hop comprises next hop information, when the next hop of the ECMP group fails, a neighbor request can be sent to the failed next hop according to a preset period, and when the failed next hop returns a neighbor response aiming at the neighbor request, the next hop information corresponding to the failed next hop is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; the neighbor response contains neighbor information corresponding to the failed next hop, and data support is provided for recovering the failed next hop.
And 103, recovering the invalid next hop according to the neighbor information to recover the ECMP group.
Optionally, when a next hop in the ECMP group fails, inputting next hop information corresponding to the failed next hop into a state database; and after the invalid next hop is recovered according to the neighbor information to recover the ECMP group, deleting the next hop information corresponding to the invalid next hop from the state database.
Illustratively, when a next hop in the ECMP group fails, recording next hop information corresponding to the failed next hop in a state database and monitoring the state database by an independent process, so that the independent process sends a neighbor request to the failed next hop according to the next hop information of the state database, so as to recover the failed next hop to an effective state as soon as possible; and after the ECMP group recovers, deleting the next hop information corresponding to the failed next hop from the state database, it can be understood that a neighbor request may be periodically sent to the next hop (member) in the ECMP group to recover the failed next hop as soon as possible, and the normal next hop does not need to periodically or intensively send the neighbor request, so that the next hop information corresponding to the failed next hop is deleted from the state database.
In a specific implementation, the ECMP group is composed of a plurality of next hops, each next hop includes next hop information, when a next hop of the ECMP group fails, a neighbor request may be sent to the failed next hop according to a preset period, at the same time, next hop information corresponding to the failed next hop is logged into the state database and the state database is monitored, and then, next hop information corresponding to the failed next hop may be re-acquired and neighbor information for the next hop information may be obtained according to the neighbor request, so that the failed next hop may be recovered to recover the ECMP group according to a correspondence between MAC addresses of the failed next hop and the failed next hop, to make up for a defect that traffic triggering is unreliable for the ECMP group, and after the ECMP group is recovered, the next hop information corresponding to the failed next hop is deleted from the state database.
Optionally, the recovering the failed next hop to recover the ECMP group according to the neighbor information includes:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
In a specific implementation, the neighbor information is a correspondence between an IP address of a next hop and an outgoing interface of the next hop and a MAC address of the next hop, and when a failed next hop needs to be recovered to recover the ECMP group, the failed next hop can be recovered to recover the ECMP group according to the correspondence between the IP address of the failed next hop and the outgoing interface of the failed next hop and the MAC address of the failed next hop.
In the embodiment of the invention, an ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hops of the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period, and when the failed next hops return a neighbor response aiming at the neighbor request, the next hop information corresponding to the failed next hops is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; wherein the neighbor response contains neighbor information corresponding to the failed next hop; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor request to the failed next hop and re-acquiring the next hop information corresponding to the failed next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the failed next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balancing again, and the data flow on the non-failed next hop is little or hardly influenced in the recovery process.
In an alternative embodiment, the stale next hop in the ECMP group is deleted when the next hop is marked as stale next hop.
In a particular implementation, the stale next hop in the ECMP group is deleted when the next hop is marked as stale next hop.
Optionally, when the next hop in the ECMP group fails, the next hop needs to be marked as the failed next hop, specifically, the next hop in the ECMP group includes a plurality of HASH buckets, and for a HASH bucket, it may be understood as a modulo HASH value, and the modulo HASH value enters the HASH bucket, where the HASH bucket may correspond to all the next hops in the ECMP group; when a next hop in the ECMP group fails, the next hop is first marked as an unreachable next hop, and the HASH bucket allocated before the unreachable next hop is distributed to other reachable next hops.
Optionally, when the state of the outgoing interface of the next hop that fails is the outgoing interface DOWN, the next hop that fails in the ECMP group is deleted, that is, when the state of the outgoing interface of the next hop is the outgoing interface DOWN, it indicates that the link of the next hop is already abnormal, and at this time, the next hop that fails in the ECMP group needs to be deleted.
In addition, when the neighbor information fails, the failed next hop in the ECMP group is deleted.
For the neighbor information failure, it is a kind of neighbor information that has changed, specifically, when the neighbor information FAILs (for example, when the state of the neighbor information changes to INCOMPLETE or FAIL), the neighbor information no longer has a legal MAC address, and at this time, the failed next hop corresponding to the neighbor information may be correspondingly deleted.
It should be noted that, for the change of the neighbor information, there is a case of updating the next hop, specifically, when the MAC address in the neighbor information changes into another MAC address, it is not necessary to delete the corresponding next hop of the neighbor information in the ECMP group, and the corresponding next hop in the ECMP group can be directly updated.
As for the neighbor information, it may be the condition that an Address Resolution Protocol (ARP) ages, and the like, and it can be understood that, in practical applications, there may be many situations that the neighbor information fails, so for convenience of description, an example listed is simple, and this is not described in detail in the embodiment of the present invention.
Optionally, the ECMP group is provided with a bidirectional forwarding detection protocol, and when the bidirectional forwarding detection protocol detects that the connection to the next hop is interrupted, the failed next hop in the ECMP group is deleted.
For a Bidirectional Forwarding Detection protocol (BFD), it can be used for detecting a network protocol of a failure between two Forwarding points.
It can be understood that, when the state of the outgoing interface of the ECMP group is the outgoing interface DOWN, the neighbor information is invalid, and the detected connection to the next hop is interrupted, the invalid next hop in the ECMP group may be deleted, that is, the next hop (member) of the current ECMP group is directly updated, so that the route referring to the ECMP group does not need to be retransmitted, and the traffic is not only guided to other reachable paths (next hops), but also the route table entry does not need to be fully updated.
In the specific implementation, when the state of the outgoing interface of the ECMP group is that the outgoing interface DOWN, the neighbor information is aged or the neighbor information is no longer reachable, and a protocol such as a bidirectional forwarding detection protocol finds that a path failure such as interruption occurs in a connection to a next hop, the route does not need to be updated, but the member of the current ECMP group is directly updated, that is, the failed next hop in the ECMP group is deleted, so that the route referring to the ECMP group does not need to be re-issued, traffic is guided to other reachable paths (next hops), and routing table entries do not need to be updated comprehensively, so that the failed path can be recovered faster and can be quickly discovered and load balancing can be realized again, and data flows on non-failed paths are little or hardly affected in the recovery process, and in addition, after the failed path is recovered, data of other links are guided to the recovered path.
It can be understood that, when the state of the outgoing interface of the failed next hop is the outgoing interface UP, the failed next hop can be recovered to recover the failed ECMP group; secondly, when the neighbor information for the next hop which is invalid is obtained again, namely the corresponding relation between the IP address of the next hop which is invalid and the outbound interface of the next hop which is invalid and the MAC address of the next hop which is invalid is obtained again, the next hop which is invalid can be recovered to recover the ECMP group which is invalid, illustratively, when the ARP request obtains the ARP response replied by the IP address owner, the neighbor information can be understood to be obtained again, namely the corresponding relation between the IP address and the outbound interface and the MAC address is obtained again; in addition, a failure or recovery of a connection to a next hop can be discovered through a protocol such as a bidirectional forwarding detection protocol to determine whether the next hop in the ECMP group is failed or in effect. After recovering the failed next hop to recover the ECMP group, the data for the other links may be directed onto the recovered path (next hop).
Referring to fig. 2, a schematic configuration diagram of an ECMP group provided in the embodiment of the present invention is shown; as shown in fig. 2, the ECMP group can determine the working condition of the next hop according to the change of the neighbor information, the change of the link state, and the change of the bidirectional forwarding detection protocol state, and when the path fails, the route does not need to be updated, but the member of the current ECMP group is directly updated, i.e., the next hop of the ECMP group is updated, so that the route referring to the ECMP group does not need to be re-issued, and the traffic is guided to other reachable paths, and the route entry does not need to be updated comprehensively, and then, after the failed path is recovered, the data of other links is guided to the recovered path. For the current failed path, the failed next hop information can be recorded into a state database, the database is monitored by an independent process, and for the temporarily unreachable next hop information, a neighbor request is actively and periodically sent to reacquire neighbor information to recover an ECMP group, so that the defect that the flow trigger is unreliable for the ECMP group is overcome; and after the failed path is recovered, guiding the data of other links to the recovered path, and deleting the information of the failed next hop from the state database.
In the embodiment of the invention, an ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hops of the ECMP group fail, a neighbor request is sent to the failed next hops according to a preset period, and when the failed next hops return a neighbor response aiming at the neighbor request, the next hop information corresponding to the failed next hops is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; wherein the neighbor response contains neighbor information corresponding to the failed next hop; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor request to the failed next hop and re-acquiring the next hop information corresponding to the failed next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the failed next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balancing again, and the data flow on the non-failed next hop is little or hardly influenced in the recovery process.
Meanwhile, when the state of the outgoing interface of the ECMP group is the situation that the outgoing interface is DOWN, neighbor information is aged or the neighbor information is no longer reachable, and the condition that the connection reaching the next hop is interrupted and other path failures are found through protocols such as a bidirectional forwarding detection protocol and the like, the routing does not need to be updated, but the members of the current ECMP group are directly updated, so that the routing referring to the ECMP group does not need to be retransmitted, the flow is guided to other reachable paths (next hop), and the routing table entry does not need to be comprehensively updated.
In an alternative embodiment, the next hop in the ECMP group comprises a plurality of HASH buckets, the HASH buckets corresponding to HASH values, and the method further comprises:
when the ECMP group is recovered, acquiring the corresponding affinity and updated surrogate value of the HASH bucket;
obtaining a numerical value of the priority of the HASH bucket according to the affinity and the updated surrogate value;
and recovering the invalid next hop according to the numerical value of the priority and the number of the HASH buckets corresponding to each next hop in the ECMP group so as to distribute the HASH buckets to the recovered next hop.
The HASH bucket can be a modulo HASH value, the modulo HASH value enters the HASH bucket, and the HASH bucket can also be understood as a corresponding position; for the HASH value, a 32-bit unsigned integer value is typically computed by a hashing algorithm.
For the affinity and the updated proxy value, the affinity and the updated proxy value are the position affinity and the updated proxy value of the state when the ECMP group reaches the maximum group member number for the first time;
optionally, in response to the allocation instruction of the HASH bucket, the HASH bucket may be allocated to a next hop in the ECMP group, wherein an affinity between the HASH bucket and its corresponding next hop is recorded when the HASH bucket is initially allocated to the next hop in the ECMP group.
It should be noted that, for the initial allocation, it may be understood as a process of issuing the HASH bucket to all next hops of the ECMP group for the first time, specifically, during the initial issuing, the HASH bucket is issued to the next hop first no matter whether the next hop is reachable, and after the initial issuing of the HASH bucket is completed, if there is an unreachable next hop, the unreachable next hop is deleted.
In a particular implementation, the affinity between a HASH bucket and its corresponding next hop can be recorded when the HASH bucket is initially assigned to the next hop in the ECMP group. In the present example, the affinity between the HASH buckets initially allocated for the next hop that failed was recorded as 1, and the affinities at the remaining positions were recorded as 0. It should be noted that, in practical applications, each HASH bucket records which next hop affinity is compared, and in the embodiment of the present invention, the initial assignment is used as the basis for affinity, assuming that the initial assignment is 1, and the rest are 0. It is understood that, regarding the value and basis of affinity, a person skilled in the art can select the affinity according to actual situations, and the embodiment of the present invention is not limited thereto.
Optionally, the update surrogate value is attribute information of a HASH bucket, and when a next hop in the ECMP group fails and/or the failed next hop recovers, the update surrogate value of the HASH bucket corresponding to the next hop of the current ECMP group may be updated. In an embodiment of the present invention, the update metaphor value of all updated HASH buckets is incremented when an update due to next hop failure occurs, and decremented when an update due to next hop recovery occurs. It can be understood that, for the calculation method of the updated proxy value when the updated proxy value changes due to the failure of the next hop and/or the recovery of the failed next hop, a person skilled in the art may adjust the calculation method according to the actual situation, and the embodiment of the present invention is not limited to this.
Alternatively, the priority value is used as a basis for assigning the HASH bucket to the next hop after recovery, wherein the priority value of the HASH bucket is obtained in response to a weighting instruction of the affinity and the update surrogate value, it is understood that the priority value can be obtained by performing a weighting calculation of the affinity and the update surrogate value.
In a specific implementation, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets correspond to HASH values, when the ECMP group is recovered, the affinity and the updated surrogate value corresponding to the HASH bucket are obtained, and then the priority value of the HASH bucket is obtained according to the affinity and the updated surrogate value, and then according to the priority value and the number of HASH buckets corresponding to each next hop in the ECMP group, the failed next hop is recovered to allocate the HASH bucket to the recovered next hop. The failed next hop can be recovered faster and load balancing can be rapidly achieved again, and the data flow on the non-failed next hop is little or hardly affected during recovery, and in addition, when the failed path is recovered, the data of other links are guided to the recovered path.
In order to make the technical solutions of the embodiments of the present invention better understood by those skilled in the art, the following is an exemplary description by way of an example:
referring to fig. 3, a schematic flow chart of a priority algorithm provided in the embodiment of the present invention is shown;
in order to ensure that data flow on a failure path is not affected during the switching process of the path as much as possible, the system takes the value of priority (priority) of each position as the basis for migrating to a new position when the path is recovered, namely, the value of priority can be used as the basis for allocating a HASH bucket to a next hop after recovery, and the value of priority is calculated by weighting the position affinity (affinity) and the update surrogate value (epoch) of the state when the maximum group member number is reached for the first time, namely, the HASH bucket with higher priority is preferentially migrated to the new path. For simplicity of description, the present embodiment is illustrated with 16 HASH values (modulo HASH values) distributed to an ECMP group consisting of 4 next hops.
As shown in fig. 3, first, when the data stream is delivered for the first time, the next hop is placed at the position related to each HASH value as the initial map (the first map in the first row in fig. 3), when 3 fails, the HASH bucket with the original next hop of 3 is sequentially allocated to 1, 2 and 4, and the updated epoch (epoch) value of each change position is added by one (the second map in the second row in fig. 3), so as to obtain the second step map (the second map in the first row in fig. 3). When 2 fails again, since the previous multi-assignment is 1, the current assignment is started from 4, the current assignment is respectively assigned to 4 and 1, the numbers of 4 and 1 are kept consistent, and the updated epoch (epoch) value of the change position is added with one more, so as to obtain a third step chart (the first row and the third chart in fig. 3).
When 3 is recovered, the priority obtained by weighting the initial position (recorded as affinity, in this example, half of the number of next hops is taken as affinity, that is, the affinity weight is 2, and the update surrogate value weight is 1) of 3 with the epoch value is calculated, the position with the highest priority in each value range is respectively allocated to 3, and the update surrogate value of each position is changed by one, so that the fourth step chart is obtained. Similarly, when the step 2 is also recovered, the step 2 is finally recovered to the fifth step, and the principle of the recovery of the step 2 is similar to that of the recovery of the step 3 described above, which is not described herein again in the embodiments of the present invention.
It should be noted that the restoration to the original state in this drawing is not necessarily a state that can be achieved, and it is understood that there are many states after restoration in the actual application process, and the above-mentioned embodiment of the present invention is only one kind of coincidence state.
In the embodiment of the invention, an ECMP group consists of a plurality of next hops, the next hops comprise next hop information, when the next hop of the ECMP group fails, a neighbor request is sent to the failed next hop according to a preset period, and when the failed next hop returns a neighbor response aiming at the neighbor request, the next hop information corresponding to the failed next hop is obtained again according to the neighbor response and the neighbor information aiming at the next hop information is obtained; wherein the neighbor response contains neighbor information corresponding to the failed next hop; and recovering the invalid next hop according to the neighbor information to recover the ECMP group. By sending the neighbor request to the failed next hop and re-acquiring the next hop information corresponding to the failed next hop according to the neighbor response returned by the neighbor request to obtain the neighbor information, the failed next hop can be recovered more quickly according to the neighbor information and can be found quickly to realize load balancing again, and the data flow on the non-failed next hop is little or hardly influenced in the recovery process.
Meanwhile, when the state of the outgoing interface of the ECMP group is the situation that the outgoing interface is DOWN, neighbor information is aged or the neighbor information is no longer reachable, and the condition that the connection reaching the next hop is interrupted and other path failures are found through protocols such as a bidirectional forwarding detection protocol and the like, the routing does not need to be updated, but the members of the current ECMP group are directly updated, so that the routing referring to the ECMP group does not need to be retransmitted, the flow is guided to other reachable paths (next hop), and the routing table entry does not need to be comprehensively updated.
In addition, when the next hop of the path fails, the next hop information is recorded in a database, an independent process periodically requests to retrieve the failed next hop as soon as possible according to the content of the database, and on the basis that the data stream distribution on the non-failed path is not influenced when the path fails, the affinity value and the updating surrogate value of each position (HASH bucket) are calculated when the path is recovered, and then the numerical value obtained by weighting the affinity value and the updating surrogate value is used as the numerical value of the priority of the position, so that the numerical value of the priority is used as the basis for migrating to a new position, namely the basis for distributing the HASH bucket to the recovered next hop, and the influence on the original data stream on the non-failed path is reduced as much as possible.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the illustrated order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments of the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
Referring to fig. 4, a block diagram of a structure of a recovery apparatus for an ECMP group failure provided in the embodiment of the present invention is shown, where the ECMP group is composed of multiple next hops, and the next hops include next hop information, and specifically include the following modules:
a neighbor request sending module 401, configured to send a neighbor request to a next hop that fails according to a preset period when the next hop in the ECMP group fails;
a neighbor information obtaining module 402, configured to, when the failed next hop returns a neighbor response to the neighbor request, obtain, according to the neighbor response, next hop information corresponding to the failed next hop again and obtain neighbor information for the next hop information; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
an ECMP group restoring module 403, configured to restore the failed next hop according to the neighbor information to restore the ECMP group.
In an alternative embodiment, the apparatus further comprises:
and the information input module is used for inputting the next hop information corresponding to the invalid next hop into a state database when the next hop in the ECMP group is invalid.
In an alternative embodiment, the apparatus further comprises:
and the information deleting module is used for deleting the next hop information corresponding to the invalid next hop from the state database.
In an optional embodiment, the ECMP group restoring module is specifically configured to:
and recovering the invalid next hop to recover the ECMP group according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop.
In an alternative embodiment, the apparatus further comprises:
and the next hop marking module is used for marking the next hop as the invalid next hop when the next hop in the ECMP group is invalid.
In an alternative embodiment, the apparatus further comprises:
and deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
In an optional embodiment, the next hop deletion module is specifically configured to:
and when the state of the output interface of the invalid next hop is the output interface DOWN, deleting the invalid next hop in the ECMP group.
In an optional embodiment, the next hop deletion module is specifically configured to:
and when the neighbor information fails, deleting the failed next hop in the ECMP group.
In an optional embodiment, the ECMP group is provided with a bidirectional forwarding detection protocol, and the next hop deletion module is specifically configured to:
and when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted, deleting the invalid next hop in the ECMP group.
In an alternative embodiment, the next hop in the ECMP group includes a plurality of HASH buckets, the HASH buckets corresponding to HASH values, and the apparatus further comprises:
the data acquisition module is used for acquiring the corresponding affinity and the updated surrogate value of the HASH bucket when the ECMP group is recovered;
a priority value obtaining module, configured to obtain a priority value of the HASH bucket according to the affinity and the updated surrogate value;
and the next hop recovery module is used for recovering the failed next hop according to the numerical value of the priority and the number of the HASH buckets corresponding to the next hops in the ECMP group so as to allocate the HASH buckets to the recovered next hops.
In an alternative embodiment, the apparatus further comprises:
a HASH bucket allocation module for allocating the HASH bucket to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
In an optional embodiment, the data obtaining module is specifically configured to:
recording the affinity between the HASH bucket and its corresponding next hop when the HASH bucket is initially assigned to the next hop in the ECMP group.
In an alternative embodiment, the apparatus further comprises:
and the update surrogate value updating module is used for updating the update surrogate value of the HASH bucket corresponding to the next hop of the current ECMP group when the next hop in the ECMP group fails and/or the failed next hop is recovered.
In an optional embodiment, the priority value obtaining module is specifically configured to:
obtaining a value for the priority of the HASH bucket in response to a weighting instruction of the affinity and the update agent value.
For the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
In addition, an embodiment of the present invention further provides an electronic device, including: the processor, the memory, and the computer program stored in the memory and capable of running on the processor, when executed by the processor, implement each process of the above-mentioned ECMP group failure recovery method embodiment, and can achieve the same technical effect, and in order to avoid repetition, details are not described here again.
FIG. 5 is a schematic structural diagram of a computer-readable storage medium provided in an embodiment of the present invention;
the embodiment of the present invention further provides a computer-readable storage medium 501, where a computer program is stored on the computer-readable storage medium 501, and when the computer program is executed by a processor, the computer program implements each process of the above-mentioned ECMP group failure recovery method embodiment, and can achieve the same technical effect, and is not described here again to avoid repetition. The computer-readable storage medium 501 is, for example, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.
Fig. 6 is a schematic diagram of a hardware structure of an electronic device implementing various embodiments of the present invention.
The electronic device 600 includes, but is not limited to: radio frequency unit 601, network module 602, audio output unit 603, input unit 604, sensor 605, display unit 606, user input unit 607, interface unit 608, memory 609, processor 610, and power 611. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 6 does not constitute a limitation of electronic devices, which may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. In the embodiment of the present invention, the electronic device includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted terminal, a wearable device, a pedometer, and the like.
It should be understood that, in the embodiment of the present invention, the radio frequency unit 601 may be used to receive and transmit signals during a message transmission or call process, and specifically, receive downlink data from a base station and then process the received downlink data to the processor 610; in addition, uplink data is transmitted to the base station. Generally, radio frequency unit 601 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 601 may also communicate with a network and other devices through a wireless communication system.
The electronic device provides wireless broadband internet access to the user via the network module 602, such as assisting the user in sending and receiving e-mails, browsing web pages, and accessing streaming media.
The audio output unit 603 may convert audio data received by the radio frequency unit 601 or the network module 602 or stored in the memory 609 into an audio signal and output as sound. Also, the audio output unit 603 can provide audio output related to a specific function performed by the electronic apparatus 600 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 603 includes a speaker, a buzzer, a receiver, and the like.
The input unit 604 is used to receive audio or video signals. The input Unit 604 may include a Graphics Processing Unit (GPU) 6041 and a microphone 6042, and the Graphics processor 6041 processes image data of a still picture or video obtained by an image capturing apparatus (such as a camera) in a video capture mode or an image capture mode. The processed image frames may be displayed on the display unit 606. The image frames processed by the graphic processor 6041 may be stored in the memory 609 (or other storage medium) or transmitted via the radio frequency unit 601 or the network module 602. The microphone 6042 can receive sound, and can process such sound into audio data. The processed audio data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 601 in case of the phone call mode.
The electronic device 600 also includes at least one sensor 605, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 6061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 6061 and/or the backlight when the electronic apparatus 600 is moved to the ear. As one type of motion sensor, an accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of an electronic device (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), and vibration identification related functions (such as pedometer, tapping); the sensors 605 may also include fingerprint sensors, pressure sensors, iris sensors, molecular sensors, gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc., which are not described in detail herein.
The display unit 606 is used to display information input by the user or information provided to the user. The Display unit 606 may include a Display panel 6061, and the Display panel 6061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 607 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device. Specifically, the user input unit 607 includes a touch panel 6071 and other input devices 6072. Touch panel 6071, also referred to as a touch screen, may collect touch operations by a user on or near it (e.g., operations by a user on or near touch panel 6071 using a finger, stylus, or any other suitable object or attachment). The touch panel 6071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 610, receives a command from the processor 610, and executes the command. In addition, the touch panel 6071 can be implemented by various types such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. The user input unit 607 may include other input devices 6072 in addition to the touch panel 6071. Specifically, the other input devices 6072 may include, but are not limited to, a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a track ball, a mouse, and a joystick, which are not described herein again.
Further, the touch panel 6071 can be overlaid on the display panel 6061, and when the touch panel 6071 detects a touch operation thereon or nearby, the touch operation can be transmitted to the processor 610 to determine the type of the touch event, and then the processor 610 can provide a corresponding visual output on the display panel 6061 according to the type of the touch event. Although in fig. 6, the touch panel 6071 and the display panel 6061 are two independent components to implement the input and output functions of the electronic device, in some embodiments, the touch panel 6071 and the display panel 6061 may be integrated to implement the input and output functions of the electronic device, which is not limited herein.
The interface unit 608 is an interface for connecting an external device to the electronic apparatus 600. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 608 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the electronic device 600 or may be used to transmit data between the electronic device 600 and external devices.
The memory 609 may be used to store software programs as well as various data. The memory 609 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, etc. Further, the memory 609 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 610 is a control center of the electronic device, connects various parts of the whole electronic device by using various interfaces and lines, performs various functions of the electronic device and processes data by running or executing software programs and/or modules stored in the memory 609, and calling data stored in the memory 609, thereby performing overall monitoring of the electronic device. Processor 610 may include one or more processing units; preferably, the processor 610 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 610.
The electronic device 600 may further include a power supply 611 (e.g., a battery) for supplying power to the various components, and preferably, the power supply 611 may be logically connected to the processor 610 via a power management system, such that the power management system may be used to manage charging, discharging, and power consumption.
In addition, the electronic device 600 includes some functional modules that are not shown, and are not described in detail herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element.
Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part thereof which substantially contributes to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk, and various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (21)

1. A method for recovering a failure of an ECMP group, wherein the ECMP group comprises a plurality of next hops, and wherein the next hops include next hop information, the method comprising:
when the next hop in the ECMP group fails, sending a neighbor request to the failed next hop according to a preset period;
when the invalid next hop returns a neighbor response aiming at the neighbor request, re-acquiring the next hop information corresponding to the invalid next hop according to the neighbor response and obtaining the neighbor information aiming at the next hop information; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
and recovering the invalid next hop according to the neighbor information to recover the ECMP group.
2. The method of claim 1, further comprising:
and when the next hop in the ECMP group fails, inputting the next hop information corresponding to the failed next hop into a state database.
3. The method of claim 2, wherein after the recovering the failed next hop to recover the ECMP group according to the neighbor information, the method further comprises:
and deleting the next hop information corresponding to the invalid next hop from the state database.
4. The method of claim 1, wherein the next hop information comprises an IP address corresponding to the next hop and an egress interface of the next hop.
5. The method of claim 4, wherein the neighbor information is a correspondence between the next hop and the MAC address of the next hop.
6. The method of claim 5, wherein the recovering the failed next hop to recover the ECMP group according to the neighbor information comprises:
and recovering the invalid next hop according to the corresponding relation between the invalid next hop and the MAC address of the invalid next hop so as to recover the ECMP group.
7. The method of claim 1, further comprising:
and when the next hop in the ECMP group fails, marking the next hop as the failed next hop.
8. The method of claim 7, further comprising:
and deleting the invalid next hop in the ECMP group when the next hop is marked as the invalid next hop.
9. The method of claim 8, wherein deleting the stale next hop from the ECMP group when the next hop is marked as stale next hop comprises:
and deleting the invalid next hop in the ECMP group when the state of the outgoing interface of the invalid next hop is the outgoing interface DOWN.
10. The method of claim 8, wherein deleting the stale next hop in the ECMP group when the next hop is marked as stale next hop comprises:
and when the neighbor information fails, deleting the failed next hop in the ECMP group.
11. The method of claim 8, wherein the ECMP group is configured with a bidirectional forwarding detection protocol, and wherein deleting the failed next hop in the ECMP group when the next hop is marked as the failed next hop comprises:
and when the bidirectional forwarding detection protocol detects that the connection reaching the next hop is interrupted, deleting the invalid next hop in the ECMP group.
12. The method of claim 1, wherein the next hop in the ECMP group comprises a plurality of HASH buckets, and wherein the HASH buckets have HASH values associated therewith, the method further comprising:
when the ECMP group is recovered, acquiring the affinity and the updated surrogate value corresponding to the HASH bucket;
obtaining a numerical value of the priority of the HASH bucket according to the affinity and the updated surrogate value;
and recovering the failed next hop according to the numerical value of the priority and the number of HASH buckets corresponding to each next hop in the ECMP group so as to allocate the HASH buckets to the recovered next hop.
13. The method of claim 12, wherein the priority value is used as a basis for the HASH bucket to be allocated to the recovered next hop.
14. The method of claim 12, further comprising:
allocating the HASH bucket to a next hop in the ECMP group in response to an allocation instruction of the HASH bucket.
15. The method of claim 14, wherein obtaining the affinity and update surrogate values corresponding to the HASH bucket comprises:
recording the affinity between the HASH bucket and its corresponding next hop when the HASH bucket is initially assigned to the next hop in the ECMP group.
16. The method of claim 12, wherein the updated algebraic value is attribute information of the HASH bucket.
17. The method of claim 16, further comprising:
and when the next hop in the ECMP group fails and/or the failed next hop recovers, updating the updated surrogate value of the HASH bucket corresponding to the next hop of the current ECMP group.
18. The method of claim 12, wherein deriving the value of the priority of the HASH bucket based on the affinity and the updated surrogate value comprises:
obtaining a value for the priority of the HASH bucket in response to a weighting instruction of the affinity and the update agent value.
19. An apparatus for recovering an ECMP group failure, wherein the ECMP group comprises a plurality of next hops, and the next hops include next hop information, the apparatus comprising:
a neighbor request sending module, configured to send a neighbor request to a failed next hop according to a preset cycle when the next hop in the ECMP group fails;
a neighbor information obtaining module, configured to re-obtain, according to the neighbor response, next hop information corresponding to the failed next hop and obtain neighbor information for the next hop information, when the failed next hop returns a neighbor response for the neighbor request; wherein the neighbor response includes neighbor information corresponding to the failed next hop;
and the ECMP group recovery module is used for recovering the failed next hop according to the neighbor information so as to recover the ECMP group.
20. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory communicate with each other via the communication bus;
the memory is used for storing a computer program;
the processor, when executing a program stored on the memory, implementing the method of any of claims 1-18.
21. A computer-readable storage medium having instructions stored thereon, which when executed by one or more processors, cause the processors to perform the method of any one of claims 1-18.
CN202310239663.8A 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium Active CN115955434B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310239663.8A CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310239663.8A CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115955434A true CN115955434A (en) 2023-04-11
CN115955434B CN115955434B (en) 2023-05-30

Family

ID=85907016

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310239663.8A Active CN115955434B (en) 2023-03-14 2023-03-14 ECMP group failure recovery method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115955434B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105721321A (en) * 2014-12-02 2016-06-29 中兴通讯股份有限公司 Equal-cost multi-path outbound interface updating method and equal-cost multi-path outbound interface updating device
CN115514702A (en) * 2022-09-16 2022-12-23 苏州盛科科技有限公司 Method and device for quickly switching link, electronic equipment and storage medium
WO2023273937A1 (en) * 2021-06-29 2023-01-05 中兴通讯股份有限公司 Equal cost multi-path routing management method, switch, switch system, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105721321A (en) * 2014-12-02 2016-06-29 中兴通讯股份有限公司 Equal-cost multi-path outbound interface updating method and equal-cost multi-path outbound interface updating device
WO2023273937A1 (en) * 2021-06-29 2023-01-05 中兴通讯股份有限公司 Equal cost multi-path routing management method, switch, switch system, and storage medium
CN115514702A (en) * 2022-09-16 2022-12-23 苏州盛科科技有限公司 Method and device for quickly switching link, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN115955434B (en) 2023-05-30

Similar Documents

Publication Publication Date Title
CN111143005B (en) Application sharing method, electronic equipment and computer readable storage medium
CN108509299B (en) Message processing method, device and computer readable storage medium
CN110474841B (en) Service request routing processing method and terminal equipment
CN107979461A (en) Secret key method for retrieving, device, terminal, key escrow server and computer-readable recording medium
CN115617278B (en) Path device selection method and device, electronic device and readable storage medium
CN112422711B (en) Resource allocation method and device, electronic equipment and storage medium
CN108282405B (en) Application program interface cache management method, application server and storage medium
CN101404620A (en) Method for creating routing list item and switching equipment
CN109831359B (en) Method for detecting connection state of data network and terminal equipment thereof
CN109254972B (en) Offline command word bank updating method, terminal and computer readable storage medium
CN111444237A (en) Server system, data transmission method and electronic equipment
CN113950125A (en) Method, device and storage medium for network acceleration in application program
CN109284110B (en) Terminal application replacement method, terminal and computer readable storage medium
CN107317828A (en) Document down loading method and device
CN115955434B (en) ECMP group failure recovery method and device, electronic equipment and storage medium
CN110213069B (en) Data forwarding method and device, disaster recovery system and storage medium
CN112395106A (en) Process management method, mobile terminal, and computer-readable storage medium
CN109818967B (en) Notification method, server, mobile terminal and computer readable storage medium
CN111083009A (en) Packet capturing method and device and mobile terminal
CN115695309A (en) Access control list rule configuration method and device, electronic equipment and storage medium
CN115834460A (en) Calculation force resource allocation method and device, electronic equipment and readable storage medium
CN109918340B (en) File processing method and terminal equipment
CN109710125B (en) Application control method, terminal and computer readable storage medium
CN115987890B (en) Method, device, electronic equipment and storage medium for cross-cluster access to virtual IP address
CN113032361B (en) Database configuration changing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant