CN106878382B - Method and device for dynamically changing cluster scale in distributed arbitration cluster - Google Patents

Method and device for dynamically changing cluster scale in distributed arbitration cluster Download PDF

Info

Publication number
CN106878382B
CN106878382B CN201611248514.4A CN201611248514A CN106878382B CN 106878382 B CN106878382 B CN 106878382B CN 201611248514 A CN201611248514 A CN 201611248514A CN 106878382 B CN106878382 B CN 106878382B
Authority
CN
China
Prior art keywords
node
cluster
nodes
master
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201611248514.4A
Other languages
Chinese (zh)
Other versions
CN106878382A (en
Inventor
章立刚
卢忠亚
谢江帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201611248514.4A priority Critical patent/CN106878382B/en
Publication of CN106878382A publication Critical patent/CN106878382A/en
Application granted granted Critical
Publication of CN106878382B publication Critical patent/CN106878382B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0659Management of faults, events, alarms or notifications using network fault recovery by isolating or reconfiguring faulty entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1048Departure or maintenance mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1044Group management mechanisms 
    • H04L67/1051Group master selection mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/104Peer-to-peer [P2P] networks
    • H04L67/1061Peer-to-peer [P2P] networks using node-based peer discovery mechanisms
    • H04L67/1068Discovery involving direct consultation or announcement among potential requesting and potential source peers
    • H04L67/107Discovery involving direct consultation or announcement among potential requesting and potential source peers with limitation or expansion of the discovery scope
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/40Bus networks
    • H04L12/407Bus networks with decentralised control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/40Bus networks
    • H04L12/407Bus networks with decentralised control
    • H04L12/413Bus networks with decentralised control with random access, e.g. carrier-sense multiple-access with collision detection [CSMA-CD]
    • H04L12/4135Bus networks with decentralised control with random access, e.g. carrier-sense multiple-access with collision detection [CSMA-CD] using bit-wise arbitration

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Multi Processors (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application relates to the technical field of computers, and discloses a method and a device for dynamically changing cluster size in a distributed arbitration cluster, which are used for solving the problems of double masters, multiple masters or no master in the process of dynamically changing the scale of the distributed arbitration cluster. The method comprises the following steps: the method comprises the steps that a main node receives an instruction for changing the cluster size, and M nodes included in a distributed arbitration cluster after the cluster size is changed are determined, wherein M is a positive integer, and the M nodes include the main node and (M-1) slave nodes; the master node forwards the instruction to the (M-1) slave nodes and receives confirmation responses returned by the (M-1) slave nodes, wherein the confirmation responses are used for representing the agreement of the change of the cluster size; if the master node receives heartbeat information sent by L slave nodes in a heartbeat period, executing the operation of changing the cluster scale,
Figure DDA0001197596140000011
Figure DDA0001197596140000012
l is a positive integer.

Description

Method and device for dynamically changing cluster scale in distributed arbitration cluster
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for dynamically changing a cluster size in a distributed arbitration cluster.
Background
In a multi-processing system, a plurality of devices or modules may simultaneously apply for the right to use the bus, in order to avoid generating bus conflict, a bus arbitration mechanism needs to reasonably control and manage the applicant needing to occupy the bus in the system, and when a plurality of applicants simultaneously make bus requests, the applicant needing to obtain the right to use the bus is judged by a certain arbitration method. Arbitration methods can be divided into centralized arbitration and distributed arbitration. Centralized arbitration utilizes a single arbiter to distribute requests of multiple processors. In distributed arbitration, each arbitration node has its own arbiter. The arbitration node may be referred to simply as a node.
The system in which distributed arbitration is located may be referred to as a distributed system, which comprises a number of nodes forming an arbitration cluster. The arbitration cluster may integrate the resources to provide service to the outside as a whole. At most only one master node is allowed in the arbitration cluster. Currently, there are various arbitration algorithms to select master nodes, such as Fast Leader election algorithm, Raft election algorithm, etc.
During the operation of the distributed system, the user may need to dynamically change the size of the arbitration cluster, i.e., add or delete nodes. Currently, arbitration clusters in distributed systems generally do not support scale changes; or restrictively support scale changes; or, in the scale change process, the problem of "dual master" or "multi-master" or "no master" occurs, wherein "dual master" refers to two master nodes, "multi-master" refers to more than two master nodes, and "no master" refers to no master node.
Disclosure of Invention
The embodiment of the application provides a method and a device for dynamically changing the cluster size in a distributed arbitration cluster, which are used for solving the problems of double masters, multiple masters or no master in the process of dynamically changing the distributed arbitration cluster size.
The embodiment of the application provides the following specific technical scheme:
in a first aspect, a method for dynamically changing cluster size in a distributed arbitration cluster is provided, where the distributed arbitration cluster includes a master node, and determines whether to be able to maintain the identity of the master node of the distributed arbitration cluster after all nodes in the cluster agree with a change operation, where the master node is maintained under the condition that half of the nodes including the master node maintain signal connection between master and slave nodes, and the master node solidifies the operation of changing the cluster size after determining that the identity of the master node can be maintained. Therefore, the role of the main node can be kept unchanged in the process of changing the cluster scale, the condition for maintaining the main node is independently set and separated from the condition for increasing the main node, the availability of the system is enhanced, and the condition that the main node is not lost in the capacity expansion process of the system is ensured.
In one possible design, the master node receives an instruction to change the cluster size, determines M nodes that the distributed arbitration cluster includes after changing the cluster size, M being a positive integer,the M nodes comprise the master node and (M-1) slave nodes, the master node forwards the instruction to the (M-1) slave nodes and receives acknowledgement responses returned by the (M-1) slave nodes, the acknowledgement responses are used for representing that the cluster size is allowed to be changed, if the master node receives heartbeat information sent by L slave nodes in one heartbeat cycle, the operation of changing the cluster size is executed,
Figure BDA0001197596120000021
l is a positive integer. Therefore, various application scenes such as single-node capacity expansion, multi-node capacity expansion, capacity reduction and the like can be realized, and the condition of no master or multiple masters can not occur in each application scene for changing the cluster scale.
In one possible design, the master node receiving the instruction to change the cluster size may be implemented by: the master node receives an instruction of adding a first node, or the master node receives an instruction of deleting a second node, wherein the second node is a node different from the master node, and the distributed arbitration cluster does not include the first node and includes the second node.
In one possible design, any node contains a permission attribute, the permission attribute is used for representing whether the node has the elected permission, the permission attribute allows dynamic adjustment, and before the operation of changing the cluster size is executed, the permission attribute of the first node is not provided with the elected permission. The situation of no master or multiple masters after the first node is added is avoided.
In one possible design, before the master node receives an instruction to delete a second node, the second node is disconnected from the M nodes in communication connection; or before the main node receives an instruction of deleting the second node, the authority attribute of the second node is set to have no elected authority. And the distributed system is prevented from being failed when the deleted second node is the main node.
In one possible design, the master node includes configuration information, and the configuration informationIncluding raising the main condition and maintaining the main condition; the main conditions of liter are as follows: when N nodes are included in the distributed arbitration cluster, K nodes agree that the master node is up-master, wherein,
Figure BDA0001197596120000031
K. n is a positive integer; the main maintaining conditions are as follows: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,
Figure BDA0001197596120000032
p, N are all positive integers. The main rising condition and the main maintaining condition are isolated, the availability of the system is enhanced, and the condition that the main rising condition and the main maintaining condition are not lost in the capacity expansion process is ensured.
In a second aspect, an apparatus for dynamically changing cluster size in a distributed arbitration cluster is provided, where the apparatus has a function of implementing master node behavior in any one of the possible designs of the first aspect and the first aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.
In one possible design, the apparatus may be structured to include a transceiver, a memory, and a processor, wherein the memory is configured to store a set of programs, and the processor is configured to invoke the programs stored in the memory to perform the method as set forth in the first aspect and any one of the possible designs of the first aspect.
In a third aspect, a distributed arbitration cluster is provided, comprising master nodes in any one of the possible designs of the first aspect and the first aspect as described above.
In a fourth aspect, there is provided a computer storage medium storing computer software instructions for a master node of the above aspects, comprising a program designed to perform the above aspects.
Drawings
FIG. 1 is a block diagram of a distributed arbitration cluster according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating a method for dynamically changing cluster size in a distributed arbitration cluster according to an embodiment of the present application;
fig. 3 is a schematic view illustrating a capacity expansion flow of single-node capacity expansion in the embodiment of the present application;
FIG. 4 is a schematic view illustrating a capacity expansion flow of multiple nodes in the embodiment of the present application;
FIG. 5 is a schematic flow chart of the reduction in the embodiment of the present application;
FIG. 6 is a second schematic flow chart of the reduction in volume according to the embodiment of the present application;
FIG. 7 is a schematic structural diagram of an apparatus for dynamically changing a cluster size in a distributed arbitration cluster according to an embodiment of the present application;
fig. 8 is a second schematic structural diagram of an apparatus for dynamically changing the cluster size in a distributed arbitration cluster according to an embodiment of the present application.
Detailed Description
The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
The embodiments of the present application may be applied to a distributed system, and may be applied to, but are not limited to, a distributed arbitration cluster, and in the following description, the distributed arbitration cluster may be simply referred to as a cluster. In the distributed system, a user needs to dynamically change the scale of an arbitration cluster, and the embodiment of the application judges whether the identity of a main node of the user can be maintained or not after all nodes in the cluster agree to change operation, wherein the condition for maintaining the main node is that half of the nodes including the user maintain signal connection between the main node and the slave node, and the main node solidifies the operation of changing the scale of the cluster after determining the identity of the main node which can be maintained. Therefore, the role of the main node can be kept unchanged in the process of changing the cluster scale, the condition for maintaining the main node is independently set and separated from the condition for increasing the main node, the availability of the system is enhanced, and the condition that the main node is not lost in the capacity expansion process of the system is ensured.
As shown in fig. 1, the size of the distributed arbitration cluster 100 in the embodiment of the present application may include one node, or include two or more nodes. Distributed arbitration cluster 100 includes a master node 101. If the cluster size is greater than or equal to two nodes, the distributed arbitration cluster 100 includes at least one slave node 102 in addition to one master node 101. The distributed arbitration cluster 100 may integrate resources to provide services externally as a whole. Only one master node 101 at most is allowed in the distributed arbitration cluster 100.
The master node 101 is obtained by election of each node, for example, the arbitration algorithm for selecting the master node may include a FastLeader election algorithm and a Raft election algorithm, which is not limited in this application.
In the embodiment of the present application, each node in the distributed arbitration cluster 100 configures some local attributes, including: the identity (i.e., ID) of a node, which identity is unique, i.e., one node corresponds to one unique identity; a communication address, such as an Internet Protocol (IP) address, from which to communicate with other nodes; the authority attribute is used for representing whether the node has the elected authority or not, the node with the elected authority can vote to the node when voting for the master node, the node without the elected authority cannot vote to the node when voting for the master node, whether the node can rise the master or not can be controlled through setting of the authority attribute, and the identity of the node is controlled to be reduced from the master node to the slave node. The permission attributes allow for dynamic adjustment.
The master node 101 also contains some configuration information, including the raising master condition and the maintaining master condition;
wherein the main conditions of liter are as follows: when N nodes are included in the distributed arbitration cluster, K nodes agree to the master node to be up, wherein,K. n is a positive integer;
the main conditions for maintenance were: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,
Figure BDA0001197596120000052
P、Nare all positive integers.
Optionally, each node includes the configuration information.
In summary, only nodes (including self) greater than half the cluster size vote can guarantee dominance raising; during the application process, the heartbeat of the nodes (not including the nodes) with the size not more than half of the cluster size is lost, and the master can be maintained.
In practical applications, the master node 101 and the slave nodes 102 may determine a master-slave relationship by sending heartbeat information, the master node 101 may periodically send heartbeat information to all the slave nodes 102 according to a heartbeat cycle, the slave nodes 102 may also periodically reply the heartbeat information to the master node 101, and each slave node 102 does not send heartbeat information. The master node 101 can determine whether the slave node 102 maintains a master-slave relationship with itself through the heartbeat information.
Based on the architecture of the distributed arbitration cluster shown in fig. 1, as shown in fig. 2, a flow of a method for dynamically changing a cluster size in the distributed arbitration cluster provided in the embodiment of the present application is as follows.
Step 201: and the master node receives the instruction for changing the cluster size and determines M nodes included in the distributed arbitration cluster after the cluster size is changed.
M is a positive integer, and M nodes comprise a master node and (M-1) slave nodes.
The instruction for changing the cluster size received by the master node may be an instruction for capacity expansion or an instruction for capacity reduction. Capacity expansion refers to adding nodes in an original cluster, and capacity reduction refers to deleting nodes in the cluster. The instruction intelligently adds or deletes a node in one execution.
For example, the master node receives an instruction to add a first node, or the master node receives an instruction to delete a second node, where the second node is a node different from the master node;
wherein the distributed arbitration cluster does not include the first node and includes the second node. That is, the first node is a node outside the cluster and the second node is a node inside the cluster.
If the instruction is an instruction for increasing the first node, M is the cluster scale after capacity expansion, and if the instruction is an instruction for deleting the second node, M is the cluster scale after capacity expansion, that is, the number of nodes in the cluster after deletion of the second node.
Step 202: the master node forwards the instruction to the (M-1) slave nodes and receives acknowledgement responses returned by the (M-1) slave nodes.
Wherein the confirmation response is used to characterize an agreement to change the cluster size.
Specifically, when receiving the instruction, the master node needs to request whether each node agrees to an operation of changing the cluster size. And after receiving the confirmation responses of all the nodes, determining that all the nodes agree to the operation of changing the cluster size. If the main node does not receive the confirmation response returned by one of the nodes after the instruction is forwarded, the node is separated from the cluster, or the operation for changing the cluster scale is not approved, the main node does not perform subsequent operation, and thus the safety of the distributed system can be ensured.
Step 203: if the master node receives heartbeat information sent by L slave nodes in a heartbeat period, the operation of changing the cluster scale is executed,
Figure BDA0001197596120000061
l is a positive integer.
After receiving the acknowledgement response of each node, the master node needs to ensure that it can satisfy the condition of maintaining master, and when determining that it can satisfy the condition of maintaining master, it performs an operation of changing the cluster scale, for example, writing information of adding nodes into a disk, or writing information of deleting nodes into the disk.
Specifically, if the instruction is an instruction to add the first node, the authority attribute of the first node is not to have the elected right before the first node is added to the disk write, and the authority attribute of the first node is to have the elected right after the first node is added to the disk write.
Therefore, the capacity expansion of the single node is two nodes, and the problem of no master can be solved. For example, if the original cluster only contains one node, the node is the master node. The master node receives an instruction to add a node of the first node, and confirms that M is 2, and the master node is not changed after the first node joins the cluster because the authority attribute of the first node does not have the elected authority. Otherwise, if the first node does not set the authority attribute, the first node may have no master after joining the cluster.
If the instruction is an instruction for deleting the second node, before the main node receives the instruction for deleting the second node, the communication connection state between the second node and the M nodes is disconnected, for example, the second node and the M nodes are in a network isolation state, and for example, the second node is powered off;
or before the main node receives the instruction of deleting the second node, the authority attribute of the second node is set to have no elected authority.
Therefore, if the second node is the master node in the original cluster, in order to prevent the cluster from having the problem of no master after the second node is deleted, the communication connection state between the second node and the M nodes is firstly ensured to be disconnected, then the M nodes reselect new master nodes, the new master nodes receive the instruction of deleting the second node and then send the instruction to the (M-1) slave nodes and the second node, and only the confirmation responses returned by the (M-1) slave nodes are received without receiving the feedback information of the second node.
The embodiments of the present application will be described in further detail below with reference to specific application scenarios.
And the single-node capacity expansion is carried out to form double nodes in the scene one. The original node is represented by node 1 and the added node is represented by node 2.
As shown in fig. 3, the capacity expansion process of single-node capacity expansion specifically includes:
step 301, the node 1 receives a capacity expansion instruction for increasing the node 2.
Step 302, node 1 sends a capacity expansion request to node 2.
And carrying the node list in the current cluster and the local attribute of each node in the node list in the capacity expansion joining request.
Step 303, node 2 returns an acknowledgement response to node 1.
The acknowledgement response characterizes an agreement to join the cluster.
And step 304, the node 1 judges that the condition of maintaining the master node is met, and the transaction of adding the node 2 in the cluster is written into a disk.
Step 305, the node 1 instructs the node 2 to write the transaction of the node 2 added in the cluster into the disk.
The configuration information of the node 1 includes a master raising condition and a master maintaining condition, and the specific conditions are consistent with those described in the above embodiments and are not described herein again. Before the node 2 joins the cluster, the authority attribute is configured not to have the elected authority, so that the situation of no owner does not occur when joining the cluster.
And a second scenario is expansion of multiple nodes. The original cluster is provided with n nodes which are represented by a node 1 … … node n, n is more than or equal to 2, n is a positive integer, the node 1 is a main node, and the added node is represented by a node (n + 1).
As shown in fig. 4, the capacity expansion flow of multi-node capacity expansion specifically includes:
in step 401, node 1 receives a capacity expansion instruction for increasing node (n + 1).
Step 402, node 1 sends capacity expansion requests to nodes 2 to n and node (n + 1).
The capacity expansion request sent to the nodes 2 to n carries the local attribute of the node (n +1), and the capacity expansion request sent to the node (n +1) carries the current cluster and node list and the local attribute of the node 1 to the nodes 2 to n.
And step 403, the node 1 receives the confirmation responses returned by the nodes 2 to n and the node (n + 1).
Step 404, in a heartbeat cycle, the node 1 judges that the condition for maintaining the master node is satisfied, that is, the node 1 can receive the information not less than the condition for maintaining the master node
Figure BDA0001197596120000081
And (4) writing the transaction of the node (n +1) added in the cluster into the disk by the heartbeat information sent by the slave node.
For example, if n is 4, the cluster size after expansion is n +1 is 5,
Figure BDA0001197596120000082
in a heartbeat period, the node 1 can receive heartbeat information sent by not less than 2 slave nodes, and the condition of maintaining the master node can be met.
Step 405, the node 1 indicates the nodes 2 to n and the node (n +1), and the transaction of the node (n +1) added in the cluster is written into a disk.
The configuration information of the node 1 includes a master raising condition and a master maintaining condition, and the specific conditions are consistent with those described in the above embodiments and are not described herein again. Node (n +1) before joining the cluster, the permission attribute is configured not to have elected permission, and after step 405, the permission attribute of node (n +1) is configured to have elected permission.
Based on the condition of maintaining the master, the cluster size can be dynamically changed, and the method is suitable for various possible conditions. For example, when the size of the expanded cluster is even, and the expanded cluster is split into two parts by the network, the number of nodes included in the two parts is the same, in this case, the master node can still maintain the identity of the master node.
And a third scene, capacity reduction. The original cluster is provided with n nodes which are represented by node 1 … … node n, n is more than or equal to 2, n is a positive integer, node 1 is a main node, a deleted node is node n, and node n is a slave node.
As shown in fig. 5, the process of capacity reduction specifically includes:
step 501, the node 1 receives a capacity reduction instruction of the deletion node n.
Step 502, the node 1 sends a capacity reduction request to the nodes 2 to n respectively.
And carrying the local attribute of the deleted node n in the capacity reduction request.
Step 503, the node 1 receives the confirmation responses returned by the nodes 2 to (n-1), respectively.
Step 504, in a heartbeat cycle, the node 1 judges that the condition for maintaining the master node is met, namely that the condition that the received data is not less than the received data is judged
Figure BDA0001197596120000091
Heartbeat information sent by each slave node, then the heartbeat information will be collectedThe transaction for the delete node n in the cluster writes to disk.
And 505, the node 1 indicates the nodes 2 to (n-1) and writes the transaction of the deleted node (n-1) in the cluster into a disk.
Scene four, capacity reduction. The original cluster is provided with n nodes which are represented by a node 1 … … node n, n is more than or equal to 2, n is a positive integer, a node 2 is a main node of the original cluster, and a deleted node is a node 2.
And when the node 2 is determined to be deleted, the node 2 is isolated from the cluster in a network or powered off, or the authority attribute of the node 2 is configured not to have the elected authority.
Triggering the cluster to reselect the master, for example, reselecting the node 1 as the master node.
As shown in fig. 6, the process of capacity reduction specifically includes:
step 601, the node 1 receives a capacity reduction instruction of the deleting node 2.
Step 602, node 1 sends a capacity reduction request to nodes 3 to n, respectively.
The local attribute of the deleted node 2 is carried in the capacity reduction request.
Step 603, the node 1 receives the confirmation responses returned by the nodes 3 to n respectively.
Step 604, in a heartbeat cycle, the node 1 judges that the condition for maintaining the master node is met, namely that the condition that the received signal is not less than the received signal
Figure BDA0001197596120000101
And the heartbeat information sent by the slave node writes the transaction of the deleted node 2 in the cluster into the disk.
Step 605, the node 1 indicates the nodes 3 to n, and writes the transaction of deleting the node 2 in the cluster into the disk.
Based on the same concept as the method shown in fig. 2, as shown in fig. 7, an embodiment of the present application further provides an apparatus 700 for dynamically changing a cluster size in a distributed arbitration cluster, where the apparatus 700 for dynamically changing a cluster size in a distributed arbitration cluster is a master node in the distributed arbitration cluster, and the apparatus 700 for dynamically changing a cluster size in a distributed arbitration cluster includes:
a receiving unit 701, configured to receive an instruction for changing a cluster size, and determine M nodes included in the distributed arbitration cluster after the cluster size is changed, where M is a positive integer, and each of the M nodes includes a master node and (M-1) slave nodes;
a sending unit 702, configured to forward the instruction to (M-1) slave nodes, and receive acknowledgement responses returned by the (M-1) slave nodes, where the acknowledgement responses are used to represent agreement to change the cluster size;
a processing unit 703, configured to, if it is determined that the receiving unit receives heartbeat information sent by L slave nodes in one heartbeat cycle, perform an operation of changing the cluster size,l is a positive integer.
Optionally, the receiving unit 701 is configured to:
receiving an instruction of adding a first node or receiving an instruction of deleting a second node, wherein the second node is a node different from the main node;
wherein the distributed arbitration cluster does not include the first node and includes the second node.
Optionally, any node includes an authority attribute, where the authority attribute is used to represent whether the node has the elected authority, and the authority attribute allows dynamic adjustment;
before the operation of changing the cluster size is executed, the authority attribute of the first node is not provided with the elected authority.
Optionally, before the master node receives an instruction to delete the second node, the communication connection state between the second node and the M nodes is disconnected; alternatively, the first and second electrodes may be,
before the master node receives an instruction to delete the second node, the authority attribute of the second node is set to have no elected authority.
Optionally, the method further includes configuration information, where the configuration information includes a raising main condition and a maintaining main condition;
the main conditions of liter are: when N nodes are included in the distributed arbitration cluster, K nodes agree to the master node to be up, wherein,K. n is a positive integer;
the main conditions for maintenance were: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,
Figure BDA0001197596120000112
p, N are all positive integers.
Based on the same concept as the method shown in fig. 2, as shown in fig. 8, an embodiment of the present application further provides an apparatus 800 for dynamically changing the cluster size in a distributed arbitration cluster, where the apparatus 800 includes a transceiver 801, a processor 802, a memory 803, and a bus 804, and the transceiver 801, the processor 802, and the memory 803 are all connected to the bus 804, where the memory 803 stores a set of programs therein, and the processor 802 is configured to call the programs stored in the memory 803 to perform the following operations:
receiving, by the transceiver 801, an instruction to change the cluster size, and determining M nodes included in the distributed arbitration cluster after the cluster size is changed, where M is a positive integer, and the M nodes include a master node and (M-1) slave nodes;
forward the instructions to (M-1) slave nodes through the transceiver 801 and receive acknowledgement responses returned by the (M-1) slave nodes, the acknowledgement responses being used to characterize consent to change the cluster size;
if the receiving unit is determined to receive the heartbeat information sent by the L slave nodes in one heartbeat period, executing the operation of changing the cluster size,
Figure BDA0001197596120000113
l is a positive integer.
Optionally, the transceiver 801 is configured to:
receiving an instruction of adding a first node or receiving an instruction of deleting a second node, wherein the second node is a node different from the main node;
wherein the distributed arbitration cluster does not include the first node and includes the second node.
Optionally, any node includes an authority attribute, where the authority attribute is used to represent whether the node has the elected authority, and the authority attribute allows dynamic adjustment;
before the operation of changing the cluster size is executed, the authority attribute of the first node is not provided with the elected authority.
Optionally, before the master node receives an instruction to delete the second node, the communication connection state between the second node and the M nodes is disconnected; alternatively, the first and second electrodes may be,
before the master node receives an instruction to delete the second node, the authority attribute of the second node is set to have no elected authority.
Optionally, the method further includes configuration information, where the configuration information includes a raising main condition and a maintaining main condition;
the main conditions of liter are: when N nodes are included in the distributed arbitration cluster, K nodes agree to the master node to be up, wherein,
Figure BDA0001197596120000121
K. n is a positive integer;
the main conditions for maintenance were: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,p, N are all positive integers.
The processor 802 may be a Central Processing Unit (CPU), a Network Processor (NP), or a combination of a CPU and an NP.
The processor 802 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.
The memory 803 may include a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 803 may also include a non-volatile memory (non-volatile) such as a flash memory (flash memory), a Hard Disk Drive (HDD) or a solid-state drive (SSD); the memory 803 may also comprise a combination of memories of the kind described above.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various changes and modifications may be made in the embodiments of the present application without departing from the spirit and scope of the embodiments of the present application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to encompass such modifications and variations.

Claims (10)

1. A method for dynamically changing cluster size in a distributed arbitration cluster, wherein the distributed arbitration cluster includes master nodes, the method comprising:
the master node receives an instruction for changing the cluster size, and determines M nodes included in the distributed arbitration cluster after the cluster size is changed, wherein M is a positive integer, and the M nodes include the master node and (M-1) slave nodes;
the master node forwards the instruction to the (M-1) slave nodes and receives confirmation responses returned by the (M-1) slave nodes, wherein the confirmation responses are used for representing the agreement of the change of the cluster size;
if the master node receives heartbeat information sent by L slave nodes in a heartbeat period, executing the operation of changing the cluster scale,
Figure FDA0002155339160000011
l is a positive integer.
2. The method of claim 1, wherein the master node receives an instruction to scale a cluster comprising:
the main node receives an instruction of adding a first node, or the main node receives an instruction of deleting a second node, wherein the second node is a node different from the main node;
wherein the distributed arbitration cluster does not include the first node and includes the second node prior to changing cluster size.
3. The method of claim 2, wherein any node contains an authority attribute for characterizing whether the node has elected authority, the authority attribute allowing dynamic adjustment;
wherein the permission attribute of the first node is that the permission attribute does not have elected permission before the operation of changing the cluster size is executed.
4. The method of claim 3, wherein the second node is disconnected from the M nodes before the master node receives an instruction to delete the second node; alternatively, the first and second electrodes may be,
before the main node receives an instruction for deleting the second node, the authority attribute of the second node is set to have no elected authority.
5. The method of any of claims 1 to 4, wherein the master node contains configuration information, the configuration information comprising an ascending master condition and a maintaining master condition;
the main conditions of liter are as follows: when N nodes are included in the distributed arbitration cluster, K nodes agree that the master node is up-master, wherein,
Figure FDA0002155339160000021
K. n is a positive integer;
the main maintaining conditions are as follows: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,p, N are all positive integers.
6. An apparatus for dynamically changing cluster size in a distributed arbitration cluster, the apparatus being a master node in the distributed arbitration cluster, the apparatus comprising:
a receiving unit, configured to receive an instruction for changing a cluster size, and determine M nodes included in the distributed arbitration cluster after the cluster size is changed, where M is a positive integer, and the M nodes include the master node and (M-1) slave nodes;
a sending unit, configured to forward the instruction to (M-1) slave nodes, and receive acknowledgement responses returned by the (M-1) slave nodes, where the acknowledgement responses are used to characterize approval of changing the cluster size;
a processing unit, configured to execute the operation of changing the cluster scale if it is determined that the receiving unit receives heartbeat information sent by L slave nodes in one heartbeat cycle,l is a positive integer.
7. The apparatus of claim 6, wherein the receiving unit is to:
receiving an instruction of adding a first node, or receiving an instruction of deleting a second node, wherein the second node is a node different from the main node;
wherein the distributed arbitration cluster does not include the first node and includes the second node prior to changing cluster size.
8. The apparatus of claim 7, wherein any node contains an authority attribute for characterizing whether the node has elected authority, the authority attribute allowing dynamic adjustment;
wherein the permission attribute of the first node is that the permission attribute does not have elected permission before the operation of changing the cluster size is executed.
9. The apparatus of claim 8, wherein the second node is disconnected from the M nodes before the master node receives the instruction to delete the second node; alternatively, the first and second electrodes may be,
before the main node receives an instruction for deleting the second node, the authority attribute of the second node is set to have no elected authority.
10. The apparatus according to any one of claims 6 to 9, wherein the apparatus further comprises configuration information, the configuration information comprising an ascending main condition and a maintaining main condition;
the main conditions of liter are as follows: when N nodes are included in the distributed arbitration cluster, K nodes agree that the master node is up-master, wherein,
Figure FDA0002155339160000031
K. n is a positive integer;
the main maintaining conditions are as follows: when N nodes are included in the distributed arbitration cluster, the master node receives heartbeat information of P slave nodes in a heartbeat period, wherein,
Figure FDA0002155339160000032
p, N are all positive integers.
CN201611248514.4A 2016-12-29 2016-12-29 Method and device for dynamically changing cluster scale in distributed arbitration cluster Expired - Fee Related CN106878382B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611248514.4A CN106878382B (en) 2016-12-29 2016-12-29 Method and device for dynamically changing cluster scale in distributed arbitration cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611248514.4A CN106878382B (en) 2016-12-29 2016-12-29 Method and device for dynamically changing cluster scale in distributed arbitration cluster

Publications (2)

Publication Number Publication Date
CN106878382A CN106878382A (en) 2017-06-20
CN106878382B true CN106878382B (en) 2020-02-14

Family

ID=59164415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611248514.4A Expired - Fee Related CN106878382B (en) 2016-12-29 2016-12-29 Method and device for dynamically changing cluster scale in distributed arbitration cluster

Country Status (1)

Country Link
CN (1) CN106878382B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108063787A (en) * 2017-06-26 2018-05-22 杭州沃趣科技股份有限公司 The method that dual-active framework is realized based on distributed consensus state machine
CN109729129B (en) * 2017-10-31 2021-10-26 华为技术有限公司 Configuration modification method of storage cluster system, storage cluster and computer system
CN108134712B (en) * 2017-12-19 2020-12-18 海能达通信股份有限公司 Distributed cluster split brain processing method, device and equipment
CN108769118B (en) * 2018-04-23 2022-01-21 网宿科技股份有限公司 Method and device for selecting master nodes in distributed system
CN111355600B (en) * 2018-12-21 2023-05-02 杭州海康威视数字技术股份有限公司 Main node determining method and device
CN116367202A (en) * 2021-12-28 2023-06-30 华为技术有限公司 Cluster arbitration method, network equipment and system
CN114461141B (en) * 2021-12-30 2023-08-18 苏州浪潮智能科技有限公司 ETCD system, node arbitration method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103634375A (en) * 2013-11-07 2014-03-12 华为技术有限公司 Method, device and equipment for cluster node expansion
CN104378232A (en) * 2014-11-10 2015-02-25 东软集团股份有限公司 Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3693896B2 (en) * 2000-07-28 2005-09-14 三菱電機株式会社 Communication method and communication system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103634375A (en) * 2013-11-07 2014-03-12 华为技术有限公司 Method, device and equipment for cluster node expansion
CN104378232A (en) * 2014-11-10 2015-02-25 东软集团股份有限公司 Schizencephaly finding and recovering method and device under main joint and auxiliary joint cluster networking mode

Also Published As

Publication number Publication date
CN106878382A (en) 2017-06-20

Similar Documents

Publication Publication Date Title
CN106878382B (en) Method and device for dynamically changing cluster scale in distributed arbitration cluster
US11888599B2 (en) Scalable leadership election in a multi-processing computing environment
EP3564873B1 (en) System and method of decentralized machine learning using blockchain
US10719260B2 (en) Techniques for storing and retrieving data from a computing device
US9727268B2 (en) Management of storage in a storage network
US10749954B2 (en) Cross-data center hierarchical consensus scheme with geo-aware leader election
KR101871383B1 (en) Method and system for using a recursive event listener on a node in hierarchical data structure
US11070979B2 (en) Constructing a scalable storage device, and scaled storage device
CN109739435B (en) File storage and updating method and device
WO2017113280A1 (en) Distributed storage system and metadata managing method
WO2022134797A1 (en) Data fragmentation storage method and apparatus, a computer device, and a storage medium
US10084860B2 (en) Distributed file system using torus network and method for configuring and operating distributed file system using torus network
EP2998862A1 (en) Method, device, and system for memory management
CN112416881A (en) Intelligent terminal storage sharing method, device, medium and equipment based on block chain
CN114244835A (en) Decentralized self-adaptive collaborative training method and device based on block chain
KR101527634B1 (en) Method and apparatus for providing sharding service
WO2015024491A2 (en) Enhanced data transfer in multi-cpu systems
US10749921B2 (en) Techniques for warming up a node in a distributed data store
CN103500108A (en) System memory access method, node processor and multi-processor system
CN111046004B (en) Data file storage method, device, equipment and storage medium
CN107547605B (en) message reading and writing method based on node queue and node equipment
US10761724B2 (en) System, method, and apparatus for updating data in a distributed storage system
CN107547593B (en) Method, device and distributed system for realizing log synchronization
WO2021068850A1 (en) Transaction management method and system, network device and readable storage medium
CN106155573B (en) method and device for expanding storage device and expanded storage device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200214

Termination date: 20201229