CN114598710A

CN114598710A - Method, device, equipment and medium for synchronizing distributed storage cluster data

Info

Publication number: CN114598710A
Application number: CN202210248426.3A
Authority: CN
Inventors: 董俊明
Original assignee: Suzhou Inspur Intelligent Technology Co Ltd
Current assignee: Suzhou Inspur Intelligent Technology Co Ltd
Priority date: 2022-03-14
Filing date: 2022-03-14
Publication date: 2022-06-07

Abstract

The invention provides a method, a device, equipment and a readable medium for synchronizing distributed storage cluster data, wherein the method comprises the following steps: selecting a preset number of core nodes from data nodes in the distributed storage cluster, and selecting a main node from the core nodes; responding to the data node to send data to the main node, and sending the received data to each core node by the main node for backup; responding to the fact that the number of the receipt which is sent by the core node and is successfully written into the data by the core node and received by the main node is larger than a preset value, and writing the received data into the main node by the main node; and responding to the completion of the data writing of the main node, and sending confirmation information to the data node by the main node. By using the scheme of the invention, the data of the cluster can be protected, the stability of the cluster can be improved, and the product competitiveness can be improved.

Description

Method, device, equipment and medium for synchronizing distributed storage cluster data

Technical Field

The present invention relates to the field of computers, and more particularly, to a method, an apparatus, a device, and a readable medium for data synchronization of distributed storage clusters.

Background

In the current distributed storage cluster environment, data is important data of a cluster, the cluster generates a lot of information and data in a normal operation process, such as alarm information, node information, network information, operation logs, statistical information and the like, the data is very important for the cluster, the cluster can face various problems in the operation process, such as abnormal power failure, node failure, network failure, cluster expansion and contraction capacity, when the cluster is in an abnormal condition, problems of data loss, data coverage and the like easily occur, and daily service implementation of a user is seriously affected.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, a device, and a readable medium for synchronizing distributed storage cluster data.

In view of the above, an aspect of the embodiments of the present invention provides a method for data synchronization of distributed storage clusters, including the following steps:

selecting a preset number of core nodes from data nodes in the distributed storage cluster, and selecting a main node from the core nodes;

responding to the data node to send data to the main node, and sending the received data to each core node by the main node for backup;

responding to the fact that the number of receipts, sent by the core node and used for successfully writing the data in by the core node, of the main node is larger than a preset value, and writing the received data into the main node by the main node;

and responding to the completion of the data writing of the main node, and sending confirmation information to the data node by the main node.

According to an embodiment of the present invention, the electing a preset number of core nodes from among the data nodes in the distributed storage cluster includes:

randomly selecting nodes which are half of the total number of the nodes from the data nodes in the distributed storage cluster as candidate nodes;

the candidate nodes send voting requests to the data nodes so that the data nodes vote as the candidate nodes;

and sorting the votes of the candidate nodes from high to low, and selecting the candidate nodes with the preset number in the front with the high votes as core nodes.

According to an embodiment of the present invention, in response to that the number of receipt pieces, sent by the master node and received by the core node, for successfully writing data in the master node is greater than a preset value, the writing of the received data in the master node by the master node includes:

responding to the core node receiving the data, writing the data into the core node and sending a success receipt to the main node after the data is successfully written;

responding to the successful receipt that the main node receives more than half of the number of the core nodes, and writing the received data into the main node by the main node;

in response to the master node receiving the failure receipt sent by the core node, the master node resends the data to the failed core node.

According to an embodiment of the present invention, further comprising:

each core node in the cluster detects whether a main node is on line or not in a ping mode every time a threshold value is passed;

in response to the fact that the main node is detected to be offline for two consecutive times, the core node marks the main node as a subjective offline;

in response to the number of core nodes marking the master node as the subjective offline exceeding half of the total number of core nodes, marking the first 50% of the last message update time as candidate nodes;

the candidate nodes send voting requests to other core nodes to vote for the candidate nodes;

and selecting the candidate node with the first obtained vote number more than half of the total number of the core nodes as a new master node or selecting the candidate node with the highest obtained vote number as the new master node.

In another aspect of the embodiments of the present invention, there is also provided an apparatus for data synchronization of a distributed storage cluster, where the apparatus includes:

the selection module is configured to select a preset number of core nodes from the data nodes in the distributed storage cluster, and select a main node from the core nodes;

the backup module is configured to respond to the data nodes to send data to the main node, and the main node sends the received data to each core node for backup;

the writing module is configured to respond that the number of receipts that the master node receives the core node sent by the core node and successfully writes the data is larger than a preset value, and the master node writes the received data into the master node;

and the sending module is configured to respond to the completion of the data writing of the main node, and the main node sends confirmation information to the data node.

According to one embodiment of the invention, the selection module is further configured to:

randomly selecting a node which is half of the total number of nodes from the data nodes in the distributed storage cluster as a candidate node;

According to one embodiment of the invention, the write module is further configured to:

in response to the master node receiving a failure receipt sent by the core node, the master node resends the data to the failed core node.

According to an embodiment of the present invention, the system further comprises an election module configured to:

in response to the fact that the main node is detected to be offline for two times, the core node marks the main node as a subjective offline;

the candidate nodes send voting requests to other core nodes to vote the candidate nodes;

and selecting the first candidate node with the number of votes obtained larger than half of the total number of the core nodes as a new main node or selecting the candidate node with the highest number of votes obtained as the new main node.

In another aspect of an embodiment of the present invention, there is also provided a computer apparatus including:

at least one processor; and

a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of any of the methods described above.

In another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium storing a computer program, which when executed by a processor implements the steps of any one of the above-mentioned methods.

The invention has the following beneficial technical effects: in the method for synchronizing data of a distributed storage cluster provided by the embodiment of the invention, a preset number of core nodes are selected from data nodes in the distributed storage cluster, and a main node is selected from the core nodes; responding to the data nodes to send data to the main node, and sending the received data to each core node by the main node for backup; responding to the fact that the number of the receipt which is sent by the core node and is successfully written into the data by the core node and received by the main node is larger than a preset value, and writing the received data into the main node by the main node; the technical scheme that the main node sends the confirmation information to the data node in response to the completion of the data writing of the main node can protect the data of the cluster, improve the stability of the cluster and improve the competitiveness of products.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.

FIG. 1 is a schematic flow chart diagram of a method of distributed storage cluster data synchronization in accordance with one embodiment of the present invention;

FIG. 2 is a schematic diagram of a communication process of nodes in distributed storage cluster data synchronization according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an apparatus for distributed storage cluster data synchronization according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a computer device according to one embodiment of the present invention;

fig. 5 is a schematic diagram of a computer-readable storage medium according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention are described in further detail with reference to the accompanying drawings.

In view of the foregoing, a first aspect of the embodiments of the present invention provides an embodiment of a method for data synchronization of distributed storage clusters. Fig. 1 shows a schematic flow diagram of the method.

As shown in fig. 1, the method may include the steps of:

s1 elects a preset number of core nodes from the data nodes in the distributed storage cluster, and elects a master node from the core nodes.

The data nodes (data nodes) generally refer to nodes except a core node (major node) and a master node (leader node), each data node is a production node, and the produced data generally includes basic information, alarm information, operation logs, statistical information and the like of the nodes. The core nodes are disaster recovery nodes of the main node, the specific number of the core nodes is configurable, and the core nodes are responsible for the selection of the main node and the data synchronization of the data nodes. The master node is a master node in the cluster, and is responsible for data consistency of the cluster, receiving and sending cluster data, synchronizing core node data, and sending data confirmation information, and communication processes of the three nodes are shown in fig. 2. The core nodes are selected by the data nodes, the process is that a node half of the total number of the nodes is randomly selected from the data nodes in the distributed storage cluster as a candidate node, the number of the candidate nodes can be set according to requirements, the data of the candidate nodes needs to be larger than the preset number, then the candidate nodes send voting requests to the data nodes to enable the data nodes to vote as the candidate nodes, the voting process can be achieved by using an election mechanism in the prior art, after the voting process is completed, the votes of the candidate nodes are sorted from high to low, the candidate nodes with the preset number before the votes are high are selected as the core nodes, for example, 10 candidate nodes are sorted according to the votes from high to low, the first 5 candidate nodes in the sorting process are selected as the core nodes, and the rest candidate nodes are also the data nodes. The master node is elected by the core node, and the specific process is similar to the election process of the core node.

S2 responds to the data node sending data to the master node, and the master node sends the received data to each core node for backup.

After the data node generates data information, the data node sends the data information to the main node through the message middleware, the main node changes the message state into a submission state after acquiring the message, but does not enter a library (store) at first, and the main node sends the message to all core nodes so as to backup the data information corresponding to the message.

S3, responding to the response that the number of the receipt that the master node receives the core node sent by the core node and writes the data successfully is larger than the preset value, the master node writes the received data into the master node.

The core nodes receive the messages and then put in storage and send return receipt to the main nodes, the time of the main nodes sending the messages is set to be 500-1000 ms, when the processing request is not completed, the main nodes repeatedly send updating items to the core nodes, the core nodes send execution results to the main nodes after putting in storage, and when more than half of the core nodes finish putting in storage of data, the main nodes put in storage of the data.

S4 the master node sends an acknowledgement to the data node in response to the master node data write being completed.

And after the data is successfully put in the database by the main node, returning the result to the data node, and setting the message state to be finished by the main node, thereby finishing the storage and backup of the data once.

By the technical scheme, the data of the cluster can be protected, the stability of the cluster can be improved, and the product competitiveness can be improved.

In a preferred embodiment of the present invention, the electing a preset number of core nodes from the data nodes in the distributed storage cluster includes:

and sorting the votes of the candidate nodes from high to low, and selecting the candidate nodes with the high votes and the preset number as core nodes.

In another embodiment, the process of selecting the core nodes is as follows, when the number of the core nodes of the cluster is insufficient and the number of the core nodes does not exceed 1/3 of the number of the cluster nodes, the main node randomly selects a plurality of nodes from the data nodes as candidate nodes, the large-scale cluster is not more than 5 and is not more than 1/3 of the total number of the nodes in other cases, the candidate nodes send voting requests to the data nodes, the candidate node with the highest vote number is converted into the core node, other candidate nodes are informed to give up the election, the general data nodes are converted, if the vote number conflicts occur, the election is carried out again until the core node is elected, and the log is updated through comparison data, and the differential data is automatically pulled from other core nodes to achieve final data consistency.

In a preferred embodiment of the present invention, in response to that the number of the receipt of the successful data writing by the core node, which is sent by the core node and received by the master node, is greater than a preset value, the writing of the received data into the master node by the master node includes:

In a preferred embodiment of the present invention, the method further comprises:

and selecting the candidate node with the first obtained vote number more than half of the total number of the core nodes as a new master node or selecting the candidate node with the highest obtained vote number as the new master node. The core nodes and the main nodes in the cluster verify whether the other side is online or not by periodically executing ping and pong, if the core node 1 receives the pong message after sending the ping message to the main node, the core node 1 updates the communication time between the last time and the main node, if the pong message is not received, the core node 1 executes the ping message again, if the last communication time between the core node and the main node exceeds the specified time, the main node is marked as subjective offline by the core node 1, when the number of the core nodes marking the subjective offline of the main node is more than half of the total number of the core nodes, the main node is considered as objective offline, when the main node is marked as objective offline, a disaster recovery mechanism of the core node is started, according to the last message updating time, the first 50 percent of the core nodes are marked as candidate nodes and initiate main node election, the candidate nodes send voting requests to other core nodes, each core node can only cast a vote, voting is carried out according to the sequence of the voting requests, the first candidate node with the voting number larger than half of the total number of the core nodes is obtained (namely, the voting is not finished, the situation that the voting number is larger than half occurs, the cluster directly stops the voting, and the candidate node is directly upgraded to the main node) or the node with the highest voting number after the voting elects the main node (namely, the situation that the voting number is not larger than half during the voting process, the sorting is carried out according to the voting number), and if the situation that the voting number is the same occurs, the voting is restarted until the main node is selected according with the requirement.

In a preferred embodiment of the present invention, the data node periodically pulls data from the core node, and the core node performs data consistency detection on the alarm data of the data node through a timing task (configurable), and achieves final consistency of the node data through a compensation mode. By the scheme, the cluster can keep the integrity of the data and the consistency of the cluster data to the maximum extent under the conditions of node failure and network failure.

The technical scheme of the invention has the following advantages:

(1) and a master node and core node election mode is designed, and the master node data consistency is the only responsibility system.

(2) By designing a data communication mode among the data nodes, the master nodes and the core nodes, the strong consistency relationship between the master nodes and the core nodes and the final consistency of data of the data nodes are realized, the data of the cluster are protected, and the stability of the cluster is improved.

It should be noted that, as will be understood by those skilled in the art, all or part of the processes in the methods of the above embodiments may be implemented by instructing relevant hardware through a computer program, and the above programs may be stored in a computer-readable storage medium, and when executed, the programs may include the processes of the embodiments of the methods as described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The embodiments of the computer program may achieve the same or similar effects as any of the above-described method embodiments.

Furthermore, the method disclosed according to an embodiment of the present invention may also be implemented as a computer program executed by a CPU, and the computer program may be stored in a computer-readable storage medium. The computer program, when executed by the CPU, performs the above-described functions defined in the method disclosed in the embodiments of the present invention.

In view of the above object, according to a second aspect of the embodiments of the present invention, an apparatus for data synchronization of distributed storage clusters is provided, as shown in fig. 3, the apparatus 200 includes:

the writing module is configured to respond that the number of receipt messages, sent by the core node and received by the main node, for successfully writing the data is larger than a preset value, and the main node writes the received data into the main node;

In a preferred embodiment of the invention, the selection module is further configured to:

In a preferred embodiment of the present invention, the writing module is further configured to:

In a preferred embodiment of the present invention, the system further comprises an election module configured to:

in response to the number of core nodes marking the master node as a subjective offline exceeding half of the total number of core nodes, marking the core nodes of the first 50% of the last message update time as candidate nodes;

In view of the above object, a third aspect of the embodiments of the present invention provides a computer device. Fig. 4 is a schematic diagram of an embodiment of a computer device provided by the present invention. As shown in fig. 4, an embodiment of the present invention includes the following means: at least one processor 21; and a memory 22, the memory 22 storing computer instructions 23 executable on the processor, the instructions when executed by the processor implementing the method of:

responding to the fact that the number of the receipt which is sent by the core node and is successfully written into the data by the core node and received by the main node is larger than a preset value, and writing the received data into the main node by the main node;

In view of the above object, a fourth aspect of the embodiments of the present invention proposes a computer-readable storage medium. FIG. 5 is a schematic diagram illustrating an embodiment of a computer-readable storage medium provided by the present invention. As shown in fig. 5, the computer readable storage medium 31 stores a computer program 32 which, when executed by a processor, performs the method as described above.

Furthermore, the methods disclosed according to embodiments of the present invention may also be implemented as a computer program executed by a processor, which may be stored in a computer-readable storage medium. Which when executed by a processor performs the above-described functions as defined in the method disclosed by an embodiment of the invention.

Further, the above method steps and system elements may also be implemented using a controller and a computer readable storage medium for storing a computer program for causing the controller to implement the functions of the above steps or elements.

Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments of the present invention.

In one or more exemplary designs, the functions may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk, blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the present disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. Furthermore, although elements of the disclosed embodiments of the invention may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.

It should be understood that, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.

The numbers of the embodiments disclosed in the embodiments of the present invention are merely for description, and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, of embodiments of the invention is limited to these examples; within the idea of an embodiment of the invention, also technical features in the above embodiment or in different embodiments may be combined and there are many other variations of the different aspects of the embodiments of the invention as described above, which are not provided in detail for the sake of brevity. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of the embodiments of the present invention are intended to be included within the scope of the embodiments of the present invention.

Claims

1. A method for data synchronization of distributed storage clusters is characterized by comprising the following steps:

selecting a preset number of core nodes from data nodes in a distributed storage cluster, and selecting a main node from the core nodes;

2. The method of claim 1, wherein electing a preset number of core nodes among the data nodes in the distributed storage cluster comprises:

3. The method of claim 1, wherein in response to the master node receiving that the number of acknowledgements from the core node that the core node successfully written the data is greater than a preset value, the master node writing the received data to the master node comprises:

4. The method of claim 1, further comprising:

5. An apparatus for distributed storage cluster data synchronization, the apparatus comprising:

the system comprises a selection module, a storage module and a processing module, wherein the selection module is configured to select a preset number of core nodes from data nodes in a distributed storage cluster, and select a main node from the core nodes;

the writing module is configured to respond that the number of receipt messages, sent by the main node and received by the main node, for successfully writing the data into the core node is larger than a preset value, and the main node writes the received data into the main node;

6. The apparatus of claim 5, wherein the selection module is further configured to:

7. The apparatus of claim 5, wherein the write module is further configured to:

8. The apparatus of claim 5, further comprising an election module configured to:

detecting whether a main node is on line or not by each core node in the cluster in a ping mode every time a threshold value is passed;

9. A computer device, comprising:

at least one processor; and

a memory storing computer instructions executable on the processor, the instructions when executed by the processor implementing the steps of the method of any one of claims 1 to 4.

10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 4.