CN108173971A - A kind of MooseFS high availability methods and system based on active-standby switch - Google Patents

A kind of MooseFS high availability methods and system based on active-standby switch Download PDF

Info

Publication number
CN108173971A
CN108173971A CN201810111014.9A CN201810111014A CN108173971A CN 108173971 A CN108173971 A CN 108173971A CN 201810111014 A CN201810111014 A CN 201810111014A CN 108173971 A CN108173971 A CN 108173971A
Authority
CN
China
Prior art keywords
metadata
metadata node
node
standby
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810111014.9A
Other languages
Chinese (zh)
Inventor
林炳东
赵旦谱
台宪青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu IoT Research and Development Center
Original Assignee
Jiangsu IoT Research and Development Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu IoT Research and Development Center filed Critical Jiangsu IoT Research and Development Center
Priority to CN201810111014.9A priority Critical patent/CN108173971A/en
Publication of CN108173971A publication Critical patent/CN108173971A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/202Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
    • G06F11/2023Failover techniques
    • G06F11/2028Failover techniques eliminating a faulty processor or activating a spare
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure

Abstract

The invention discloses a kind of MooseFS high availability methods based on active-standby switch, wherein, including:Determine that the first metadata node is main metadata node by active and standby election, the second metadata node is standby metadata node;Main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata synchronization module in real time, and standby metadata node periodically obtains metadata from the metadata synchronization module;The working condition of monitoring main metadata node in real time, and judge whether the working condition of main metadata node is abnormal;If the working condition of main metadata node is abnormal, control re-starts active and standby election;If standby metadata node is elected successfully, using standby metadata node as new main metadata node.The invention also discloses a kind of MooseFS high-availability systems based on active-standby switch.MooseFS high availability methods provided by the invention based on active-standby switch realize the High Availabitity of MooseFS.

Description

A kind of MooseFS high availability methods and system based on active-standby switch
Technical field
The present invention relates to technical field of distributed memory more particularly to a kind of MooseFS High Availabitities based on active-standby switch Method and a kind of MooseFS high-availability systems based on active-standby switch.
Background technology
MooseFS is a distributed file system increased income, and MooseFS mainly uses master/slave framework.Wherein, main table Show metadata node, referred to as master;From expression back end, referred to as chunkserver.Metadata node is MooseFS Core, only there are one, it manages the access of the metadata and client of entire file system to file system.Back end is deposited Practical file data is stored up, and the synchrodata copy between different nodes, a general system there are multiple back end.It is all The read-write requests of client are required for by metadata node.The master/slave framework of MooseFS enormously simplifies design.However, If unique metadata node breaks down, whole system just can not externally provide service, all read-write requests of client It will be unable to meet with a response, here it is the Single Point of Faliures of MooseFS(Single Point of Failure)Problem.MooseFS High Availabitity refer to remaining to external offer service when metadata node breaks down.However, only there are one members by MooseFS Back end, without standby node.There is no other metadata nodes that can take over when unique metadata node breaks down Its work, necessarily leads to service disruption.Servicing unavailable may bring enterprise huge loss.
It is typically to pass through increasing in the prior art as shown in Figure 1, for the Single Point of Faliure problem of MooseFS metadata nodes Add a metadata backup node, i.e. metalogger.Metadata is synchronized to metadata backup section by metadata node master Point metalogger.When master is out of order or when metadata is lost, node copy metadata where from metalogger Then file restarts metadata node.Fig. 2 show the course of work of the structure of Fig. 1 in fault recovery, can be seen by Fig. 2 Go out, without automatic fault discovery mechanism in the program, need manually to find failure, failure may need a period of time after occurring It can just be found, it is long to service the not available time.And the metalogger in the program is used only to backup metadata, can not replace For master(Do not have the function of customer in response end read-write requests).It needs to restart master during fault recovery, master will be first Data need to take a substantial amount of time from local be loaded into memory(Minute grade), cause to service for a long time unavailable.The party Case is big to the dependence of artificial O&M, and failure recovery time is long.
Therefore, automatic fault detection and fault recovery how to be realized to realize that the High Availabitity of MooseFS becomes this The technical issues of field technology personnel are urgently to be resolved hurrily.
Invention content
The present invention is directed at least solving one of technical problem in the prior art, provide a kind of based on active-standby switch MooseFS high availability methods and a kind of MooseFS high-availability systems based on active-standby switch, to solve of the prior art ask Topic.
As the first aspect of the invention, a kind of MooseFS high availability methods based on active-standby switch are provided, wherein, The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the Binary data node, the MooseFS high availability methods based on active-standby switch include:
Determine that first metadata node is main metadata node by active and standby election, second metadata node is standby member Back end;
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata in real time and is synchronized Module, the standby metadata node periodically obtain metadata from the metadata synchronization module;
The working condition of the main metadata node is monitored in real time, and judges whether the working condition of the main metadata node is different Often;
If the working condition of the main metadata node is abnormal, control re-starts active and standby election;
If the standby metadata node is elected successfully, using the standby metadata node as new main metadata node.
Preferably, the MooseFS high availability methods based on active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number Metadata is obtained according to synchronization module.
Preferably, if the standby metadata node is elected successfully, using the standby metadata node as new master Metadata node includes:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
Preferably, if the standby metadata node is elected successfully, using the standby metadata node as new master Metadata node includes:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end To the metadata synchronization module.
Preferably, if the MooseFS high availability methods based on active-standby switch are additionally included in the main metadata node Working condition it is abnormal, then carried out before the step of control re-starts active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged It is abnormal to receive your working condition.
Preferably, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
Preferably, it is described if the active and standby election is additionally included in first metadata node and creates lock node success It is carried out after the step of first metadata node is elected successfully, and first metadata node is confirmed as main metadata node 's:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by It is determined as new main metadata node.
As the second aspect of the invention, a kind of MooseFS high-availability systems based on active-standby switch are provided, wherein, The MooseFS high-availability systems based on active-standby switch include MooseFS, metadata synchronization module, fault recovery control mould Block and ZooKeeper clusters, the MooseFS include metadata node and multiple back end, and the metadata node includes First metadata node and the second metadata node, the metadata synchronization module, the fault recovery control module and described Back end is connect with first metadata node and second metadata node, the ZooKeeper clusters and institute It states fault recovery control module to connect, one in first metadata node and second metadata node is pivot number According to node, another is standby metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
Preferably, the metadata synchronization module includes the cluster being made of odd number node, passes through fortune between the cluster Row paxos algorithms synchronize the metadata of the main metadata node.
Preferably, the fault recovery control module by rpc to the working condition of the main metadata node and standby member The working condition of back end is monitored.
MooseFS high availability methods provided by the invention based on active-standby switch, by using two metadata nodes, and One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service, The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
Description of the drawings
Attached drawing is to be used to provide further understanding of the present invention, and a part for constitution instruction, with following tool Body embodiment is used to explain the present invention, but be not construed as limiting the invention together.In the accompanying drawings:
Fig. 1 is the Organization Chart of MooseFS of the prior art.
Fig. 2 is failover procedure flow chart of the prior art.
Fig. 3 is the flow chart of the MooseFS high availability methods provided by the invention based on active-standby switch.
Fig. 4 is the Organization Chart of the MooseFS high-availability systems provided by the invention based on active-standby switch.
Fig. 5 is the failover procedure flow chart of the MooseFS high availability methods provided by the invention based on active-standby switch.
Specific embodiment
The specific embodiment of the present invention is described in detail below in conjunction with attached drawing.It should be understood that this place is retouched The specific embodiment stated is merely to illustrate and explain the present invention, and is not intended to restrict the invention.
As the first aspect of the invention, a kind of MooseFS high availability methods based on active-standby switch are provided, wherein, The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the Binary data node, as shown in figure 3, the MooseFS high availability methods based on active-standby switch include:
S110, first metadata node is determined as main metadata node by active and standby election, second metadata node For standby metadata node;
S120, the main metadata node carry out data transmission with multiple back end, and metadata are synchronized to first number in real time According to synchronization module, the standby metadata node periodically obtains metadata from the metadata synchronization module;
S130, the working condition for monitoring the main metadata node in real time, and judge the working condition of the main metadata node It is whether abnormal;
If the working condition of S140, the main metadata node is abnormal, control re-starts active and standby election;
If S150, the standby metadata node are elected successfully, using the standby metadata node as new main metadata node.
MooseFS high availability methods provided by the invention based on active-standby switch, by using two metadata nodes, and One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service, The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
As one kind, specifically embodiment, the MooseFS high availability methods based on active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number Metadata is obtained according to synchronization module.
As another specifically embodiment, if the standby metadata node is elected successfully, by the standby member Back end includes as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
It, will the standby member if the standby metadata node is elected successfully as another specifically embodiment Back end includes as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end To the metadata synchronization module.
Specifically, if the MooseFS high availability methods based on active-standby switch are additionally included in the main metadata node Working condition it is abnormal, then carried out before the step of control re-starts active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged It is abnormal to receive your working condition.
It is understood that heartbeat packet is sent to the main metadata node by timing, then according to whether first The feedback information of the heartbeat packet is received in threshold time to judge the main metadata node whether in normal work shape State.
Specifically, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
Further specifically, if the active and standby election is additionally included in first metadata node and creates lock node success, After the step of then first metadata node is elected successfully, and first metadata node is confirmed as main metadata node It carries out:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by It is determined as new main metadata node.
As the second aspect of the invention, a kind of MooseFS high-availability systems based on active-standby switch are provided, wherein, As shown in figure 4, the MooseFS high-availability systems based on active-standby switch include MooseFS, metadata synchronization module, failure Restore control module and ZooKeeper clusters, the MooseFS includes metadata node and multiple back end, the member number Include the first metadata node and the second metadata node, the metadata synchronization module, fault recovery control according to node Module and the back end are connect with first metadata node and second metadata node, described ZooKeeper clusters are connect with the fault recovery control module, first metadata node and the second metadata section One in point is main metadata node, another is standby metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
MooseFS high-availability systems provided by the invention based on active-standby switch, by using two metadata nodes, and One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service, The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
Specifically, the metadata synchronization module includes the cluster being made of odd number node, passes through fortune between the cluster Row paxos algorithms synchronize the metadata of the main metadata node.
Specifically, the fault recovery control module by rpc to the working condition of the main metadata node and standby member The working condition of back end is monitored.
In order to the above-mentioned MooseFS high availability methods based on active-standby switch be more clearly understood and based on active-standby switch MooseFS high-availability systems, be described in detail with reference to Fig. 4 and Fig. 5.
As shown in figure 4, the MooseFS high-availability systems provided by the invention based on active-standby switch mainly include first yuan of number According to node, the second metadata node, multiple back end, metadata synchronization module, fault recovery control module, ZooKeeper Cluster, wherein the first metadata node, as main master, the second metadata node is used as standby master.It is master pairs only main Outer offer service.
Metadata is synchronized to metadata synchronization module by main master, and standby master is periodically obtained from metadata synchronization module Metadata is simultaneously loaded into memory, keeps metadata consistent with main master.
Fault recovery control module monitors the health status of two master, and completes active and standby choosing with ZooKeeper interactions It lifts, finally carries out active-standby switch.Active and standby master in the present invention have the function of it is identical, according to state difference perform difference Operation(If for main master, the read-write requests at customer in response end, and metadata is synchronized to metadata synchronization module;If For standby master, then pulled in metadata updates to memory from metadata synchronization module).
Metadata synchronization module is by odd number node(Generally 3 or 5)Form a cluster.It is run between cluster Paxos algorithms.Main master needs metadata metadata synchronization module, only the metadata synchronization node of more than half is written It is written successfully, which just completes.By using paxos algorithms, while data consistency is ensured, system is greatly improved Availability.Standby master periodically from metadata synchronization module more new metadata, makes itself metadata keep one with main master It causes, and metadata is loaded into memory, service directly can be externally provided after being so main master when standby master switchings, Without from disk metadata about load to memory, failure recovery time is in second grade.By metadata synchronization module, transported in the module Row paxos algorithms, compared to the mode that metadata is directly synchronized to standby master from main master, which substantially increases The availability of system(Because the metadata synchronization module node of less than half is allowed to delay machine), also improve the safety of metadata Property.
Fault recovery control module mainly completes two functions, first, the health status of monitoring master.It is examined by rpc Survey the situation of master.It is connected second is that being established with ZooKeeper, carries out active and standby election, and switch master according to election results State.By fault recovery control module, realize that failure finds and restores automatically, avoid manpower intervention O&M.
ZooKeeper clusters are used for active and standby election, node are locked by being created on ZooKeeper, based on establishment successfully Master, it is standby master to create failure.Using the consistency of writing of ZooKeeper, ensure there are one synchronizations Master can successfully create lock node, i.e., only there are one main master for synchronization, avoid existing simultaneously two main master Caused data are inconsistent.In addition, the function of being subscribed to using ZooKeeper, standby master registers one on lock node Watcher is monitored, and when main master is out of order, which can be deleted, and standby master is stood after receiving NodeDeleted notices I.e. competition creates lock node, if creating successfully, switching is main master.
It should be noted that between the ZooKeeper clusters and the main master and standby master, there are data Communication connection is more than the second threshold when the disconnecting time between the ZooKeeper clusters and main master or standby master When being worth the time, the ZooKeeper clusters actively delete main master/ for the corresponding lock nodes of master.
By introducing ZooKeeper, write consistency using it and carry out active and standby election, in guarantee system any time there was only one A main master, prevents " fissure "(Exist simultaneously two main master)Situation, ensure that the consistency of metadata.Simultaneously Using the subscription function of ZooKeeper, it can be notified rapidly when locking node and being deleted, improve the efficiency of active and standby election.
As shown in figure 5, the process for fault recovery.When main master breaks down, fault recovery control module can Failure is found fast automaticly, and deletes the lock node on ZooKeeper, and standby master receives the notice of lock knot removal, fast Speed re-starts active and standby election by ZooKeeper, becomes new main master, externally provides service.Entire fault recovery Process automation, without manual intervention.
Due to being loaded directly into memory when standby master synchronizes metadata from metadata synchronization module, when its switching Directly need not can externally to provide service, failure recovery time is short from local metadata about load after main master.
Pass through the MooseFS high-availability systems provided by the invention based on active-standby switch so that when master breaks down Active-standby switch can be carried out fast automaticly, and without manual intervention, and failure recovery time realizes MooseFS's in second grade High Availabitity.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses Mode, however the present invention is not limited thereto.For those skilled in the art, in the essence for not departing from the present invention In the case of refreshing and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.

Claims (10)

1. a kind of MooseFS high availability methods based on active-standby switch, which is characterized in that the MooseFS includes metadata section Point and multiple back end, the metadata node includes the first metadata node and the second metadata node, described based on master The MooseFS high availability methods of standby switching include:
Determine that first metadata node is main metadata node by active and standby election, second metadata node is standby member Back end;
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata in real time and is synchronized Module, the standby metadata node periodically obtain metadata from the metadata synchronization module;
The working condition of the main metadata node is monitored in real time, and judges whether the working condition of the main metadata node is different Often;
If the working condition of the main metadata node is abnormal, control re-starts active and standby election;
If the standby metadata node is elected successfully, using the standby metadata node as new main metadata node.
2. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that described to be based on The MooseFS high availability methods of active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number Metadata is obtained according to synchronization module.
3. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that if the institute It states standby metadata node to elect successfully, then includes the standby metadata node as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
4. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that if the institute It states standby metadata node to elect successfully, then includes the standby metadata node as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end To the metadata synchronization module.
5. the MooseFS high availability methods as claimed in any of claims 1 to 4 based on active-standby switch, feature It is, if the MooseFS high availability methods based on active-standby switch are additionally included in the working condition of the main metadata node It is abnormal, then control what is carried out before the step of re-starting active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged It is abnormal to receive your working condition.
6. the MooseFS high availability methods as claimed in any of claims 1 to 4 based on active-standby switch, feature It is, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
7. the MooseFS high availability methods according to claim 6 based on active-standby switch, which is characterized in that described active and standby If election is additionally included in first metadata node and creates lock node success, first metadata node is elected successfully, It is carried out after the step of first metadata node is confirmed as main metadata node:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by It is determined as new main metadata node.
8. a kind of MooseFS high-availability systems based on active-standby switch, which is characterized in that described based on active-standby switch MooseFS high-availability systems include MooseFS, metadata synchronization module, fault recovery control module and ZooKeeper clusters, The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the Binary data node, the metadata synchronization module, the fault recovery control module and the back end are with described One metadata node is connected with second metadata node, the ZooKeeper clusters and the fault recovery control module It connects, one in first metadata node and second metadata node is main metadata node, another is standby Metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
9. the MooseFS high-availability systems according to claim 8 based on active-standby switch, which is characterized in that the member number Include the cluster that is made of odd number node according to synchronization module, by running paxos algorithms to the pivot number between the cluster It is synchronized according to the metadata of node.
10. the MooseFS high-availability systems according to claim 8 based on active-standby switch, which is characterized in that the failure Restore control module to supervise the working condition of the working condition of the main metadata node and standby metadata node by rpc It surveys.
CN201810111014.9A 2018-02-05 2018-02-05 A kind of MooseFS high availability methods and system based on active-standby switch Pending CN108173971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810111014.9A CN108173971A (en) 2018-02-05 2018-02-05 A kind of MooseFS high availability methods and system based on active-standby switch

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810111014.9A CN108173971A (en) 2018-02-05 2018-02-05 A kind of MooseFS high availability methods and system based on active-standby switch

Publications (1)

Publication Number Publication Date
CN108173971A true CN108173971A (en) 2018-06-15

Family

ID=62512764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810111014.9A Pending CN108173971A (en) 2018-02-05 2018-02-05 A kind of MooseFS high availability methods and system based on active-standby switch

Country Status (1)

Country Link
CN (1) CN108173971A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117322A (en) * 2018-08-28 2019-01-01 郑州云海信息技术有限公司 A kind of control method, system, equipment and the storage medium of server master-slave redundancy
CN109286529A (en) * 2018-10-31 2019-01-29 武汉烽火信息集成技术有限公司 A kind of method and system for restoring RabbitMQ network partition
CN110874292A (en) * 2018-08-29 2020-03-10 中车株洲电力机车研究所有限公司 Redundant display system
CN110955382A (en) * 2018-09-26 2020-04-03 华为技术有限公司 Method and device for writing data in distributed system
CN111865659A (en) * 2020-06-10 2020-10-30 新华三信息安全技术有限公司 Method and device for switching master controller and slave controller, controller and network equipment
WO2021082868A1 (en) * 2019-10-31 2021-05-06 北京金山云网络技术有限公司 Data managmenet method for distributed storage system, apparatus, and electronic device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890716A (en) * 2012-09-29 2013-01-23 南京中兴新软件有限责任公司 Distributed file system and data backup method thereof
CN104486447A (en) * 2014-12-30 2015-04-01 成都因纳伟盛科技股份有限公司 Large platform cluster system based on Big-Cluster
CN205901808U (en) * 2016-08-05 2017-01-18 国家电网公司 Accomplish distributed storage system of first data nodes automatic switch -over

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102890716A (en) * 2012-09-29 2013-01-23 南京中兴新软件有限责任公司 Distributed file system and data backup method thereof
CN104486447A (en) * 2014-12-30 2015-04-01 成都因纳伟盛科技股份有限公司 Large platform cluster system based on Big-Cluster
CN205901808U (en) * 2016-08-05 2017-01-18 国家电网公司 Accomplish distributed storage system of first data nodes automatic switch -over

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEIXIN_30680385: "https://blog.csdn.net/weixin_30680385/article/details/95142822", 《CSDN论坛》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117322A (en) * 2018-08-28 2019-01-01 郑州云海信息技术有限公司 A kind of control method, system, equipment and the storage medium of server master-slave redundancy
CN110874292A (en) * 2018-08-29 2020-03-10 中车株洲电力机车研究所有限公司 Redundant display system
CN110955382A (en) * 2018-09-26 2020-04-03 华为技术有限公司 Method and device for writing data in distributed system
CN109286529A (en) * 2018-10-31 2019-01-29 武汉烽火信息集成技术有限公司 A kind of method and system for restoring RabbitMQ network partition
CN109286529B (en) * 2018-10-31 2021-08-10 武汉烽火信息集成技术有限公司 Method and system for recovering RabbitMQ network partition
WO2021082868A1 (en) * 2019-10-31 2021-05-06 北京金山云网络技术有限公司 Data managmenet method for distributed storage system, apparatus, and electronic device
US11966305B2 (en) 2019-10-31 2024-04-23 Beijing Kingsoft Cloud Network Technology Co., Ltd. Data processing method for distributed storage system, apparatus, and electronic device
CN111865659A (en) * 2020-06-10 2020-10-30 新华三信息安全技术有限公司 Method and device for switching master controller and slave controller, controller and network equipment
CN111865659B (en) * 2020-06-10 2023-12-29 新华三信息安全技术有限公司 Main and standby controller switching method and device, controller and network equipment

Similar Documents

Publication Publication Date Title
CN108173971A (en) A kind of MooseFS high availability methods and system based on active-standby switch
CN109729129A (en) Configuration modification method, storage cluster and the computer system of storage cluster
US20120197822A1 (en) System and method for using cluster level quorum to prevent split brain scenario in a data grid cluster
AU2014312103A1 (en) Distributed file system using consensus nodes
CN111460039A (en) Relational database processing system, client, server and method
CN105471622A (en) High-availability method and system for main/standby control node switching based on Galera
CN102394914A (en) Cluster brain-split processing method and device
CN110581782A (en) Disaster tolerance data processing method, device and system
CN110971662A (en) Two-node high-availability implementation method and device based on Ceph
CN114124650A (en) Master-slave deployment method of SPTN (shortest Path bridging) network controller
CN103795572A (en) Method for switching master server and slave server and monitoring server
CN110333986B (en) Method for guaranteeing availability of redis cluster
CN107071189B (en) Connection method of communication equipment physical interface
CN115292283A (en) Master-slave high-availability switching method based on disk cabinet
CN110377487A (en) A kind of method and device handling high-availability cluster fissure
CN116185697B (en) Container cluster management method, device and system, electronic equipment and storage medium
CN105323271B (en) Cloud computing system and processing method and device thereof
WO2002001347A2 (en) Method and system for automatic re-assignment of software components of a failed host
JP5285044B2 (en) Cluster system recovery method, server, and program
CN114598593B (en) Message processing method, system, computing device and computer storage medium
CN115878361A (en) Node management method and device for database cluster and electronic equipment
CN114363350A (en) Service management system and method
CN113794765A (en) Gate load balancing method and device based on file transmission
CN115145782A (en) Server switching method, mooseFS system and storage medium
CN115242701B (en) Airport data platform cluster consumption processing method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180615