CN108173971A - A kind of MooseFS high availability methods and system based on active-standby switch - Google Patents
A kind of MooseFS high availability methods and system based on active-standby switch Download PDFInfo
- Publication number
- CN108173971A CN108173971A CN201810111014.9A CN201810111014A CN108173971A CN 108173971 A CN108173971 A CN 108173971A CN 201810111014 A CN201810111014 A CN 201810111014A CN 108173971 A CN108173971 A CN 108173971A
- Authority
- CN
- China
- Prior art keywords
- metadata
- metadata node
- node
- standby
- main
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Failover techniques
- G06F11/2028—Failover techniques eliminating a faulty processor or activating a spare
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/06—Management of faults, events, alarms or notifications
- H04L41/0654—Management of faults, events, alarms or notifications using network fault recovery
- H04L41/0668—Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
Abstract
The invention discloses a kind of MooseFS high availability methods based on active-standby switch, wherein, including:Determine that the first metadata node is main metadata node by active and standby election, the second metadata node is standby metadata node;Main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata synchronization module in real time, and standby metadata node periodically obtains metadata from the metadata synchronization module;The working condition of monitoring main metadata node in real time, and judge whether the working condition of main metadata node is abnormal;If the working condition of main metadata node is abnormal, control re-starts active and standby election;If standby metadata node is elected successfully, using standby metadata node as new main metadata node.The invention also discloses a kind of MooseFS high-availability systems based on active-standby switch.MooseFS high availability methods provided by the invention based on active-standby switch realize the High Availabitity of MooseFS.
Description
Technical field
The present invention relates to technical field of distributed memory more particularly to a kind of MooseFS High Availabitities based on active-standby switch
Method and a kind of MooseFS high-availability systems based on active-standby switch.
Background technology
MooseFS is a distributed file system increased income, and MooseFS mainly uses master/slave framework.Wherein, main table
Show metadata node, referred to as master;From expression back end, referred to as chunkserver.Metadata node is MooseFS
Core, only there are one, it manages the access of the metadata and client of entire file system to file system.Back end is deposited
Practical file data is stored up, and the synchrodata copy between different nodes, a general system there are multiple back end.It is all
The read-write requests of client are required for by metadata node.The master/slave framework of MooseFS enormously simplifies design.However,
If unique metadata node breaks down, whole system just can not externally provide service, all read-write requests of client
It will be unable to meet with a response, here it is the Single Point of Faliures of MooseFS(Single Point of Failure)Problem.MooseFS
High Availabitity refer to remaining to external offer service when metadata node breaks down.However, only there are one members by MooseFS
Back end, without standby node.There is no other metadata nodes that can take over when unique metadata node breaks down
Its work, necessarily leads to service disruption.Servicing unavailable may bring enterprise huge loss.
It is typically to pass through increasing in the prior art as shown in Figure 1, for the Single Point of Faliure problem of MooseFS metadata nodes
Add a metadata backup node, i.e. metalogger.Metadata is synchronized to metadata backup section by metadata node master
Point metalogger.When master is out of order or when metadata is lost, node copy metadata where from metalogger
Then file restarts metadata node.Fig. 2 show the course of work of the structure of Fig. 1 in fault recovery, can be seen by Fig. 2
Go out, without automatic fault discovery mechanism in the program, need manually to find failure, failure may need a period of time after occurring
It can just be found, it is long to service the not available time.And the metalogger in the program is used only to backup metadata, can not replace
For master(Do not have the function of customer in response end read-write requests).It needs to restart master during fault recovery, master will be first
Data need to take a substantial amount of time from local be loaded into memory(Minute grade), cause to service for a long time unavailable.The party
Case is big to the dependence of artificial O&M, and failure recovery time is long.
Therefore, automatic fault detection and fault recovery how to be realized to realize that the High Availabitity of MooseFS becomes this
The technical issues of field technology personnel are urgently to be resolved hurrily.
Invention content
The present invention is directed at least solving one of technical problem in the prior art, provide a kind of based on active-standby switch
MooseFS high availability methods and a kind of MooseFS high-availability systems based on active-standby switch, to solve of the prior art ask
Topic.
As the first aspect of the invention, a kind of MooseFS high availability methods based on active-standby switch are provided, wherein,
The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the
Binary data node, the MooseFS high availability methods based on active-standby switch include:
Determine that first metadata node is main metadata node by active and standby election, second metadata node is standby member
Back end;
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata in real time and is synchronized
Module, the standby metadata node periodically obtain metadata from the metadata synchronization module;
The working condition of the main metadata node is monitored in real time, and judges whether the working condition of the main metadata node is different
Often;
If the working condition of the main metadata node is abnormal, control re-starts active and standby election;
If the standby metadata node is elected successfully, using the standby metadata node as new main metadata node.
Preferably, the MooseFS high availability methods based on active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number
According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number
Metadata is obtained according to synchronization module.
Preferably, if the standby metadata node is elected successfully, using the standby metadata node as new master
Metadata node includes:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby
Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized
Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
Preferably, if the standby metadata node is elected successfully, using the standby metadata node as new master
Metadata node includes:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby
Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end
To the metadata synchronization module.
Preferably, if the MooseFS high availability methods based on active-standby switch are additionally included in the main metadata node
Working condition it is abnormal, then carried out before the step of control re-starts active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged
It is abnormal to receive your working condition.
Preferably, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first
Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
Preferably, it is described if the active and standby election is additionally included in first metadata node and creates lock node success
It is carried out after the step of first metadata node is elected successfully, and first metadata node is confirmed as main metadata node
's:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects
Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by
It is determined as new main metadata node.
As the second aspect of the invention, a kind of MooseFS high-availability systems based on active-standby switch are provided, wherein,
The MooseFS high-availability systems based on active-standby switch include MooseFS, metadata synchronization module, fault recovery control mould
Block and ZooKeeper clusters, the MooseFS include metadata node and multiple back end, and the metadata node includes
First metadata node and the second metadata node, the metadata synchronization module, the fault recovery control module and described
Back end is connect with first metadata node and second metadata node, the ZooKeeper clusters and institute
It states fault recovery control module to connect, one in first metadata node and second metadata node is pivot number
According to node, another is standby metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time
State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot
If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled
It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
Preferably, the metadata synchronization module includes the cluster being made of odd number node, passes through fortune between the cluster
Row paxos algorithms synchronize the metadata of the main metadata node.
Preferably, the fault recovery control module by rpc to the working condition of the main metadata node and standby member
The working condition of back end is monitored.
MooseFS high availability methods provided by the invention based on active-standby switch, by using two metadata nodes, and
One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out
During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service,
The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
Description of the drawings
Attached drawing is to be used to provide further understanding of the present invention, and a part for constitution instruction, with following tool
Body embodiment is used to explain the present invention, but be not construed as limiting the invention together.In the accompanying drawings:
Fig. 1 is the Organization Chart of MooseFS of the prior art.
Fig. 2 is failover procedure flow chart of the prior art.
Fig. 3 is the flow chart of the MooseFS high availability methods provided by the invention based on active-standby switch.
Fig. 4 is the Organization Chart of the MooseFS high-availability systems provided by the invention based on active-standby switch.
Fig. 5 is the failover procedure flow chart of the MooseFS high availability methods provided by the invention based on active-standby switch.
Specific embodiment
The specific embodiment of the present invention is described in detail below in conjunction with attached drawing.It should be understood that this place is retouched
The specific embodiment stated is merely to illustrate and explain the present invention, and is not intended to restrict the invention.
As the first aspect of the invention, a kind of MooseFS high availability methods based on active-standby switch are provided, wherein,
The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the
Binary data node, as shown in figure 3, the MooseFS high availability methods based on active-standby switch include:
S110, first metadata node is determined as main metadata node by active and standby election, second metadata node
For standby metadata node;
S120, the main metadata node carry out data transmission with multiple back end, and metadata are synchronized to first number in real time
According to synchronization module, the standby metadata node periodically obtains metadata from the metadata synchronization module;
S130, the working condition for monitoring the main metadata node in real time, and judge the working condition of the main metadata node
It is whether abnormal;
If the working condition of S140, the main metadata node is abnormal, control re-starts active and standby election;
If S150, the standby metadata node are elected successfully, using the standby metadata node as new main metadata node.
MooseFS high availability methods provided by the invention based on active-standby switch, by using two metadata nodes, and
One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out
During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service,
The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
As one kind, specifically embodiment, the MooseFS high availability methods based on active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number
According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number
Metadata is obtained according to synchronization module.
As another specifically embodiment, if the standby metadata node is elected successfully, by the standby member
Back end includes as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby
Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized
Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
It, will the standby member if the standby metadata node is elected successfully as another specifically embodiment
Back end includes as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby
Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end
To the metadata synchronization module.
Specifically, if the MooseFS high availability methods based on active-standby switch are additionally included in the main metadata node
Working condition it is abnormal, then carried out before the step of control re-starts active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged
It is abnormal to receive your working condition.
It is understood that heartbeat packet is sent to the main metadata node by timing, then according to whether first
The feedback information of the heartbeat packet is received in threshold time to judge the main metadata node whether in normal work shape
State.
Specifically, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first
Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
Further specifically, if the active and standby election is additionally included in first metadata node and creates lock node success,
After the step of then first metadata node is elected successfully, and first metadata node is confirmed as main metadata node
It carries out:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects
Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by
It is determined as new main metadata node.
As the second aspect of the invention, a kind of MooseFS high-availability systems based on active-standby switch are provided, wherein,
As shown in figure 4, the MooseFS high-availability systems based on active-standby switch include MooseFS, metadata synchronization module, failure
Restore control module and ZooKeeper clusters, the MooseFS includes metadata node and multiple back end, the member number
Include the first metadata node and the second metadata node, the metadata synchronization module, fault recovery control according to node
Module and the back end are connect with first metadata node and second metadata node, described
ZooKeeper clusters are connect with the fault recovery control module, first metadata node and the second metadata section
One in point is main metadata node, another is standby metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time
State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot
If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled
It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
MooseFS high-availability systems provided by the invention based on active-standby switch, by using two metadata nodes, and
One of metadata node is as main metadata node, another is as standby metadata node.When main metadata node goes out
During failure, failure can be found automatically, and standby metadata node is switched to main metadata node, continue externally to provide service,
The High Availabitity of MooseFS is realized, entire failover procedure is without manual intervention, and failure recovery time is short.
Specifically, the metadata synchronization module includes the cluster being made of odd number node, passes through fortune between the cluster
Row paxos algorithms synchronize the metadata of the main metadata node.
Specifically, the fault recovery control module by rpc to the working condition of the main metadata node and standby member
The working condition of back end is monitored.
In order to the above-mentioned MooseFS high availability methods based on active-standby switch be more clearly understood and based on active-standby switch
MooseFS high-availability systems, be described in detail with reference to Fig. 4 and Fig. 5.
As shown in figure 4, the MooseFS high-availability systems provided by the invention based on active-standby switch mainly include first yuan of number
According to node, the second metadata node, multiple back end, metadata synchronization module, fault recovery control module, ZooKeeper
Cluster, wherein the first metadata node, as main master, the second metadata node is used as standby master.It is master pairs only main
Outer offer service.
Metadata is synchronized to metadata synchronization module by main master, and standby master is periodically obtained from metadata synchronization module
Metadata is simultaneously loaded into memory, keeps metadata consistent with main master.
Fault recovery control module monitors the health status of two master, and completes active and standby choosing with ZooKeeper interactions
It lifts, finally carries out active-standby switch.Active and standby master in the present invention have the function of it is identical, according to state difference perform difference
Operation(If for main master, the read-write requests at customer in response end, and metadata is synchronized to metadata synchronization module;If
For standby master, then pulled in metadata updates to memory from metadata synchronization module).
Metadata synchronization module is by odd number node(Generally 3 or 5)Form a cluster.It is run between cluster
Paxos algorithms.Main master needs metadata metadata synchronization module, only the metadata synchronization node of more than half is written
It is written successfully, which just completes.By using paxos algorithms, while data consistency is ensured, system is greatly improved
Availability.Standby master periodically from metadata synchronization module more new metadata, makes itself metadata keep one with main master
It causes, and metadata is loaded into memory, service directly can be externally provided after being so main master when standby master switchings,
Without from disk metadata about load to memory, failure recovery time is in second grade.By metadata synchronization module, transported in the module
Row paxos algorithms, compared to the mode that metadata is directly synchronized to standby master from main master, which substantially increases
The availability of system(Because the metadata synchronization module node of less than half is allowed to delay machine), also improve the safety of metadata
Property.
Fault recovery control module mainly completes two functions, first, the health status of monitoring master.It is examined by rpc
Survey the situation of master.It is connected second is that being established with ZooKeeper, carries out active and standby election, and switch master according to election results
State.By fault recovery control module, realize that failure finds and restores automatically, avoid manpower intervention O&M.
ZooKeeper clusters are used for active and standby election, node are locked by being created on ZooKeeper, based on establishment successfully
Master, it is standby master to create failure.Using the consistency of writing of ZooKeeper, ensure there are one synchronizations
Master can successfully create lock node, i.e., only there are one main master for synchronization, avoid existing simultaneously two main master
Caused data are inconsistent.In addition, the function of being subscribed to using ZooKeeper, standby master registers one on lock node
Watcher is monitored, and when main master is out of order, which can be deleted, and standby master is stood after receiving NodeDeleted notices
I.e. competition creates lock node, if creating successfully, switching is main master.
It should be noted that between the ZooKeeper clusters and the main master and standby master, there are data
Communication connection is more than the second threshold when the disconnecting time between the ZooKeeper clusters and main master or standby master
When being worth the time, the ZooKeeper clusters actively delete main master/ for the corresponding lock nodes of master.
By introducing ZooKeeper, write consistency using it and carry out active and standby election, in guarantee system any time there was only one
A main master, prevents " fissure "(Exist simultaneously two main master)Situation, ensure that the consistency of metadata.Simultaneously
Using the subscription function of ZooKeeper, it can be notified rapidly when locking node and being deleted, improve the efficiency of active and standby election.
As shown in figure 5, the process for fault recovery.When main master breaks down, fault recovery control module can
Failure is found fast automaticly, and deletes the lock node on ZooKeeper, and standby master receives the notice of lock knot removal, fast
Speed re-starts active and standby election by ZooKeeper, becomes new main master, externally provides service.Entire fault recovery
Process automation, without manual intervention.
Due to being loaded directly into memory when standby master synchronizes metadata from metadata synchronization module, when its switching
Directly need not can externally to provide service, failure recovery time is short from local metadata about load after main master.
Pass through the MooseFS high-availability systems provided by the invention based on active-standby switch so that when master breaks down
Active-standby switch can be carried out fast automaticly, and without manual intervention, and failure recovery time realizes MooseFS's in second grade
High Availabitity.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses
Mode, however the present invention is not limited thereto.For those skilled in the art, in the essence for not departing from the present invention
In the case of refreshing and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.
Claims (10)
1. a kind of MooseFS high availability methods based on active-standby switch, which is characterized in that the MooseFS includes metadata section
Point and multiple back end, the metadata node includes the first metadata node and the second metadata node, described based on master
The MooseFS high availability methods of standby switching include:
Determine that first metadata node is main metadata node by active and standby election, second metadata node is standby member
Back end;
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to metadata in real time and is synchronized
Module, the standby metadata node periodically obtain metadata from the metadata synchronization module;
The working condition of the main metadata node is monitored in real time, and judges whether the working condition of the main metadata node is different
Often;
If the working condition of the main metadata node is abnormal, control re-starts active and standby election;
If the standby metadata node is elected successfully, using the standby metadata node as new main metadata node.
2. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that described to be based on
The MooseFS high availability methods of active-standby switch further include:
If the main metadata node is elected successfully, return and perform the main metadata node and multiple back end into line number
According to transmission, and metadata is synchronized to the metadata synchronization module in real time, the standby metadata node is periodically from first number
Metadata is obtained according to synchronization module.
3. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that if the institute
It states standby metadata node to elect successfully, then includes the standby metadata node as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node restores normal, then will be described standby
Metadata node, using former main metadata node as new standby metadata node, is returned and is performed as new main metadata node
The main metadata node carries out data transmission with multiple back end, and metadata is synchronized to the metadata in real time and is synchronized
Module, the standby metadata node periodically obtain metadata from the metadata synchronization module.
4. the MooseFS high availability methods according to claim 1 based on active-standby switch, which is characterized in that if the institute
It states standby metadata node to elect successfully, then includes the standby metadata node as new main metadata node:
If the standby metadata node is elected successfully, and the working condition of former main metadata node is still exception, then will be described standby
Metadata node carries out data transmission, and in real time synchronize metadata as new main metadata node with multiple back end
To the metadata synchronization module.
5. the MooseFS high availability methods as claimed in any of claims 1 to 4 based on active-standby switch, feature
It is, if the MooseFS high availability methods based on active-standby switch are additionally included in the working condition of the main metadata node
It is abnormal, then control what is carried out before the step of re-starting active and standby election:
Timing sends heartbeat packet to the main metadata node, and receives feedback information;
If the heartbeat inter-packet gap first threshold time that current time sends does not receive feedback information, the main metadata is judged
It is abnormal to receive your working condition.
6. the MooseFS high availability methods as claimed in any of claims 1 to 4 based on active-standby switch, feature
It is, the active and standby election includes:
First metadata node and second metadata node compete establishment lock node simultaneously;
If first metadata node creates lock node success, first metadata node is elected successfully, described first
Metadata node is confirmed as main metadata node;
Conversely, second metadata node is elected successfully, second metadata node is confirmed as main metadata node.
7. the MooseFS high availability methods according to claim 6 based on active-standby switch, which is characterized in that described active and standby
If election is additionally included in first metadata node and creates lock node success, first metadata node is elected successfully,
It is carried out after the step of first metadata node is confirmed as main metadata node:
Second metadata node sets on the lock node and monitors;
If the working condition of the main metadata node is abnormal, the lock node is deleted, and second metadata node connects
Competition creates new lock node after receiving the notice of the monitoring;
If second metadata node creates new lock node success, second metadata node is elected successfully, and by
It is determined as new main metadata node.
8. a kind of MooseFS high-availability systems based on active-standby switch, which is characterized in that described based on active-standby switch
MooseFS high-availability systems include MooseFS, metadata synchronization module, fault recovery control module and ZooKeeper clusters,
The MooseFS includes metadata node and multiple back end, and the metadata node includes the first metadata node and the
Binary data node, the metadata synchronization module, the fault recovery control module and the back end are with described
One metadata node is connected with second metadata node, the ZooKeeper clusters and the fault recovery control module
It connects, one in first metadata node and second metadata node is main metadata node, another is standby
Metadata node;
The main metadata node for the back end carrying out data transmission is used to that metadata to be synchronized to institute in real time
State metadata synchronization module;
The standby metadata node is for periodically from metadata synchronization module acquisition metadata;
The fault recovery control module is used to monitor the working condition of the main metadata node in real time, and judges the pivot
If whether the working condition of back end is extremely and abnormal for the working condition of the main metadata node, weight is controlled
It newly carries out active and standby election and carries out active-standby switch for being controlled according to election results;
The ZooKeeper clusters are used to carry out active and standby election.
9. the MooseFS high-availability systems according to claim 8 based on active-standby switch, which is characterized in that the member number
Include the cluster that is made of odd number node according to synchronization module, by running paxos algorithms to the pivot number between the cluster
It is synchronized according to the metadata of node.
10. the MooseFS high-availability systems according to claim 8 based on active-standby switch, which is characterized in that the failure
Restore control module to supervise the working condition of the working condition of the main metadata node and standby metadata node by rpc
It surveys.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810111014.9A CN108173971A (en) | 2018-02-05 | 2018-02-05 | A kind of MooseFS high availability methods and system based on active-standby switch |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810111014.9A CN108173971A (en) | 2018-02-05 | 2018-02-05 | A kind of MooseFS high availability methods and system based on active-standby switch |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108173971A true CN108173971A (en) | 2018-06-15 |
Family
ID=62512764
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810111014.9A Pending CN108173971A (en) | 2018-02-05 | 2018-02-05 | A kind of MooseFS high availability methods and system based on active-standby switch |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108173971A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117322A (en) * | 2018-08-28 | 2019-01-01 | 郑州云海信息技术有限公司 | A kind of control method, system, equipment and the storage medium of server master-slave redundancy |
CN109286529A (en) * | 2018-10-31 | 2019-01-29 | 武汉烽火信息集成技术有限公司 | A kind of method and system for restoring RabbitMQ network partition |
CN110874292A (en) * | 2018-08-29 | 2020-03-10 | 中车株洲电力机车研究所有限公司 | Redundant display system |
CN110955382A (en) * | 2018-09-26 | 2020-04-03 | 华为技术有限公司 | Method and device for writing data in distributed system |
CN111865659A (en) * | 2020-06-10 | 2020-10-30 | 新华三信息安全技术有限公司 | Method and device for switching master controller and slave controller, controller and network equipment |
WO2021082868A1 (en) * | 2019-10-31 | 2021-05-06 | 北京金山云网络技术有限公司 | Data managmenet method for distributed storage system, apparatus, and electronic device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890716A (en) * | 2012-09-29 | 2013-01-23 | 南京中兴新软件有限责任公司 | Distributed file system and data backup method thereof |
CN104486447A (en) * | 2014-12-30 | 2015-04-01 | 成都因纳伟盛科技股份有限公司 | Large platform cluster system based on Big-Cluster |
CN205901808U (en) * | 2016-08-05 | 2017-01-18 | 国家电网公司 | Accomplish distributed storage system of first data nodes automatic switch -over |
-
2018
- 2018-02-05 CN CN201810111014.9A patent/CN108173971A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890716A (en) * | 2012-09-29 | 2013-01-23 | 南京中兴新软件有限责任公司 | Distributed file system and data backup method thereof |
CN104486447A (en) * | 2014-12-30 | 2015-04-01 | 成都因纳伟盛科技股份有限公司 | Large platform cluster system based on Big-Cluster |
CN205901808U (en) * | 2016-08-05 | 2017-01-18 | 国家电网公司 | Accomplish distributed storage system of first data nodes automatic switch -over |
Non-Patent Citations (1)
Title |
---|
WEIXIN_30680385: "https://blog.csdn.net/weixin_30680385/article/details/95142822", 《CSDN论坛》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109117322A (en) * | 2018-08-28 | 2019-01-01 | 郑州云海信息技术有限公司 | A kind of control method, system, equipment and the storage medium of server master-slave redundancy |
CN110874292A (en) * | 2018-08-29 | 2020-03-10 | 中车株洲电力机车研究所有限公司 | Redundant display system |
CN110955382A (en) * | 2018-09-26 | 2020-04-03 | 华为技术有限公司 | Method and device for writing data in distributed system |
CN109286529A (en) * | 2018-10-31 | 2019-01-29 | 武汉烽火信息集成技术有限公司 | A kind of method and system for restoring RabbitMQ network partition |
CN109286529B (en) * | 2018-10-31 | 2021-08-10 | 武汉烽火信息集成技术有限公司 | Method and system for recovering RabbitMQ network partition |
WO2021082868A1 (en) * | 2019-10-31 | 2021-05-06 | 北京金山云网络技术有限公司 | Data managmenet method for distributed storage system, apparatus, and electronic device |
US11966305B2 (en) | 2019-10-31 | 2024-04-23 | Beijing Kingsoft Cloud Network Technology Co., Ltd. | Data processing method for distributed storage system, apparatus, and electronic device |
CN111865659A (en) * | 2020-06-10 | 2020-10-30 | 新华三信息安全技术有限公司 | Method and device for switching master controller and slave controller, controller and network equipment |
CN111865659B (en) * | 2020-06-10 | 2023-12-29 | 新华三信息安全技术有限公司 | Main and standby controller switching method and device, controller and network equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108173971A (en) | A kind of MooseFS high availability methods and system based on active-standby switch | |
CN109729129A (en) | Configuration modification method, storage cluster and the computer system of storage cluster | |
US20120197822A1 (en) | System and method for using cluster level quorum to prevent split brain scenario in a data grid cluster | |
AU2014312103A1 (en) | Distributed file system using consensus nodes | |
CN111460039A (en) | Relational database processing system, client, server and method | |
CN105471622A (en) | High-availability method and system for main/standby control node switching based on Galera | |
CN102394914A (en) | Cluster brain-split processing method and device | |
CN110581782A (en) | Disaster tolerance data processing method, device and system | |
CN110971662A (en) | Two-node high-availability implementation method and device based on Ceph | |
CN114124650A (en) | Master-slave deployment method of SPTN (shortest Path bridging) network controller | |
CN103795572A (en) | Method for switching master server and slave server and monitoring server | |
CN110333986B (en) | Method for guaranteeing availability of redis cluster | |
CN107071189B (en) | Connection method of communication equipment physical interface | |
CN115292283A (en) | Master-slave high-availability switching method based on disk cabinet | |
CN110377487A (en) | A kind of method and device handling high-availability cluster fissure | |
CN116185697B (en) | Container cluster management method, device and system, electronic equipment and storage medium | |
CN105323271B (en) | Cloud computing system and processing method and device thereof | |
WO2002001347A2 (en) | Method and system for automatic re-assignment of software components of a failed host | |
JP5285044B2 (en) | Cluster system recovery method, server, and program | |
CN114598593B (en) | Message processing method, system, computing device and computer storage medium | |
CN115878361A (en) | Node management method and device for database cluster and electronic equipment | |
CN114363350A (en) | Service management system and method | |
CN113794765A (en) | Gate load balancing method and device based on file transmission | |
CN115145782A (en) | Server switching method, mooseFS system and storage medium | |
CN115242701B (en) | Airport data platform cluster consumption processing method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180615 |