CN106960060B - Database cluster management method and device - Google Patents

Database cluster management method and device Download PDF

Info

Publication number
CN106960060B
CN106960060B CN201710228209.7A CN201710228209A CN106960060B CN 106960060 B CN106960060 B CN 106960060B CN 201710228209 A CN201710228209 A CN 201710228209A CN 106960060 B CN106960060 B CN 106960060B
Authority
CN
China
Prior art keywords
node
database cluster
cluster
information
started
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710228209.7A
Other languages
Chinese (zh)
Other versions
CN106960060A (en
Inventor
刘先攀
于晓峰
于芝涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Hisense Media Network Technology Co Ltd
Original Assignee
Qingdao Hisense Media Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Hisense Media Network Technology Co Ltd filed Critical Qingdao Hisense Media Network Technology Co Ltd
Priority to CN201710228209.7A priority Critical patent/CN106960060B/en
Publication of CN106960060A publication Critical patent/CN106960060A/en
Application granted granted Critical
Publication of CN106960060B publication Critical patent/CN106960060B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The utility model discloses a management method and a device of a database cluster, wherein, an agent entity is arranged in each node of the database cluster, each agent entity controls the node where the agent entity is respectively arranged, and the management method of the database cluster is applied to the agent entities; the method comprises the following steps: when the node is not started, acquiring node information of a second node; the second nodes are all other nodes except the node in the database cluster; and if judging that no second node meeting preset conditions exists according to the node information of the second node, starting the node with the first node of the database cluster. According to the management method for the database cluster, provided by the embodiment of the disclosure, one agent is deployed in each node, and the agent automatically determines and starts the cluster head node according to the information of each node, so that manual participation is not needed, and the cluster starting efficiency is greatly improved.

Description

Database cluster management method and device
Technical Field
The present disclosure relates to the field of database technologies, and in particular, to a method and an apparatus for managing a database cluster.
Background
The Galera Cluster is a set of multi-master MySQ L (relational database) Cluster based on synchronous replication, is simple to use, has no single point of failure, has high availability, can well ensure the safety and the random expansion of data when the service is continuously increased, and is called as the most advanced open source database Cluster in the world.
The architecture diagram of the Galera Cluster is shown in fig. 1, a client can be connected with any node in the Cluster, each node can perform reading and writing, if the writing is successful, all servers do not need to be rolled back, the data consistency of all servers is ensured, and all servers are synchronously updated in real time. When the GaleraCluster is started, a node is started in a new node mode, then other nodes are added into the cluster, and the newly added node selects a node from the cluster to synchronize data. Galera Cluster maintains a self-incrementing seqno (serial number), each node having the same seqno. When the nodes in the cluster stop, seqno is persisted to the grant.dat file, the last stopped node in the cluster also sets a mark in the grant.dat file, the last stopped node is started as a new node when the cluster is started next time, and then other nodes are added into the cluster.
It should be noted that the Galera Cluster itself is not automatic nor highly available, for example: when a cluster is just created, a node is manually configured to start in a mode of a cluster head node, and then other nodes are added into the cluster; when nodes in the cluster are separated from the cluster, a user cannot perceive the nodes; the nodes which are separated from the cluster need to be added into the cluster manually; when the whole cluster dies, the node with the largest seqno must be manually found to start the first node of the cluster, and then other nodes are added. Therefore, Galera Cluster management needs manual participation, and automation cannot be realized.
Disclosure of Invention
In order to solve the problems that the management of the Galera Cluster needs manual participation and cannot realize automation in the related technology, the invention provides a management method of a database Cluster.
The present disclosure provides a management method of a database cluster, wherein each node of the database cluster is provided with an agent entity, each agent entity controls the node where the agent entity is located, and the management method of the database cluster is applied to the agent entities; the method comprises the following steps:
when the node is not started, acquiring node information of a second node; the second nodes are all other nodes except the node in the database cluster;
and if judging that no second node meeting preset conditions exists according to the node information of the second node, starting the node with the first node of the database cluster.
The present disclosure also provides a management apparatus for a database cluster, where each node of the database cluster is provided with an agent entity, each agent entity controls the node where the agent entity is located, and the management apparatus for the database cluster is applied to the agent entities; the device comprises:
the information acquisition module is used for acquiring the node information of the second node when the node is not started; the second nodes are all other nodes except the node in the database cluster;
and the node starting module is used for starting the node with the first node of the database cluster when judging that the second node meeting the preset condition does not exist according to the node information of the second node.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
the agent entity is arranged in each node, each agent entity controls the node where the agent entity is located, and the agent entity can automatically determine the cluster head node and start the cluster head node according to the information of each node without manual participation, so that the cluster starting efficiency is greatly improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
FIG. 1 is an architecture diagram of a Galera Cluster provided by the prior art;
FIG. 2 is a flow diagram illustrating a method for management of a database cluster in accordance with an exemplary embodiment;
FIG. 3 is a deployment diagram illustrating a database cluster in accordance with an exemplary embodiment;
FIG. 4 is a flow chart illustrating a method of managing a database cluster in accordance with another exemplary embodiment;
FIG. 5 is a detailed flow chart diagram illustrating a method of managing a database cluster in accordance with yet another illustrative embodiment;
FIG. 6 is a block diagram illustrating a server in accordance with an exemplary embodiment;
fig. 7 is a block diagram illustrating a management apparatus of a database cluster in accordance with an example embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
FIG. 2 is a flow chart illustrating a method of managing a database cluster in accordance with an exemplary embodiment. A database cluster is a system that uses at least two or more database servers to form a virtual single database logical image, such as a single database system, to provide transparent data services to clients. Each server of the database cluster is called a node, and each node (i.e. each server of the database cluster) is provided with an agent entity (agent), and each agent entity controls the node where the agent entity is located. The Agent refers to a software or hardware entity capable of performing autonomous activities, and the Agent can execute the management method of the database cluster provided by the embodiment of the disclosure.
The application range and the execution subject of the management method of the database Cluster can be applied to a server 500 with a structure shown in fig. 6, for example, the database Cluster can include a plurality of servers 500 with a structure shown in fig. 6, and one agent is respectively deployed on each server 500. the server 500 to which each agent belongs is managed by the agent on each server 500. specifically, the database Cluster can refer to a galla Cluster. as shown in fig. 3, one agent is deployed on each database node for the deployment diagram of the database Cluster, in addition, an L VS (linux virtual server) can be deployed at the upper layer of the database Cluster for realizing load balancing.
As shown in fig. 2, the database cluster management method may be performed by a proxy entity of a server, and the method may include the following steps:
step S210: when the node is not started, acquiring node information of a second node; the second nodes are all other nodes except the node in the database cluster;
specifically, the agent deployed on each node controls the start and stop of each server. For example, the 5 database nodes shown in fig. 3 are referred to as node 1, node 2 … … and node 5, respectively, for convenience of differentiation. Assume that the node is node 1 in fig. 3, and the second nodes are nodes 2, 3, 4, 5 other than node 1. When the node (node 1) is not started, the agent deployed on the node 1 acquires the node information of the second nodes (nodes 2, 3, 4 and 5). Wherein the node information includes: the operational status of the second node (including started, starting, not started), sequence number, priority, etc.
Step S230: and if judging that no second node meeting preset conditions exists according to the node information of the second node, starting the node with the first node of the database cluster.
Specifically, when the node (node 1) is not started, the agent deployed on the node 1 determines whether a second node meeting the preset condition exists according to the node information of the second node (nodes 2, 3, 4, and 5). For example, when there is no second node that has started or is starting, the node (node 1) is started as the first node of the database cluster.
Optionally, the second node meeting the preset condition may include: the second node is in a started state, the second node is in a starting phase, the second node with the sequence number larger than that of the second node, and the second node with the priority higher than that of the second node. When the second node meeting the preset condition does not exist, the node can be considered as the node with the highest priority and the largest serial number, so that the node can be determined as the first node of the cluster, and the node can be started by the first node of the database cluster. Then, other nodes except the node are added into the cluster. The newly joined node selects a node from the cluster to synchronize data.
It should be noted that the Galera Cluster itself is not automatic, and when the Cluster is created, a node needs to be manually configured, started as a first node, and then other nodes are added into the Cluster; when the whole cluster is down, the first node of the cluster needs to be manually searched for starting, and then other nodes are added, so that the cluster starting efficiency is low. According to the management method for the database cluster, provided by the embodiment of the disclosure, one agent is deployed in each node, and the agent automatically determines and starts the cluster head node according to the information of each node, so that manual participation is not needed, and the cluster starting efficiency is greatly improved.
FIG. 4 is a flowchart illustrating a method of managing a database cluster, according to another example embodiment. As shown in fig. 4, in addition to step S210 and step S230 of the corresponding embodiment of fig. 3, the method may further include the following steps:
step S250: and if judging that the second node in the started state exists according to the node information of the second node, synchronizing the data of the second node to the local node, and setting the state of the local node to be the started state.
Specifically, the address of each node may be obtained from the configuration file of each node, and then all database nodes (i.e., second nodes) except the node may be traversed according to the address of each node to obtain node information of the nodes. When it is determined that there is a second node in the activated state according to the node information of the second node, if there is a second node whose state is "RUNNING", it is determined that the node is not the cluster head node. And then, adding the node into the cluster, synchronizing the data of the second node to the node, and setting the state of the node to be the started state, such as setting the state of the node to be RUNNING ". And if the front node fails to join the cluster, sending an alarm signal.
It should be noted that, when a certain node goes down, through the steps S210 and S250, the node that goes down can be added to the cluster in time according to the node information of other nodes, and the data of other nodes is synchronized to the node that goes down, so that the node that goes down does not need to be manually started, and the node starting efficiency is improved.
Further, on the basis of the above embodiment, the method may further include the steps of:
after the node is started, monitoring the state information of the node regularly;
and when the state information is abnormal, sending out an alarm signal.
The Agent of the node regularly monitors the state information of the node, and the monitored state is as follows:
wsrep _ ready (when the value is ON, a node can accept data of other nodes) ═ ON
wsrep _ connected (when the value is ON, it means that a node can connect at least one node in the cluster) ═ ON
srep _ local _ state _ comment (normal steady state: Joining, Waiting on SST, Joined, synchronized or Donor. other states are temporary or abnormal) is equal to Joined, synchronized or Donor
wsrep _ cluster _ status (the normal state value of a node is: primary) ═ primary
The wsrep _ cluster _ state _ uuid of a node (the state variables that should be the same for each node) is at least the same value as the wsrep _ cluster _ state _ uuid half of the cluster.
It should be noted that when the monitored state variable is abnormal, that is, when the condition is not satisfied, if wsrep _ ready is OFF or wsrep _ connected is OFF, it is considered that the state information is abnormal, and an alarm signal needs to be sent.
If the state variables wsrep _ cluster _ size (the number of nodes in the cluster), wsrep _ cluster _ conf _ id (the number of cluster changes), wsrep _ cluster _ state _ uuid (the same state variables for each node) of the node are found to be inconsistent with the state variables of other half nodes, the node can be considered to be abnormal and an alarm signal needs to be sent.
Or, the exception needs to be sent out if wsrep _ local _ recv _ queue _ avg (the average length of the receiving queue of the last state query so far), wsrep _ flow _ control _ used (the percentage of time that the node is suspended due to flow control of the state variable query so far last time), and wsrep _ local _ send _ queue _ avg (the average length of the sending queue of the last state query so far) exceed the configured threshold value in the configuration file, and if the threshold value exceeds the configured threshold value, the exception is considered to occur. The state information of all nodes in the cluster can be checked through the command line, and the main state information can be checked.
If necessary, the L VS node at the upper layer of the database cluster may be queried for an abnormal condition by checking the item wsrep _ ready (when the value is ON, the node can accept the data of other nodes) ═ ON
wsrep _ connected (when the value is ON, it means that a node can connect at least one node in the cluster) ═ ON
srep _ local _ state _ comment (normal steady state: Joining, Waiting on SST, Joined, synchronized or Donor. other states are temporary or abnormal) is equal to Joined, synchronized or Donor
wsrep _ cluster _ status (the normal state value of a node is: primary) ═ primary
When the four conditions are simultaneously met, the L VS node is considered to have no abnormity, otherwise, an alarm can be given.
Further, on the basis of any one of the above embodiments, the method may further include the following steps:
after the node is started, regularly detecting whether the node is on line or not;
and if the node information of the second node is not on line, repeatedly executing the step of obtaining the node information of the second node.
It should be noted that, in the prior art, when there is a node in the cluster that is out of the cluster, the user cannot perceive it, and the node that is out of the cluster needs to join the cluster manually to want to join the cluster again. The agent deployed at each node in the embodiment of the disclosure can detect whether the node is online at regular time. If the agent detects whether the node is online every 1 second, if the node is not online, the steps S210 and S230 or the steps S210 and S250 may be repeated. When the second node meeting the preset condition does not exist, the offline node can be started as the cluster head node. When there is a second node that has already started up, the node that is offline may be rejoined in the cluster, and data may be synchronized to the node that is offline from the second node.
Fig. 5 is a detailed flowchart illustrating a method of managing a database cluster according to another exemplary embodiment. And the agent deployed in each node respectively controls the node where the agent is located. The node where the current Agent is located is called the local node, and the nodes where other agents are located are called other nodes. And after agents in each node are started up, the whole cluster is considered to be started up. To start the Galera Cluster for example,
the specific process is as follows:
1. and if the node is started, exiting.
2. And setting the node to an 'INIT' state.
3. And acquiring the address of each node from the configuration file of each node.
4. The list of cluster nodes (nodes other than the own node) is traversed.
5. And acquiring the states of other nodes, and if the states of other nodes are not acquired, detecting the states of other nodes in a circulating manner until the states of other nodes are known.
(a) If a node state is started, adding the node into the cluster (if the node fails to be added, the node state is set to 'RUNNING' after the node is successfully added, and the process is finished.
(b) If a node is found to be STARTING a new cluster (state 'STARTING _ C L USTER'), then continue and wait for the next re-traversal.
(c) If a node is in an initialization state (the state is 'INIT') or a 'STOP' (STOP) state, and the seqno (serial number) is higher than that of the node (if the seqno is not acquired, the loop detection is carried out until the acquisition is finished), and the next re-traversal is continued.
(d) If a node is in an initialization ('INIT') or 'STOP' state and seqno is equal to the node (if the seqno is not obtained, the cyclic detection is carried out until the seqno is obtained), judging the priority of the node (the priority is reduced according to the sequence configured in the configuration file);
if the priority is higher than that of the node, continuing to wait for next traversal;
and if no node has high priority, the counting count is increased by one.
(e) And judging whether the counting count reaches the maximum value (the number of the cluster nodes is reduced by one), and if so, jumping out of the loop. So far, the comparison between the node and all other nodes is completed.
6. Setting the state of the node as 'STARTING _ C L USTER', then STARTING the node in the mode of the first node of the new cluster (STARTING failure alarm, and retrying STARTING till success), and setting the state as 'RUNNING' after success, and finishing the process.
After the node is started, other agents control the nodes where the agents are respectively located to join the cluster, and then the database cluster can be started.
For stopping a cluster, the embodiment of the present disclosure provides two stopping manners: and stopping the agent entity or stopping the cluster node first and then stopping the agent entity. For the Galera Cluster, the management system Galera _ agent which only stops the Cluster is the first; the other is to stop the cluster node before stopping the management system Galera _ agent.
According to the technical scheme provided by the embodiment of the disclosure, the agent can automatically start the database cluster, alarms when a node in the cluster is offline, automatically pulls up the node, and automatically finds the node with the largest seqno and starts the whole cluster when the whole cluster is down, so that the reliability is improved, the automation is improved, and the starting efficiency is improved.
Referring to fig. 6, fig. 6 is a schematic diagram of a server structure provided in an embodiment of the present disclosure, the server 500 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 522 (e.g., one or more processors) and a memory 532, one or more storage media 530 (e.g., one or more mass storage devices) storing an application program 542 or data 544, wherein the memory 532 and the storage media 530 may be transient storage or persistent storage, the program stored in the storage media 530 may include one or more modules (not shown), each of which may include a series of instruction operations on the server, further, the central processing unit 522 may be configured to communicate with the storage media 530, perform a series of instruction operations in the storage media 530 on the server 500, the server 500 may further include one or more power supplies 526, one or more wired or wireless network interfaces 550, one or more input/output interfaces 558, and/or one or more operation systems, such as a system operating system, a ttx operation system, a ttx (e.g. a ttx operation system, a ttx (e.g. a ttx, a ttx operation system, a ttx (r) may be implemented in an embodiment of the above-type ttt 5, a ttt, a ttx (see ttt) and/or ttx (see tt.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The following is an embodiment of the apparatus of the present disclosure, which may be used to execute an embodiment of a method for managing a database cluster executed by an agent entity in the server 500 of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the management method of the database cluster of the present disclosure.
Fig. 7 is a block diagram illustrating a management apparatus of a database cluster, which may be used in the server 500 of the implementation environment shown in fig. 6, according to an exemplary embodiment, and performs all or part of the steps in the above method embodiments. As shown in fig. 7, the management means of the database cluster includes but is not limited to: an information acquisition module 710 and a node start module 730;
an information obtaining module 710, configured to obtain node information of a second node when the node is not started; the second nodes are all other nodes except the node in the database cluster;
and a node starting module 730, configured to start the node with the first node of the database cluster according to the node information of the second node when it is determined that there is no second node that meets a preset condition.
The implementation process of the functions and actions of each module in the above device is specifically detailed in the implementation process of the corresponding step in the management method of the above database cluster, and is not described herein again.
The information acquisition module 710 may be, for example, one of the physical structures of the wired or wireless network interface 550 shown in fig. 6.
The node starting module 730 may also be a functional module, configured to execute corresponding steps in the database cluster management method. It is understood that these modules may be implemented in hardware, software, or a combination of both. When implemented in hardware, these modules may be implemented as one or more hardware modules, such as one or more application specific integrated circuits. When implemented in software, the modules may be implemented as one or more computer programs executing on one or more processors, such as programs stored in memory 532 for execution by central processor 522 of FIG. 6.
Optionally, the second node meeting the preset condition includes but is not limited to:
the second node is in a started state, the second node is in a starting phase, the second node with the sequence number larger than that of the second node, and the second node with the priority higher than that of the second node.
On the basis of the above device embodiment, the device may further include but is not limited to:
and the node adding module is used for synchronizing the data of the second node to the node and setting the state of the node to be the started state when judging and knowing that the second node in the started state exists according to the node information of the second node.
On the basis of the above device embodiment, the device may further include but is not limited to:
and the information monitoring module is used for monitoring the state information of the node at regular time after the node is started, and sending an alarm signal when the state information is abnormal.
On the basis of the above device embodiment, the device may further include but is not limited to:
the online detection module is used for detecting whether the node is online or not at regular time after the node is started; and when not online, the information obtaining module 710 continues to obtain the node information of the second node.
In an exemplary embodiment, the present disclosure also provides another management apparatus for a database cluster, which may be used in the server 500 of the implementation environment shown in fig. 6 to perform all or part of the steps in the method embodiment. The device comprises:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform:
when the node is not started, acquiring node information of a second node; the second nodes are all other nodes except the node in the database cluster;
and if judging that no second node meeting preset conditions exists according to the node information of the second node, starting the node with the first node of the database cluster.
The specific manner in which the processor of the apparatus performs operations in this embodiment has been described in detail in relation to an embodiment of the method for managing a database cluster, and will not be described in detail here.
Optionally, a storage medium is also provided, which is a computer-readable storage medium, such as may be transitory and non-transitory computer-readable storage media including instructions. The storage medium refers to, for example, a memory including instructions executable by a processor of the apparatus to perform the method of managing the database cluster.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A management method of a database cluster is characterized in that an agent entity is arranged in each node of the database cluster, each agent entity controls the node where the agent entity is respectively located, and the management method of the database cluster is applied to the agent entities; the method comprises the following steps:
when the node is not started, acquiring node information of a second node; the second nodes are all other nodes except the node in the database cluster;
if judging that no second node meeting preset conditions exists according to the node information of the second node, starting the node with the first node of the database cluster;
and detecting whether each started node in the database cluster is online or not, and adding the offline node into the database cluster again.
2. The method according to claim 1, wherein the second node satisfying the preset condition comprises:
the second node is in a started state, the second node is in a starting phase, the second node with the sequence number larger than that of the second node, and the second node with the priority higher than that of the second node.
3. The method of claim 1, further comprising:
and if judging that the second node in the started state exists according to the node information of the second node, synchronizing the data of the second node to the local node, and setting the state of the local node to be the started state.
4. The method of claim 1, further comprising:
after the node is started, monitoring the state information of the node regularly;
and when the state information is abnormal, sending out an alarm signal.
5. The method of claim 1, further comprising:
after the node is started, regularly detecting whether the node is on line or not;
and if the node information of the second node is not on line, repeatedly executing the step of obtaining the node information of the second node.
6. A management device of a database cluster is characterized in that an agent entity is arranged in each node of the database cluster, each agent entity controls the node where the agent entity is located respectively, and the management device of the database cluster is applied to the agent entities; the device comprises:
the information acquisition module is used for acquiring the node information of the second node when the node is not started; the second nodes are all other nodes except the node in the database cluster;
the node starting module is used for starting the node with the first node of the database cluster when judging that the second node meeting the preset condition does not exist according to the node information of the second node;
an online detection module for detecting whether each started node in the database cluster is online,
the node starting module is also used for rejoining the offline node to the database cluster.
7. The apparatus of claim 6, wherein the second node satisfying the preset condition comprises:
the second node is in a started state, the second node is in a starting phase, the second node with the sequence number larger than that of the second node, and the second node with the priority higher than that of the second node.
8. The apparatus of claim 6, further comprising:
and the node adding module is used for synchronizing the data of the second node to the node and setting the state of the node to be the started state when judging and knowing that the second node in the started state exists according to the node information of the second node.
9. The apparatus of claim 6, further comprising:
and the information monitoring module is used for monitoring the state information of the node at regular time after the node is started, and sending an alarm signal when the state information is abnormal.
10. The apparatus of claim 6, further comprising:
the online detection module is used for detecting whether the node is online or not at regular time after the node is started; and when the node is not on line, the information acquisition module continuously acquires the node information of the second node.
CN201710228209.7A 2017-04-10 2017-04-10 Database cluster management method and device Active CN106960060B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710228209.7A CN106960060B (en) 2017-04-10 2017-04-10 Database cluster management method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710228209.7A CN106960060B (en) 2017-04-10 2017-04-10 Database cluster management method and device

Publications (2)

Publication Number Publication Date
CN106960060A CN106960060A (en) 2017-07-18
CN106960060B true CN106960060B (en) 2020-07-31

Family

ID=59484214

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710228209.7A Active CN106960060B (en) 2017-04-10 2017-04-10 Database cluster management method and device

Country Status (1)

Country Link
CN (1) CN106960060B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107332926A (en) * 2017-07-28 2017-11-07 郑州云海信息技术有限公司 A kind of application server cluster starts method and device
CN107678795A (en) * 2017-09-26 2018-02-09 郑州云海信息技术有限公司 A kind of management method and its device of more primary database clusters
CN110765172A (en) * 2018-07-10 2020-02-07 北京京东尚科信息技术有限公司 System and method for operating a database
CN111367998A (en) * 2020-03-04 2020-07-03 安超云软件有限公司 Database cluster recovery method based on Galera and terminal equipment
CN111817894B (en) * 2020-07-13 2022-12-30 济南浪潮数据技术有限公司 Cluster node configuration method and system and readable storage medium
CN112667449B (en) * 2020-12-29 2024-03-08 新华三技术有限公司 Cluster management method and device
CN114666348B (en) * 2022-05-25 2022-11-11 广东睿江云计算股份有限公司 Method for quickly constructing distributed system based on python language

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103023970A (en) * 2012-11-15 2013-04-03 中国科学院计算机网络信息中心 Method and system for storing mass data of Internet of Things (IoT)
CN103984768A (en) * 2014-05-30 2014-08-13 华为技术有限公司 Data management method for database cluster, nodes and data management system for database cluster
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
CN104536988A (en) * 2014-12-10 2015-04-22 杭州斯凯网络科技有限公司 MonetDB distributed computing storage method
CN105354332A (en) * 2015-12-04 2016-02-24 浪潮(北京)电子信息产业有限公司 Method and system for implementing mutual standby of database and middleware based on BCP (Batch Communications Program)
CN105511966A (en) * 2015-12-22 2016-04-20 深圳供电局有限公司 Method and system for optimizing database cluster service separation
CN106301938A (en) * 2016-08-25 2017-01-04 成都索贝数码科技股份有限公司 A kind of high availability and the data base cluster system of strong consistency and node administration method thereof
CN106372165A (en) * 2016-08-31 2017-02-01 天津南大通用数据技术股份有限公司 Leader selection method and device for cluster based on totem protocol
CN106383845A (en) * 2016-08-31 2017-02-08 天津南大通用数据技术股份有限公司 Shared storage-based MPP database data redistribution system
CN106557543A (en) * 2016-10-14 2017-04-05 深圳前海微众银行股份有限公司 Node switching method and system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103023970A (en) * 2012-11-15 2013-04-03 中国科学院计算机网络信息中心 Method and system for storing mass data of Internet of Things (IoT)
CN104281631A (en) * 2013-07-12 2015-01-14 中兴通讯股份有限公司 Distributed database system and data synchronization method and nodes thereof
CN103984768A (en) * 2014-05-30 2014-08-13 华为技术有限公司 Data management method for database cluster, nodes and data management system for database cluster
CN104536988A (en) * 2014-12-10 2015-04-22 杭州斯凯网络科技有限公司 MonetDB distributed computing storage method
CN105354332A (en) * 2015-12-04 2016-02-24 浪潮(北京)电子信息产业有限公司 Method and system for implementing mutual standby of database and middleware based on BCP (Batch Communications Program)
CN105511966A (en) * 2015-12-22 2016-04-20 深圳供电局有限公司 Method and system for optimizing database cluster service separation
CN106301938A (en) * 2016-08-25 2017-01-04 成都索贝数码科技股份有限公司 A kind of high availability and the data base cluster system of strong consistency and node administration method thereof
CN106372165A (en) * 2016-08-31 2017-02-01 天津南大通用数据技术股份有限公司 Leader selection method and device for cluster based on totem protocol
CN106383845A (en) * 2016-08-31 2017-02-08 天津南大通用数据技术股份有限公司 Shared storage-based MPP database data redistribution system
CN106557543A (en) * 2016-10-14 2017-04-05 深圳前海微众银行股份有限公司 Node switching method and system

Also Published As

Publication number Publication date
CN106960060A (en) 2017-07-18

Similar Documents

Publication Publication Date Title
CN106960060B (en) Database cluster management method and device
CN105468717B (en) Database operation method and device
CN107480014B (en) High-availability equipment switching method and device
US20180329630A1 (en) Data synchronization method and system, and synchronization obtaining method and apparatus
US20170168756A1 (en) Storage transactions
CN109446169B (en) Double-control disk array shared file system
EP3671461A1 (en) Systems and methods of monitoring software application processes
CN111198662B (en) Data storage method, device and computer readable storage medium
CN113312153B (en) Cluster deployment method and device, electronic equipment and storage medium
CN106940671B (en) Method, device and system for monitoring running of task threads in cluster
CN108228789B (en) Synchronous abnormity recovery method and device triggered by slave node
CN111726388A (en) Cross-cluster high-availability implementation method, device, system and equipment
CN107071189B (en) Connection method of communication equipment physical interface
CN110069365B (en) Method for managing database and corresponding device, computer readable storage medium
CN105323271B (en) Cloud computing system and processing method and device thereof
CN111092956A (en) Resource synchronization method, device, storage medium and equipment
CN109002263B (en) Method and device for adjusting storage capacity
CN113472891A (en) SDN controller cluster data processing method, device and medium
KR102033489B1 (en) Method and server for managing server cluster
CN114764379A (en) Access switching method and device for application software and computer readable storage medium
CN110502460B (en) Data processing method and node
CN113032477A (en) Long-distance data synchronization method and device based on GTID and computing equipment
CN114979141B (en) Task processing method, device, equipment and storage medium
CN116991635B (en) Data synchronization method and data synchronization device
CN114356214B (en) Method and system for providing local storage volume for kubernetes system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant