CN105357042A

CN105357042A - High-availability cluster system, master node and slave node

Info

Publication number: CN105357042A
Application number: CN201510729575.1A
Authority: CN
Inventors: 李延彬
Original assignee: Inspur Beijing Electronic Information Industry Co Ltd
Current assignee: Inspur Beijing Electronic Information Industry Co Ltd
Priority date: 2015-10-30
Filing date: 2015-10-30
Publication date: 2016-02-24
Anticipated expiration: 2035-10-30
Also published as: CN105357042B

Abstract

The embodiment of the invention provides a high-availability cluster system, a master node and a slave node. The master node comprises a master resource allocation layer, a master information layer and a master resource agent layer; and the slave node comprises a slave resource allocation layer, a slave information layer and a slave resource agent layer. The master node and the slave node are respectively divided into three layers; one layer is used for information interaction, one layer is used for allocation management of cluster resources, and one layer is used for start and stop of the cluster resources. As the working mechanism of the master node and the slave node is simplified, more convenience is brought to the management of the master node and the slave node as well as the understanding and learning of the working principle therein. No matter the master node or the slave node is in failure, a person can know which layer of the node is in failure rapidly based on failure performances of the node and can further perform failure checking just for the layer, so that the failure checking range is shortened, and convenience is brought to fault inquiry.

Description

A kind of highly available cluster system and host node thereof and from node

Technical field

The present invention relates to Clustering field, particularly relate to a kind of highly available cluster system and host node thereof and from node.

Background technology

Along with extensive use and the deep development of enterprise information system, the core application quantity of user gets more and more, and under this distributed multi-application system framework, highly available cluster system is day by day easily accepted by a user and widely uses.In high-availability cluster needs, need mutual transmission of information between node, and by the resource utilization of each node of node statistics, distribute to the cluster resource that each node is different, each node starts or relevant cluster resource of stopping using after learning resource allocation result, and the working mechanism of each node is complicated, and is not easy to management and rational learning operation principle wherein, when certain one malfunctions, also also inconvenience carries out fault inquiry to this node.

Summary of the invention

In view of this, the embodiment of the present invention provides a kind of highly available cluster system and host node thereof and from node, complicated with the working mechanism solving each node in prior art, and be not easy to management and rational learning operation principle wherein, when certain one malfunctions, also also inconvenience carries out the problem of fault inquiry to this node.

For achieving the above object, the embodiment of the present invention provides following technical scheme:

For a host node for highly available cluster system, for highly available cluster system, described highly available cluster system comprises a host node and at least one from node, and described host node comprises: primary resource Distribution Layer, main Information Level and primary resource Agent layer; Wherein,

Described primary resource Distribution Layer, for collecting the flowing information of all nodes in described high-availability cluster, resource allocation policy is obtained according to described flowing information, described resource allocation policy is sent to described main Information Level, and perform described resource allocation policy, send resource enabled instruction and/or resource halt instruction to described primary resource Distribution Layer;

Described main Information Level, for receiving all described information sent from node, and to all described from node transmission information, wherein, the described information sent from node comprises heartbeat message, comprises: heartbeat message, configuration information and/or described resource allocation policy to all described information sent from node;

Described primary resource Agent layer, for starting the corresponding cluster resource of described resource enabled instruction after receiving resource enabled instruction, stops starting the corresponding cluster resource of described resource transfer instruction after receiving resource halt instruction.

Wherein, described main Information Level comprises: main first-in first-out subprocess, main heartbeat host process, main write subprocess and main reading subprocess; Wherein,

Described main first-in first-out subprocess, for receiving the information that client sends, and sends to described main heartbeat host process by the information that described client sends;

Described main reading subprocess, for receiving the information sent from node, and sends to described main heartbeat host process by the described information sent from node;

Described main heartbeat host process, for receiving the information of described main first-in first-out subprocess and described main reading subprocess transmission, determine the memory location of the information received, the information of described reception is stored, or the information of described reception is sent to described main write subprocess, or the information of described reception is sent to corresponding client, and send to described main write subprocess by needing to send to from the information of node;

Described main write subprocess, for receiving the information that described main heartbeat host process sends, and sends to corresponding from node by the information that described main heartbeat host process sends.

Wherein, information transmission is carried out by first-in first-out passage between described main first-in first-out subprocess and described client;

All information transmission is carried out by interprocess communication between described main first-in first-out subprocess and described main heartbeat host process, between described main heartbeat host process and main write subprocess, between described main heartbeat host process and main reading subprocess and between described main heartbeat host process and described client;

Described main write subprocess and described between node, and described main reading subprocess and describedly all carry out information transmission by heartbeat communication plug-in unit between node.

Wherein, described primary resource Distribution Layer comprises: main cluster resource manager, main local resource manager and main cluster information storehouse; Wherein,

Described main cluster resource manager, for collecting the flowing information of all nodes in described high-availability cluster, resource allocation policy is obtained according to described flowing information, described resource allocation policy is sent to described main Information Level, and after obtaining described resource allocation policy, send main call instruction to described main local resource manager, call described main local resource manager;

Described main local resource manager, starts after receiving described main call instruction, and sends resource enabled instruction and/or resource halt instruction according to described main call instruction to described primary resource Distribution Layer;

Described main cluster information storehouse, for storing the configuration information of described host node, wherein, the configuration information of described host node is can edit file.

For highly available cluster system from a node, for highly available cluster system, described highly available cluster system comprises a host node and at least one from node, eachly describedly to comprise from node: from Resourse Distribute layer, from Information Level with from Resource Broker layer; Wherein,

Described from Resourse Distribute layer, for performing the resource allocation policy that described host node sends, send resource enabled instruction and/or resource halt instruction to described from Resourse Distribute layer;

Described from Information Level, for receiving the information that described host node and other Information Levels from node send, and send information to described host node and other Information Levels from node, wherein, the information that described host node and other Information Levels from node send comprises: heartbeat message, configuration information and/or described resource allocation policy, and the information sent to described host node and other Information Levels from node comprises: heartbeat message;

Described from Resource Broker layer, for starting the corresponding cluster resource of described resource enabled instruction after receiving resource enabled instruction, stop starting the corresponding cluster resource of described resource transfer instruction after receiving resource halt instruction.

Wherein, describedly to comprise from Information Level: from first-in first-out subprocess, from heartbeat host process, from write subprocess with from reading subprocess;

Described from first-in first-out subprocess, for receiving the information that client sends, and the information that described client sends is sent to described from heartbeat host process;

Described from reading subprocess, for receiving the information that host node and other Information Levels from node send, and described host node and other are sent to described from heartbeat host process from the Information Level of node;

Described from heartbeat host process, described from first-in first-out subprocess and the described information from reading subprocess transmission for receiving, determine the memory location of the information received, the information of described reception is stored, or the information of described reception is sent to described from write subprocess, or the information of described reception is sent to corresponding client, and by need to send to host node and/or other send to described from write subprocess from the information of the Information Level of node;

Described from write subprocess, for receiving the described information sent from heartbeat host process, and the described information sent from heartbeat host process is sent to host node and/or corresponding from node.

Wherein, describedly information transmission is carried out between first-in first-out subprocess and described client by first-in first-out passage;

Described from first-in first-out subprocess and described between heartbeat host process, described from heartbeat host process and from writing between subprocess, described from heartbeat host process and to read between subprocess and describedly all carry out information transmission by interprocess communication between heartbeat host process and described client;

Described from write subprocess and described host node with other are between node, and describedly between node, all carry out information transmission by heartbeat communication plug-in unit from reading subprocess and described host node and other.

Wherein, describedly to comprise from Resourse Distribute layer: from cluster resource manager, from local resource manager with from cluster information storehouse; Wherein,

Described from cluster resource manager, for after the resource allocation policy obtaining host node transmission, perform described resource allocation policy, and send from call instruction from local resource manager to described, call described from local resource manager;

Described from local resource manager, start after call instruction described in receiving, and send resource enabled instruction and/or resource halt instruction to described from Resourse Distribute layer from call instruction according to described;

Described from cluster information storehouse, for storing the configuration information of self, wherein, described configuration information is a read message.

A kind of highly available cluster system, comprises an above-mentioned host node, and at least one above-mentioned from node.

Wherein, described highly available cluster system, also comprises: standby host node and/or for subsequent use from node; Wherein,

Described standby host node comprises: primary resource Distribution Layer for subsequent use, main Information Level for subsequent use and primary resource Agent layer for subsequent use; Wherein,

When described host node fault;

Described primary resource Distribution Layer for subsequent use, for collecting the flowing information of all nodes in described high-availability cluster, resource allocation policy is obtained according to described flowing information, described resource allocation policy is sent to described Information Level, and perform described resource allocation policy, send resource enabled instruction and/or resource halt instruction to described primary resource Distribution Layer for subsequent use;

Described main Information Level for subsequent use, for receiving all described information sent from node, and to all described from node transmission information, wherein, the described information sent from node comprises heartbeat message, comprises: heartbeat message, configuration information and/or described resource allocation policy to all described information sent from node;

Described primary resource Agent layer for subsequent use, for starting the corresponding cluster resource of described resource enabled instruction after receiving resource enabled instruction, stops starting the corresponding cluster resource of described resource transfer instruction after receiving resource halt instruction;

Describedly for subsequent usely to comprise from node: for subsequent use from Resourse Distribute layer, for subsequent use from Information Level and for subsequent use from Resource Broker layer; Wherein,

When break down in described highly available cluster system from node time;

Described for subsequent use from Resourse Distribute layer, for performing the resource allocation policy that host node sends, to described for subsequent use from the transmission resource enabled instruction of Resourse Distribute layer and/or resource halt instruction;

Described for subsequent use from Information Level, for receiving the information that host node and other Information Levels from node send, and send information to host node and other Information Levels from node, wherein, the information that described host node and other Information Levels from node send comprises: heartbeat message, configuration information and/or described resource allocation policy, and the information sent to described host node and other Information Levels from node comprises: heartbeat message;

Described for subsequent use from Resource Broker layer, for starting the corresponding cluster resource of described resource enabled instruction after receiving resource enabled instruction, stop starting the corresponding cluster resource of described resource transfer instruction after receiving resource halt instruction.

Based on technique scheme, the highly available cluster system that the embodiment of the present invention provides and host node thereof and from node, highly available cluster system comprises a host node and at least one from node, host node is divided into primary resource Distribution Layer, main Information Level and primary resource Agent layer, sent and received information by main Information Level, information transmission is carried out with other nodes, the flowing information of each node is collected by primary resource Distribution Layer, add up the resource utilization of each node, obtain resource allocation policy, distribute the cluster resource that each node is different, and perform this resource allocation policy, after primary resource Distribution Layer performs this resource allocation policy, corresponding cluster resource is started or stoped by primary resource Agent layer, to be divided into from Resourse Distribute layer, from Information Level with from Resource Broker layer from node, send and receive information by from Information Level, information transmission is carried out with other nodes, by from Resourse Distribute layer obtain host node send resource allocation policy after, perform this resource allocation policy, after performing this resource allocation policy from Resourse Distribute layer, start or stop corresponding cluster resource by from Resource Broker layer.Three layers are divided into by host node with from node, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify host node and the working mechanism from node, more be convenient to host node with from the management of node and rational learning operation principle wherein, it is no matter host node or from node, when it breaks down, can according to the fault performance after its fault, learn that rapidly it is which layer breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only embodiments of the invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to the accompanying drawing provided.

The structured flowchart of the host node for highly available cluster system that Fig. 1 provides for the embodiment of the present invention;

The structured flowchart for Information Level main in the host node of highly available cluster system that Fig. 2 provides for the embodiment of the present invention;

The structured flowchart for primary resource Distribution Layer in the host node of highly available cluster system that Fig. 3 provides for the embodiment of the present invention;

The structured flowchart from node for highly available cluster system that Fig. 4 provides for the embodiment of the present invention;

Fig. 5 for the embodiment of the present invention provide for highly available cluster system from node from the structured flowchart of Information Level;

Fig. 6 for the embodiment of the present invention provide for highly available cluster system from node from the structured flowchart of Resourse Distribute layer;

The system block diagram of the highly available cluster system that Fig. 7 provides for the embodiment of the present invention;

Another system block diagram of the highly available cluster system that Fig. 8 provides for the embodiment of the present invention;

Main Information Level and from the schematic diagram carrying out information interaction between Information Level in the highly available cluster system that Fig. 9 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

The structured flowchart of the host node for highly available cluster system that Fig. 1 provides for the embodiment of the present invention, host node is divided into three layers, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify the working mechanism of host node, more be convenient to the management to host node and rational learning operation principle wherein, when host node breaks down, can according to the fault performance after this host node fault, learn that rapidly which layer for this host node breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry, with reference to Fig. 1, this host node being used for highly available cluster system can comprise: primary resource Distribution Layer 110, main Information Level 120 and primary resource Agent layer 130, wherein,

Primary resource Distribution Layer 110, for the allocation manager of cluster resource, collect the flowing information of all nodes in high-availability cluster, flowing information according to collecting obtains resource allocation policy, this resource allocation policy is sent to main Information Level 120, other nodes are sent to by main Information Level 120, simultaneously, primary resource Distribution Layer 110 is after obtaining resource allocation policy, perform this resource allocation policy, send resource enabled instruction and/or resource halt instruction according to this resource allocation policy to primary resource Distribution Layer 130.

Optionally, primary resource Distribution Layer 110 can comprise main cluster resource manager, main local resource manager and main cluster information storehouse, the flowing information of all nodes is collected by main cluster resource manager, according to described flowing information, obtain resource allocation policy, resource allocation policy is sent to main Information Level 120, and after obtaining resource allocation policy, main call instruction is sent to main local resource manager, call this main local resource manager, the main call instruction sent according to main cluster resource manager by local resource manager sends resource enabled instruction and/or resource halt instruction to primary resource Distribution Layer 130, by the configuration information of cluster information library storage host node, wherein, the configuration information of host node is can edit file.

Main Information Level 120, mutual for information, receive all information sent from node, and send information to all from node, wherein, the information sent from node can comprise heartbeat message, and the information sent to all institutes node can comprise: the resource allocation policy that heartbeat message, configuration information and/or primary resource Distribution Layer 110 obtain.

Optionally, main Information Level 120 can comprise main first-in first-out subprocess, main heartbeat host process, main write subprocess and main reading subprocess, the information of client transmission is received by main first-in first-out subprocess, and the information that client sends is sent to main heartbeat host process, the information sent from node is received by main reading subprocess, and the information sent from node is sent to main heartbeat host process, the information of main first-in first-out subprocess and the transmission of main reading subprocess is received by main heartbeat host process, determine the memory location of the information received, the information of this reception is stored, maybe the information of this reception is sent to main write subprocess, maybe the information of this reception is sent to corresponding client, and send to main write subprocess by needing to send to from the information of node, the information of main heartbeat host process transmission is received by main write subprocess, and the information that main heartbeat host process sends is sent to corresponding from node, thus complete internodal information transmission and the information transmission between host node and client.

Optionally, information transmission can be carried out by first-in first-out passage between main first-in first-out subprocess and client; Between main first-in first-out subprocess and main heartbeat host process, between main heartbeat host process and main write subprocess, between main heartbeat host process and main reading subprocess and, all can carry out information transmission by interprocess communication between main heartbeat host process and client; Main write subprocess and between node, and main reading subprocess and all can carry out information transmission by heartbeat communication plug-in unit between node.

Primary resource Agent layer 130, for cluster resource startup and stop using, after the resource enabled instruction receiving primary resource Distribution Layer 110 transmission, start the cluster resource that this resource enabled instruction is corresponding, receiving after primary resource Distribution Layer 110 sends resource halt instruction, stop starting the corresponding cluster resource of this resource transfer instruction.

Based on technique scheme, the host node for highly available cluster system that the embodiment of the present invention provides, be divided into primary resource Distribution Layer, main Information Level and primary resource Agent layer, sent and received information by main Information Level, information transmission is carried out with other nodes, the flowing information of each node is collected by primary resource Distribution Layer, add up the resource utilization of each node, obtain resource allocation policy, distribute the cluster resource that each node is different, and perform this resource allocation policy, after primary resource Distribution Layer performs this resource allocation policy, corresponding cluster resource is started or stoped by primary resource Agent layer.Host node is divided into three-decker, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify the working mechanism of host node, more be convenient to the management to host node and rational learning operation principle wherein, when host node breaks down, can according to the fault performance after this host node fault, learn that rapidly which layer for this host node breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

Optionally, Fig. 2 shows the structured flowchart for Information Level 120 main in the host node of highly available cluster system that the embodiment of the present invention provides, with reference to Fig. 2, this main Information Level 120 can comprise: main first-in first-out subprocess 121, main heartbeat host process 122, main write subprocess 123 and main reading subprocess 124; Wherein,

Main first-in first-out subprocess 121, for receiving the information that client sends, and sends to described main heartbeat host process by the information that described client sends;

Main reading subprocess 122, for receiving the information sent from node, and sends to described main heartbeat host process by the described information sent from node;

Main heartbeat host process 123, for receiving the information of described main first-in first-out subprocess and described main reading subprocess transmission, determine the memory location of the information received, the information of described reception is stored, or the information of described reception is sent to described main write subprocess, or the information of described reception is sent to corresponding client, and send to described main write subprocess by needing to send to from the information of node;

Main write subprocess 124, for receiving the information that described main heartbeat host process sends, and sends to corresponding from node by the information that described main heartbeat host process sends.

When main Information Level 120 receives certain node data information from node transmission, main reading subprocess 120 receives this node data information described, and this node data information is sent to main heartbeat host process 123; After main heartbeat host process 123 receives this node data information, determine the memory location of this node data information, if determine, the memory location of this node data information is host node, then this node data information is stored, if determine, the memory location of this node data information is for certain client, then send this client by this node data information.

When main Information Level 120 is to during from node sending node data message, main heartbeat host process 123 is behind the memory location determining this node data information, namely after determining the memory node of this node data information, this node data information is sent to main write subprocess 124, after main write subprocess 124 receives this node data information, this memory node is sent to store this node data information.

When main Information Level 120 receives the client data information of client transmission, this client data information is received by main first-in first-out subprocess 121, this client data information is sent to main heartbeat host process 123, main heartbeat host process 123 is after this client data information, determine the memory location that this client data information is corresponding, if determine, the memory location of this client data information is host node, then this client data information is stored, if determine, the memory location of this node data information is for certain is from node, then this client data information is sent to main write subprocess 124, this is sent to store from node this client data information by main write subprocess 124.

Optionally, Fig. 3 shows the structured flowchart for primary resource Distribution Layer 110 in the host node of highly available cluster system that the embodiment of the present invention provides, with reference to Fig. 3, this primary resource Distribution Layer 110 can comprise: main cluster resource manager 111, main local resource manager 112 and main cluster information storehouse 113; Wherein,

Main cluster resource manager 111, for collecting the flowing information of all nodes in described high-availability cluster, resource allocation policy is obtained according to described flowing information, described resource allocation policy is sent to described main Information Level, and after obtaining described resource allocation policy, send main call instruction to described main local resource manager, call described main local resource manager;

Optionally, main cluster resource manager 111 can comprise: policy engine, transmission engine and main enforcement engine, wherein, policy engine is for collecting the flowing information of all nodes in high-availability cluster, resource allocation policy is obtained according to this flowing information, transmission engine is used for this resource allocation policy to send to main Information Level 120, this resource allocation policy is sent to all from node by this main Information Level 120, main enforcement engine is for performing this resource allocation policy, send main call instruction to main local resource manager 112, call main local resource manager 112.

Main local resource manager 112, starts after receiving main call instruction, and sends resource enabled instruction and/or resource halt instruction according to this main call instruction to primary resource Distribution Layer 130;

Main cluster information storehouse 113, for storing the configuration information of host node, namely stores the configuration information of self, and wherein, the configuration information of host node is can edit file.

Wherein, the configuration information of host node is can edit file, can modify, be a read message from the configuration information of node, can not modify, if desired the configuration of an amendment node, then need the configuration information first revising host node, then there is the main Information Level 120 of host node, this amended configuration information is sent to and respectively replaces from node.

The host node for highly available cluster system that the embodiment of the present invention provides, host node is divided into three layers, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify the working mechanism of host node, more be convenient to the management to host node and rational learning operation principle wherein, when host node breaks down, can according to the fault performance after this host node fault, learn that rapidly which layer for this host node breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

Below to being introduced from node for highly available cluster system that the embodiment of the present invention provides, can cooperatively interact for same highly available cluster system from node and the above-described host node for highly available cluster system for highly available cluster system described below.

The structured flowchart from node for highly available cluster system that Fig. 4 provides for the embodiment of the present invention, three layers will be divided into from node, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify the working mechanism from node, more be convenient to from the management of node and rational learning operation principle wherein, when from nodes break down, can according to this from the fault performance after node failure, learn rapidly as this which layer from node breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry, with reference to Fig. 4, this is used for can comprising from node of highly available cluster system: from Resourse Distribute layer 210, from Information Level 220 with from Resource Broker layer 230, wherein,

From Resourse Distribute layer 210, for the allocation manager of cluster resource, perform the resource allocation policy that host node sends, send resource enabled instruction and/or resource halt instruction to from Resourse Distribute layer 230.

Optionally, can comprise from cluster resource manager from Resourse Distribute layer 210, from local resource manager with from cluster information storehouse, by from cluster resource manager 210 obtain host node send resource allocation policy after, perform this resource allocation policy, and send from call instruction to from local resource manager, call this from local resource manager, by starting after call instruction from local resource manager in reception, and send resource enabled instruction and/or resource halt instruction from call instruction to from Resourse Distribute layer according to this, by the configuration information from cluster information library storage self, wherein, this configuration information is a read message.

From Information Level 220, mutual for information, receive the information that host node sends with other Information Levels from node, and send information to host node and other Information Levels from node, wherein, the information that host node and other Information Levels from node send can comprise: heartbeat message, configuration information and/or described resource allocation policy, and the information sent to host node and other Information Levels from node can comprise: heartbeat message.

Optionally, can comprise from first-in first-out subprocess from Information Level 220, from heartbeat host process, from write subprocess with from reading subprocess, by receiving the information that client sends from first-in first-out subprocess, and the information that this client sends is sent to described from heartbeat host process, by receiving the information that host node sends with other Information Levels from node from reading subprocess, and host node and other are sent to from heartbeat host process from the Information Level of node, by receiving from first-in first-out subprocess with from the information reading subprocess transmission from heartbeat host process, determine the memory location of the information received, the information of this reception is stored, maybe the information of this reception is sent to from write subprocess, maybe the information of this reception is sent to corresponding client, and by need to send to host node and/or other send to from write subprocess from the information of the Information Level of node, by receiving from write subprocess the information sent from heartbeat host process, and the information sent from heartbeat host process is sent to host node and/or corresponding from node, thus complete internodal information transmission and and client between information transmission.

Optionally, information transmission can be carried out by first-in first-out passage between first-in first-out subprocess and client; From first-in first-out subprocess and from heartbeat from process between, from heartbeat from process and from writing between subprocess, from heartbeat from process and to read between subprocess and, all can carry out information transmission by interprocess communication from heartbeat between process and client; From write subprocess and host node with other are between node, and between node, all carry out information transmission by heartbeat communication plug-in unit from reading subprocess and host node and other.

From Resource Broker layer 230, for cluster resource startup and stop using, after receiving resource enabled instruction, start the cluster resource that this resource enabled instruction is corresponding, after receiving resource halt instruction, stop starting the corresponding cluster resource of this resource transfer instruction.

Based on technique scheme, the highly available cluster system that the embodiment of the present invention provides from node, be divided into from Resourse Distribute layer, from Information Level with from Resource Broker layer, send and receive information by from Information Level, information transmission is carried out with other nodes, by from Resourse Distribute layer obtain host node send resource allocation policy after, perform this resource allocation policy, after performing this resource allocation policy from Resourse Distribute layer, start or stop corresponding cluster resource by from Resource Broker layer.Three-decker will be divided into from node, one deck is used for the mutual of information, and one deck is used for the allocation manager of cluster resource, and one deck is used for the startup of cluster resource and stops using, simplify the working mechanism from node, more be convenient to from the management of node and rational learning operation principle wherein, when it breaks down, can according to the fault performance after its fault, learn rapidly as its which layer breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

Optionally, Fig. 5 show that the embodiment of the present invention provides for highly available cluster system from node from the structured flowchart of Information Level 220, with reference to Fig. 5, can should comprise from Information Level 220: from first-in first-out subprocess 221, from heartbeat from process 222, from write subprocess 223 with from reading subprocess 224; Wherein,

From first-in first-out subprocess 221, for receiving the information that client sends, and the information that described client sends is sent to described from heartbeat host process;

From reading subprocess 222, for receiving the information that host node and other Information Levels from node send, and described host node and other are sent to described from heartbeat host process from the Information Level of node;

From heartbeat host process 223, described from first-in first-out subprocess and the described information from reading subprocess transmission for receiving, determine the memory location of the information received, the information of described reception is stored, or the information of described reception is sent to described from write subprocess, or the information of described reception is sent to corresponding client, and by need to send to host node and/or other send to described from write subprocess from the information of the Information Level of node;

From write subprocess 224, for receiving the described information sent from heartbeat host process, and the described information sent from heartbeat host process is sent to host node and/or corresponding from node.

When receiving host node or certain node data information from node transmission from Information Level 220, receive this node data information described from reading subprocess 220, and this node data information is sent to from heartbeat from process 223; Receive this node data information from heartbeat from process 223 after, determine the memory location of this node data information, if determine, the memory location of this node data information is own, then this node data information is stored, if determine, the memory location of this node data information is for certain client, maybe sends this client by this node data information.

When from Information Level 220 to host node or other from node sending node data message time, from heartbeat from process 223 behind the memory location determining this node data information, namely after determining the memory node of this node data information, this node data information is sent to from write subprocess 224, after receiving this node data information from write subprocess 224, this memory node is sent to store this node data information.

When receiving the client data information that client sends from Information Level 220, this client data information is received by from first-in first-out subprocess 222, this client data information is sent to from heartbeat from process 223, from heartbeat from process 223 after this client data information, determine the memory location that this client data information is corresponding, if determine, the memory location of this client data information is own, then this client data information is stored, if determine, the memory location of this node data information is host node or other are from node, then this client data information is sent to from write subprocess 224, by from write subprocess 224 this client data information sent to host node or other store from node.

Optionally, Fig. 6 show that the embodiment of the present invention provides for highly available cluster system from node from the structured flowchart of Resourse Distribute layer 210, with reference to Fig. 6, can should comprise from Resourse Distribute layer 210: from cluster resource manager 211, from local resource manager 212 with from cluster information storehouse 213; Wherein,

From cluster resource manager 211, for after the resource allocation policy obtaining host node transmission, perform this resource allocation policy, and to sending from call instruction from local resource manager 212, call from local resource manager 212;

Optionally, can comprise from cluster resource manager 211: from enforcement engine, perform resource allocation policy by this from enforcement engine, send main call instruction to main local resource manager 112, call main local resource manager 112.

From local resource manager 212, start after call instruction for receiving, and send resource enabled instruction and/or resource halt instruction from call instruction to from Resourse Distribute layer 230 according to this;

From cluster information storehouse 213, for storing the configuration information of self, wherein, this configuration information is a read message.

The embodiment of the present invention provide for highly available cluster system from node, three-decker will be divided into from node, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify the working mechanism from node, more be convenient to from the management of node and rational learning operation principle wherein, when it breaks down, can according to the fault performance after its fault, learn rapidly as its which layer breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

Below the highly available cluster system that the embodiment of the present invention provides is introduced, highly available cluster system described below based on the above-described host node for highly available cluster system, and for highly available cluster system from node.

The system block diagram of the highly available cluster system that Fig. 7 provides for the embodiment of the present invention, with reference to Fig. 7, this highly available cluster system can comprise: comprise a host node 100 and at least one from node 200.

Wherein, host node 100 is the host node for high-availability cluster as described above, is the host node for high-availability cluster as described above from node 200.

Host node 100 and carry out information interaction by main Information Level 120 with from Information Level 220 between node 200, respectively between node 200 all by carrying out information interaction from Information Level 220 each other.

Optionally, Fig. 8 shows another system block diagram of the highly available cluster system that the embodiment of the present invention provides, and with reference to Fig. 8, this highly available cluster system can also comprise: comprise standby host node 300 and/or for subsequent use from node 400; Wherein,

Standby host node comprises primary resource Distribution Layer 310 for subsequent use, main Information Level 320 for subsequent use and primary resource Agent layer 330 for subsequent use.

When host node 100 fault, primary resource Distribution Layer 310 for subsequent use, for collecting the flowing information of all nodes in high-availability cluster, flowing information according to collecting obtains resource allocation policy, this resource allocation policy is sent to main Information Level 320 for subsequent use, other nodes are sent to by main Information Level 320 for subsequent use, simultaneously, primary resource Distribution Layer 330 for subsequent use is after obtaining resource allocation policy, perform this resource allocation policy, send resource enabled instruction and/or resource halt instruction according to this resource allocation policy to primary resource Distribution Layer 330 for subsequent use;

Main Information Level 320 for subsequent use, for receiving all information sent from node, and send information to all from node, wherein, the information sent from node can comprise heartbeat message, and the information sent to all institutes node can comprise: the resource allocation policy that heartbeat message, configuration information and/or primary resource Distribution Layer 110 obtain;

Primary resource Agent layer 330 for subsequent use, for after the resource enabled instruction receiving primary resource Distribution Layer 310 for subsequent use transmission, start the cluster resource that this resource enabled instruction is corresponding, receiving after primary resource Distribution Layer 330 for subsequent use sends resource halt instruction, stop starting the corresponding cluster resource of this resource transfer instruction.

That is, when host node 100 fault, replace host node 100 to work on by standby host node 300, make the sustainable normal operation of high-availability cluster node.

For subsequent usely to comprise from node 400: for subsequent use from Resourse Distribute layer 410, for subsequent use from Information Level 420 and for subsequent use from Resource Broker layer 430; Wherein,

When break down in highly available cluster system from node time, for subsequent use from Resourse Distribute layer 410, for perform host node send resource allocation policy, send resource enabled instruction and/or resource halt instruction to for subsequent use from Resourse Distribute layer 430;

For subsequent use from Information Level 420, for receiving the information that host node and other Information Levels from node send, and send information to host node and other Information Levels from node, wherein, the information that host node and other Information Levels from node send can comprise: heartbeat message, configuration information and/or described resource allocation policy, and the information sent to host node and other Information Levels from node can comprise: heartbeat message;

For subsequent use from Resource Broker layer 430, for after receiving resource enabled instruction, start the cluster resource that this resource enabled instruction is corresponding, after receiving resource halt instruction, stop starting the corresponding cluster resource of this resource transfer instruction.

That is, when break down in highly available cluster system from node time, replace working on from node of this fault by for subsequent use from node 400, make the sustainable normal operation of high-availability cluster node.

Optionally, Fig. 9 shows main Information Level 110 in the highly available cluster system that the embodiment of the present invention provides and from the schematic diagram carrying out information interaction between Information Level 120.

When client data information is sent to host node 100 by the first client, received by the main first-in first-out subprocess 121 of the main Information Level 110 of host node 100, then main heartbeat host process 123 is sent to, after main heartbeat host process 123 receives this client data information, whether the memory location determining this client data information is host node, if, then this client data information is stored, if not, then this client data information is sent to main write subprocess 124, by main write subprocess 124 send to store this client data information from node, by this from node from Information Level 220 from reading subprocess 222 receive this client data information, this client data information is sent to from heartbeat host process 223, after receiving this client data information from heartbeat host process 223, whether the memory location determining this client data information is self, if, then this client data information is stored, if not, then this client data information is sent to corresponding second client.

Each node from when carrying out information interaction between Information Level, and main Information Level and carry out information interaction in like manner between Information Level, repeats no more.

The highly available cluster system that the embodiment of the present invention provides, host node and be divided into three-decker from node, one deck is used for the mutual of information, one deck is used for the allocation manager of cluster resource, one deck is used for the startup of cluster resource and stops using, simplify host node and the working mechanism from node, more be convenient to host node with from the management of node and rational learning operation principle wherein, it is no matter host node or from node, when it breaks down, can according to the fault performance after its fault, learn rapidly as its which layer breaks down, and then only trouble shooting is carried out to this layer, reduce the scope of fault debugging, be convenient to fault inquiry.

In this specification, each embodiment adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiments, between each embodiment identical similar portion mutually see.For device disclosed in embodiment, because it corresponds to the method disclosed in Example, so description is fairly simple, relevant part illustrates see method part.

Professional can also recognize further, in conjunction with unit and the algorithm steps of each example of embodiment disclosed herein description, can realize with electronic hardware, computer software or the combination of the two, in order to the interchangeability of hardware and software is clearly described, generally describe composition and the step of each example in the above description according to function.These functions perform with hardware or software mode actually, depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can use distinct methods to realize described function to each specifically should being used for, but this realization should not thought and exceeds scope of the present invention.

The software module that the method described in conjunction with embodiment disclosed herein or the step of algorithm can directly use hardware, processor to perform, or the combination of the two is implemented.Software module can be placed in the storage medium of other form any known in random asccess memory (RAM), internal memory, read-only memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field.

To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims

1. the host node for highly available cluster system, it is characterized in that, for highly available cluster system, described highly available cluster system comprises a host node and at least one from node, and described host node comprises: primary resource Distribution Layer, main Information Level and primary resource Agent layer; Wherein,

2. host node according to claim 1, is characterized in that, described main Information Level comprises: main first-in first-out subprocess, main heartbeat host process, main write subprocess and main reading subprocess; Wherein,

3. host node according to claim 2, is characterized in that,

Information transmission is carried out by first-in first-out passage between described main first-in first-out subprocess and described client;

4. host node according to claim 1, is characterized in that, described primary resource Distribution Layer comprises: main cluster resource manager, main local resource manager and main cluster information storehouse; Wherein,

5. one kind for highly available cluster system from node, it is characterized in that, for highly available cluster system, described highly available cluster system comprises a host node and at least one from node, eachly describedly to comprise from node: from Resourse Distribute layer, from Information Level with from Resource Broker layer; Wherein,

6. according to claim 5 from node, it is characterized in that, describedly to comprise from Information Level: from first-in first-out subprocess, from heartbeat host process, from write subprocess with from reading subprocess;

7. according to claim 6 from node, it is characterized in that,

Describedly carry out information transmission between first-in first-out subprocess and described client by first-in first-out passage;

8. according to claim 5 from node, it is characterized in that, describedly to comprise from Resourse Distribute layer: from cluster resource manager, from local resource manager with from cluster information storehouse; Wherein,

9. a highly available cluster system, is characterized in that, comprises the host node according to any one of a claim 1-4, and according to any one of at least one claim 5-8 from node.

10. highly available cluster system according to claim 9, is characterized in that, also comprises: standby host node and/or for subsequent use from node; Wherein,

When described host node fault;

When break down in described highly available cluster system from node time;