CN105959145A

CN105959145A - Method and system for parallel management server of high availability cluster

Info

Publication number: CN105959145A
Application number: CN201610395528.2A
Authority: CN
Inventors: 沈星宇; 莫庆良; 吴崇峰; 吴健; 余世清
Original assignee: Guangdong Zhongxing Newstart Technology Co Ltd
Current assignee: Guangdong Zhongxing Newstart Technology Co Ltd
Priority date: 2016-06-04
Filing date: 2016-06-04
Publication date: 2016-09-21
Anticipated expiration: 2036-06-04
Also published as: CN105959145B

Abstract

The embodiment of the invention discloses a method and system for a parallel management server of a high availability cluster. The method comprises a step of detecting whether an operation server under a high availability cluster system has a fault or not based on high availability cluster software, a step of judging the type of the operation server, when the type of the operation server is judged as a first virtual server, switching the Linux virtual server LVS service borne in the first virtual server to a second virtual server, and operating the LVS server by the second virtual server, a step of when the type of the operation server is judged as a first real server, switching the real service borne in the first real server to a second real server, and operating the real service by the second real server. By applying the embodiment of the invention, the service in each of real servers and virtual servers is allowed to provide service continuously, and the needs of a user are satisfied to the maximum.

Description

A kind of method and system of the concurrent management server being suitable for high availability cluster

Technical field

The present invention relates to computer cluster technology field, particularly relate to a kind of applicable high availability cluster The method and system of concurrent management server.

Background technology

High-availability cluster, original English text is High Availability Cluster, is called for short HACluster, collection Group (cluster) is to be made up of one group of computer, and they provide a user with one group of network as an entirety Resource.These single computer systems are exactly the node (node) of cluster.The appearance of high-availability cluster It is to make the integrity service of cluster can use as far as possible, thus reduces by computer hardware and software fallibility The loss that property is brought.If certain node failure, its redundant node is by within the time of several seconds The responsibility that connects that let it be.Therefore, for a user, cluster is shut down never.High-availability cluster is soft The Main Function of part is exactly to realize trouble shooting and the automatization of business switching.The only height of two nodes Availability cluster is also called two-node cluster hot backup, i.e. uses two-server to back up mutually.When a station server goes out During existing fault, service role can be undertaken by another station server, thus need not the feelings of manual intervention Under condition, automatic guarantee system can continue externally to provide service.Two-node cluster hot backup is the one of high-availability cluster Kind, highly available cluster system more can support plural node, it is provided that more more than two-node cluster hot backup, The function of higher level, more can meet the changes in demand that user constantly occurs.

Current high availability cluster introduce linux virtual server (Linux Virtual Server, LVS), it uses LVS+Keepalived or LVS+heartbeat to achieve virtual server (Virtual Server) High Availabitity (High Availability), it is ensured that LVS business uninterruptedly provides load balancing Ability, LVS in existing framework and real server belong to and manage under different mode, existing service Framework can not meet the ability of parallel processing LVS and real server, lacks the most parallel to true clothes Business device and effective management of LVS, lack to the business on every real server (Real server) with And business the most persistently provides the guarantee of service ability on LVS.

Summary of the invention

It is an object of the invention to overcome the deficiencies in the prior art, the invention provides a kind of applicable height can By the method and system of the concurrent management server of property cluster, make every real server and Virtual Service Business on device provides service, the maximum demand meeting user the most incessantly.

In order to solve the problems referred to above, the present invention proposes a kind of concurrent management being suitable for high availability cluster The method of server, comprises the steps:

The runtime server being positioned under high availability cluster system based on high availability cluster software detection is No there is fault；

Judge described runtime server type, judging that described runtime server type is first virtual During server, then by the Linux virtual server LVS business switching of carrying on the first virtual server To the second virtual server, the second virtual server run described LVS business, described first virtual Server and the second virtual server are by the different real server institute under high availability cluster system Carrying；

When judging that described runtime server type is the first real server, then truly take first The actual services that business device is carried is switched to the second real server, the second real server run institute State actual services.

Described when judging that described runtime server type is the first real server, then true by first The actual services that real server is carried is switched to the second real server, by the second real server fortune The described actual services of row includes；

Judge whether described first real server carries the first virtual server, if judging described One real server carries the first virtual server, then carried by the first real server is true Business is switched to the second real server, the second real server run described actual services, by the On one virtual server, the LVS business of carrying is switched to the second virtual server, by the second Virtual Service Device runs described LVS business；

If judging when described first real server does not carry the first virtual server, then true by first The actual services that real server is carried is switched to the second real server, by the second real server fortune The described actual services of row.

Described being switched in the second real server by the actual services that first real server is carried is gone back Including:

When judging that the first real server breaks down, in actual services handoff procedure, Gao Ke Detect that the business on the first real server does not exists, from virtual server with property clustered software LVS routing table is deleted the real IP route of the first real server, after completing actual services switching, Recover the second real server real IP to be routed in the LVS routing table of virtual server.

Described when judging that described runtime server type is the first real server, then true by first The actual services that real server is carried is switched to the second real server, by the second real server fortune The described actual services of row includes:

Judge the fault type of described first real server, if described fault type is association LVS industry When the actual services of business is applied, then the actual services that the first real server is carried is switched to second Real server, is run described actual services by the second real server；Or trigger first truly to take Business device is restarted automatically；

If during the fault type that described fault type is dereferenced LVS, then by the first real server institute The actual services of carrying is switched to the second real server, described truly by the second real server operation Business.

Described method also includes:

The most recover based on the out of order server of high availability cluster software detection, described server Including virtual server and real server；

After judging that described out of order server recovers, according to joining in high availability cluster software Put file and carry out service failure recovery, by the automatic switchback of LVS business that switches before to the first virtual clothes Business device or by the first real server after the automatic switchback of actual services that switches before to recovery；Or Person

After judging that described out of order server recovers, according to joining in high availability cluster software Put file and carry out standby host registration process, the first real server after recovering or the first Virtual Service Device is as the standby host in consequent malfunction treatment mechanism.

Accordingly, the invention allows for a kind of concurrent management server being suitable for high availability cluster System, described system includes:

Fault detection module, for being positioned at high availability cluster system based on high availability cluster software detection Whether the runtime server under Tong exists fault；

Type of server judge module, is used for judging described runtime server type；

Business handover module, for according to the runtime server judged in type of server judge module Type carries out business switching, including:

Virtual service switch unit, for judging that described runtime server type is the first virtual clothes During business device, then the Linux virtual server LVS business of carrying on the first virtual server is switched to Second virtual server, is run described LVS business, described first virtual clothes by the second virtual server Business device and the second virtual server are held by the different real server under high availability cluster system Carry；

Actual services switch unit, for judging that described runtime server type is first truly to take During business device, then the actual services that the first real server is carried is switched to the second real server, Described actual services is run by the second real server.

Described type of server judge module is additionally operable to judge whether described first real server carries First virtual server；

Described business handover module is judging that described first real server carries the first Virtual Service Device, then be switched to the second real server by the actual services that the first real server is carried, by Two real server run described actual services, the LVS business of carrying on the first virtual server are cut Change to the second virtual server, the second virtual server run described LVS business；Described in judging When first real server does not carry the first virtual server, then the first real server is carried Actual services be switched to the second real server, by second real server run described actual services.

Described system also includes:

Cluster LVS routing module, for when judging that the first real server breaks down, very In real service switch process, the business in high availability cluster software detection to the first real server is not Exist, from the LVS routing table of virtual server, delete the real IP route of the first real server, After completing actual services switching, recover the second real server real IP and be routed to virtual server In LVS routing table.

Described system also includes:

Fault type module, for judging that described runtime server type is the first real server Time, it is judged that the fault type of described first real server, judge described event in fault type module When barrier type is the actual services application of association LVS business, then true by first by business handover module The actual services that server is carried is switched to the second real server, the second real server run Described actual services；Or trigger the first real server automatically to restart；

When fault type module judges the fault type that described fault type is dereferenced LVS, then by The actual services that first real server is carried is switched to the second real service by business handover module Device, is run described actual services by the second real server.

Described system also includes:

Fault recovery detection module, for based on the out of order server of high availability cluster software detection The most recovering, described server includes virtual server and real server；

Fault recovery automatic switching module, described out of order for judging at fault recovery detection module After server recovers, carry out service failure recovery according to the configuration file in high availability cluster software, The LVS business switched before is automatically switched to the first virtual server or true by switch before The automatic switchback of business to recover after the first real server；

At fault recovery detection module, standby host Registration Module, for judging that described out of order server is extensive After Fu, carry out standby host registration process according to the configuration file in high availability cluster software, will recover After the first real server or standby as in consequent malfunction treatment mechanism of the first virtual server Machine.

In embodiments of the present invention, by detecting virtual server in high availability cluster system with true State on real server, can ensure that virtual server and real server are in a failure mode Business switches.Owing to the embodiment of the present invention being come by the real server in high availability cluster system Carrying virtual server, virtual server has two or more, and it is distributed in different real server On, in whole high availability cluster system, monitor all of server, associated server detected After breaking down, first determine whether out the type of runtime server, further according to the type of runtime server Realize active-standby switch, ensure that corresponding business will not thrust.The present embodiment makes every real service Business on device and virtual server provides service, the maximum demand meeting user the most incessantly.

Accompanying drawing explanation

In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below by right In embodiment or description of the prior art, the required accompanying drawing used is briefly described, it should be apparent that, Accompanying drawing in describing below is only some embodiments of the present invention, for those of ordinary skill in the art From the point of view of, on the premise of not paying creative work, it is also possible to obtain the attached of other according to these accompanying drawings Figure.

Fig. 1 is the method stream of the concurrent management server being suitable for high availability cluster of the embodiment of the present invention Cheng Tu；

Fig. 2 is the system knot of the concurrent management server being suitable for high availability cluster of the embodiment of the present invention Structure schematic diagram；

Fig. 3 is that the system of the concurrent management server being suitable for high availability cluster of the embodiment of the present invention is another One structural representation；

Fig. 4 is that the high availability cluster system of the embodiment of the present invention specifically applies schematic diagram.

Detailed description of the invention

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is entered Row clearly and completely describes, it is clear that described embodiment is only a part of embodiment of the present invention, Rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having Have and make the every other embodiment obtained under creative work premise, broadly fall into present invention protection Scope.

The side of the concurrent management server being suitable for high availability cluster involved in the embodiment of the present invention Method, it is taken by the operation being positioned under high availability cluster system based on high availability cluster software detection Whether business device exists fault；Judge runtime server type, judge described runtime server type When being the first virtual server, then by Linux virtual server LVS of carrying on the first virtual server Business is switched to the second virtual server, the second virtual server run described LVS business, and this is the years old One virtual server and the second virtual server are by the different real service under high availability cluster system Device is carried；When judging that described runtime server type is the first real server, then by first The actual services that real server is carried is switched to the second real server, by the second real server Run described actual services.

Accordingly, the concurrent management being suitable for high availability cluster during Fig. 1 shows the embodiment of the present invention The method flow diagram of server, this flow chart comprises the steps:

S101, the operation clothes being positioned under high availability cluster system based on high availability cluster software detection Business device；

Under high availability cluster system, relate at least two virtual server, two or more Real server, this real server quantity be more than virtual server quantity, these these Virtual Service Device is carried by different real server respectively, and such as virtual server 1 exists real server 1 On, virtual server 2 exists on real server 2, and virtual server 1 is deposited with virtual server 2 In main and standby relation, generally there are a virtual server and be in operation, the LVS business involved by operation. Relating to the real server under high availability cluster system can be that multiple real server is to support industry Business operation, relate to one backup real server, the real server of this backup can be with The real server involved by virtual server of backup is identical, it is also possible to different.

S102, judge whether runtime server exists fault, if runtime server exists fault, enter Enter S103, otherwise continue step S101；

Owing to including real server and virtual server, every kind under this high availability cluster system The server of type all there may be fault, need the server judging to break down be virtual server also It it is real server.

In specific implementation process, can by checking each resource whether normal operation of business, if There is certain resource abnormal, then switch.And these resources are embodied in: business faults itself (as Business application fault stops), operating system failure, or hardware fault etc., all can cause industry Business cannot be properly functioning, and now business only switches or automatically restarts.Can select for traffic failure Select switching or business self is restarted, if server failure (such as power-off or Server Restart), Then business must switch.In general, this fault may relate to software fault or hardware fault etc. Deng.If run is LVS business, then it is virtual server, is otherwise real server.Certainly Can also be virtual server, real server by specifying type of server in configuration file, mix Hop server (virtual server and real server share).

S103, judge the type of runtime server, if the type of runtime server is virtual server, Then enter into S104, if the type of runtime server is real server, then enter into S105；

S104, by the first virtual server carrying LVS business be switched to the second virtual server, LVS business is run by the second virtual server；

When judging that runtime server type is the first virtual server, then by the first virtual server The Linux virtual server LVS business of upper carrying is switched to the second virtual server, virtual by second Server runs described LVS business, the first virtual server here and the second virtual server by height Different real server under availability group system is carried.

S105, the actual services that the first real server is carried is switched to the second real server, Actual services is run by the second real server.

In specific implementation process, step S105 can determine whether the real service broken down Whether device carries virtual server, if judging, the real server broken down carries the void of operation Intend server, then will appear from the actual services that the real server of fault carried and be switched to the true of backup On real server, the real server backed up run this actual services, and will hold on virtual server The LVS business carried is switched on the virtual server of backup, the virtual server backed up run described LVS business；If judging when the real server broken down does not carries virtual server, then will go out The actual services that the real server of existing fault is carried is switched on the real server of backup, by standby The real server of part runs actual services.It should be noted that the virtual server backed up here and The real server of backup can be to be carried by the real server backed up, it is also possible to is different true Real server is carried.

In specific implementation process, in step S105 when judging that the first real server breaks down, Industry in actual services handoff procedure, in high availability cluster software detection to the first real server Business does not exists, and deletes the real IP road of the first real server from the LVS routing table of virtual server By, after completing actual services switching, recover the second real server real IP and be routed to Virtual Service In the LVS routing table of device.

In specific implementation process, the fault type of this first real server can be determined whether, If during the actual services application that this fault type is association LVS business, then by the first real server institute The actual services of carrying is switched to the second real server, the second real server run actual services, Or triggering the first real server automatically to restart, this restarting process is usually soft start-up process；If institute When stating the fault type that fault type is dereferenced LVS, then carried by the first real server is true Industry business is switched to the second real server, the second real server run described actual services.

In specific implementation process, the server broken down can realize spontaneous recovery or manual reversion etc. Function, recovers the most based on the out of order server of high availability cluster software detection, clothes here Business device includes virtual server and real server；

After judging that described out of order server recovers, according to joining in high availability cluster software Put file and carry out service failure recovery, by the automatic switchback of LVS business that switches before to the first virtual clothes Business device or by the first real server after the automatic switchback of actual services that switches before to recovery；Or Person is after judging that out of order server recovers, according to the configuration file in high availability cluster software Carry out standby host registration process, will recover after the first real server or the first virtual server as Standby host in consequent malfunction treatment mechanism.

Accordingly, the concurrent management service being suitable for high availability cluster during Fig. 2 also show the present embodiment The system structure schematic diagram of device, this system includes:

In specific implementation process, this type of server judge module is additionally operable to judge that described first truly takes Whether business device carries the first virtual server；This business handover module is judging that described first truly takes Business device carries the first virtual server, then the actual services carried by the first real server switches To the second real server, the second real server run described actual services, by the first virtual clothes On business device, the LVS business of carrying is switched to the second virtual server, the second virtual server run institute State LVS business；When judging that described first real server does not carry the first virtual server, then The actual services that first real server is carried is switched to the second real server, true by second Server runs described actual services.

In specific implementation process, can by checking each resource whether normal operation of business, if There is certain resource abnormal, then switch.And these resources are embodied in: business faults itself (as Business application fault stops), operating system failure, or hardware fault etc., all can cause industry Business cannot be properly functioning, and now business only switches or automatically restarts.Permissible for traffic failure Switching or business self is selected to restart, if server failure (such as power-off or Server Restart), Then business must switch.In general, this fault may relate to software fault or hardware fault etc. Deng.If run is LVS business, then it is virtual server, is otherwise real server.Certainly Can also be virtual server, real server by specifying type of server in configuration file, mix Hop server (virtual server and real server share).

Further, the concurrent management clothes being suitable for high availability cluster during Fig. 3 shows the present embodiment Another structural representation of system of business device, this system includes:

In specific implementation process, this system also includes a cluster LVS routing module, for judging When first real server breaks down, in actual services handoff procedure, high availability cluster software Detect that the business on the first real server does not exists, delete from the LVS routing table of virtual server Except the real IP of the first real server route, after completing actual services switching, recover second true Server real IP is routed in the LVS routing table of virtual server.

In specific implementation process, this system also includes:

Fault type module, for judging that described runtime server type is the first real server Time, it is judged that the fault type of described first real server, judge described event in fault type module When barrier type is the actual services application of association LVS business, then true by first by business handover module The actual services that server is carried is switched to the second real server, the second real server run Described actual services；Or triggering the first real server automatically to restart, this restarting process is the softest Start-up course；

In specific implementation process, this system also includes:

Fault recovery automatic switchback module, described out of order for judging at fault recovery detection module After server recovers, carry out service failure recovery according to the configuration file in high availability cluster software, By the automatic switchback of LVS business that switches before to the first virtual server or true by switch before The automatic switchback of business to recover after the first real server；

Accordingly, the high availability cluster system during Fig. 4 shows the embodiment of the present invention is specifically applied and is shown It is intended to, first disposing when, virtual server and real server is installed special cluster Software, all nodes are as a cluster cluster, and selecting wherein two-server is Virtual Service Device, is also real server, and virtual server runs LVS business, and real server runs user institute The business needed.

The when that next configuring, using the backup node on virtual server as with real server altogether Backup node.The business externally provided run on real server is placed in clustered software management and control Under, the LVS business that virtual server runs is also under clustered software management and control.

After configuration completes, start clustered software system.After above-mentioned configuration, LVS business All run under the monitoring of this clustered software system with the business on real server, it is provided that take accordingly Business.Master virtual server is if it occur that fault, and LVS business automatically switches to the virtual clothes of backup Business device；Certain real server breaks down, and the related service of this real server also automatically switches to Corresponding backup real server.User by access VIP (Virtual IP), by LVS according to User is requested assignment to real server by load balance scheduling algorithm.

All above step has been assisted jointly by LVS and clustered software system, and this concurrency management is empty Intending server and the method for real server, the most every real server provides load balancing, And ensure business continual external provide service, do not thrust for business provide higher reliably Property.

Based on the specific embodiments in Fig. 4, virtual server and real server are brought into special Highly available cluster system (High Availability cluster) management in, as a cluster management.

Select two-server (main frame and standby host) as virtual server, management and control LVS business, this Two-shipper is that virtual server and real server share, and standby host both standby as virtual server Machine, also serves as a special case of the standby host of all real server, i.e. cluster N+M pattern, for void It is 1+1 pattern for intending server, is N+1 pattern for real server.

The business Floating IP address of the operation on every real server is as the RIP (Real in LVS configuration IP)。

Go wrong on a certain real server in cluster, can be in two kinds of situation: 1, this The service resources such as the service application of platform real server or business IP (Floating IP address) go wrong；2、 This real server itself goes wrong, such as power-off, system reboot etc..Work as real server When there is the 1st kind of problem, business automatically switches to standby host or automatically select and restarts, this select by User is arranged in CONFIG.SYS.When the 2nd kind of problem occurs in real server, business is automatic It is switched to standby host (real server and virtual server share).

The switching of both the above situation business or restart, from LVS virtual server, every true Real server is to run always, from the angle of user, can't see thrusting of business especially.Really Ensure that business 7*24 hour is continuously and uninterruptedly run.

In like manner, when main virtual server (Master virtual server) breaks down, and LVS business is just Standby virtual server (Backup virtual server) can be automatically switched to.In specific implementation process, this In two virtual servers and real server share.

After out of order server (including real server and virtual server) recovers, business Automatically switchback or non-switchback, arranged in CONFIG.SYS by user.If selecting not switchback, The server then recovered is as standby host.

In whole specific implementation process, by detecting virtual server in high availability cluster system With the state on real server, can ensure that virtual server and real server are at fault mode Under business switching.Owing to the embodiment of the present invention being passed through the real service in high availability cluster system Device carries virtual server, and virtual server has two or more, and it is distributed in different true clothes On business device, in whole high availability cluster system, monitor all of server, relevant clothes detected After business device breaks down, first determine whether out the type of runtime server, further according to runtime server Type realizes active-standby switch, ensures that corresponding business will not thrust.The present embodiment makes every truly Business on server and virtual server provides service, the maximum need meeting user the most incessantly Ask.

One of ordinary skill in the art will appreciate that in the various methods of above-described embodiment is all or part of Step can be by program and completes to instruct relevant hardware, and this program can be stored in a calculating In machine readable storage medium storing program for executing, storage medium may include that read only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD Deng.

It addition, the above concurrent management clothes being suitable for high availability cluster that the embodiment of the present invention is provided The method and system of business device are described in detail, former to the present invention of specific case used herein Reason and embodiment are set forth, and the explanation of above example is only intended to help to understand the present invention's Method and core concept thereof；Simultaneously for one of ordinary skill in the art, according to the think of of the present invention Think, the most all will change, in sum, this specification Content should not be construed as limitation of the present invention.

Claims

1. the method for the concurrent management server being suitable for high availability cluster, it is characterised in that bag Include following steps:

2. the method being suitable for the concurrent management server of high availability cluster as claimed in claim 1, It is characterized in that, described when judging that described runtime server type is the first real server, then The actual services that first real server is carried is switched to the second real server, true by second Server runs described actual services and includes；

3. the method being suitable for the concurrent management server of high availability cluster as claimed in claim 1, It is characterized in that, described the actual services that first real server is carried is switched to second truly takes Business device also includes:

4. the concurrent management service being suitable for high availability cluster as described in any one of claims 1 to 3 The method of device, it is characterised in that described judging that described runtime server type is first truly to take During business device, then the actual services that the first real server is carried is switched to the second real server, Run described actual services by the second real server to include:

5. the method being suitable for the concurrent management server of high availability cluster as claimed in claim 4, It is characterized in that, described method also includes:

6. the system of the concurrent management server being suitable for high availability cluster, it is characterised in that institute The system of stating includes:

7. it is suitable for the system of the concurrent management server of high availability cluster as claimed in claim 6, It is characterized in that, described type of server judge module is additionally operable to judge that described first real server is No carry the first virtual server；

8. it is suitable for the system of the concurrent management server of high availability cluster as claimed in claim 6, It is characterized in that, described system also includes:

9. the concurrent management server being suitable for high availability cluster as described in any one of claim 6 to 8 System, it is characterised in that described system also includes:

10. it is suitable for the system of the concurrent management server of high availability cluster as claimed in claim 9, It is characterized in that, described system also includes: