CN105117300A - Apparatus for realizing high availability of heartbeat - Google Patents
Apparatus for realizing high availability of heartbeat Download PDFInfo
- Publication number
- CN105117300A CN105117300A CN201510493574.1A CN201510493574A CN105117300A CN 105117300 A CN105117300 A CN 105117300A CN 201510493574 A CN201510493574 A CN 201510493574A CN 105117300 A CN105117300 A CN 105117300A
- Authority
- CN
- China
- Prior art keywords
- heartbeat
- high availability
- watchdog
- module
- watchdog module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Debugging And Monitoring (AREA)
Abstract
The present invention discloses an apparatus for realizing high availability of heartbeat. The apparatus comprises a heartbeat component and a watchdog module, wherein the heartbeat component is used for performing a write operation on the watchdog module at a predetermined time interval; the watchdog module is used for triggering an operation of re-booting a system when the write operation is not performed within a preset period; and the predetermined time interval is shorter than or equal to the preset period. According to the apparatus for realizing high availability of heartbeat, provided by the present invention, when heartbeat abnormally terminates or the system fails, the watchdog module can automatically re-boot the system, release cluster resources and avoid the occurrence of data conflict, thereby improving the high-availability of heartbeat.
Description
Technical field
The present invention relates to computer memory technical field, particularly relate to a kind of device realizing heartbeat high availability.
Background technology
Heartbeat is an assembly in Linux-HA engineering, it achieves a highly available cluster system.Heartbeat service and trunking communication are two key components of high-availability cluster, in heartbeat project, achieve this two functions by heartbeat module.
But it only can complete heartbeat monitor and resource take-over, its resource controlled or application program can not be monitored.Want monitoring resource or application program whether normal operation, third-party plug-in unit must be used.Such as ipfail, Mon, Ldirector etc.Equally, for operating system self produced problem, heartbeat also cannot monitor.If heartbeat abnormal end or system malfunctions, service disruption may be caused on the one hand, on the other hand because host node resource cannot discharge, and backup node has taken over the resource of host node, now just there occurs the situation of two nodes contention resource simultaneously, cause data collision phenomenon.
For avoiding the generation of above-mentioned technical matters, the invention provides a kind of device realizing heartbeat high availability.
Summary of the invention
The object of this invention is to provide a kind of device realizing heartbeat high availability, object is to solve heartbeat in prior art and easily causes the problem of data collision.
For solving the problems of the technologies described above, the invention provides a kind of device realizing heartbeat high availability, comprising heartbeat assembly and watchdog module;
Wherein, described heartbeat assembly is used for carrying out write operation every predetermined time interval to described watchdog module;
Described watchdog module is used for when not being performed write operation in predetermined period, triggers the operation of restarting systems; Wherein, described predetermined time interval is less than or equal to described predetermined period.
Alternatively, also comprise:
Starting module, entering duty for starting described watchdog module.
Alternatively, described watchdog module realizes especially by the hardware timer independent of kernel.
Alternatively, described hardware timer is MAX813,5045 or IMP813 chip.
Alternatively, described watchdog module realizes in conjunction with timer especially by kernel module.
Alternatively, described predetermined period is one minute.
Alternatively, linux kernel provides corresponding driving for described watchdog module.
Alternatively, the driving of described watchdog module only has one to be loaded at synchronization.
Alternatively, also comprise:
Reminding module, for pointing out the information of restarting systems to user.
The device realizing heartbeat high availability provided by the present invention, carries out write operation every predetermined time interval to watchdog module by heartbeat assembly; And when watchdog module is not performed write operation in predetermined period, namely trigger the operation of restarting systems.Visible, the device realizing heartbeat high availability provided by the present invention, in heartbeat abnormal end, or when system malfunctions, watchdog module can autoboot system, release cluster resource, avoids the generation of data collision, thus improves the high availability of heartbeat.
Accompanying drawing explanation
Fig. 1 is the schematic diagram realizing a kind of embodiment of the device of heartbeat high availability provided by the present invention.
Embodiment
Core of the present invention is to provide a kind of device realizing heartbeat high availability, by being integrated in heartbeat by watchdog.In heartbeat abnormal end, or during system malfunctions, watchdog can autoboot system, thus release cluster resource, avoid the generation of data collision.
In order to make those skilled in the art person understand the present invention program better, below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.Obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
As shown in Figure 1, this device comprises the schematic diagram realizing a kind of embodiment of the device of heartbeat high availability provided by the present invention:
Heartbeat assembly 1 and watchdog module 2;
Wherein, described heartbeat assembly 1 is for carrying out write operation every predetermined time interval to described watchdog module 2;
Described watchdog module 2, for when not being performed write operation in predetermined period, triggers the operation of restarting systems; Wherein, described predetermined time interval is less than or equal to described predetermined period.
The device realizing heartbeat high availability provided by the present invention, carries out write operation every predetermined time interval to watchdog module by heartbeat assembly; And when watchdog module is not performed write operation in predetermined period, namely trigger the operation of restarting systems.Visible, the device realizing heartbeat high availability provided by the present invention, in heartbeat abnormal end, or when system malfunctions, watchdog module can autoboot system, release cluster resource, avoids the generation of data collision, thus improves the high availability of heartbeat.
As a kind of embodiment, the device of the heartbeat of realization high availability provided by the present invention can further include:
Starting module, entering duty for starting watchdog module.
As a kind of embodiment, provided by the present inventionly realize in the device of heartbeat high availability, watchdog module can be hardware circuit in realization also can be software timer, and can reset automatically when system malfunctions system.
When watchdog module adopts hardware circuit, can realize independent of kernel.It is specifically as follows MAX813,5045 or IMP813 chip in one.It is pointed out that watchdog module can have multiple way of realization, be not limited to that these are several, this does not all affect realization of the present invention.
In the present embodiment, predetermined period can be specially 1 minute, can certainly according to user need select other numerical value.
Under linux kernel, the basic functional principle of watchdog is: after watchdog starts (namely/dev/watchdog equipment be opened after), if within the time interval (acquiescence is 1 minute) of a certain setting/and dev/watchdog is not performed write operation, and hardware watchdog circuit or software timer will restarting systems.Wherein ,/dev/watchdog is a major device number is 10, from the character device node of device number 130.
Linux kernel is not only various dissimilar watchdog hardware circuit and provides driving, additionally provides a pure software watchdog based on timer and drives.Drive source code bit is under kernel source code tree drivers/char/watchdog catalogue.
Particularly, watchdog option can be enabled in/etc/ha.d/ha.cf configuration file.Like this, Heartbeat is by every being equivalent to the long time write/dev/watchdog file (or equipment) of deadtime, therefore, there is any thing causing Heartbeat to upgrade the failure of watchdog equipment, once watchdog time out period (acquiescence is a minute) is expired, it is panic that watchdog will start kernel.Kernel fear is set to restart routine by the present embodiment.
Hardware watchdog must have hardware circuit support, and device node/dev/watchdog correspond to real physical equipment, and dissimilar hardware watchdog equipment is managed by corresponding hardware driving.Software watchdog is then realized by timer mechanism by a kernel module softdog.ko, and/dev/watchdog not correspond to real physical equipment, just for application provides an interface identical with operational hardware watchdog.
As a kind of embodiment, at any one time, a watchdog driver module can only be had to be loaded, management/dev/watchdog device node.If system does not have hardware watchdog circuit, software watchdog can be loaded and drive softdog.ko.
The device of the heartbeat of realization high availability provided by the present invention can add watchdog/dev/watchdog in/etc/ha.d/ha.cf configuration file, can automatically enable watchdog function.
Heartbeat process is closed on the primary node by " killall-9heartbeat " order.Owing to being illegally close heartbeat process, the resource that therefore heartbeat controls does not discharge.Backup node, after very short a period of time does not receive the response of host node, will think that host node breaks down, and then adapter host node resource.In this case, just occurred contention for resources situation, two nodes all take a resource, cause data collision.
For this situation, the device realizing heartbeat high availability provided by the present invention, the kernel monitoring module watchdog provided by Linux, is integrated into watchdog in Heartbeat.If Heartbeat abnormal end, or system malfunctions, watchdog can autoboot system, thus release cluster resource, avoid the generation of data collision.
The present embodiment can further include:
Reminding module, for pointing out the information of restarting systems to user.
" system will restart at once, can discharge cluster resource with that, strengthens system high-available.
The device realizing heartbeat high availability provided by the present invention, can carry out effective monitoring self health status by watchdog mechanism.Once node failure in the rigid reset of official hour internal trigger kernel, thus will discharge the resource in hand in time, the generation of cluster " fissure " can also be prevented simultaneously, improve the high availability of heartbeat.
In this instructions, each embodiment adopts the mode of going forward one by one to describe, and what each embodiment stressed is the difference with other embodiment, between each embodiment same or similar part mutually see.
To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.
Claims (9)
1. realize a device for heartbeat high availability, it is characterized in that, comprise heartbeat assembly and watchdog module;
Wherein, described heartbeat assembly is used for carrying out write operation every predetermined time interval to described watchdog module;
Described watchdog module is used for when not being performed write operation in predetermined period, triggers the operation of restarting systems; Wherein, described predetermined time interval is less than or equal to described predetermined period.
2. realize the device of heartbeat high availability as claimed in claim 1, it is characterized in that, also comprise:
Starting module, entering duty for starting described watchdog module.
3. realize the device of heartbeat high availability as claimed in claim 2, it is characterized in that, described watchdog module realizes especially by the hardware timer independent of kernel.
4. realize the device of heartbeat high availability as claimed in claim 3, it is characterized in that, described hardware timer is MAX813,5045 or IMP813 chip.
5. realize the device of heartbeat high availability as claimed in claim 2, it is characterized in that, described watchdog module realizes in conjunction with timer especially by kernel module.
6. realize the device of heartbeat high availability as claimed in claim 2, it is characterized in that, described predetermined period is one minute.
7. realize the device of heartbeat high availability as claimed in claim 6, it is characterized in that, linux kernel provides corresponding driving for described watchdog module.
8. the device realizing heartbeat high availability as described in any one of claim 1 to 7, is characterized in that, the driving of described watchdog module only has one to be loaded at synchronization.
9. realize the device of heartbeat high availability as claimed in claim 8, it is characterized in that, also comprise:
Reminding module, for pointing out the information of restarting systems to user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510493574.1A CN105117300A (en) | 2015-08-12 | 2015-08-12 | Apparatus for realizing high availability of heartbeat |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510493574.1A CN105117300A (en) | 2015-08-12 | 2015-08-12 | Apparatus for realizing high availability of heartbeat |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105117300A true CN105117300A (en) | 2015-12-02 |
Family
ID=54665300
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510493574.1A Pending CN105117300A (en) | 2015-08-12 | 2015-08-12 | Apparatus for realizing high availability of heartbeat |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105117300A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105634813A (en) * | 2016-01-04 | 2016-06-01 | 浪潮电子信息产业股份有限公司 | Method for automatically switching nodes under dual-computer environment based on network |
CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
CN107577575A (en) * | 2017-09-06 | 2018-01-12 | 长沙曙通信息科技有限公司 | A kind of disaster tolerant backup system management of monitor implementation method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101464811A (en) * | 2008-12-29 | 2009-06-24 | 艾默生网络能源有限公司 | Multitask monitoring management system |
CN101980171A (en) * | 2010-10-08 | 2011-02-23 | 广东威创视讯科技股份有限公司 | Failure self-recovery method for software system and software watchdog system used by same |
-
2015
- 2015-08-12 CN CN201510493574.1A patent/CN105117300A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101464811A (en) * | 2008-12-29 | 2009-06-24 | 艾默生网络能源有限公司 | Multitask monitoring management system |
CN101980171A (en) * | 2010-10-08 | 2011-02-23 | 广东威创视讯科技股份有限公司 | Failure self-recovery method for software system and software watchdog system used by same |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105634813A (en) * | 2016-01-04 | 2016-06-01 | 浪潮电子信息产业股份有限公司 | Method for automatically switching nodes under dual-computer environment based on network |
CN107528724A (en) * | 2017-07-20 | 2017-12-29 | 北京奇安信科技有限公司 | A kind of optimized treatment method and device of node cluster |
CN107528724B (en) * | 2017-07-20 | 2020-09-29 | 奇安信科技集团股份有限公司 | Optimization processing method and device for node cluster |
CN107577575A (en) * | 2017-09-06 | 2018-01-12 | 长沙曙通信息科技有限公司 | A kind of disaster tolerant backup system management of monitor implementation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10353779B2 (en) | Systems and methods for detection of firmware image corruption and initiation of recovery | |
US8707290B2 (en) | Firmware update in an information handling system employing redundant management modules | |
US9720757B2 (en) | Securing crash dump files | |
US20150205676A1 (en) | Server Control Method and Server Control Device | |
CN103324495A (en) | Method and system for data center server boot management | |
CN106293979A (en) | Detection procedure is without the method and apparatus of response | |
US9696988B2 (en) | Upgrade processing method, apparatus and system for CPLD | |
CN102609349A (en) | Method and system for screen capture in server failure | |
CN102880527B (en) | Data recovery method of baseboard management controller | |
CN109976926A (en) | Method, circuit, terminal and the storage medium of protection BMC renewal process are restarted in a kind of shielding | |
US10983825B2 (en) | Processing for multiple containers are deployed on the physical machine | |
CN105117300A (en) | Apparatus for realizing high availability of heartbeat | |
CN102819466A (en) | Method and device for processing operating system exceptions | |
US20150046748A1 (en) | Information processing device and virtual machine control method | |
CN103634388B (en) | Controller is restarted in treatment storage server method and relevant device and communication system | |
CN110083491A (en) | A kind of BIOS initialization method, apparatus, equipment and storage medium | |
CN104346188A (en) | Updating method of substrate management controller and updating system of substrate management controller | |
CN106776206A (en) | The method of monitor process state, device and electronic equipment | |
US20240054085A1 (en) | Method for controlling a target memory by programmably selecting an action execution circuit module corresponding to a triggered preset state | |
CN111371642B (en) | Network card fault detection method, device, equipment and storage medium | |
CN103890713A (en) | Apparatus and method for managing register information in a processing system | |
CN103428022A (en) | Method and system for network element configuration data file backup and recovery | |
Carvalho et al. | PCI express hotplug implementation for ATCA based instrumentation | |
US20210406064A1 (en) | Systems and methods for asynchronous job scheduling among a plurality of managed information handling systems | |
TW201820137A (en) | Device having restarting function |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20151202 |
|
WD01 | Invention patent application deemed withdrawn after publication |