CN110635950A - Double-data-center disaster recovery system - Google Patents

Double-data-center disaster recovery system Download PDF

Info

Publication number
CN110635950A
CN110635950A CN201910939003.4A CN201910939003A CN110635950A CN 110635950 A CN110635950 A CN 110635950A CN 201910939003 A CN201910939003 A CN 201910939003A CN 110635950 A CN110635950 A CN 110635950A
Authority
CN
China
Prior art keywords
data center
heartbeat
data
monitoring
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910939003.4A
Other languages
Chinese (zh)
Inventor
陈辉
强春雨
薛文娟
罗文洁
颜旭乐
谭秀瑶
时琛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Power Supply Co ltd
Original Assignee
Shenzhen Power Supply Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Power Supply Co ltd filed Critical Shenzhen Power Supply Co ltd
Priority to CN201910939003.4A priority Critical patent/CN110635950A/en
Publication of CN110635950A publication Critical patent/CN110635950A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides a double-data-center disaster recovery system, which comprises a first data center, a second data center and a centralized disaster recovery switching device, wherein the first data center is connected with the second data center through a network; the first data center and the second data center monitor the fault state of the local data center and obtain the running state of the data center of the opposite side according to the heartbeat response condition fed back by the opposite side; the centralized disaster recovery switching device compares the fault state and the operation state of each data center to form a first comparison result and a second comparison result respectively, identifies a fault data center and a normal data center in the first data center and the second data center according to the first comparison result and the second comparison result, and further enables the normal data center to take over all data services of the fault data center. The invention can automatically identify abnormal conditions of the double data centers and carry out corresponding switching operation.

Description

Double-data-center disaster recovery system
Technical Field
The invention relates to the technical field of data centers, in particular to a disaster recovery system with double data centers.
Background
95598 the power supply service faces thousands of households, and has high service requirement and great social influence. The customer service center is used as a window department of a company, the reliability of an information system of the customer service center is very important, and particularly, the construction of a service continuity guarantee system of a 95598 core service system is very important. Through the construction of the business continuity guarantee system, the capability of a core business system of the customer service center for resisting disasters and major accidents can be improved, the loss caused by disaster attack and major accidents is reduced, the data safety and the operation continuity of an important information system of the customer service center are ensured, the serious interruption of important social service functions is avoided, and the stability of social economy is guaranteed.
The service continuity guarantee is the target of disaster recovery construction of a 95598 core service system, a client service center can adopt a framework of a double-active data center, the double centers simultaneously accept service access of users in different areas, service operation is completed in the center, and data between the double centers are mutually prepared through a database logic copying technology. When a disaster or failure event occurs in one center, in order to meet the service continuity guarantee, the two data centers must be able to provide access services to remote users of the failure center respectively.
Therefore, a disaster recovery system capable of automatically identifying abnormal conditions and performing corresponding switching operations for dual data centers is needed.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a disaster recovery system for dual data centers, which can automatically identify abnormal situations and perform corresponding switching operations for the dual data centers.
In order to solve the above technical problem, an embodiment of the present invention provides a dual data center disaster recovery system, including a first data center and a second data center that are connected to each other, and a centralized disaster recovery switching device that is connected to both the first data center and the second data center; wherein the content of the first and second substances,
the first data center is used for monitoring the fault state of a local data center and obtaining the running state of the second data center by receiving the heartbeat response condition fed back by the second data center after sending heartbeat request information to the second data center;
the second data center is used for monitoring the fault state of a local data center and obtaining the running state of the first data center by receiving the heartbeat response condition fed back by the first data center after sending heartbeat request information to the first data center;
the centralized disaster recovery switching device is configured to compare a fault state of the first data center with an operating state of the first data center obtained by the second data center to form a first comparison result, compare a fault state of the second data center with an operating state of the second data center obtained by the first data center to form a second comparison result, identify a fault data center and a normal data center among the first data center and the second data center according to the first comparison result and the second comparison result, and further allow the normal data center to take over all data services of the fault data center.
And after the first data center and the second data center are connected, the provided data services are the same or different.
The first data center comprises a first local fault state monitoring module and a first opposite end running state monitoring module which are both connected with the concentrated disaster recovery switching device; the first local fault state monitoring module is used for monitoring the fault state of the first data center; the first peer operation state monitoring module is configured to obtain an operation state of the second data center by receiving a heartbeat response condition fed back by the second data center after sending heartbeat request information to the second data center;
the second data center comprises a second local fault state monitoring module and a second opposite end running state monitoring module which are both connected with the concentrated disaster backup switching device, and the second opposite end running state monitoring module is also in channel connection with the first opposite end running state monitoring module; the second local fault state monitoring module is used for monitoring the fault state of the second data center; the second peer operating state monitoring module is configured to obtain an operating state of the first data center by receiving a heartbeat response condition fed back by the first data center after sending the heartbeat request information to the first data center.
The first local fault state monitoring module comprises a first equipment state monitoring submodule and a first environment monitoring submodule; the first equipment state monitoring submodule is used for monitoring equipment health data in the first data center to obtain an equipment health state in the first data center; the first environment monitoring submodule is used for monitoring environment data in the first data center to obtain an environment state of the first data center;
the second local fault state monitoring module comprises a second equipment state monitoring submodule and a second environment monitoring submodule; the second equipment state monitoring submodule is used for monitoring equipment health data in the second data center to obtain the equipment health state in the second data center; and the second environment monitoring submodule is used for monitoring the environment data in the second data center to obtain the environment state of the second data center.
The equipment health data of the first data center and the second data center respectively comprise an equipment current value and an equipment voltage value; the environmental data of the first data center and the second data center each include a humidity and a temperature.
The first peer-to-peer operation state monitoring module comprises a first heartbeat request information sending submodule, a first heartbeat response information receiving submodule and a first heartbeat monitoring management submodule connected with the concentrated disaster recovery switching device; the first heartbeat request information sending submodule is used for sending heartbeat request information to the second data center; the first heartbeat response information receiving submodule is used for receiving a heartbeat response condition fed back by the second data center; the first heartbeat monitoring management submodule is used for obtaining the running state of the second data center according to the heartbeat response condition fed back by the second data center;
the second peer-to-peer operation state monitoring module comprises a second heartbeat request information sending submodule, a second heartbeat response information receiving submodule and a second heartbeat monitoring management submodule connected with the concentrated disaster recovery switching device; the second heartbeat request information sending submodule is used for sending heartbeat request information to the first data center; the second heartbeat response information receiving submodule is used for receiving a heartbeat response condition fed back by the first data center; and the second heartbeat monitoring management submodule is used for obtaining the running state of the first data center according to the heartbeat response condition fed back by the first data center.
The first heartbeat monitoring management submodule comprises a first timing counting unit and a first running state monitoring management unit; the first timing counting unit is configured to start timing when the first heartbeat request information sending module sends heartbeat request information to the second data center, and start counting if the first heartbeat response information receiving module does not receive heartbeat corresponding information fed back by the second data center after a preset time is exceeded, and add 1 to a numerical value; or if the first heartbeat response information receiving module receives heartbeat corresponding information fed back by the second data center within the preset time, resetting the counted numerical value; the first running state monitoring management unit is used for marking the running state of the second data center as a fault if the counting numerical value of the first timing counting unit is greater than a threshold value; otherwise, marking the running state of the second data center as normal;
the second heartbeat monitoring management submodule comprises a second timing counting unit and a second running state monitoring management unit; the second timing and counting unit is configured to start timing when the second heartbeat request information sending module sends heartbeat request information to the first data center, and start counting if the second heartbeat response information receiving module does not receive heartbeat corresponding information fed back by the first data center after the preset time is exceeded, and add 1 to a numerical value; or if the second heartbeat response information receiving module receives heartbeat corresponding information fed back by the first data center within the preset time, resetting the counted numerical value; the second running state monitoring management unit is used for marking the running state of the first data center as a fault if the counting numerical value of the second timing counting unit is greater than the threshold value; and otherwise, marking the running state of the first data center as normal.
The first heartbeat request information sending submodule or the second heartbeat request information sending submodule sends heartbeat request information to the other party at regular intervals so as to periodically detect the heartbeat connection condition between the first data center and the second data center.
The centralized disaster recovery switching device comprises a monitoring information receiving module, a monitoring information processing module, a fault information management module and a take-over module; wherein the content of the first and second substances,
the monitoring information receiving module is used for receiving the fault state of the first data center and the obtained running state of the second data center, and receiving the fault state of the second data center and the obtained running state of the first data center;
the monitoring information processing module is used for comparing the fault state of the first data center with the running state of the first data center obtained by the second data center to form a first comparison result and comparing the fault state of the second data center with the running state of the second data center obtained by the first data center to form a second comparison result according to preset fault characteristic data;
the fault information management module is used for identifying a fault data center and a normal data center in the first data center and the second data center according to the first comparison result and the second comparison result;
and the take-over module is used for generating a corresponding take-over instruction in a preset fault logic principle to enable the normal data center to take over all data services of the fault data center.
The centralized disaster recovery switching device also comprises a correction module; wherein the content of the first and second substances,
and the correcting module is used for correcting and updating the preset fault logic principle.
The embodiment of the invention has the following beneficial effects:
according to the invention, the local fault state monitoring module and the opposite end running state monitoring module of each data center respectively monitor the fault state of the data center and the running state of the opposite data center, so that analysis data is provided for the centralized disaster recovery switching device, and the centralized disaster recovery switching device achieves the purposes of automatically identifying abnormal conditions of the double data centers and performing corresponding switching operation.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.
Fig. 1 is a schematic structural diagram of a dual data center disaster recovery system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a first data center of FIG. 1;
FIG. 3 is a schematic diagram of a second data center of FIG. 1;
fig. 4 is a schematic structural diagram of the centralized disaster recovery switching device in fig. 1.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, a dual data center disaster recovery system provided in an embodiment of the present invention includes a first data center 1 and a second data center 2 connected to each other, and a centralized disaster recovery switching device 3 connected to both the first data center 1 and the second data center 2; wherein the content of the first and second substances,
the first data center 1 is used for monitoring the fault state of the local data center and obtaining the running state of the second data center 2 by receiving the heartbeat response condition fed back by the second data center 2 after sending heartbeat request information to the second data center 2;
the second data center 2 is used for monitoring the fault state of the local data center and obtaining the running state of the first data center 1 by receiving the heartbeat response condition fed back by the first data center 1 after sending the heartbeat request information to the first data center 1;
and the centralized disaster recovery switching device 3 is configured to compare the fault state of the first data center 1 with the operating state of the first data center 1 obtained by the second data center 2 to form a first comparison result, compare the fault state of the second data center 2 with the operating state of the second data center 2 obtained by the first data center 1 to form a second comparison result, identify a faulty data center and a normal data center among the first data center 1 and the second data center 2 according to the first comparison result and the second comparison result, and further enable the normal data center to take over all data services of the faulty data center.
It should be noted that, after the first data center 1 and the second data center 2 are connected, the provided data services may be the same or different, and once any one of the data services fails, all the data services are concentrated on the normal data center through the concentrated disaster recovery switching device 3, so that normal operation of all the data services is ensured, and a disaster recovery effect is achieved.
In the embodiment of the present invention, as shown in fig. 2, the first data center 1 includes a first local failure state monitoring module 11 and a first peer operation state monitoring module 12, both of which are connected to the concentrated disaster recovery switching device 3; the first local fault state monitoring module 11 is configured to monitor a fault state of the first data center 1; the first peer operation state monitoring module 12 is configured to obtain an operation state of the second data center 2 by receiving a heartbeat response condition fed back by the second data center 2 after sending the heartbeat request information to the second data center 2;
the first local fault state monitoring module 11 includes a first device state monitoring submodule 111 and a first environment monitoring submodule 112; the first equipment state monitoring submodule 111 is configured to monitor equipment health data in the first data center 1 to obtain an equipment health state in the first data center 1; the first environment monitoring submodule 112 is configured to monitor environment data in the first data center 1 to obtain an environment state of the first data center 1; wherein the device health data comprises a device current value and a device voltage value; environmental data includes humidity and temperature;
the first peer operation state monitoring module 12 includes a first heartbeat request information sending submodule 121, a first heartbeat response information receiving submodule 122, and a first heartbeat monitoring management submodule 123 connected to the concentrated disaster recovery switching device 3; the first heartbeat request information sending submodule 121 is configured to send heartbeat request information to the second data center 2; the first heartbeat response information receiving submodule 122 is configured to receive a heartbeat response condition fed back by the second data center 2; the first heartbeat monitoring management submodule 123 is configured to obtain an operating state of the second data center 2 according to a heartbeat response condition fed back by the second data center 2;
the first heartbeat monitoring management submodule 123 includes a first timing counting unit 1231 and a first operation state monitoring management unit 1232; the first timing and counting unit 1231 is configured to start timing when the first heartbeat request information sending module 121 sends heartbeat request information to the second data center 2, and start counting if the first heartbeat response information receiving module 122 does not receive heartbeat corresponding information fed back by the second data center 2 after a preset time (for example, 10S) is exceeded, and add 1 to the value; or if the receiving module of the first heartbeat response information 122 receives heartbeat corresponding information fed back by the second data center 2 within a preset time (for example, 10S), clearing the counted value; the first operation state monitoring management unit 1232 is configured to mark the operation state of the second data center 2 as a fault if the counted value of the first timing counting unit 1231 is greater than a threshold (e.g., 3); otherwise, the operation state of the second data center 2 is marked as normal.
In the embodiment of the present invention, as shown in fig. 3, the second data center 2 includes a second local failure state monitoring module 21 and a second peer operation state monitoring module 22 both connected to the concentrated disaster backup switching device 3, and the second peer operation state monitoring module 22 further establishes a channel connection with the first peer operation state monitoring module 21; the second local fault state monitoring module 21 is configured to monitor a fault state of the second data center 2; the second peer operating state monitoring module 22 is configured to obtain an operating state of the first data center 1 by receiving a heartbeat response condition fed back by the first data center 1 after sending the heartbeat request information to the first data center 1;
the second local fault status monitoring module 21 includes a second device status monitoring submodule 211 and a second environment monitoring submodule 212; the second equipment state monitoring submodule 211 is configured to monitor the equipment health data in the second data center 2 to obtain the equipment health state in the second data center 2; the second environment monitoring submodule 212 is configured to monitor environment data in the second data center 2 to obtain an environment state of the second data center 2; the equipment health data also comprises an equipment current value and an equipment voltage value; environmental data also includes humidity and temperature;
the second peer operating state monitoring module 22 includes a second heartbeat request information sending submodule 221, a second heartbeat response information receiving submodule 222, and a second heartbeat monitoring management submodule 223 connected to the concentrated disaster recovery switching device 3; the second heartbeat request information sending submodule 221 is configured to send heartbeat request information to the first data center 1; the second heartbeat response information receiving submodule 222 is configured to receive a heartbeat response condition fed back by the first data center 1; the second heartbeat monitoring management submodule 223 is configured to obtain an operating state of the first data center 1 according to a heartbeat response condition fed back by the first data center 1;
the second heartbeat monitoring management sub-module 223 includes a second timing counting unit 2231 and a second operation state monitoring management unit 2232; the second timing and counting unit 2231 is configured to start timing when the second heartbeat request information sending module 221 sends the heartbeat request information to the first data center 1, and start counting if the second heartbeat response information receiving module 222 does not receive the heartbeat corresponding information fed back by the first data center 1 after a preset time (for example, 10S) is exceeded, and add 1 to the value; or if the second heartbeat response information receiving module 222 receives heartbeat corresponding information fed back by the first data center 1 within a preset time (for example, 10S), clearing the counted value; a second operation state monitoring management unit 2232, configured to mark the operation state of the first data center 1 as a fault if the counted value of the second time counting unit 2231 is greater than a threshold (e.g., 3); otherwise, the operation state of the first data center 1 is marked as normal.
It should be noted that the first heartbeat request information sending sub-module 121 or the second heartbeat request information sending sub-module 221 sends heartbeat request information to the other party at regular intervals to periodically detect the heartbeat connection between the first data center 1 and the second data center 2, that is, periodically and automatically identify the abnormal condition of the dual data centers.
In the embodiment of the present invention, as shown in fig. 4, the centralized disaster recovery switching device 3 includes a monitoring information receiving module 31, a monitoring information processing module 32, a fault information management module 33, and a takeover module 34; wherein the content of the first and second substances,
the monitoring information receiving module 31 is configured to receive a fault state of the first data center 1 and an obtained operating state of the second data center 2, and receive a fault state of the second data center 2 and an obtained operating state of the first data center 1;
the monitoring information processing module 32 is configured to compare the fault state of the first data center 1 with the operating state of the first data center 1 obtained by the second data center 2 to form a first comparison result, and compare the fault state of the second data center 2 with the operating state of the second data center 2 obtained by the first data center 1 to form a second comparison result, according to preset fault feature data;
the fault information management module 33 is configured to identify a fault data center and a normal data center of the first data center and the second data center according to the first comparison result and the second comparison result; it should be noted that the first data center and the second data center have at most one failure, otherwise, the whole data center is broken down;
and the takeover module 34 is configured to generate a corresponding takeover instruction in a preset fault logic principle, so that the normal data center takes over all data services of the fault data center.
Furthermore, the centralized disaster recovery switching device 3 further includes a correction module 35; the correcting module 35 is configured to correct and update a preset fault logic principle.
The embodiment of the invention has the following beneficial effects:
according to the invention, the local fault state monitoring module and the opposite end running state monitoring module of each data center respectively monitor the fault state of the data center and the running state of the opposite data center, so that analysis data is provided for the centralized disaster recovery switching device, and the centralized disaster recovery switching device achieves the purposes of automatically identifying abnormal conditions of the double data centers and performing corresponding switching operation.
It should be noted that, in the foregoing system embodiment, each included module is only divided according to functional logic, but is not limited to the above division as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It will be understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by relevant hardware instructed by a program, and the program may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A double-data-center disaster recovery system is characterized by comprising a first data center and a second data center which are connected with each other, and a centralized disaster recovery switching device which is connected with the first data center and the second data center; wherein the content of the first and second substances,
the first data center is used for monitoring the fault state of a local data center and obtaining the running state of the second data center by receiving the heartbeat response condition fed back by the second data center after sending heartbeat request information to the second data center;
the second data center is used for monitoring the fault state of a local data center and obtaining the running state of the first data center by receiving the heartbeat response condition fed back by the first data center after sending heartbeat request information to the first data center;
the centralized disaster recovery switching device is configured to compare a fault state of the first data center with an operating state of the first data center obtained by the second data center to form a first comparison result, compare a fault state of the second data center with an operating state of the second data center obtained by the first data center to form a second comparison result, identify a fault data center and a normal data center among the first data center and the second data center according to the first comparison result and the second comparison result, and further allow the normal data center to take over all data services of the fault data center.
2. The dual-data-center disaster recovery system according to claim 1, wherein the first data center and the second data center provide the same or different data services after establishing the connection.
3. The dual-data-center disaster recovery system according to claim 1, wherein the first data center includes a first local failure status monitoring module and a first peer operation status monitoring module both connected to the centralized disaster recovery switching device; the first local fault state monitoring module is used for monitoring the fault state of the first data center; the first peer operation state monitoring module is configured to obtain an operation state of the second data center by receiving a heartbeat response condition fed back by the second data center after sending heartbeat request information to the second data center;
the second data center comprises a second local fault state monitoring module and a second opposite end running state monitoring module which are both connected with the concentrated disaster backup switching device, and the second opposite end running state monitoring module is also in channel connection with the first opposite end running state monitoring module; the second local fault state monitoring module is used for monitoring the fault state of the second data center; the second peer operating state monitoring module is configured to obtain an operating state of the first data center by receiving a heartbeat response condition fed back by the first data center after sending the heartbeat request information to the first data center.
4. The dual data center disaster recovery system of claim 3 wherein said first local failure status monitoring module comprises a first equipment status monitoring submodule and a first environmental monitoring submodule; the first equipment state monitoring submodule is used for monitoring equipment health data in the first data center to obtain an equipment health state in the first data center; the first environment monitoring submodule is used for monitoring environment data in the first data center to obtain an environment state of the first data center;
the second local fault state monitoring module comprises a second equipment state monitoring submodule and a second environment monitoring submodule; the second equipment state monitoring submodule is used for monitoring equipment health data in the second data center to obtain the equipment health state in the second data center; and the second environment monitoring submodule is used for monitoring the environment data in the second data center to obtain the environment state of the second data center.
5. The dual-data-center disaster recovery system of claim 4, wherein the equipment health data of the first data center and the second data center each comprise an equipment current value and an equipment voltage value; the environmental data of the first data center and the second data center each include a humidity and a temperature.
6. The dual-data-center disaster recovery system according to claim 3, wherein the first peer-to-peer operation status monitoring module comprises a first heartbeat request information sending sub-module, a first heartbeat response information receiving sub-module, and a first heartbeat monitoring management sub-module connected to the centralized disaster recovery switching device; the first heartbeat request information sending submodule is used for sending heartbeat request information to the second data center; the first heartbeat response information receiving submodule is used for receiving a heartbeat response condition fed back by the second data center; the first heartbeat monitoring management submodule is used for obtaining the running state of the second data center according to the heartbeat response condition fed back by the second data center;
the second peer-to-peer operation state monitoring module comprises a second heartbeat request information sending submodule, a second heartbeat response information receiving submodule and a second heartbeat monitoring management submodule connected with the concentrated disaster recovery switching device; the second heartbeat request information sending submodule is used for sending heartbeat request information to the first data center; the second heartbeat response information receiving submodule is used for receiving a heartbeat response condition fed back by the first data center; and the second heartbeat monitoring management submodule is used for obtaining the running state of the first data center according to the heartbeat response condition fed back by the first data center.
7. The dual data center disaster recovery system according to claim 6, wherein the first heartbeat monitoring management sub-module comprises a first timing counting unit and a first operation status monitoring management unit; the first timing counting unit is configured to start timing when the first heartbeat request information sending module sends heartbeat request information to the second data center, and start counting if the first heartbeat response information receiving module does not receive heartbeat corresponding information fed back by the second data center after a preset time is exceeded, and add 1 to a numerical value; or if the first heartbeat response information receiving module receives heartbeat corresponding information fed back by the second data center within the preset time, resetting the counted numerical value; the first running state monitoring management unit is used for marking the running state of the second data center as a fault if the counting numerical value of the first timing counting unit is greater than a threshold value; otherwise, marking the running state of the second data center as normal;
the second heartbeat monitoring management submodule comprises a second timing counting unit and a second running state monitoring management unit; the second timing and counting unit is configured to start timing when the second heartbeat request information sending module sends heartbeat request information to the first data center, and start counting if the second heartbeat response information receiving module does not receive heartbeat corresponding information fed back by the first data center after the preset time is exceeded, and add 1 to a numerical value; or if the second heartbeat response information receiving module receives heartbeat corresponding information fed back by the first data center within the preset time, resetting the counted numerical value; the second running state monitoring management unit is used for marking the running state of the first data center as a fault if the counting numerical value of the second timing counting unit is greater than the threshold value; and otherwise, marking the running state of the first data center as normal.
8. The dual-data-center disaster recovery system according to claim 6, wherein the first heartbeat request information sending sub-module or the second heartbeat request information sending sub-module sends heartbeat request information to the other at regular intervals to periodically detect a heartbeat connection condition between the first data center and the second data center.
9. The dual data center disaster recovery system according to claim 1, wherein the centralized disaster recovery switching device comprises a monitoring information receiving module, a monitoring information processing module, a fault information management module and a takeover module; wherein the content of the first and second substances,
the monitoring information receiving module is used for receiving the fault state of the first data center and the obtained running state of the second data center, and receiving the fault state of the second data center and the obtained running state of the first data center;
the monitoring information processing module is used for comparing the fault state of the first data center with the running state of the first data center obtained by the second data center to form a first comparison result and comparing the fault state of the second data center with the running state of the second data center obtained by the first data center to form a second comparison result according to preset fault characteristic data;
the fault information management module is used for identifying a fault data center and a normal data center in the first data center and the second data center according to the first comparison result and the second comparison result;
and the take-over module is used for generating a corresponding take-over instruction in a preset fault logic principle to enable the normal data center to take over all data services of the fault data center.
10. The dual data center disaster recovery system according to claim 9, wherein said centralized disaster recovery switching device further comprises a modification module; wherein the content of the first and second substances,
and the correcting module is used for correcting and updating the preset fault logic principle.
CN201910939003.4A 2019-09-30 2019-09-30 Double-data-center disaster recovery system Pending CN110635950A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910939003.4A CN110635950A (en) 2019-09-30 2019-09-30 Double-data-center disaster recovery system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910939003.4A CN110635950A (en) 2019-09-30 2019-09-30 Double-data-center disaster recovery system

Publications (1)

Publication Number Publication Date
CN110635950A true CN110635950A (en) 2019-12-31

Family

ID=68973549

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910939003.4A Pending CN110635950A (en) 2019-09-30 2019-09-30 Double-data-center disaster recovery system

Country Status (1)

Country Link
CN (1) CN110635950A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112540873A (en) * 2020-12-03 2021-03-23 华云数据控股集团有限公司 Disaster tolerance method and device, electronic equipment and disaster tolerance system
CN113037560A (en) * 2021-03-18 2021-06-25 同盾科技有限公司 Service flow switching method and device, storage medium and electronic equipment
CN113765705A (en) * 2021-08-12 2021-12-07 深圳市珍爱捷云信息技术有限公司 Traffic switching method and traffic management server for cross-public-cloud dual-active structure
CN114338359A (en) * 2021-12-29 2022-04-12 中国邮政储蓄银行股份有限公司 Method and device for processing data center abnormity
CN114679376A (en) * 2022-02-22 2022-06-28 兴业证券股份有限公司 Multi-data-center disaster recovery method and system
WO2023093379A1 (en) * 2021-11-26 2023-06-01 中兴通讯股份有限公司 Disaster recovery switching method and system, electronic device, and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193252A1 (en) * 2005-02-25 2006-08-31 Cisco Technology, Inc. Active-active data center using RHI, BGP, and IGP anycast for disaster recovery and load distribution
CN103106048A (en) * 2013-01-30 2013-05-15 浪潮电子信息产业股份有限公司 Multi-control multi-activity storage system
CN105574590A (en) * 2015-12-28 2016-05-11 中国民航信息网络股份有限公司 Adaptive general control disaster recovery switching device and system, and signal generation method
CN109451189A (en) * 2018-09-25 2019-03-08 国家电网有限公司客户服务中心 One kind being based on event driven 95598 strange land dual-active system panorama switching system and method
CN110177007A (en) * 2019-04-16 2019-08-27 平安科技(深圳)有限公司 Realize gateway strange land method, apparatus, computer equipment and storage medium mostly living

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060193252A1 (en) * 2005-02-25 2006-08-31 Cisco Technology, Inc. Active-active data center using RHI, BGP, and IGP anycast for disaster recovery and load distribution
CN103106048A (en) * 2013-01-30 2013-05-15 浪潮电子信息产业股份有限公司 Multi-control multi-activity storage system
CN105574590A (en) * 2015-12-28 2016-05-11 中国民航信息网络股份有限公司 Adaptive general control disaster recovery switching device and system, and signal generation method
CN109451189A (en) * 2018-09-25 2019-03-08 国家电网有限公司客户服务中心 One kind being based on event driven 95598 strange land dual-active system panorama switching system and method
CN110177007A (en) * 2019-04-16 2019-08-27 平安科技(深圳)有限公司 Realize gateway strange land method, apparatus, computer equipment and storage medium mostly living

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112540873A (en) * 2020-12-03 2021-03-23 华云数据控股集团有限公司 Disaster tolerance method and device, electronic equipment and disaster tolerance system
CN112540873B (en) * 2020-12-03 2021-12-31 华云数据控股集团有限公司 Disaster tolerance method and device, electronic equipment and disaster tolerance system
CN113037560A (en) * 2021-03-18 2021-06-25 同盾科技有限公司 Service flow switching method and device, storage medium and electronic equipment
CN113037560B (en) * 2021-03-18 2022-09-30 同盾科技有限公司 Service flow switching method and device, storage medium and electronic equipment
CN113765705A (en) * 2021-08-12 2021-12-07 深圳市珍爱捷云信息技术有限公司 Traffic switching method and traffic management server for cross-public-cloud dual-active structure
WO2023093379A1 (en) * 2021-11-26 2023-06-01 中兴通讯股份有限公司 Disaster recovery switching method and system, electronic device, and storage medium
CN114338359A (en) * 2021-12-29 2022-04-12 中国邮政储蓄银行股份有限公司 Method and device for processing data center abnormity
CN114679376A (en) * 2022-02-22 2022-06-28 兴业证券股份有限公司 Multi-data-center disaster recovery method and system

Similar Documents

Publication Publication Date Title
CN110635950A (en) Double-data-center disaster recovery system
CN107465721B (en) Global load balancing method and system based on double-active architecture and scheduling server
CN102355368B (en) Fault processing method of network equipment and system
CN108737574B (en) Node offline judgment method, device, equipment and readable storage medium
CN107862626A (en) A kind of real-time power failure monitoring method and device based on measuring terminal warning information
WO2016183967A1 (en) Failure alarm method and apparatus for key component, and big data management system
CN112422684B (en) Target message processing method and device, storage medium and electronic device
US20080082630A1 (en) System and method of fault tolerant reconciliation for control card redundancy
CN111901176B (en) Fault determination method, device, equipment and storage medium
CN111565133B (en) Private line switching method and device, electronic equipment and computer readable storage medium
CN113535480A (en) Data disaster recovery system and method
CN110674096A (en) Node troubleshooting method, device and equipment and computer readable storage medium
CN111309515B (en) Disaster recovery control method, device and system
CN111953808B (en) Data transmission switching method of dual-machine dual-activity architecture and architecture construction system
CN109510730B (en) Distributed system, monitoring method and device thereof, electronic equipment and storage medium
US20210326224A1 (en) Method and system for processing device failure
CN107026762B (en) Disaster recovery system and method based on distributed cluster
CN111404737B (en) Disaster recovery processing method and related device
KR20190104759A (en) System and method for intelligent equipment abnormal symptom proactive detection
CN115102862B (en) Automatic synchronization method and device for SDN equipment
CN116506340A (en) Flow link testing method and device, electronic equipment and storage medium
CN112751722A (en) Data transmission quality monitoring method and system
CN109104314A (en) A kind of method and device for modifying log configuration file
JP2015162806A (en) remote monitoring system
CN107590032A (en) The method and storage cluster system of storage cluster failure transfer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20191231

RJ01 Rejection of invention patent application after publication