CN114500238A - Automatic switching system and method for block-level disaster recovery, electronic device and medium - Google Patents

Automatic switching system and method for block-level disaster recovery, electronic device and medium Download PDF

Info

Publication number
CN114500238A
CN114500238A CN202210090705.1A CN202210090705A CN114500238A CN 114500238 A CN114500238 A CN 114500238A CN 202210090705 A CN202210090705 A CN 202210090705A CN 114500238 A CN114500238 A CN 114500238A
Authority
CN
China
Prior art keywords
self
address
healing
application server
control center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210090705.1A
Other languages
Chinese (zh)
Other versions
CN114500238B (en
Inventor
董帅
陈跃俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ybm Technologies Pvt ltd
Original Assignee
Ybm Technologies Pvt ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ybm Technologies Pvt ltd filed Critical Ybm Technologies Pvt ltd
Priority to CN202210090705.1A priority Critical patent/CN114500238B/en
Publication of CN114500238A publication Critical patent/CN114500238A/en
Application granted granted Critical
Publication of CN114500238B publication Critical patent/CN114500238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0631Management of faults, events, alarms or notifications using root cause analysis; using analysis of correlation between notifications, alarms or events based on decision criteria, e.g. hierarchy, tree or time analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0893Assignment of logical groups to network elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0817Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking functioning

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides an automatic switching system, a method, electronic equipment and a medium of block-level disaster recovery, wherein the automatic switching method of the block-level disaster recovery comprises the following steps: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions; when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment. The automatic switching method of the block-level disaster recovery improves the problem that the ground-level disaster recovery cannot be automatically switched in a copy mode in the prior art.

Description

Automatic switching system and method for block-level disaster recovery, electronic device and medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to an automatic block-level disaster recovery switching system, method, electronic device, and medium.
Background
The ground-level disaster recovery is to copy all relevant application programs, operating systems and the like on the disk to a block copy server end through disk block-level copying, reserve the disk files in a virtualization format at the server end for storage, and realize the reserve of a rollback point through the snapshot function of a virtualization disk.
At present, in the industry, under a ground-level CDM replication mode, emergency takeover needs to be manually switched, and the operation is complex.
Disclosure of Invention
The invention aims to provide an automatic switching system, an automatic switching method, electronic equipment and a medium for block-level disaster recovery, wherein the automatic switching method for the block-level disaster recovery can solve the problem that the ground-level disaster recovery cannot be automatically switched in a copy mode in the prior art.
In order to achieve the above purpose, the invention provides the following technical scheme:
the embodiment of the invention provides an automatic switching method of block-level disaster recovery, which specifically comprises the following steps:
monitoring the conditions of an application port and an IP address in real time;
executing a self-healing script when the application port and the IP address are closed;
when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions;
when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment
On the basis of the technical scheme, the invention can be further improved as follows:
further, the monitoring the conditions of the application port and the IP address in real time includes:
judging whether the communication between the control center and the control center gateway fails or not, and if so, executing a self-healing script;
when the communication between the control center and the control center gateway is successful, judging whether the communication between the control center and the application server gateway fails or not, if so, sending a warning signal and executing a self-healing script;
and when the control center is successfully communicated with the control center gateway and the control center is successfully communicated with the application server gateway, judging whether the application port and the IP address are closed or not, and if so, executing a self-healing script.
Further, when the application port and the IP address are closed, executing a self-healing script, including:
presetting times and interval time for executing the self-healing script in the control center;
after the self-healing script is executed, monitoring whether the application port and the IP address are closed again within the designated time;
and if the application port and the IP address are not closed again, judging that the self-healing result is successful in self-healing.
Further, after the self-healing script is executed, monitoring whether the application port and the IP address are closed again within a specified time includes:
if the application port and the IP address are closed again, judging that the self-healing result is self-healing failure;
the application server sends the self-healing failure feedback information to a control center;
the control center stops executing the self-healing script.
Further, when the application port and the IP address are open, determining whether a persistent abnormal condition exists in the application server includes:
acquiring monitoring data of the application server in real time;
and comparing the monitoring data, judging whether the monitoring data has long-time CPU and memory occupation abnormity, and if so, judging that the application server has abnormity.
Further, when the self-healing fails or the application server has a persistent abnormal condition, generating a snapshot creating instruction, deriving the snapshot creating instruction, creating a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and setting hardware resources of the virtual machine based on the original production environment includes:
setting hardware resources of the virtual machine based on the original production environment and preset resource limit;
and executing a port closing action on the opened application port and the IP address through an application server.
Further, when the self-healing fails or the application server has a persistent abnormal condition, generating a snapshot creating instruction, deriving the snapshot creating instruction, creating a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and setting hardware resources of the virtual machine based on the original production environment, further comprising:
and detecting whether the application server is normally started, if so, monitoring the application port, the IP address and the operation condition of the application server after the application server is normally started.
An automatic switching system for block-level disaster recovery, comprising:
the application server is used for monitoring the conditions of an application port and an IP address in real time and executing a self-healing script when the application port and the IP address are closed;
the control center is used for judging whether the application server has continuous abnormal conditions or not when the application port and the IP address are opened;
and the creating module is used for generating a snapshot creating instruction and deriving the snapshot creating instruction when the self-healing fails or the application server has continuous abnormal conditions, creating a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and setting hardware resources of the virtual machine based on the original production environment.
An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
A non-transitory computer readable medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method.
The invention has the following advantages:
the automatic switching method of the block-level disaster recovery system monitors the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions; when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment. The problem that ground level disaster recovery cannot be automatically switched in a copy mode in the prior art is solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart of an automatic switching method of block-level disaster recovery according to the present invention;
FIG. 2 is a block diagram of an automatic switching system for block-level disaster recovery according to the present invention;
FIG. 3 is a block diagram of the disaster recovery system of the present invention;
fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.
Description of the reference numerals
The system comprises an application server 10, a control center 20, a creation module 30, a disaster recovery execution module 40, an electronic device 50, a processor 501, a memory 502 and a bus 503.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In this application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings. These terms are used primarily to better describe the present application and its embodiments, and are not used to limit the indicated devices, elements or components to a particular orientation or to be constructed and operated in a particular orientation.
Moreover, some of the above terms may be used to indicate other meanings besides the orientation or positional relationship, for example, the term "on" may also be used to indicate some kind of attachment or connection relationship in some cases. The specific meaning of these terms in this application will be understood by those of ordinary skill in the art as appropriate.
In addition, the term "plurality" shall mean two as well as more than two.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 is a flowchart of an embodiment of an automatic switching method for block-level disaster recovery, and as shown in fig. 1, the automatic switching method for block-level disaster recovery provided in the embodiment of the present invention includes the following steps:
s101, monitoring the conditions of an application port and an IP address in real time;
specifically, whether the communication between the control center 20 and the gateway of the control center 20 fails or not is judged, and if so, a self-healing script is executed; when the communication between the control center 20 and the gateway of the control center 20 is successful, judging whether the communication between the control center 20 and the gateway of the application server 10 is failed, if so, sending a warning signal and executing a self-healing script;
the Gateway (Gateway) is also called an internetwork connector and a protocol converter. The gateway realizes network interconnection above a network layer, is a complex network interconnection device and is only used for interconnection of two networks with different high-level protocols. The gateway can be used for interconnection of both wide area networks and local area networks. A gateway is a computer system or device that acts as a switch-operative. The gateway is a translator used between two systems that differ in communication protocol, data format or language, or even in an entirely different architecture. Instead of the bridge simply communicating the information, the gateway repackages the received information to accommodate the needs of the destination system. Same layer-application layer.
And when the control center 20 and the control center 20 are successfully communicated with each other and the control center 20 and the application server 10 are successfully communicated with each other, judging whether the application port is closed or not, and if so, executing a self-healing script.
A port is an idea of an English port and can be considered as an outlet for communication between equipment and the outside. Ports can be divided into virtual ports, which refer to ports within a computer or within a switch router, and physical ports, which are not visible. Such as 80 ports, 21 ports, 23 ports, etc. in a computer. The physical ports are also called interfaces, and are visible ports, RJ45 network ports of a computer backplane, RJ45 ports such as a switch router hub, and the like. The use of RJ11 jacks by telephones is also within the category of physical ports.
S102, when the application port and the IP address are closed, executing a self-healing script;
specifically, the number of times and the interval time for executing the self-healing script are preset in the control center 20;
after the self-healing script is executed, monitoring whether the application port and the IP address are closed again within the specified time; if the application port and the IP address are not closed again, judging that the self-healing result is successful in self-healing;
if the application port and the IP address are closed again, judging that the self-healing result is self-healing failure;
the application server 10 sends the feedback information of the self-healing failure to the control center 20;
the control center 20 stops executing the self-healing script.
S103, when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions;
specifically, the monitoring data of the application server 10 is obtained in real time;
comparing the monitoring data, judging whether the monitoring data has long-time CPU and memory occupation abnormity, if so, judging that the application server 10 has abnormity.
S104, when the self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creating instruction, deriving the snapshot creating instruction, automatically creating a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and automatically setting resources of the virtual machine according to the original production environment and preset resource limits;
specifically, the open application port and IP address are used to execute a port closing action through the application server 10;
and detecting whether the application server 10 is normally started, if so, monitoring the application port, the IP address and the operation condition of the application server 10 after the application server 10 is normally started.
The disaster recovery backup system is composed of a control center 20, a disaster recovery backup execution module 40 and a client (application server 10).
The application server 10 has three functional modules, which are an agent module, a CDP module and a monitor module respectively; the agent module is used for communicating, executing and transmitting data with the disaster recovery execution module 40; the CDP module is used for driving, real-time data capturing and local caching, the monitor module is used for monitoring the state of the application server 10 and actively pushing the monitoring result to the control center 20 for use, and the monitor module pushes the opening conditions of the CPU, the memory, the network and the application port (specified) to the control center 20 within 3-5 seconds.
The control center 20 actively queries the task running condition of the disaster recovery execution module 40, issues the task to the disaster recovery execution module 40, receives monitor information of the client, detects the condition returned by the monitor, and reversely determines the opening condition of the client IP and the application port according to the condition returned by the monitor.
The self-healing times and the interval time are preset in the control center 20, the monitor module monitors that the application port is closed, automatically executes the preset self-healing script, and monitors whether the application port is closed again in the specified time, the self-healing times and the interval time can be set, and if the application port is successfully self-healed in the specified time, other operations are not performed;
if the self-healing of the application port fails, the monitor feeds back information to the control center 20 and stops the self-healing operation.
Firstly, the control center 20 confirms the communication condition between the control center and the gateway of the control center; if the communication between the control center 20 and the gateway of the control center 20 fails, the control center 20 itself has a problem and does not perform any subsequent operation;
the control center 20 determines the gateway communication condition with the application server 10; if the first step is successful, the communication between the control center 20 and the gateway of the application server 10 fails, and the network between the control center 20 and the application server 10 has a problem, only an alarm is given, but no other subsequent operation is performed.
The condition of an application port; if the first step is successful, the second step is successful and the third step is failed, the automatic scheduling switching process is prepared.
Aiming at self-healing judgment conditions: after the self-healing failure result returned by the client is obtained, the control center 20 judges the communication between the control center 20 and the gateway of the control center 20, the communication between the control center 20 and the gateway of the application server 10, and the condition of the application port, and determines the subsequent operation according to the judgment result.
And judging according to other conditions: the method mainly comprises the steps of judging continuous abnormal conditions of a CPU (Central processing Unit), a memory and the like of an application, comparing and analyzing short-term and long-term monitoring data, if the CPU and the memory occupy abnormally high time (the time length and the times can be set), if the CPU and the memory suddenly appear for a long time, the problem of memory and CPU leakage possibly exists, so that the application cannot be normally provided or can only be provided locally, after conditions are met, the conditions of communication between a control center 20 and a control center 20 gateway, communication between the control center 20 and an application server 10 gateway and an application port are judged, and the success of the first step and the success of the second step are increased, and if the conditions are successful, a starting switching process is executed;
automatically switching the flow by the judgment result of the application port:
before switching: blocking the opened application port, and executing the application port closing action by the monitor program to enable the production network card to be offline or reset to 1.1.1.1; next handover is performed (handover is performed when IP is not available);
in the switching process: the following process is automatically completed: generating a snapshot creating instruction, deriving the snapshot creating instruction, automatically creating a virtual machine on the disaster recovery backup resource pool through the derived snapshot creating instruction, automatically setting resources such as an MAC (media access control) address, a VLAN (virtual local area network) number, a CPU (central processing unit), a memory and the like according to the original production environment and preset resource limitation, and automatically starting up after the setting is finished;
after switching: whether the application server 10 is normally started up is detected, when the application server 10 is normally started up, the conditions of communication between the control center 20 and the control center 20 gateway, communication between the control center 20 and the application server 10 gateway, an application port and an IP address are judged, and after all the IP, the application port, the IP address, the CPU, the memory and the like are normal, a detection result is sent to the control center 20 (including abnormal conditions).
Fig. 2 is a flowchart of an embodiment of an automatic switching system for block-level disaster recovery, and fig. 3 is a block diagram of the disaster recovery system according to the present invention; as shown in fig. 2 to 3, an automatic switching system for block-level disaster recovery provided in an embodiment of the present invention includes the following steps:
the application server is used for monitoring the conditions of an application port and an IP address in real time and executing a self-healing script when the application port and the IP address are closed;
the control center is used for judging whether the application server has continuous abnormal conditions or not when the application port and the IP address are opened; judging whether the communication between the control center and the control center gateway fails or not, and if so, executing a self-healing script; when the communication between the control center and the control center gateway is successful, judging whether the communication between the control center and the application server gateway fails or not, if so, sending a warning signal and executing a self-healing script; and when the control center is successfully communicated with the control center gateway and the control center is successfully communicated with the application server gateway, judging whether the application port and the IP address are closed or not, and if so, executing a self-healing script.
The creating module 30 is configured to generate a snapshot creating instruction when the self-healing fails or the application server has a persistent abnormal condition, derive the snapshot creating instruction, create a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and set a hardware resource of the virtual machine based on an original production environment;
the disaster recovery backup system is composed of a control center 20, a disaster recovery backup execution module 40 and a client (application server 10).
The application server 10 has three functional modules, which are an agent module, a CDP module and a monitor module; the agent module is used for communicating, executing and transmitting data with the disaster recovery execution module 40; the CDP module is used for driving, real-time data capturing and local caching, the monitor module is used for monitoring the state of the application server 10 and actively pushing the monitoring result to the control center 20 for use, and the monitor module pushes the opening conditions of the CPU, the memory, the network and the application port (specified) to the control center 20 within 3-5 seconds.
The control center 20 actively queries the task running condition of the disaster recovery execution module 40, issues the task to the disaster recovery execution module 40, receives monitor information of the client, detects the condition returned by the monitor, and reversely determines the opening condition of the client IP and the application port according to the condition returned by the monitor.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 4, the electronic device 50 includes: a processor 501(processor), a memory 502(memory), and a bus 503;
the processor 501 and the memory 502 complete communication with each other through the bus 503;
the processor 501 is configured to call program instructions in the memory 502 to perform the methods provided by the above-described method embodiments, including, for example: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions; when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment.
The present embodiments provide a non-transitory computer readable medium storing computer instructions that cause a computer to perform the methods provided by the above method embodiments, for example, including: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions; when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment.
Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned media include: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods of the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (10)

1. An automatic switching method for block-level disaster recovery is characterized by specifically comprising the following steps:
monitoring the conditions of an application port and an IP address in real time;
executing a self-healing script when the application port and the IP address are closed;
when the application port and the IP address are opened, judging whether the application server has continuous abnormal conditions;
when self-healing fails or continuous abnormal conditions exist in the application server, a snapshot creating instruction is generated, the snapshot creating instruction is derived, a virtual machine is created on the disaster recovery resource pool through the derived snapshot creating instruction, and hardware resources of the virtual machine are set based on the original production environment.
2. The method according to claim 1, wherein the monitoring the application port and the IP address in real time comprises:
judging whether the communication between the control center and the control center gateway fails or not, and if so, executing a self-healing script;
when the communication between the control center and the control center gateway is successful, judging whether the communication between the control center and the application server gateway fails or not, if so, sending a warning signal and executing a self-healing script;
and when the control center is successfully communicated with the control center gateway and the control center is successfully communicated with the application server gateway, judging whether the application port and the IP address are closed or not, and if so, executing a self-healing script.
3. The method according to claim 1, wherein the executing a self-healing script when the application port and the IP address are closed comprises:
presetting times and interval time for executing the self-healing script in the control center;
after the self-healing script is executed, monitoring whether the application port and the IP address are closed again within the designated time;
and if the application port and the IP address are not closed again, judging that the self-healing result is successful in self-healing.
4. The method according to claim 1, wherein the monitoring whether the application port and the IP address are closed again within a specified time after the self-healing script is executed comprises:
if the application port and the IP address are closed again, judging that the self-healing result is self-healing failure;
the application server sends the self-healing failure feedback information to a control center;
the control center stops executing the self-healing script.
5. The method according to claim 1, wherein the determining whether the application server has a persistent abnormal condition when the application port and the IP address are open includes:
acquiring monitoring data of the application server in real time;
and comparing the monitoring data, judging whether the monitoring data has long-time CPU and memory occupation abnormity, and if so, judging that the application server has abnormity.
6. The method according to claim 1, wherein the generating a snapshot creating instruction when self-healing fails or a persistent abnormal condition exists in the application server, deriving the snapshot creating instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creating instruction, and setting hardware resources of the virtual machine based on an original production environment includes:
setting hardware resources of the virtual machine based on the original production environment and preset resource limit;
and executing a port closing action on the opened application port and the IP address through an application server.
7. The method according to claim 1, wherein when self-healing fails or a persistent abnormal condition exists in the application server, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on an original production environment, further comprises:
and detecting whether the application server is normally started, if so, monitoring the application port, the IP address and the operation condition of the application server after the application server is normally started.
8. An automatic switching system for block-level disaster recovery, comprising:
the application server is used for monitoring the conditions of an application port and an IP address in real time and executing a self-healing script when the application port and the IP address are closed;
the control center is used for judging whether the application server has continuous abnormal conditions or not when the application port and the IP address are opened;
and the creating module is used for generating a snapshot creating instruction and deriving the snapshot creating instruction when the self-healing fails or the application server has continuous abnormal conditions, creating a virtual machine on the disaster recovery resource pool through the derived snapshot creating instruction, and setting hardware resources of the virtual machine based on the original production environment.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor realizes the steps of the method according to any one of claims 1 to 7 when executing the computer program.
10. A non-transitory computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202210090705.1A 2022-01-25 2022-01-25 Automatic switching system, method, electronic equipment and medium for block-level disaster recovery Active CN114500238B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210090705.1A CN114500238B (en) 2022-01-25 2022-01-25 Automatic switching system, method, electronic equipment and medium for block-level disaster recovery

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210090705.1A CN114500238B (en) 2022-01-25 2022-01-25 Automatic switching system, method, electronic equipment and medium for block-level disaster recovery

Publications (2)

Publication Number Publication Date
CN114500238A true CN114500238A (en) 2022-05-13
CN114500238B CN114500238B (en) 2024-02-20

Family

ID=81474875

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210090705.1A Active CN114500238B (en) 2022-01-25 2022-01-25 Automatic switching system, method, electronic equipment and medium for block-level disaster recovery

Country Status (1)

Country Link
CN (1) CN114500238B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072685A1 (en) * 2010-09-16 2012-03-22 Hitachi, Ltd. Method and apparatus for backup of virtual machine data
CN110928728A (en) * 2019-11-27 2020-03-27 上海英方软件股份有限公司 Virtual machine copying and switching method and system based on snapshot
CN111580929A (en) * 2020-05-07 2020-08-25 上海英方软件股份有限公司 Validity verification system and method based on virtual machine protection data
CN112380062A (en) * 2020-11-17 2021-02-19 上海英方软件股份有限公司 Method and system for rapidly recovering system for multiple times based on system backup point
WO2021072880A1 (en) * 2019-10-15 2021-04-22 平安科技(深圳)有限公司 Method for asynchronously creating internal snapshot of virtual machine, apparatus, system and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120072685A1 (en) * 2010-09-16 2012-03-22 Hitachi, Ltd. Method and apparatus for backup of virtual machine data
WO2021072880A1 (en) * 2019-10-15 2021-04-22 平安科技(深圳)有限公司 Method for asynchronously creating internal snapshot of virtual machine, apparatus, system and storage medium
CN110928728A (en) * 2019-11-27 2020-03-27 上海英方软件股份有限公司 Virtual machine copying and switching method and system based on snapshot
CN111580929A (en) * 2020-05-07 2020-08-25 上海英方软件股份有限公司 Validity verification system and method based on virtual machine protection data
CN112380062A (en) * 2020-11-17 2021-02-19 上海英方软件股份有限公司 Method and system for rapidly recovering system for multiple times based on system backup point

Also Published As

Publication number Publication date
CN114500238B (en) 2024-02-20

Similar Documents

Publication Publication Date Title
CN110807064B (en) Data recovery device in RAC distributed database cluster system
EP1117040A2 (en) Method and apparatus for resolving partial connectivity in a clustered computing system
US20160134467A1 (en) Method and apparatus for switching between master device and backup device
CN102546135B (en) Active/standby server switched system and method
CN107480014A (en) A kind of High Availabitity equipment switching method and device
CN113300917B (en) Traffic monitoring method and device for Open Stack tenant network
CN109274761A (en) A kind of NAS clustered node, system and data access method
CN113347037A (en) Data center access method and device
CN112256498A (en) Fault processing method and device
CN114840495A (en) Database cluster split-brain prevention method, storage medium and device
CN113965459A (en) Consul-based method for monitoring host network to realize high availability of computing nodes
CN117201507A (en) Cloud platform switching method and device, electronic equipment and storage medium
CN111399978A (en) OpenStack-based fault migration system and migration method
CN114500238A (en) Automatic switching system and method for block-level disaster recovery, electronic device and medium
CN111277593A (en) Multi-line parallel monitoring method based on internal and external network isolation
CN116192885A (en) High-availability cluster architecture artificial intelligent experiment cloud platform data processing method and system
CN105763365A (en) Method and device for processing anomaly
CN101202658A (en) System and method for service take-over of multi-host system
CN114285822A (en) Domain name resolution server switching method and device
CN109408123B (en) Method and device for reloading configuration file
Cisco Release Notes for Cisco MGX 8260 Media Gateway, Version 1.2.3
CN113238893A (en) Disaster recovery system, method, computer device and medium for multiple data centers
CN109828765B (en) Method for upgrading online service, general routing platform and storage medium
CN114826886B (en) Disaster recovery method and device for application software and electronic equipment
KR102221018B1 (en) Relay system and method for deling with fault of secure session for DB connection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant