CN114500238B

CN114500238B - Automatic switching system, method, electronic equipment and medium for block-level disaster recovery

Info

Publication number: CN114500238B
Application number: CN202210090705.1A
Authority: CN
Inventors: 董帅; 陈跃俊
Original assignee: Ybm Technologies Pvt ltd
Current assignee: Ybm Technologies Pvt ltd
Priority date: 2022-01-25
Filing date: 2022-01-25
Publication date: 2024-02-20
Anticipated expiration: 2042-01-25
Also published as: CN114500238A

Abstract

The invention provides an automatic switching system, method, electronic equipment and medium for block-level disaster recovery, wherein the automatic switching method for the block-level disaster recovery comprises the following steps: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether a continuous abnormal condition exists in the application server or not; when self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a primary production environment. The automatic switching method of the block-level disaster recovery equipment solves the problem that the ground-level disaster recovery equipment cannot be automatically switched in the copy mode in the prior art.

Description

Automatic switching system, method, electronic equipment and medium for block-level disaster recovery

Technical Field

The present invention relates to the field of internet technologies, and in particular, to an automatic switching system, method, electronic device, and medium for block-level disaster recovery.

Background

The ground disaster recovery is to copy all relevant applications such as an application program and an operating system on the ground disaster recovery to a block copy server through disk block level copy, and to reserve the ground disaster recovery as a virtualized disk file at the server, and to reserve a rollback point through a snapshot function of a virtualized disk.

At present, in a CDM replication mode based on land in the industry, the emergency connection pipe needs to be manually switched, and the operation is complex.

Disclosure of Invention

The invention aims to provide an automatic switching system, an automatic switching method, electronic equipment and a medium for block-level disaster recovery equipment, which can solve the problem that the ground-level disaster recovery equipment cannot be automatically switched in a replication mode in the prior art.

In order to achieve the above object, the present invention provides the following technical solutions:

the embodiment of the invention provides an automatic switching method of block-level disaster recovery equipment, which specifically comprises the following steps:

monitoring the conditions of an application port and an IP address in real time;

executing a self-healing script when the application port and the IP address are closed;

when the application port and the IP address are opened, judging whether a continuous abnormal condition exists in the application server or not;

when self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a native environment

Based on the technical scheme, the invention can also be improved as follows:

further, the monitoring the condition of the application port and the IP address in real time includes:

judging whether the communication between the control center and the control center gateway fails, if so, executing a self-healing script;

when the communication between the control center and the control center gateway is successful, judging whether the communication between the control center and the application server gateway is failed, if so, sending out a warning signal and executing a self-healing script;

and when the control center and the control center gateway are successfully communicated and the control center and the application server gateway are successfully communicated, judging whether the application port and the IP address are closed, and if so, executing a self-healing script.

Further, when the application port and the IP address are closed, executing a self-healing script, including:

presetting the times and interval time for executing the self-healing script in a control center;

after executing the self-healing script, monitoring whether the application port and the IP address are closed again in a designated time;

and if the application port and the IP address are not closed again, judging that the self-healing result is self-healing success.

Further, after the self-healing script is executed, monitoring whether the application port and the IP address are closed again in a specified time includes:

if the application port and the IP address are closed again, judging that the self-healing result is self-healing failure;

the application server sends feedback information of self-healing failure to a control center;

and the control center stops executing the self-healing script.

Further, when the application port and the IP address are open, determining whether the persistent abnormal condition exists in the application server includes:

acquiring monitoring data of the application server in real time;

comparing the monitoring data, judging whether the monitoring data has long-time CPU and memory occupation abnormality, if so, judging that the application server has abnormality.

Further, when the self-healing fails or the application server has a continuous abnormal condition, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a native production environment, including:

setting hardware resources of the virtual machine based on a primary production environment and preset resource limitations;

and executing port closing actions on the opened application port and the IP address through the application server.

Further, when the self-healing fails or the application server has a persistent abnormal condition, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a native environment, and further comprising:

and detecting whether the application server is normally started, if so, monitoring the operation conditions of the application port, the IP address and the application server after the application server is normally started.

An automatic switching system of block-level disaster recovery, comprising:

the application server is used for monitoring the conditions of an application port and an IP address in real time, and executing a self-healing script when the application port and the IP address are closed;

the control center is used for judging whether the continuous abnormal condition exists in the application server or not when the application port and the IP address are opened;

and the creation module is used for generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on the disaster recovery resource pool through the derived snapshot creation instruction and setting hardware resources of the virtual machine based on the original production environment when the self-healing fails or the application server has continuous abnormal conditions.

An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method when the computer program is executed.

A non-transitory computer readable medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method.

The invention has the following advantages:

the automatic switching method of the block-level disaster recovery equipment monitors the conditions of the application port and the IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether a continuous abnormal condition exists in the application server or not; when self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a primary production environment. The problem that ground disaster recovery equipment cannot be automatically switched in a replication mode in the prior art is solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an automatic switching method of block level disaster recovery according to the present invention;

FIG. 2 is a block diagram of an automatic switching system of block level disaster recovery;

FIG. 3 is a block diagram of a disaster recovery system of the present invention;

fig. 4 is a schematic diagram of an entity structure of an electronic device according to the present invention.

Description of the reference numerals

Application server 10, control center 20, creation module 30, disaster recovery execution module 40, electronic device 50, processor 501, memory 502, bus 503.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the present application described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the present application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal" and the like indicate an azimuth or a positional relationship based on that shown in the drawings. These terms are used primarily to better describe the present application and its embodiments and are not intended to limit the indicated device, element or component to a particular orientation or to be constructed and operated in a particular orientation.

Also, some of the terms described above may be used to indicate other meanings in addition to orientation or positional relationships, for example, the term "upper" may also be used to indicate some sort of attachment or connection in some cases. The specific meaning of these terms in this application will be understood by those of ordinary skill in the art as appropriate.

In addition, the term "plurality" shall mean two as well as more than two.

It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 is a flowchart of an embodiment of an automatic switching method of block-level disaster recovery equipment according to the present invention, as shown in fig. 1, the method for automatically switching block-level disaster recovery equipment according to the embodiment of the present invention includes the following steps:

s101, monitoring the conditions of an application port and an IP address in real time;

specifically, whether the communication between the control center 20 and the gateway of the control center 20 fails or not is judged, if so, a self-healing script is executed; when the communication between the control center 20 and the gateway of the control center 20 is successful, judging whether the communication between the control center 20 and the gateway of the application server 10 is failed, if so, sending out a warning signal and executing a self-healing script;

gateway (Gateway) is also called Gateway and protocol converter. The gateway realizes network interconnection above the network layer, is a complex network interconnection device, and is only used for network interconnection with two different higher-layer protocols. The gateway may be used for both wide area network and local area network interconnections. A gateway is a computer system or device that acts as a translation rendition. The gateway is a translator for use between two systems of different communication protocols, data formats or languages, even with disparate architectures. Rather than simply conveying the information, the gateway repacks the received information to accommodate the needs of the destination system. The same layer-the application layer.

And when the gateway communication between the control center 20 and the control center 20 is successful and the gateway communication between the control center 20 and the application server 10 is successful, judging whether the application port is closed, and if so, executing the self-healing script.

"Port" is an meaning translation of an English port and can be considered as an outlet for communication between the device and the outside world. Ports can be divided into virtual ports and physical ports, where virtual ports refer to ports within a computer or within a switch router that are not visible. Such as 80 ports, 21 ports, 23 ports, etc. in a computer. The physical ports are also called interfaces, and are visible ports, RJ45 network ports of a computer backboard, RJ45 ports of a switch router hub and the like. The use of RJ11 jacks by phones also falls into the category of physical ports.

S102, executing a self-healing script when an application port and an IP address are closed;

specifically, the number of times and the interval time for executing the self-healing script are preset in the control center 20;

after executing the self-healing script, monitoring whether the application port and the IP address are closed again in the appointed time; if the application port and the IP address are not closed again, judging that the self-healing result is self-healing success;

the application server 10 sends feedback information of self-healing failure to the control center 20;

the control center 20 stops executing the self-healing script.

S103, judging whether a continuous abnormal condition exists in the application server or not when the application port and the IP address are opened;

specifically, the monitoring data of the application server 10 are obtained in real time;

and comparing the monitoring data, judging whether the monitoring data has long-time CPU and memory occupation abnormality, and if so, judging that the application server 10 has abnormality.

S104, when the self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, automatically creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and automatically setting resources of the virtual machine according to the original production environment and preset resource limitations;

specifically, the open application port and the IP address are executed with a port closing action by the application server 10;

and detecting whether the application server 10 is normally started, if so, monitoring the application port, the IP address and the running condition of the application server 10 after the application server 10 is normally started.

The disaster recovery system is composed of a control center 20, a disaster recovery execution module 40 and a client (application server 10).

The application server 10 has three functional modules, namely a agent module, a CDP module and a monitor module; the agent module is used for communicating with the disaster recovery execution module 40, executing and transmitting data; the CDP module is used for driving, capturing real-time data and locally caching, the monitor module is used for monitoring the state of the application server 10, actively pushing the monitoring result to the control center 20 for use, and the monitor module pushes the opening conditions of the CPU, the memory, the network and the application port (designated) to the control center 20 for 3-5 seconds.

The control center 20 actively inquires the task running condition of the disaster recovery execution module 40, and transmits the task to the disaster recovery execution module 40, and simultaneously receives monitor information of the client, detects monitor return condition, and reversely determines the opening condition of the client IP and the application port according to the monitor return condition.

Presetting self-healing times and interval time in a control center 20, automatically executing a preset self-healing script when a monitor module monitors that an application port is closed, and monitoring whether the application port is closed again in a designated time, wherein the self-healing times and the interval time can be set, and if the application port is self-healed successfully in the designated time, other operations are not performed;

if the self-healing of the application port fails, monitor feeds back information to the control center 20 and stops the self-healing operation.

(1) The control center 20 confirms the communication condition between itself and its own gateway; if the communication between the control center 20 and the gateway of the control center 20 fails, the control center 20 itself has a problem and does not perform any subsequent operation;

(2) the control center 20 determines the gateway communication condition with the application server 10; if (1) succeeds, communication between the control center 20 and the gateway of the application server 10 fails, and a problem occurs in the network between the control center 20 and the application server 10, and only an alarm is given but no other subsequent operation is performed.

(3) The case of application ports; if (1) succeeds, (2) succeeds and (3) fails, preparing to schedule the automatic switching flow.

Aiming at the self-healing judging condition: after obtaining the self-healing failure result returned by the client, the control center 20 judges the communication between the control center 20 and the gateway of the control center 20, the communication between the control center 20 and the gateway of the application server 10 and the condition of the application port, and decides the subsequent operation according to the judging result.

Judging for other conditions: the method mainly comprises the steps of judging continuous abnormal conditions of an applied CPU, a memory and the like, comparing and analyzing short-term and long-term monitoring data, if the CPU and the memory occupy abnormally high for a long time (the settable time length and the settable times) suddenly, the problems of memory and CPU leakage possibly exist, so that the application cannot be normally provided or can only be locally provided, judging the conditions of communication between the control center 20 and a gateway of the control center 20, communication between the control center 20 and a gateway of the application server 10 and application ports after meeting the conditions, and increasing (1) success, (2) success, (3) execution of a start switching flow under success;

and (3) automatically switching the flow by applying the judging result of the port:

before switching: blocking the opened application port, and executing the closing action of the application port by the monitor program to enable the production network card to be off line or reset to be 1.1.1.1; the next handover is performed (the handover is performed when the IP is not on);

in the switching: the following steps are automatically completed: generating a snapshot creation instruction, deriving the snapshot creation instruction, automatically creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, automatically setting resources such as an MAC address, a VLAN number, a CPU (Central processing Unit), a memory and the like according to a primary production environment and preset resource limitations, and automatically starting up after completion;

after switching: whether the application server 10 is started normally or not is detected, when the application server 10 is started normally, the communication between the control center 20 and the gateway of the control center 20, the communication between the control center 20 and the gateway of the application server 10, the application port and the IP address are judged, and after all the IP, the application port, the IP address, the CPU, the memory and the like are normal, the detection result is sent to the control center 20 (including abnormal conditions).

FIG. 2 is a flowchart of an embodiment of an automatic switching system of block-level disaster recovery, and FIG. 3 is a block diagram of the disaster recovery system of the present invention; as shown in fig. 2-3, the automatic switching system for block-level disaster recovery provided by the embodiment of the invention includes the following steps:

the control center is used for judging whether the continuous abnormal condition exists in the application server or not when the application port and the IP address are opened; judging whether the communication between the control center and the control center gateway fails, if so, executing a self-healing script; when the communication between the control center and the control center gateway is successful, judging whether the communication between the control center and the application server gateway is failed, if so, sending out a warning signal and executing a self-healing script; and when the control center and the control center gateway are successfully communicated and the control center and the application server gateway are successfully communicated, judging whether the application port and the IP address are closed, and if so, executing a self-healing script.

The creation module 30 generates a snapshot creation instruction, derives the snapshot creation instruction, creates a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and sets hardware resources of the virtual machine based on a native production environment when self-healing fails or the application server has a continuous abnormal condition;

Fig. 4 is a schematic diagram of an entity structure of an electronic device according to an embodiment of the present invention, as shown in fig. 4, an electronic device 50 includes: a processor 501 (processor), a memory 502 (memory), and a bus 503;

wherein, the processor 501 and the memory 502 complete the communication with each other through the bus 503;

the processor 501 is configured to invoke program instructions in the memory 502 to perform the methods provided by the above-described method embodiments, for example, including: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether a continuous abnormal condition exists in the application server or not; when self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a primary production environment.

The present embodiment provides a non-transitory computer readable medium storing computer instructions that cause a computer to perform the methods provided by the above-described method embodiments, for example, including: monitoring the conditions of an application port and an IP address in real time; executing a self-healing script when the application port and the IP address are closed; when the application port and the IP address are opened, judging whether a continuous abnormal condition exists in the application server or not; when self-healing fails or the application server has continuous abnormal conditions, generating a snapshot creation instruction, deriving the snapshot creation instruction, creating a virtual machine on a disaster recovery resource pool through the derived snapshot creation instruction, and setting hardware resources of the virtual machine based on a primary production environment.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable medium such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method of the respective embodiments or parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.

Claims

1. The automatic switching method of the block-level disaster recovery equipment is characterized by comprising the following steps of:

blocking the open application port before switching;

in the switching process, when self-healing fails or the application server has continuous abnormal conditions, a snapshot creation instruction is generated, the snapshot creation instruction is derived, a virtual machine is created on a disaster recovery resource pool through the derived snapshot creation instruction, hardware resources of the virtual machine are set based on a primary production environment, and the hardware resources of the virtual machine are set based on the primary production environment and preset resource limitations; executing port closing action on the opened application port and the IP address through an application server;

after switching, detecting whether the application server is started normally, if so, monitoring the operation conditions of the application port, the IP address and the application server after the application server is started normally, and sending the operation conditions to a control center.

2. The automatic switching method of block-level disaster recovery according to claim 1, wherein the real-time monitoring of the application port and the IP address comprises:

3. The automatic switching method of block-level disaster recovery from a disaster, according to claim 1, wherein when said application port and said IP address are closed, executing a self-healing script comprises:

4. The method for automatically switching a block-level disaster recovery from a disaster, according to claim 1, wherein after said executing a self-healing script, monitoring whether the application port and the IP address are closed again within a specified time comprises:

and the control center stops executing the self-healing script.

5. The automatic switching method of block-level disaster recovery according to claim 1, wherein when the application port and the IP address are open, determining whether a persistent abnormal condition exists in the application server includes:

acquiring monitoring data of the application server in real time;

6. An automatic switching system for block-level disaster recovery, comprising:

blocking the open application port before switching;

in the switching process, when the self-healing fails or the application server has continuous abnormal conditions, a creating module generates a snapshot creating instruction, derives the snapshot creating instruction, creates a virtual machine on a disaster recovery resource pool through the derived snapshot creating instruction, sets hardware resources of the virtual machine based on a primary production environment, and sets the hardware resources of the virtual machine based on the primary production environment and preset resource limitations; executing port closing action on the opened application port and the IP address through an application server;

7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1 to 5 when the computer program is executed.

8. A non-transitory computer readable medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 5.