CN112732999A - Static disaster recovery method, system, electronic device and storage medium - Google Patents

Static disaster recovery method, system, electronic device and storage medium Download PDF

Info

Publication number
CN112732999A
CN112732999A CN202110080886.5A CN202110080886A CN112732999A CN 112732999 A CN112732999 A CN 112732999A CN 202110080886 A CN202110080886 A CN 202110080886A CN 112732999 A CN112732999 A CN 112732999A
Authority
CN
China
Prior art keywords
disaster recovery
url
disaster
static
state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110080886.5A
Other languages
Chinese (zh)
Other versions
CN112732999B (en
Inventor
何嘉杰
邓玉
胡仲强
谢潇宇
林浪桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202110080886.5A priority Critical patent/CN112732999B/en
Publication of CN112732999A publication Critical patent/CN112732999A/en
Application granted granted Critical
Publication of CN112732999B publication Critical patent/CN112732999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides a static disaster recovery method, a static disaster recovery system, electronic equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: after receiving a user request, judging whether the URL of the user request uniform resource locator is configured with a disaster tolerance state or not; the disaster recovery state of the URL is the disaster recovery state updated according to the log monitoring data and the disaster recovery triggering condition, and the log monitoring data is generated according to the access log; if the user request URL service has configured a disaster tolerance state, the user request is reversely proxied to a disaster tolerance cache of the static disaster tolerance system; caching cache data regularly crawled from a URL request historical record in an access log in a disaster recovery cache of the static disaster recovery system; and if the user request URL service is not configured with the disaster tolerance state, the user request is reversely proxied to the back-end server. By the static disaster recovery method, the static disaster recovery system has dynamic configuration support capability, intelligent monitoring switching capability and unified intelligent checking capability.

Description

Static disaster recovery method, system, electronic device and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a static disaster recovery method, a system, an electronic device, and a storage medium.
Background
Aiming at the problems of system service stability reduction and even service breakdown, most enterprises can make precautionary measures from the aspects of system, process and system support, and common methods in the aspect of systems comprise service degradation and fault-tolerant design. For example, systems can be generally classified into 3 categories from simple to complex, depending on the different situations in which the system services use caches. The first type is a system that uses a simple cache, which has a low probability of generating an anomaly. The second category is systems that use complex caches for services, which are typically controlled by the business layer, affected by application layer code. If the application code has problems, the cache cannot be called correctly, and the page display exception is caused. The third category is systems that serve no use of caching at all. The first system has low abnormal probability and can not be subjected to fault-tolerant design. The second and third systems have a high probability of having an abnormality. For the second type of system, the overall stability of the system can be improved by introducing an intelligent static disaster recovery system. For the third type of system, due to the dynamic service characteristic, the stability cannot be improved by introducing a static disaster recovery system.
At present, in a manner of improving the overall stability of the second type of system by introducing an intelligent static disaster recovery system, one scheme is to introduce a fault tolerance mechanism in an application code level, for example, setting a response time threshold, and triggering an overtime mechanism if the service response is slow, so as to perform fault tolerance degradation. The common practice of fault tolerance degradation is to uniformly return a default data template, and all users see the same data, so that the fault tolerance degradation is performed from one side of thousands of users to one side of thousands of users. The implementation scheme of the static disaster recovery system mainly relates to developers, and can solve problems in most scenes, but the following hidden dangers exist in practical production application: 1) due to instability of the application itself, it may happen that the application is "hung up" as a whole; 2) each application service needs to introduce a fault tolerance mechanism, and fault tolerance codes are dispersed in different applications and difficult to uniformly upgrade and maintain; 3) the dispersed fault-tolerant codes are difficult to test, and the overall static disaster tolerance capability is difficult to evaluate.
Disclosure of Invention
Based on the problems in the prior art, embodiments of the present invention provide a static disaster recovery method, a system, an electronic device, and a storage medium.
In a first aspect, an embodiment of the present invention provides a static disaster recovery method, where the method is applied to a static disaster recovery system, and the method includes: after receiving a user request, judging whether a Uniform Resource Locator (URL) of the user request is configured with a disaster tolerance state; the disaster recovery state of the URL is a disaster recovery state updated according to log monitoring data and disaster recovery triggering conditions, wherein the log monitoring data are generated according to access logs; if the user request URL service is configured with a disaster tolerance state, reversely proxying the user request to a disaster tolerance cache of the static disaster tolerance system; cache data regularly crawled from URL request historical records in access logs are cached in a disaster recovery cache of the static disaster recovery system; and if the user request URL service is not configured with the disaster tolerance state, reversely proxying the user request to a back-end server.
Optionally, before the determining whether the user requests the URL to configure the disaster tolerance state, the method further includes: judging the on-off state of the static disaster recovery system, wherein the on-off state of the static disaster recovery system comprises opening or closing; if the switch state of the static disaster recovery system is on, judging whether the user requests the URL service to be configured with a disaster recovery state; and if the switch state of the static disaster recovery system is closed, reversely proxying the user request to a back-end server.
Optionally, after reverse-proxying the user request to a backend server, the method further includes: judging whether the returned result of the back-end server is a server error; and if the returned result of the back-end server is a server error, reversely proxying the user request to the disaster recovery cache of the static disaster recovery system.
Optionally, the method further comprises: and if the returned result of the back-end server is a normal result, returning the normal result to the user.
Optionally, the method further comprises: and acquiring a URL list supporting static disaster tolerance, crawling cache data from URL request historical records in an access log at regular time, and storing the cache data into a disaster tolerance cache.
Optionally, the method further comprises: reading a URL service list needing disaster tolerance; reading log monitoring data; and updating the disaster recovery state of the URL in the URL service list according to the log monitoring data and the disaster recovery triggering condition.
Optionally, the updating the disaster recovery state of the URL according to the log monitoring data and the disaster recovery triggering condition includes: judging the state of the URL service in the URL service list according to the log monitoring data and the disaster tolerance triggering condition; if the state of the URL service is abnormal, setting the URL service as a disaster tolerance state, wherein the configured disaster tolerance state is obtained after the URL service is set as the disaster tolerance state; and if the state of the URL service is abnormally recovered, clearing the disaster tolerance state of the URL service, wherein the cleared disaster tolerance state of the URL service is a disaster tolerance unconfigured state.
Optionally, the disaster recovery triggering condition includes: the response time is greater than or equal to the average response time threshold or tp90 response time threshold.
Optionally, before reading the log monitoring data, the method further includes: and collecting and analyzing the access log to generate log monitoring data.
Optionally, the reading the URL service list that needs disaster tolerance includes: reading the URL service list needing disaster tolerance from the pre-configured key configuration information; the URL service list needing disaster tolerance comprises resource names corresponding to the URL services needing static disaster tolerance, URLs, disaster tolerance triggering conditions and contacts; and the URL service list needing disaster tolerance is configured into the static disaster tolerance system after the URL service of the static disaster tolerance is identified by manual analysis.
Optionally, before the timed crawling of the cached data from the URL request history in the access log, the method further includes: and performing deduplication processing on the URL request history record in the access log.
Optionally, the key cached in the disaster recovery cache of the static disaster recovery system is a URL + time version; the static disaster recovery system is used for crawling cache data after collecting first-time URL request historical record data, and the static disaster recovery system is used for generating different time versions according to resources corresponding to crawling frequency.
The embodiment of the invention at least has the following beneficial effects:
1) dynamic configuration support capability is provided. Dynamic configuration is allowed to support the disaster tolerance to the URL service, and the system can newly add the static disaster tolerance of the support service without restarting.
2) Possess intelligent monitoring switching ability. And judging the stability of the URL service according to the response state of the request, and automatically proxying the user request to a static disaster-tolerant cache system when an abnormal condition occurs.
3) The method has the capability of unified intelligent checking. The system can be managed and maintained uniformly, and relevant tests are carried out to evaluate the static disaster recovery capability of the system.
In a second aspect, an embodiment of the present application further provides a static disaster recovery system, where the static disaster recovery system includes: the system comprises a reverse proxy module, a log analysis module, a disaster recovery configuration module, a crawler module and a disaster recovery cache module; the crawler module is used for acquiring a URL list supporting static disaster tolerance from the reverse proxy module and regularly crawling cache data from a URL request historical record in an access log into the disaster tolerance cache module; cache data crawled by the crawler module is cached in the disaster recovery cache module; the log analysis module is used for generating log monitoring data according to the access log; the disaster recovery configuration module is used for updating the disaster recovery state of the URL service to the reverse proxy module according to the log monitoring data and the disaster recovery triggering condition; the reverse proxy module is used for judging whether the URL of the user request uniform resource locator is configured with a disaster tolerance state after receiving a user request; if the user request URL service is configured with a disaster tolerance state, the user request is reversely proxied to the disaster tolerance cache module; and if the user request URL service is not configured with the disaster tolerance state, reversely proxying the user request to a back-end server.
Optionally, the reverse proxy module is further configured to determine an on-off state of the static disaster recovery system before determining whether the URL of the user request uniform resource locator has configured a disaster recovery state, where the on-off state of the static disaster recovery system includes on or off; if the switch state of the static disaster recovery system is on, judging whether the user requests the URL service to be configured with a disaster recovery state; and if the switch state of the static disaster recovery system is closed, reversely proxying the user request to a back-end server.
Optionally, the reverse proxy module is further configured to determine whether a return result of the back-end server is a server error after the user request is reversely proxied to the back-end server; and if the returned result of the back-end server is a server error, reversely proxying the user request to the disaster recovery cache of the static disaster recovery system.
Optionally, the reverse proxy module is further configured to return the normal result to the user if the return result of the back-end server is the normal result.
Optionally, the disaster recovery configuration module is specifically configured to read a URL service list that needs disaster recovery; reading log monitoring data; and updating the disaster recovery state of the URL in the URL service list to the reverse proxy module according to the log monitoring data and the disaster recovery triggering condition.
Optionally, the disaster recovery configuration module is specifically configured to determine a state of a URL service in the URL service list according to the log monitoring data and the disaster recovery triggering condition; if the state of the URL service is abnormal, requesting the reverse proxy module to set the URL service as a disaster tolerance state, wherein the configured disaster tolerance state is obtained after the URL service is set as the disaster tolerance state; and if the state of the URL service is abnormally recovered, requesting the reverse proxy module to clear the disaster tolerance state of the URL service, wherein the cleared disaster tolerance state of the URL service is a disaster tolerance unconfigured state.
Optionally, the disaster recovery triggering condition includes: the response time is greater than or equal to the average response time threshold or tp90 response time threshold.
Optionally, the log analysis module is specifically configured to collect and analyze the access log to generate log monitoring data.
Optionally, the disaster recovery configuration module is specifically configured to read the URL service list that needs disaster recovery from the preconfigured key configuration information; the URL service list needing disaster tolerance comprises resource names corresponding to the URL services needing static disaster tolerance, URLs, disaster tolerance triggering conditions and contacts; and the URL service list needing disaster tolerance is configured into the disaster tolerance configuration module after the static disaster tolerance URL service is identified by manual analysis.
Optionally, the crawler module is further configured to perform deduplication processing on the URL request history record in the access log.
Optionally, the key cached in the disaster recovery configuration module is a URL + time version; the crawler module crawls cache data after collecting URL request historical record data of a first duration, and the disaster recovery configuration module is used for generating different time versions according to resources corresponding to the crawling frequency of the crawler module.
In a third aspect, an embodiment of the present invention provides an electronic device, including: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a storage medium, on which a computer program is stored, which, when executed by a processor, performs the steps of the method according to the first aspect.
The beneficial effects described in the second to fourth aspects above can be referred to the first aspect, and are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 shows a schematic structural diagram of a static disaster recovery system provided in an embodiment of the present application;
FIG. 2 is a process flow diagram of a reverse proxy module according to an embodiment of the present application;
fig. 3 is a schematic processing flow diagram of a disaster recovery configuration module according to an embodiment of the present disclosure;
fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it should be understood that the drawings in the present invention are for illustrative and descriptive purposes only and are not used to limit the scope of the present invention. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this disclosure illustrate operations implemented according to some embodiments of the present invention. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or simultaneously. One skilled in the art, under the direction of this summary, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments of the present invention are only some embodiments of the present invention, and not all embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the term "comprising" will be used in the embodiments of the invention to indicate the presence of the features stated hereinafter, but does not exclude the addition of further features. It should also be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. In the description of the present invention, it should also be noted that the terms "first", "second", "third", and the like are used for distinguishing the description, and are not intended to indicate or imply relative importance.
The stability of the operation of the system service is always pursued by enterprises, but there are some accidents that can cause the stability of the system service to be reduced and even the service to be broken down. For example, common factors that negatively impact system service stability include: 1) lack of performance pressure testing; 2) lack of high availability designs for service; 3) lack of fault tolerance mechanisms; 4) capacity estimation is insufficient, and a large number of slow queries occur under high concurrent requests; 5) events such as power failure and network disconnection of a machine room, hardware faults of a server and the like; 6) frequent changes of application versions, etc.
Aiming at the problem that the system service stability is reduced and even the service is crashed due to the factors 1) to 6) and the like, most enterprises can make precautionary measures from the aspects of system, process and system support, and common methods in the aspect of the system comprise service degradation and fault-tolerant design.
For example, systems can be generally classified into 3 categories from simple to complex, depending on the different situations in which the system services use caches. The cache is a data pool for storing hot spot data (frequently used data), and is generally used to speed up data acquisition or reduce pressure on a system caused by directly reading original data.
The first type is a system using a simple cache for service, a Content Delivery Network (CDN) cache can be directly configured, and data cached in the CDN cache generally has no service logic, such as a style file, a js library file, a picture, an audio/video, various documents, and the like. The cache system has a low probability of generating an exception as a whole. The second category uses a complex cache system for services, for example, thousands of people and thousands of faces which are popular at present, different users accessing the system services can be identified by applying big data and related technologies, different response contents are displayed for the same request of different users, that is, different results can be displayed for the same request according to different users, and the purpose of accurate marketing is achieved. These caches are typically controlled by the business layer, subject to application layer code. If the application code has problems, the cache cannot be called correctly, and the page display exception is caused. The third category is systems where the service does not use the cache at all, such as services that require order placement, inventory reduction, payment, return, etc. persisted into the database.
In the three systems, the first system has low abnormal probability, and fault-tolerant design can not be carried out. The second and third systems have a high probability of having an abnormality. For the second type of system, the overall stability of the system can be improved by introducing an intelligent static disaster recovery system. For the third type of system, due to the dynamic service characteristic, the stability cannot be improved by introducing a static disaster recovery system. The static disaster recovery refers to that when the service is abnormal, static data is used for access.
At present, in a manner of improving the overall stability of the second type of system by introducing an intelligent static disaster recovery system, the static disaster recovery system generally has the following two implementation schemes.
The first scheme is to introduce a fault tolerance mechanism at the application code level, for example, a response time threshold is set, and a timeout mechanism is triggered when the service response is slow, so as to perform fault tolerance degradation. The common practice of fault tolerance degradation is to uniformly return a default data template, and all users see the same data, so that the fault tolerance degradation is performed from one side of thousands of users to one side of thousands of users.
The implementation scheme of the static disaster recovery system mainly relates to developers, and can solve problems in most scenes, but the following hidden dangers exist in practical production application: 1) due to instability of the application itself, it may happen that the application is "hung up" as a whole; 2) each application service needs to introduce a fault tolerance mechanism, and fault tolerance codes are dispersed in different applications and difficult to uniformly upgrade and maintain; 3) the dispersed fault-tolerant codes are difficult to test, and the overall static disaster tolerance capability is difficult to evaluate.
The second scheme is to adopt certain screening and updating strategies, store response data of a user request into an independent cache system, enable a cache region to have hot data, and switch the user request to the independent cache system when abnormality occurs.
The implementation scheme of the static disaster recovery system mainly relates to operation and maintenance personnel, the static disaster recovery system has no business logic, the stability is only related to hardware equipment, and the stability in a single theory is greatly improved compared with the implementation scheme of the first static disaster recovery system. However, the disadvantage is that the cached data is erroneous under extreme conditions, and the return of the erroneous data to the user does not have the function of static disaster recovery.
In view of the above problems in the implementation scheme of the conventional static disaster recovery system, an embodiment of the present invention provides a static disaster recovery system, which can improve the stability of the second type of system service (using a complex cache), and at the same time, provide a dynamic configuration support capability for the system, thereby implementing intelligent monitoring switching and unified intelligent checking of the system.
The providing of the dynamic configuration support capability for the system means allowing the system to dynamically configure and support a disaster tolerance to a Uniform Resource Locator (URL) service, and the system can newly add a static disaster tolerance of the support service without restarting. The intelligent monitoring switching of the system refers to that the stability of the URL service can be judged according to the response state of the request, and when an abnormal condition occurs, the user request can be automatically proxied to the static disaster recovery cache system. The unified intelligent check means that the static disaster recovery capability of the system can be evaluated by unified management and maintenance of the static disaster recovery system.
The static disaster recovery system provided by the embodiment of the present application is exemplarily described below with reference to the drawings.
Fig. 1 shows a schematic structural diagram of a static disaster recovery system provided in an embodiment of the present application. As shown in fig. 1, the static disaster recovery system includes: the system comprises a reverse proxy module, a log analysis module, a disaster recovery configuration module, a crawler module and a disaster recovery cache module.
The reverse proxy module is used for shunting the user request and judging to send the user request to a back-end server or a static disaster recovery system.
The log analysis module is used for uniformly collecting, centrally managing and analyzing the access logs so as to monitor the real-time state of the URL service, such as the average value of response time, tp90, tp99 and the like. Illustratively, tp99 is typically used to embody the response capabilities of the service, i.e., 99% of requests can get a response within this point in time.
The disaster tolerance configuration module is configured with key configuration information, wherein the key configuration information comprises resource names, URLs, disaster tolerance triggering conditions, contacts and the like corresponding to the URL service needing static disaster tolerance. The key configuration information can be configured into the disaster recovery configuration module after the static disaster recovery URL service is identified through manual analysis.
The crawler module is used for acquiring a URL list supporting static disaster tolerance from the reverse proxy module, performing duplicate removal processing on URL request history records (including parameters) in the access log, and crawling data at regular time.
The disaster recovery caching module is used for storing cached data, and the cached key is a URL + time version. Specifically, the crawler module crawls the URL request historical record data within a certain time, and thus the cache module can generate different time versions for the resources according to the crawling frequency.
The introduction of the temporal version in the disaster recovery cache module can solve the following problems: in extreme cases, the data of the disaster recovery system is wrong (for example, the backend service returns the status code of 200, but the response content is empty or wrong information), and at this time, the URL cache can be switched to the previous time version data. When a request does not have a static disaster recovery cache within the latest temporal version, the latest temporal version data may be used.
The specific implementation logics of the reverse proxy module and the disaster recovery configuration module are respectively introduced below.
Fig. 2 is a schematic processing flow diagram of a reverse proxy module according to an embodiment of the present application.
As shown in fig. 2, the processing flow of the reverse proxy module may include:
s201, receiving a user request.
S202, judging whether the static disaster recovery system is started or not.
If the switch state of the static disaster recovery system is on, executing S203; if the switch state of the static disaster recovery system is off, S205 is executed.
S203, judging whether the user requests the URL service to configure the disaster tolerance state.
If the user requests that the URL service has configured a disaster tolerance state (i.e., if yes), then S204 is performed; if the user requests the URL service without configuring the disaster tolerance state (i.e., if no), S205 is performed.
S204, reversely proxying the user request to the disaster recovery caching module.
S205, the user request is reversely proxied to the back-end server.
If S205 is executed, S206 is continuously executed.
S206, judging whether the return result of the back-end server is a server error.
If the return result of the back-end server is a server error, executing S204; and if the returned result of the back-end server is a normal result, ending the process, and returning the normal result to the user.
It can be understood that if the user request is reversely proxied to the disaster recovery caching module, the cached data in the disaster recovery caching module is returned to the user.
Fig. 3 shows a schematic processing flow diagram of a disaster recovery configuration module according to an embodiment of the present application.
As shown in fig. 3, the processing flow of the disaster recovery configuration module may include:
s301, reading a URL service list needing disaster tolerance.
And S302, reading log monitoring data.
S303, judging the state of the URL service in the URL service list according to the log monitoring data and the disaster recovery triggering condition.
If the status of the URL service changes from normal to abnormal (i.e., an abnormality occurs), executing S304; if the status of the URL service is changed from abnormal to normal (i.e., abnormal recovery), S305 is performed.
S304, the request reverse proxy module sets the URL service to be in a disaster tolerance state.
S305, requesting the reverse proxy module to clear the disaster tolerance state of the URL service.
Optionally, the common disaster recovery triggering condition is related to the response time, and includes: an average response time threshold; tp90 response time threshold.
The static disaster recovery system provided by the embodiment of the invention has a high-availability, high-stability and intelligent service system, can provide degradation and fault tolerance functions when the back-end service is abnormal, improves the stability of the whole service, and reduces the occurrence of major production accidents.
Optionally, in the embodiment of the present invention, an operating system used for implementing the static disaster recovery system may be CentOS 7.4, the Web server may be Nginx 1.12.2, the Lua just-in-time compiler may be LuaJIT-2.1.0-beta3, and the Nginx three-party module may be Lua-Nginx-module.
As can be seen from the above, the static disaster recovery system provided in the embodiment of the present invention has the capabilities of dynamic service configuration support, intelligent monitoring switching, unified intelligent checking, and the like. The invention can identify the URL service resource which needs to be subjected to static disaster tolerance by adopting a manual configuration mode, and realizes a static disaster tolerance solution scheme with dynamic configuration, intelligent monitoring and automatic switching through 5 system modules with clear responsibility, high cohesion and low coupling in a static disaster tolerance system. In addition, the time version is introduced into the disaster recovery caching module of the static disaster recovery system, so that the problems of caching error data and missing cache data of the disaster recovery system under extreme conditions are solved.
For example, the static disaster recovery system provided by the embodiment of the present invention has at least the following beneficial effects:
1) dynamic configuration support capability is provided. Dynamic configuration is allowed to support the disaster tolerance to the URL service, and the system can newly add the static disaster tolerance of the support service without restarting.
2) Possess intelligent monitoring switching ability. And judging the stability of the URL service according to the response state of the request, and automatically proxying the user request to a static disaster-tolerant cache system when an abnormal condition occurs.
3) The method has the capability of unified intelligent checking. The system can be managed and maintained uniformly, and relevant tests are carried out to evaluate the static disaster recovery capability of the system.
Based on the static disaster recovery system provided in the foregoing embodiment, an embodiment of the present invention further provides a static disaster recovery method, where the method is applied to a static disaster recovery system. The method can comprise the following steps: and after receiving the user request, judging whether the URL of the user request uniform resource locator is configured with a disaster tolerance state. The disaster recovery state of the URL is the disaster recovery state updated according to the log monitoring data and the disaster recovery triggering condition, and the log monitoring data is generated according to the access log. And if the user request URL service is configured with the disaster tolerance state, reversely proxying the user request to a disaster tolerance cache of the static disaster tolerance system. Cache data regularly crawled from URL request historical records in access logs are cached in a disaster-tolerant cache of the static disaster-tolerant system. And if the user request URL service is not configured with the disaster tolerance state, the user request is reversely proxied to the back-end server.
Optionally, before determining whether the user requests the URL to configure the disaster tolerance state, the method further includes: judging the on-off state of the static disaster recovery system; the switch state of the static disaster recovery system comprises opening or closing. And if the switch state of the static disaster recovery system is on, judging whether the user requests the URL service to be configured with the disaster recovery state. And if the switch state of the static disaster recovery system is closed, reversely proxying the user request to the back-end server.
Optionally, after reverse-proxying the user request to the backend server, the method further includes: judging whether a returned result of the back-end server is a server error; and if the returned result of the back-end server is a server error, reversely proxying the user request to the disaster recovery cache of the static disaster recovery system.
Optionally, the method further comprises: and if the returned result of the back-end server is a normal result, returning the normal result to the user.
Optionally, the method further comprises: and acquiring a URL list supporting static disaster tolerance, crawling cache data from URL request historical records in an access log at regular time, and storing the cache data into a disaster tolerance cache.
Optionally, the method further comprises: reading a URL service list needing disaster tolerance; reading log monitoring data; and updating the disaster recovery state of the URL in the URL service list according to the log monitoring data and the disaster recovery triggering condition.
Optionally, the updating the disaster recovery state of the URL according to the log monitoring data and the disaster recovery triggering condition includes: judging the state of the URL service in the URL service list according to the log monitoring data and the disaster tolerance triggering condition; if the state of the URL service is abnormal, setting the URL service as a disaster tolerance state, and setting the URL service as the configured disaster tolerance state after the URL service is set as the disaster tolerance state; if the state of the URL service is recovered abnormally, clearing the disaster tolerance state of the URL service, wherein the cleared disaster tolerance state of the URL service is a disaster tolerance unconfigured state.
Optionally, the disaster recovery triggering condition includes: the response time is greater than or equal to the average response time threshold or tp90 response time threshold.
Optionally, before reading the log monitoring data, the method further includes: and collecting and analyzing the access log to generate log monitoring data.
Optionally, the reading the URL service list that needs disaster tolerance includes: reading a URL service list needing disaster tolerance from the pre-configured key configuration information; the URL service list needing disaster tolerance comprises a resource name corresponding to the URL service needing static disaster tolerance, a URL, a condition for triggering disaster tolerance and a contact person; the URL service list needing disaster tolerance is configured in the static disaster tolerance system after the URL service of the static disaster tolerance is identified by manual analysis.
Optionally, before the timed crawling of the cached data from the URL request history in the access log, the method further includes: and performing deduplication processing on the URL request history record in the access log.
Optionally, the key cached in the disaster recovery cache of the static disaster recovery system is a URL + time version; the static disaster recovery system is used for crawling cache data after collecting the first-time URL request historical record data, and the static disaster recovery system is used for generating different time versions according to resources corresponding to crawling frequency.
It can be understood that the static disaster recovery method may be implemented based on the module architecture of the static disaster recovery system described in the foregoing embodiment, or may be implemented based on a static disaster recovery system of another architecture. The embodiments of the present application do not limit this. Each module in the static disaster recovery system for implementing the static disaster recovery method can form a static disaster recovery device.
The static disaster recovery device may be integrated in a server, a computer, or other devices, and the present invention is not limited thereto. It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the apparatus may refer to the corresponding process of the method described in the foregoing method embodiment, and is not described in detail herein.
It should be understood that the above-described apparatus embodiments are merely exemplary, and that the apparatus and method disclosed in the embodiments of the present invention may be implemented in other ways. For example, the division of the modules into only one logical functional division may be implemented in other ways, and for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or modules through some communication interfaces, and may be in an electrical, mechanical or other form. In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention or parts thereof which substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a processor to execute the steps of all or part of the method according to the embodiments of the present invention.
That is, those skilled in the art will appreciate that embodiments of the present invention may be implemented in any form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
Based on this, the embodiment of the present invention further provides a program product, which may be a storage medium such as a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk, and the storage medium may store a computer program, and the computer program is executed by a processor to perform the steps of the method described in the foregoing method embodiment. The specific implementation and technical effects are similar, and are not described herein again.
Optionally, an embodiment of the present invention further provides an electronic device, where the electronic device may be a server, a computer, or a like device, and fig. 4 illustrates a schematic structural diagram of the electronic device provided in the embodiment of the present invention.
As shown in fig. 4, the electronic device may include: a processor 401, a storage medium 402 and a bus 403, the storage medium 402 storing machine-readable instructions executable by the processor 401, the processor 401 and the storage medium 402 communicating via the bus 403 when the electronic device is operated, the processor 401 executing the machine-readable instructions to perform the steps of the method as described in the previous embodiments. The specific implementation and technical effects are similar, and are not described herein again.
For ease of illustration, only one processor is described in the above electronic device. However, it should be noted that in some embodiments, the electronic device in the present invention may further include multiple processors, and thus, the steps performed by one processor described in the present invention may also be performed by multiple processors in combination or individually.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the present invention shall be covered thereby. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (15)

1. A static disaster recovery method is applied to a static disaster recovery system, and comprises the following steps:
after receiving a user request, judging whether a Uniform Resource Locator (URL) of the user request is configured with a disaster tolerance state; the disaster recovery state of the URL is a disaster recovery state updated according to log monitoring data and disaster recovery triggering conditions, wherein the log monitoring data are generated according to access logs;
if the user request URL service is configured with a disaster tolerance state, reversely proxying the user request to a disaster tolerance cache of the static disaster tolerance system; cache data regularly crawled from URL request historical records in access logs are cached in a disaster recovery cache of the static disaster recovery system;
and if the user request URL service is not configured with the disaster tolerance state, reversely proxying the user request to a back-end server.
2. The method of claim 1, wherein before determining whether the user requests a Uniform Resource Locator (URL) configured with a disaster recovery status, the method further comprises:
judging the on-off state of the static disaster recovery system, wherein the on-off state of the static disaster recovery system comprises opening or closing;
if the switch state of the static disaster recovery system is on, judging whether the user requests the URL service to be configured with a disaster recovery state;
and if the switch state of the static disaster recovery system is closed, reversely proxying the user request to a back-end server.
3. The method of claim 2, wherein after reverse proxying the user request into a backend server, the method further comprises:
judging whether the returned result of the back-end server is a server error;
and if the returned result of the back-end server is a server error, reversely proxying the user request to the disaster recovery cache of the static disaster recovery system.
4. The method of claim 3, further comprising:
and if the returned result of the back-end server is a normal result, returning the normal result to the user.
5. The method of claim 1, further comprising:
and acquiring a URL list supporting static disaster tolerance, crawling cache data from URL request historical records in an access log at regular time, and storing the cache data into a disaster tolerance cache.
6. The method of claim 1, further comprising:
reading a URL service list needing disaster tolerance;
reading log monitoring data;
and updating the disaster recovery state of the URL in the URL service list according to the log monitoring data and the disaster recovery triggering condition.
7. The method of claim 6, wherein the updating the disaster recovery status of the URL according to the log monitoring data and the disaster recovery triggering condition comprises:
judging the state of the URL service in the URL service list according to the log monitoring data and the disaster tolerance triggering condition;
if the state of the URL service is abnormal, setting the URL service as a disaster tolerance state, wherein the configured disaster tolerance state is obtained after the URL service is set as the disaster tolerance state;
and if the state of the URL service is abnormally recovered, clearing the disaster tolerance state of the URL service, wherein the cleared disaster tolerance state of the URL service is a disaster tolerance unconfigured state.
8. The method of claim 7, wherein the disaster recovery trigger condition comprises:
the response time is greater than or equal to the average response time threshold or tp90 response time threshold.
9. The method of claim 6, wherein prior to reading log monitoring data, the method further comprises:
and collecting and analyzing the access log to generate log monitoring data.
10. The method of claim 6, wherein reading the list of URL services that require disaster recovery comprises:
reading the URL service list needing disaster tolerance from the pre-configured key configuration information;
the URL service list needing disaster tolerance comprises resource names corresponding to the URL services needing static disaster tolerance, URLs, disaster tolerance triggering conditions and contacts; and the URL service list needing disaster tolerance is configured into the static disaster tolerance system after the URL service of the static disaster tolerance is identified by manual analysis.
11. The method of claim 5, wherein prior to the timed crawling of cached data from the URL request history in the access log, the method further comprises:
and performing deduplication processing on the URL request history record in the access log.
12. The method according to claim 1, wherein the key cached in the disaster recovery cache of the static disaster recovery system is a URL + temporal version;
the static disaster recovery system is used for crawling cache data after collecting first-time URL request historical record data, and the static disaster recovery system is used for generating different time versions according to resources corresponding to crawling frequency.
13. A static disaster recovery system, the static disaster recovery system comprising: the system comprises a reverse proxy module, a log analysis module, a disaster recovery configuration module, a crawler module and a disaster recovery cache module;
the crawler module is used for acquiring a URL list supporting static disaster tolerance from the reverse proxy module and regularly crawling cache data from a URL request historical record in an access log into the disaster tolerance cache module; cache data crawled by the crawler module is cached in the disaster recovery cache module;
the log analysis module is used for generating log monitoring data according to the access log; the disaster recovery configuration module is used for updating the disaster recovery state of the URL service to the reverse proxy module according to the log monitoring data and the disaster recovery triggering condition;
the reverse proxy module is used for judging whether the URL of the user request uniform resource locator is configured with a disaster tolerance state after receiving a user request; if the user request URL service is configured with a disaster tolerance state, the user request is reversely proxied to the disaster tolerance cache module; and if the user request URL service is not configured with the disaster tolerance state, reversely proxying the user request to a back-end server.
14. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the method of any one of claims 1-12 when executed.
15. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, performs the method according to any one of claims 1-12.
CN202110080886.5A 2021-01-21 2021-01-21 Static disaster recovery method, system, electronic equipment and storage medium Active CN112732999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110080886.5A CN112732999B (en) 2021-01-21 2021-01-21 Static disaster recovery method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110080886.5A CN112732999B (en) 2021-01-21 2021-01-21 Static disaster recovery method, system, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112732999A true CN112732999A (en) 2021-04-30
CN112732999B CN112732999B (en) 2023-06-09

Family

ID=75593540

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110080886.5A Active CN112732999B (en) 2021-01-21 2021-01-21 Static disaster recovery method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112732999B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120209807A1 (en) * 2009-10-20 2012-08-16 Zte Corporation Method and apparatus for data disaster tolerance preprocessing, and service control point
CN103793538A (en) * 2014-03-06 2014-05-14 赛特斯信息科技股份有限公司 System and method for realizing restoration of web service in case of crash of database
CN105528422A (en) * 2015-12-07 2016-04-27 中国建设银行股份有限公司 Focused crawler processing method and apparatus
CN106156231A (en) * 2015-04-24 2016-11-23 阿里巴巴集团控股有限公司 A kind of website disaster recovery method, Apparatus and system
WO2017107812A1 (en) * 2015-12-21 2017-06-29 阿里巴巴集团控股有限公司 User log storage method and device
CN108170561A (en) * 2018-01-03 2018-06-15 杭州时趣信息技术有限公司 A kind of disaster-tolerant backup method, apparatus and system
CN108737470A (en) * 2017-04-19 2018-11-02 贵州白山云科技有限公司 A kind of access request time source method and apparatus
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN109842518A (en) * 2018-12-13 2019-06-04 平安科技(深圳)有限公司 Content distributing network disaster recovery method, device, computer equipment and storage medium
CN110113224A (en) * 2019-03-19 2019-08-09 深圳壹账通智能科技有限公司 Capacity monitor method, apparatus, computer equipment and storage medium
CN110784498A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Personalized data disaster tolerance method and device
CN111414523A (en) * 2020-03-11 2020-07-14 中国建设银行股份有限公司 Data acquisition method and device
CN111767495A (en) * 2019-04-01 2020-10-13 北京沃东天骏信息技术有限公司 Method and system for synthesizing webpage
CN111866205A (en) * 2020-06-17 2020-10-30 新浪网技术(中国)有限公司 Method and system for converting IP address into position information
CN112000394A (en) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for accessing an applet
CN112181723A (en) * 2020-09-22 2021-01-05 中国建设银行股份有限公司 Financial disaster recovery method and device, storage medium and electronic equipment
CN112202631A (en) * 2020-09-17 2021-01-08 北京金山云网络技术有限公司 Resource access method, device and system, electronic equipment and storage medium

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120209807A1 (en) * 2009-10-20 2012-08-16 Zte Corporation Method and apparatus for data disaster tolerance preprocessing, and service control point
CN103793538A (en) * 2014-03-06 2014-05-14 赛特斯信息科技股份有限公司 System and method for realizing restoration of web service in case of crash of database
CN106156231A (en) * 2015-04-24 2016-11-23 阿里巴巴集团控股有限公司 A kind of website disaster recovery method, Apparatus and system
CN105528422A (en) * 2015-12-07 2016-04-27 中国建设银行股份有限公司 Focused crawler processing method and apparatus
WO2017107812A1 (en) * 2015-12-21 2017-06-29 阿里巴巴集团控股有限公司 User log storage method and device
US20190095293A1 (en) * 2016-07-27 2019-03-28 Tencent Technology (Shenzhen) Company Limited Data disaster recovery method, device and system
CN108737470A (en) * 2017-04-19 2018-11-02 贵州白山云科技有限公司 A kind of access request time source method and apparatus
CN108170561A (en) * 2018-01-03 2018-06-15 杭州时趣信息技术有限公司 A kind of disaster-tolerant backup method, apparatus and system
CN110784498A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Personalized data disaster tolerance method and device
CN109842518A (en) * 2018-12-13 2019-06-04 平安科技(深圳)有限公司 Content distributing network disaster recovery method, device, computer equipment and storage medium
CN110113224A (en) * 2019-03-19 2019-08-09 深圳壹账通智能科技有限公司 Capacity monitor method, apparatus, computer equipment and storage medium
CN111767495A (en) * 2019-04-01 2020-10-13 北京沃东天骏信息技术有限公司 Method and system for synthesizing webpage
CN111414523A (en) * 2020-03-11 2020-07-14 中国建设银行股份有限公司 Data acquisition method and device
CN111866205A (en) * 2020-06-17 2020-10-30 新浪网技术(中国)有限公司 Method and system for converting IP address into position information
CN112000394A (en) * 2020-08-27 2020-11-27 北京百度网讯科技有限公司 Method, apparatus, device and storage medium for accessing an applet
CN112202631A (en) * 2020-09-17 2021-01-08 北京金山云网络技术有限公司 Resource access method, device and system, electronic equipment and storage medium
CN112181723A (en) * 2020-09-22 2021-01-05 中国建设银行股份有限公司 Financial disaster recovery method and device, storage medium and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
诸葛钢铁云: "Nginx+upstream针对后端服务器容错的配置说明", Retrieved from the Internet <URL:https://blog.csdn.net/jj1130050965/article/details/110947245> *

Also Published As

Publication number Publication date
CN112732999B (en) 2023-06-09

Similar Documents

Publication Publication Date Title
US11023355B2 (en) Dynamic tracing using ranking and rating
US8082471B2 (en) Self healing software
Castelli et al. Proactive management of software aging
CN107818431B (en) Method and system for providing order track data
US11093349B2 (en) System and method for reactive log spooling
CN105743730B (en) The method and its system of real time monitoring are provided for the web service of mobile terminal
US20110314138A1 (en) Method and apparatus for cause analysis configuration change
US20160224400A1 (en) Automatic root cause analysis for distributed business transaction
KR100947740B1 (en) System and method for monitoring event in computing network and event management apparatus
CN103488793A (en) User behavior monitoring method based on information retrieval
CN111522703A (en) Method, apparatus and computer program product for monitoring access requests
CN113687974A (en) Client log processing method and device and computer equipment
US8959507B2 (en) Bookmarks and performance history for network software deployment evaluation
US11966884B2 (en) Using distributed databases for network regression analysis
US20120054324A1 (en) Device, method, and storage medium for detecting multiplexed relation of applications
CN108647284B (en) Method and device for recording user behavior, medium and computing equipment
US20140067912A1 (en) System for Remote Server Diagnosis and Recovery
US10180914B2 (en) Dynamic domain name service caching
CN112100035A (en) Page abnormity detection method, system and related device
CN106685744A (en) Fault elimination method, apparatus and system
CN112732999A (en) Static disaster recovery method, system, electronic device and storage medium
US20160085638A1 (en) Computer system and method of identifying a failure
CN113312320A (en) Method and system for acquiring user operation database behavior
CN114077510B (en) Method and device for positioning and displaying fault root cause
CN116701423A (en) Method, device, equipment and storage medium for updating operation logic library

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant