CN112929438B - Business processing method and device of double-site distributed database - Google Patents

Business processing method and device of double-site distributed database Download PDF

Info

Publication number
CN112929438B
CN112929438B CN202110166381.0A CN202110166381A CN112929438B CN 112929438 B CN112929438 B CN 112929438B CN 202110166381 A CN202110166381 A CN 202110166381A CN 112929438 B CN112929438 B CN 112929438B
Authority
CN
China
Prior art keywords
copies
service
copy
site
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110166381.0A
Other languages
Chinese (zh)
Other versions
CN112929438A (en
Inventor
王君轶
黄颢
王爽
陈镛先
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110166381.0A priority Critical patent/CN112929438B/en
Publication of CN112929438A publication Critical patent/CN112929438A/en
Application granted granted Critical
Publication of CN112929438B publication Critical patent/CN112929438B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Hardware Redundancy (AREA)

Abstract

The application provides a service processing method and a device for a double-site distributed database, which can be used in the technical field of big data and comprise the following steps: if all the current copies are normal, setting copy parameters as the total number of the copies, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service; if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies, and then carrying out the next adjacent service; the majority is a positive integer which is more than half of the corresponding total number, the invention solves the problem of data consistency of the main and standby sites in the double sites, and when site-level faults occur, zero data loss is ensured; after a disaster occurs, manual intervention is reduced by means of automatic fault detection and dynamic parameter adjustment, and financial business risks and production operation and maintenance risks are reduced.

Description

Business processing method and device of double-site distributed database
Technical Field
The application relates to the technical field of computers, in particular to a business processing method and device for a double-site distributed database.
Background
With the progress of information technology, digital finance is developed unprecedentedly, data is more important for financial enterprises, and loss of data or interruption of business brings immeasurable loss to the financial enterprises, especially the banking industry. With the improvement of database technology and the wave of IT architecture transformation, more and more large financial institutions migrate data from the traditional database to the distributed database to improve the security and high availability of the data. At present, almost all distributed databases are based on a distributed consistency protocol, that is, most copies of data are strong in consistency, and generally, the copies of data are set to be 3 copies or odd copies larger than 3, in order to resist the influence of site-level faults on the data, at least 3 sites should be established in the same city, but in fact, even in the banking industry with the highest security level, at most, a two-center or two-place-three-center architecture is established in the same city, so that a site in the same city has most copies (i.e., a main site is necessary), a disaster occurs in the main site not only causes service interruption, but also a spare site (a few copy sites) loses a part of data (the main site and the spare site are asynchronous data replication), which not only brings huge influence, but also cannot meet the supervision requirements.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides a service processing method and a service processing device for a double-site distributed database, which solve the problem of data consistency of main and standby sites in a double site and ensure zero data loss when site-level faults occur; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
In order to solve the technical problem, the application provides the following technical scheme:
one aspect of the present invention provides a service processing method for a dual-site distributed database, where the dual-site distributed database includes a primary site and a secondary site, and the primary site and the secondary site respectively store multiple copies, including:
if all the current copies are normal, setting copy parameters as the total number of the copies, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service;
if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies, and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
In a preferred embodiment, further comprising:
if the number of the current abnormal copies is larger than zero, detecting at least one of the server, the network and the current abnormal copies through a fault detector to obtain a fault reason;
and processing the fault causing the abnormal copy according to the fault reason.
In a preferred embodiment, the detecting the server through the failure detector to obtain the failure cause includes: and positioning the fault reason of the server by acquiring the information on the BMC.
In a preferred embodiment, the detecting the network through the fault detector to obtain the fault cause includes: the network is detected by ping the network, comparing network delays and looking up packet loss rates.
In a preferred embodiment, the detecting, by the fault detector, the current abnormal copy to obtain the fault cause includes: and sending a query instruction to the copy through the fault detector, and if an expected result is returned within a specified time, judging the copy to be normal, otherwise, judging the copy to be abnormal.
In a preferred embodiment, further comprising: and for each abnormal copy, downloading the service data before the fault occurs from the cloud server, and synchronizing the downloaded service data to the corresponding abnormal copy.
In another aspect of the present invention, a service processing apparatus for a dual-site distributed database is provided, where the dual-site distributed database includes a primary site and a secondary site, and the primary site and the secondary site each store multiple copies, including:
The normal state processing module is used for setting copy parameters as the total copy number if all current copies are normal, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service;
the abnormal state processing module is used for setting the copy parameters as the majority of the current normal copy number if the current abnormal copy number is less than the majority of all the copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
In a preferred embodiment, further comprising: the fault detection module is used for detecting at least one of the server, the network and the current abnormal copy through the fault detector to obtain a fault reason if the number of the current abnormal copies is greater than zero;
and the fault processing module is used for processing the fault of the abnormal copy according to the fault reason.
In another aspect of the present invention, the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the service processing method for the dual-site distributed database.
In still another aspect of the present invention, the present application provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the business processing method of the dual-site distributed database.
According to the technical scheme, the service processing method of the double-site distributed database, provided by the application, comprises the following steps: if all the current copies are normal, setting copy parameters as the total number of the copies, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service; if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies, and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number. The invention solves the problem of data consistency of the main and standby sites in the double sites, and ensures zero data loss when site-level faults occur; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of a business processing method of a dual-site distributed database.
FIG. 2 is a schematic diagram of a fault handling process.
FIG. 3 is a schematic diagram of a database reconstruction process.
Fig. 4 is a schematic structural diagram of a service processing device of a dual-site distributed database.
Fig. 5 is a schematic structural diagram of a fault handling module.
FIG. 6 is a diagram illustrating a structure of a reconstructed database unit.
Fig. 7 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the service processing method and apparatus for a dual-site distributed database disclosed in the present application can be used in the field of computer technology, and can also be used in any field other than the field of computer technology.
With the progress of information technology, digital finance is developed unprecedentedly, data is more important for financial enterprises, and loss of data or interruption of business brings immeasurable loss to the financial enterprises, especially the banking industry. With the improvement of database technology and the wave of IT architecture transformation, more and more large financial institutions migrate data from the traditional database to the distributed database to improve the security and high availability of the data. At present, almost all distributed databases are based on a distributed consistency protocol, that is, most copies of data are strong in consistency, and generally, the copies of data are set to be 3 copies or odd copies larger than 3, in order to resist the influence of site-level faults on the data, at least 3 sites should be established in the same city, but in fact, even in the banking industry with the highest security level, at most, a two-center or two-place-three-center architecture is established in the same city, so that a site in the same city has most copies (i.e., a main site is necessary), a disaster occurs in the main site not only causes service interruption, but also a spare site (a few copy sites) loses a part of data (the main site and the spare site are asynchronous data replication), which not only brings huge influence, but also cannot meet the supervision requirements.
The application provides a service processing method of a dual-site distributed database, wherein the dual-site distributed database includes a primary site and a secondary site, and a plurality of copies are stored in the primary site and the secondary site respectively, as shown in fig. 1, the method includes:
s1, if all current copies are normal, setting copy parameters as total copy number, and synchronizing the service data of the current processing service to all copies for next adjacent service;
specifically, for the distributed database, the copy parameter refers to that when data is synchronously written into the number of copies set by the copy parameter, the data in the database can be read. When a copy cluster is established, the copy parameters of the copy cluster can be automatically set to be half of the total number of the copies and then one more, namely, the current business data is backed up to most copies in the cluster, and a most consistent mechanism is adopted to ensure that the business data is not lost. For a distributed database system with two sites, such a majority consistency mechanism is very easy to cause service data loss to cause that services cannot be processed under the condition that a main site is crashed or a majority of copies are abnormal, for example, assuming that the total number of database copies is 11, wherein 6 copies are stored in the main site and 5 copies are stored in the standby site, the majority consistency mechanism is adopted, after the services are processed, the service data are synchronized to the 6 copies of the main site, and at this time, the majority consistency mechanism is satisfied, so that the synchronization is not continued, and the process will start to process the next service request. When the process starts to process the next service request, all the copies in the primary site are abnormal due to a certain fault, and the copies in the standby site are not asynchronously copied according to the service data, so that a serious problem occurs at this time, and the data of the service is lost. The loss of data is not allowed in many cases. In order to resist the situation that the service data of the double-site distributed database is lost, a mechanism with the same total number is adopted, namely all the copies are normal at present, the copy parameter is set as the total copy number, and the service data of the currently processed service is synchronized to all the copies and then the next adjacent service is carried out. The total number of database copies is still 11, with 6 stored at the primary site and 5 stored at the backup site for illustration. After the current service is completed, the copies are synchronized by adopting a mechanism with the same total number, namely, 6 copies of the main site and 5 copies of the standby site are written into the service data, and then the next service is processed. Similarly, if the main site causes the abnormality of the 6 stored copies due to some reason in the process of processing the next service, but the copy in the standby site already stores the current service data, the subsequent next service can be processed according to the copy of the standby site, and the loss of the service data cannot be caused.
S2, if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
Specifically, the mechanism of total consistency can ensure that data is not lost, but when a copy fails, service interruption can be caused directly. In order to improve the high availability of the double sites, if the number of the current abnormal copies is less than the majority of all the copies, the copy parameters are set to be the majority of the number of the current normal copies, and the service data of the current processing service is synchronized to the majority of the current normal copies and then the next adjacent service is carried out. It can be understood that, after the copy fails, in order to enable the services to be processed continuously, the initial total number consistency mechanism is degraded to the majority consistency mechanism, that is, after the service data of the currently processed service is synchronized to the majority copy in the current normal copy, the next adjacent service is performed. For example, if the total number of copies in the distributed database cluster is 11, 6 copies are stored in the primary site, and 5 copies are stored in the backup site. The fault detector finds that 1 copy in the main site is abnormal, at the moment, the copy parameter of the database is changed from the initial 11 to 6, namely, as long as more than 6 copies are in normal state, the database can be served externally, at the moment, although 1 copy is abnormal, 10 copies are normal, 10 copies are more than 6, so that the condition of the database for serving externally is met, the next adjacent service can be normally processed, and the continuity of the service is not influenced.
It is understood that the service in the present invention may be a service between a client device and a distributed site of the present invention, such as transferring money, making money, etc., the server may be a bank server, and the client device may include a smart phone, a tablet electronic device, a portable computer, a desktop computer, a Personal Digital Assistant (PDA).
The client device may have a communication module (i.e., a communication unit), and may be in communication connection with a remote bank server to implement data transmission with the server.
The above-described distributed sites and the client devices may communicate using any suitable network protocol, including network protocols not yet developed at the filing date of this application. The network protocol may include, for example, a TCP/IP protocol, a UDP/IP protocol, an HTTP protocol, an HTTPS protocol, or the like. Of course, the network Protocol may also include, for example, an RPC Protocol (Remote Procedure Call Protocol), a REST Protocol (Representational State Transfer Protocol), and the like used above the above Protocol.
In the embodiment of the present invention, the processing of the failure is further included, as shown in fig. 2, the steps are as follows:
And S21, if the number of the current abnormal copies is larger than zero, detecting at least one of the server, the network and the current abnormal copies through a fault detector to obtain a fault reason.
Specifically, when it is found that there is an abnormality in the copy, it is necessary to first locate a fault caused by various reasons, so as to accurately process the fault. Whether the database has faults and fault reasons are judged through the internal fault detector so as to ensure healthy and stable operation of the cluster, and meanwhile, the fault reasons are accurately positioned so as to facilitate recovery work of a subsequent system. Dividing 3 scenes from the bottom layer to the top layer by fault reasons: 1. the fault detector of the system acquires basic information of the server by connecting with a BMC (BaseboardManagerController), wherein the BMC is a control system of an IntelX86 server and is independent of a mainboard, a CPU (Central processing Unit), an operating system and the like. The BMC comprises various information such as a server mainboard, a CPU, a disk, a power supply, temperature and the like, for example, when the disk fails, corresponding disk alarm (the system is also suitable for faults such as the mainboard, the CPU, the power supply and the like) can occur on the BMC, and the system can accurately position the fault reason of the server by acquiring the information on the BMC, timely inform equipment professionals and shorten maintenance time. 2. A network failure. The fault detector can regularly detect the network health conditions between stations and nodes, evaluate the network health conditions through aspects of ping networks, comparing network delay, checking packet loss rate and the like, if the network delay is increased and packet loss suddenly increases, the network can be judged to be abnormal, network professionals are informed in time, and the system recovery efficiency is improved. 3. The copy fails. The fault detector sends a query instruction to the copy through the copy activation based on the application instead of simply sending a heartbeat to the copy, and if an expected result (such as a time stamp) is returned at a specified time, the copy is judged to be normal, and the activation mode can effectively avoid false activation.
And S22, processing the fault causing the abnormal copy according to the fault reason.
Specifically, for example, if it is detected that the failure is due to a server failure, the failure problem is handled by repairing the server, for example, restarting the server. If the detected failure is a failure due to a network, the failure may be handled by repairing the network, such as unplugging or unplugging the network data lines again. If the detected failure is a failure caused by the abnormality of the copy itself, the failure can be handled by repairing the copy itself, for example, a normal copy is copied again after the abnormal copy is deleted.
In the embodiment of the invention, if the number of the current abnormal copies is larger than the majority of all the copies, the database is reestablished, the business data before the failure occurs is downloaded from the cloud server, and the downloaded business data is synchronized to the corresponding copies. As shown in fig. 3, the specific steps for reconstructing the database include:
s301: and restoring the basic environment of the fault area. The recovery basic environment comprises server equipment, a network, an operating system and the like, and the fault reason provided by the fault detector can accurately position the fault position and reason and improve the fault recovery efficiency.
S302: and rebuilding the fault offline component. The system builds a previous fault component on a recovery basis environment, and the previous fault component comprises a database management component, a database storage component, a database calculation component and the like.
S303: and (5) recovering the online system. The system rejoins the recovered components into the cluster, reduces the capacity of the temporarily reconstructed components and copies during emergency, and recovers the components and copies to the cluster deployment architecture before failure, and the recovery period is transparent to the service.
The invention is further described with reference to a specific implementation scenario.
Suppose that a double-site distributed database is set up in a city for business needs at a certain bank, the total number of copies is set to be 25 in the database, wherein the number of copies stored in a main site is 13, the number of copies stored in a standby site is 12, and copy parameters of the database are set to be 25, i.e. the total number is consistent.
The current bank server has already finished processing a transfer service, and obtains relevant data information of the transfer service, such as transfer time, transfer account, transfer amount, account arrival mode and the like. While the fault detector shows that the status of all copies is normal. According to the setting of the copy parameters, the data information of the transfer service needs to be synchronized into 25 copies, namely all the copies of the primary site and the standby site, and then the processing of the next service is started.
Assuming that the next service request is a storage service, the system updates the account information of the storage account to obtain the relevant service information of the storage service, such as the storage account, the storage amount, the storage time, the storage mode, and the like. At which point the fault detector detects that one of the copies of the primary site is in an abnormal state. If the storage service needs to be synchronized to 25 copies of the primary site and the standby site according to the copy parameter 25 of the current database, the next service is entered. But this cannot be done because there is one copy that is anomalous. Therefore, if the service is not interrupted, the synchronization mechanism with the same total number needs to be changed to be the same as the synchronization mechanism with the same total number, that is, the copy parameter is changed to a positive integer which is half of the current normal copy number and is added with one, that is, half of 24 and is 12, and is added with 1 and then is 13, and the copy parameter is set to be 13. Therefore, the storage service only needs to be synchronized to 13 copies, and the next service can be entered. This is fully achievable.
Assuming that the next service of the storage service is a bank account opening service, the bank server has completed the related processing of the account opening service, and has obtained data information related to the account opening service, such as the account opening person identification number, account opening time, and the like. At this time, the failure detector detects that all 12 copies of the backup site are abnormal, mainly due to the interruption of network communication between the primary site and the backup site. According to the current copy parameter 13, the account opening service needs to be synchronized into 13 copies to enter the next service, and the number of the copies which normally work at this time is only 12, which obviously does not meet the requirement of copy synchronization. So in order that no interruption of the service occurs, the copy parameter needs to be changed to 7 so that the service continues. That is to say, the data information of the account opening service only needs to be synchronized to 7 copies, and then the next service processing can be performed.
Assuming that the next service of the account opening service is a withdrawal service, the bank server has already processed the withdrawal service, and obtains relevant data information of the withdrawal service, such as withdrawal time, withdrawal account number, withdrawal mode, withdrawal amount, and the like. At this time, the failure detector detects that the server of the primary site fails, which causes all the copies of the primary site to be abnormal. At this time, no normal state copy exists in the whole database, so copy synchronization cannot be performed, the next service cannot enter in order to ensure uninterrupted service, one-click recovery operation is adopted, the database is rebuilt, the total number of copies is 25, 13 main sites and 12 standby sites are provided, and the newly-built copy data are service data before failure in the cloud database. While setting the copy parameter to 25. And synchronizing the withdrawal service to all 25 copies in the main and standby sites and entering the next service.
By analogy with the process, the high availability of the double-site distributed database is realized.
In the above specific scenario, an extreme case is assumed, and all failures including copy itself and network and server failures cannot be repaired in a short time. In practice, the processing thread in the failure is parallel to the thread of the business processing. It can be understood that, if a failure is detected in the replica itself, and the service processing thread is still executing service processing at this time, and no synchronization of the replicas is needed, the failure processing thread may delete the failed replica and copy the current normal replica to the position of the deleted replica, thereby implementing failure self-healing.
As can be seen from the above description, the service processing method for a distributed database with two sites provided by the present invention solves the problem of data consistency between the primary and the secondary sites in the two sites, and when a site level fault occurs, it is ensured that data is lost zero; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
In terms of software, the present application provides an embodiment of a service processing apparatus of a dual-site distributed database for executing all or part of contents in a service processing method of the dual-site distributed database, and referring to fig. 4, the service processing apparatus of the dual-site distributed database specifically includes the following contents:
if all the current copies are normal, setting copy parameters as the total number of copies, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service;
specifically, for the distributed database, the copy parameter refers to that when data is synchronously written into the number of copies set by the copy parameter, the data in the database can be read. When a copy cluster is established, the copy parameters of the copy cluster can be automatically set to be half of the total number of the copies and then one more, namely, the current business data is backed up to most copies in the cluster, and a most consistent mechanism is adopted to ensure that the business data is not lost. For a distributed database system with two sites, such a majority consistency mechanism is very easy to cause service data loss to cause that services cannot be processed under the condition that a main site is crashed or a majority of copies are abnormal, for example, assuming that the total number of database copies is 11, wherein 6 copies are stored in the main site and 5 copies are stored in the standby site, the majority consistency mechanism is adopted, after the services are processed, the service data are synchronized to the 6 copies of the main site, and at this time, the majority consistency mechanism is satisfied, so that the synchronization is not continued, and the process will start to process the next service request. When the process starts to process the next service request, all the copies in the primary site are abnormal due to a certain fault, and the copies in the standby site are not asynchronously copied according to the service data, so that a serious problem occurs at this time, and the data of the service is lost. The loss of data is not allowed in many cases. In order to resist the situation that the service data of the double-site distributed database is lost, a mechanism with the same total number is adopted, namely all the copies are normal at present, the copy parameter is set as the total copy number, and the service data of the currently processed service is synchronized to all the copies and then the next adjacent service is carried out. The total number of database copies is still 11, with 6 stored at the primary site and 5 stored at the backup site for illustration. After the current service is completed, the copies are synchronized by adopting a mechanism with the same total number, namely, 6 copies of the main site and 5 copies of the standby site are written into the service data, and then the next service is processed. Similarly, if the main site causes the abnormality of the 6 stored copies due to some reason in the process of processing the next service, but the copy in the standby site already stores the current service data, the subsequent next service can be processed according to the copy of the standby site, and the loss of the service data cannot be caused.
If the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
Specifically, the mechanism of total number consistency can ensure that data is not lost, but when a copy fails, service interruption is directly caused. In order to improve the high availability of the double sites, if the number of the current abnormal copies is less than the majority of all the copies, the copy parameters are set to be the majority of the number of the current normal copies, and the service data of the current processing service is synchronized to the majority of the current normal copies and then the next adjacent service is carried out. It can be understood that, after the copy fails, in order to enable the services to be processed continuously, the initial total number consistency mechanism is degraded to the majority consistency mechanism, that is, after the service data of the currently processed service is synchronized to the majority copy in the current normal copy, the next adjacent service is performed. For example, if the total number of copies in the distributed database cluster is 11, 6 copies are stored in the primary site, and 5 copies are stored in the backup site. The fault detector finds that 1 copy in the main site is abnormal, at the moment, the copy parameter of the database is changed from the initial 11 to 6, namely, as long as more than 6 copies are in normal state, the database can be served externally, at the moment, although 1 copy is abnormal, 10 copies are normal, 10 copies are more than 6, so that the condition of the database for serving externally is met, the next adjacent service can be normally processed, and the continuity of the service is not influenced.
In the embodiment of the present invention, a fault handling module is further included, as shown in fig. 5, the module includes:
and the fault detection unit is used for detecting at least one of the server, the network and the current abnormal copy through the fault detector to obtain a fault reason if the number of the current abnormal copies is greater than zero.
Specifically, when it is found that there is an abnormality in the copy, it is necessary to first locate a fault caused by various reasons, so as to accurately process the fault. Whether the database has faults and fault reasons are judged through the internal fault detector so as to ensure healthy and stable operation of the cluster, and meanwhile, the fault reasons are accurately positioned so as to facilitate recovery work of a subsequent system. Dividing 3 scenes from the bottom layer to the top layer by fault reasons: 1. the fault detector of the system acquires basic information of the server by connecting with a BMC (BaseboardManagerController), wherein the BMC is a control system of an IntelX86 server and is independent of a mainboard, a CPU (Central processing Unit), an operating system and the like. The BMC comprises various information such as a server mainboard, a CPU, a disk, a power supply, temperature and the like, for example, when the disk fails, corresponding disk alarm (the system is also suitable for faults such as the mainboard, the CPU, the power supply and the like) can occur on the BMC, and the system can accurately position the fault reason of the server by acquiring the information on the BMC, timely inform equipment professionals and shorten maintenance time. 2. A network failure. The fault detector can regularly detect the network health conditions between stations and nodes, evaluate the network health conditions through aspects of ping networks, comparing network delay, checking packet loss rate and the like, if the network delay is increased and packet loss suddenly increases, the network can be judged to be abnormal, network professionals are informed in time, and the system recovery efficiency is improved. 3. The copy fails. The fault detector sends a query instruction to the copy through the copy activation based on the application instead of simply sending a heartbeat to the copy, and if an expected result (such as a time stamp) is returned at a specified time, the copy is judged to be normal, and the activation mode can effectively avoid false activation.
And the fault processing unit is used for processing the fault of the abnormal copy according to the fault reason.
Specifically, for example, if it is detected that the failure is due to a server failure, the failure problem is handled by repairing the server, for example, restarting the server. If the detected failure is a failure due to a network, the failure may be handled by repairing the network, such as unplugging or unplugging the network data lines again. If the detected failure is a failure caused by the abnormality of the copy itself, the failure can be handled by repairing the copy itself, for example, a normal copy is copied again after the abnormal copy is deleted.
In the embodiment of the invention, the system further comprises a reconstruction unit, if the number of the current abnormal copies is larger than that of all the copies, the database is reestablished, the service data before the fault occurs is downloaded from the cloud server, and the downloaded service data is synchronized to the corresponding copies. As shown in fig. 6, the reconstruction unit includes:
a basic environment recovery unit: the recovery basic environment comprises server equipment, a network, an operating system and the like, and the fault reason provided by the fault detector can accurately position the fault position and reason and improve the fault recovery efficiency.
Rebuilding a fault offline component unit: the system builds a previous fault component on a recovery basis environment, and the previous fault component comprises a database management component, a database storage component, a database calculation component and the like.
An online system recovery unit: the system rejoins the recovered components into the cluster, reduces the capacity of the temporarily reconstructed components and copies during emergency, and recovers the components and copies to the cluster deployment architecture before failure, and the recovery period is transparent to the service.
The invention is further described with reference to a specific implementation scenario.
Suppose that a double-site distributed database is set up in a city for business needs at a certain bank, the total number of copies is set to be 25 in the database, wherein the number of copies stored in a main site is 13, the number of copies stored in a standby site is 12, and copy parameters of the database are set to be 25, i.e. the total number is consistent.
The current bank server has already finished processing a transfer service, and obtains relevant data information of the transfer service, such as transfer time, transfer account, transfer amount, account arrival mode and the like. While the fault detector shows that the status of all copies is normal. According to the setting of the copy parameters, the data information of the transfer service needs to be synchronized into 25 copies, namely all the copies of the primary site and the standby site, and then the processing of the next service is started.
Assuming that the next service request is a storage service, the system updates the account information of the storage account to obtain the relevant service information of the storage service, such as the storage account, the storage amount, the storage time, the storage mode, and the like. At which point the fault detector detects that one of the copies of the primary site is in an abnormal state. If the storage service needs to be synchronized to 25 copies of the primary site and the standby site according to the copy parameter 25 of the current database, the next service is entered. But this cannot be done because there is one copy that is anomalous. Therefore, if the service is not interrupted, the synchronization mechanism with the same total number needs to be changed to be the same as the synchronization mechanism with the same total number, that is, the copy parameter is changed to a positive integer which is half of the current normal copy number and is added with one, that is, half of 24 and is 12, and is added with 1 and then is 13, and the copy parameter is set to be 13. Therefore, the storage service only needs to be synchronized to 13 copies, and the next service can be entered. This is fully achievable.
Assuming that the next service of the storage service is a bank account opening service, the bank server has completed the related processing of the account opening service, and has obtained data information related to the account opening service, such as the account opening person identification number, account opening time, and the like. At this time, the failure detector detects that all 12 copies of the backup site are abnormal, mainly due to the interruption of network communication between the primary site and the backup site. According to the current copy parameter 13, the account opening service needs to be synchronized into 13 copies to enter the next service, and the number of the copies which normally work at this time is only 12, which obviously does not meet the requirement of copy synchronization. So in order that no interruption of the service occurs, the copy parameter needs to be changed to 7 so that the service continues. That is to say, the data information of the account opening service only needs to be synchronized to 7 copies, and then the next service processing can be performed.
Assuming that the next service of the account opening service is a withdrawal service, the bank server has already processed the withdrawal service, and obtains relevant data information of the withdrawal service, such as withdrawal time, withdrawal account number, withdrawal mode, withdrawal amount, and the like. At this time, the failure detector detects that the server of the primary site fails, which causes all the copies of the primary site to be abnormal. At this time, no normal state copy exists in the whole database, so copy synchronization cannot be performed, the next service cannot enter in order to ensure uninterrupted service, one-click recovery operation is adopted, the database is rebuilt, the total number of copies is 25, 13 main sites and 12 standby sites are provided, and the newly-built copy data are service data before failure in the cloud database. While setting the copy parameter to 25. And synchronizing the withdrawal service to all 25 copies in the main and standby sites and entering the next service.
By analogy with the process, the high availability of the double-site distributed database is realized.
In the above specific scenario, an extreme case is assumed, and all failures including copy itself and network and server failures cannot be repaired in a short time. In practice, the processing thread in the failure is parallel to the thread of the business processing. It can be understood that, if it is detected that the duplicate itself fails, and at this time, the service processing thread is still executing service processing, and there is no need to synchronize the duplicate, the failed duplicate can be deleted by the failure processing thread, and the current normal duplicate is copied to the duplicate deletion position, so that the failure self-healing is realized.
As can be seen from the above description, the service processing device of a dual-site distributed database according to the present invention solves the problem of data consistency between the primary and secondary sites in the dual site, and ensures zero data loss when a site level fault occurs; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
In terms of hardware, the present application provides an embodiment of an electronic device for implementing all or part of contents in a service processing method of a dual-site distributed database, where the electronic device specifically includes the following contents:
fig. 7 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 7, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 7 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.
In one embodiment, the business processing functions of the dual site distributed database may be integrated into a central processor. Wherein the central processor may be configured to control:
S1, if all current copies are normal, setting copy parameters as total copy number, and synchronizing the service data of the current processing service to all copies for next adjacent service;
s2, if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
As can be seen from the above description, the electronic device provided in the embodiment of the present application solves the problem of data consistency between the primary and secondary sites in the dual site, and when a site level fault occurs, it is ensured that data is lost zero; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
In another embodiment, the service processing apparatus of the dual-site distributed database may be configured separately from the central processor 9100, for example, the service processing apparatus of the dual-site distributed database may be configured as a chip connected to the central processor 9100, and the service processing function of the dual-site distributed database is realized by the control of the central processor.
As shown in fig. 7, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 7; further, the electronic device 9600 may further include components not shown in fig. 7, which may be referred to in the art.
As shown in fig. 7, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.
The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.
The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.
The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the service processing method of the dual-site distributed database in the foregoing embodiment, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps of the service processing method of the dual-site distributed database, where the execution subject of the computer program is a server or a client, for example, when the processor executes the computer program, the processor implements the following steps:
S1, if all current copies are normal, setting copy parameters as total copy number, and synchronizing the service data of the current processing service to all copies for next adjacent service;
s2, if the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
As can be seen from the above description, the computer-readable storage medium provided in the embodiment of the present application solves the problem of data consistency between the primary and secondary sites in the dual site, and ensures zero data loss when a site level fault occurs; after a disaster occurs, manual intervention is reduced through a mode of fault automatic detection and parameter dynamic adjustment, the whole-process automatic emergency help service is quickly recovered, the pressure of operation and maintenance personnel is effectively reduced, and the financial service risk and the production operation and maintenance risk are reduced.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A business processing method of a dual-site distributed database is characterized in that the dual-site distributed database comprises a primary site and a standby site, a plurality of copies are respectively stored in the primary site and the standby site, and the business processing method comprises the following steps:
if all the current copies are normal, setting copy parameters as the total number of the copies, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service; the copy parameter is used for setting the copy number of the data to be synchronously written;
If the number of the current abnormal copies is less than the majority of all the copies, setting the copy parameters as the majority of the number of the current normal copies, synchronizing the service data of the current processing service to the majority of the current normal copies, and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
2. The service processing method of the dual-site distributed database according to claim 1, further comprising:
if the number of the current abnormal copies is larger than zero, detecting at least one of the server, the network and the current abnormal copies through a fault detector to obtain a fault reason;
and processing the fault causing the abnormal copy according to the fault reason.
3. The method for processing services of the dual-site distributed database according to claim 2, wherein the detecting the server through the failure detector to obtain the failure cause comprises: and positioning the fault reason of the server by acquiring the information on the BMC.
4. The method for processing services of the dual-site distributed database according to claim 2, wherein the detecting the network by the failure detector to obtain the failure cause comprises: the network is detected by ping the network, comparing network delays and looking up packet loss rates.
5. The method for processing services of the dual-site distributed database according to claim 2, wherein the detecting the current abnormal copy through the failure detector to obtain the failure cause comprises: and sending a query instruction to the copy through the fault detector, and if an expected result is returned within a specified time, judging the copy to be normal, otherwise, judging the copy to be abnormal.
6. The service processing method of the dual-site distributed database according to claim 1, further comprising: and for each abnormal copy, downloading the service data before the fault occurs from the cloud server, and synchronizing the downloaded service data to the corresponding abnormal copy.
7. A service processing apparatus of a dual-site distributed database, wherein the dual-site distributed database includes a primary site and a secondary site, and a plurality of copies are stored in the primary site and the secondary site respectively, and the service processing apparatus includes:
the normal state processing module is used for setting copy parameters as the total copy number if all current copies are normal, and synchronizing the service data of the current processing service to all the copies and then carrying out the next adjacent service; the copy parameter is used for setting the copy number of the data to be synchronously written;
The abnormal state processing module is used for setting the copy parameters as the majority of the current normal copy number if the current abnormal copy number is less than the majority of all the copies, synchronizing the service data of the current processing service to the majority of the current normal copies and then carrying out the next adjacent service; wherein the majority is a positive integer greater than half of the corresponding total number.
8. The traffic processing apparatus of the dual-site distributed database according to claim 7, further comprising:
the fault detection module is used for detecting at least one of the server, the network and the current abnormal copy through the fault detector to obtain a fault reason if the number of the current abnormal copies is greater than zero;
and the fault processing module is used for processing the fault of the abnormal copy according to the fault reason.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the business processing method of the dual site distributed database according to any one of claims 1 to 6 when executing the program.
10. A computer-readable storage medium on which a computer program is stored, the computer program, when being executed by a processor, implementing a service processing method of a dual-site distributed database according to any one of claims 1 to 6.
CN202110166381.0A 2021-02-04 2021-02-04 Business processing method and device of double-site distributed database Active CN112929438B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110166381.0A CN112929438B (en) 2021-02-04 2021-02-04 Business processing method and device of double-site distributed database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110166381.0A CN112929438B (en) 2021-02-04 2021-02-04 Business processing method and device of double-site distributed database

Publications (2)

Publication Number Publication Date
CN112929438A CN112929438A (en) 2021-06-08
CN112929438B true CN112929438B (en) 2022-09-13

Family

ID=76170986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110166381.0A Active CN112929438B (en) 2021-02-04 2021-02-04 Business processing method and device of double-site distributed database

Country Status (1)

Country Link
CN (1) CN112929438B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113687982B (en) * 2021-08-20 2024-02-09 济南浪潮数据技术有限公司 Method and device for constructing off-site disaster recovery cluster and related equipment
CN115167782B (en) * 2022-07-28 2023-02-28 北京志凌海纳科技有限公司 Temporary storage copy management method, system, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6014669A (en) * 1997-10-01 2000-01-11 Sun Microsystems, Inc. Highly-available distributed cluster configuration database
CN104468651A (en) * 2013-09-17 2015-03-25 南京中兴新软件有限责任公司 Distributed multi-copy storage method and device
CN105354111A (en) * 2015-10-29 2016-02-24 国电南瑞科技股份有限公司 Redundancy backup method suitable for wide-area distributed real-time database
CN111046004A (en) * 2019-12-24 2020-04-21 上海达梦数据库有限公司 Data file storage method, device, equipment and storage medium
CN111381950A (en) * 2020-03-05 2020-07-07 南京大学 Task scheduling method and system based on multiple copies for edge computing environment
CN111581221A (en) * 2020-03-18 2020-08-25 宁波送变电建设有限公司永耀科技分公司 Information redundancy storage and reconstruction method for distributed multi-station fusion system
CN112269772A (en) * 2020-10-30 2021-01-26 深信服科技股份有限公司 File deployment method, system, equipment and computer readable storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10296632B2 (en) * 2015-06-19 2019-05-21 Sap Se Synchronization on reactivation of asynchronous table replication

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6014669A (en) * 1997-10-01 2000-01-11 Sun Microsystems, Inc. Highly-available distributed cluster configuration database
CN104468651A (en) * 2013-09-17 2015-03-25 南京中兴新软件有限责任公司 Distributed multi-copy storage method and device
CN105354111A (en) * 2015-10-29 2016-02-24 国电南瑞科技股份有限公司 Redundancy backup method suitable for wide-area distributed real-time database
CN111046004A (en) * 2019-12-24 2020-04-21 上海达梦数据库有限公司 Data file storage method, device, equipment and storage medium
CN111381950A (en) * 2020-03-05 2020-07-07 南京大学 Task scheduling method and system based on multiple copies for edge computing environment
CN111581221A (en) * 2020-03-18 2020-08-25 宁波送变电建设有限公司永耀科技分公司 Information redundancy storage and reconstruction method for distributed multi-station fusion system
CN112269772A (en) * 2020-10-30 2021-01-26 深信服科技股份有限公司 File deployment method, system, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN112929438A (en) 2021-06-08

Similar Documents

Publication Publication Date Title
US11892922B2 (en) State management methods, methods for switching between master application server and backup application server, and electronic devices
US20200073761A1 (en) Hot backup system, hot backup method, and computer device
CN112929438B (en) Business processing method and device of double-site distributed database
CN105159795A (en) Data synchronization method, apparatus and system
CN114466027B (en) Cloud primary database service providing method, system, equipment and medium
CN112463451A (en) Cache disaster recovery cluster switching method and soft load balancing cluster device
WO2022174735A1 (en) Data processing method and apparatus based on distributed storage, device, and medium
CN112190924A (en) Data disaster tolerance method, device and computer readable medium
CN115396296A (en) Service processing method and device, electronic equipment and computer readable storage medium
US20120331335A1 (en) High availability database systems and methods
WO2021115043A1 (en) Distributed database system and data disaster backup drilling method
CN107526652B (en) Data synchronization method and storage device
CN110351122B (en) Disaster recovery method, device, system and electronic equipment
CN111241200B (en) Master-slave synchronous processing method and device based on SQLite database
CN110597467B (en) High-availability data zero-loss storage system and method
CN111352959A (en) Data synchronization remediation and storage method and cluster device
CN116193481A (en) 5G core network processing method, device, equipment and medium
CN107404511B (en) Method and device for replacing servers in cluster
CN116166470A (en) Redis cluster clone replication method and device, medium and equipment
CN115292293A (en) Data migration method and device of distributed cache system
CN111639139B (en) Data synchronization method, device, computing equipment and medium for data center
CN110851526B (en) Data synchronization method, device and system
CN113467717B (en) Dual-machine volume mirror image management method, device and equipment and readable storage medium
CN114244638B (en) Multicast network communication method, device, equipment and medium
CN115037745B (en) Method and device for electing in distributed system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant