WO2019020081A1 - 分布式***及其故障恢复方法、装置、产品和存储介质 - Google Patents

分布式***及其故障恢复方法、装置、产品和存储介质 Download PDF

Info

Publication number
WO2019020081A1
WO2019020081A1 PCT/CN2018/097262 CN2018097262W WO2019020081A1 WO 2019020081 A1 WO2019020081 A1 WO 2019020081A1 CN 2018097262 W CN2018097262 W CN 2018097262W WO 2019020081 A1 WO2019020081 A1 WO 2019020081A1
Authority
WO
WIPO (PCT)
Prior art keywords
master node
metadata
redo log
node
distributed system
Prior art date
Application number
PCT/CN2018/097262
Other languages
English (en)
French (fr)
Inventor
褚建辉
卢申朋
刘东辉
王新栋
Original Assignee
广东神马搜索科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东神马搜索科技有限公司 filed Critical 广东神马搜索科技有限公司
Publication of WO2019020081A1 publication Critical patent/WO2019020081A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments

Definitions

  • the present invention relates to the field of distributed technologies, and in particular, to a distributed system and a method, apparatus, and storage medium for the same.
  • FIG. 1 is a schematic diagram showing the structure of a distributed system employing a master-slave architecture.
  • the distributed system of the master-slave architecture is mostly composed of a master node and a plurality of slave nodes.
  • the master node usually has functions such as metadata storage and query, cluster node state management, decision making and task delivery.
  • the metadata managed by the master node is the more important data in the system. The loss of data on the node has a greater impact on the system.
  • the invention provides a distributed system and a fault recovery method, device, product and storage medium thereof, which acquires metadata mirroring of a master node at one or more moments, and records the operation of the master node in a redo log, When the primary node fails, the primary node can be quickly restored to the pre-failure state based on the previously recorded metadata mirroring and redo logs.
  • a distributed system comprising a master node for scheduling tasks and managing system states and a plurality of slave nodes for running scheduled tasks, wherein one or more slaves
  • the node and/or the master node acquires and saves a metadata image recorded with scheduling information and system status at a certain moment on the master node; the master node acquires and saves a redo log recording all operations of the master node after the moment; and the master node
  • the metadata mirror and its corresponding redo log are called for failure recovery when the fault is recovered.
  • the primary node can be quickly restored to the state before the failure, and the recovery efficiency can be improved compared with the manner of recording only the log files.
  • one or more slave nodes and/or master nodes perform metadata mirroring acquisition and save operations triggered by the master node and/or external commands. Therefore, different trigger modes can be set according to the characteristics of the distributed system to trigger the acquisition and save operation of the metadata mirror.
  • the master node responds to the slave node's request after each operation is recorded in the redo log and stored. This ensures that the redo log can fully record every operation of the primary node.
  • the one or more slave nodes and/or the master node continuously acquire and save the metadata mirror of the master node at a plurality of different moments, and the master node continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the master node can call the latest metadata mirror and its corresponding redo log for fault recovery when the fault is recovered, and call the metadata mirror and its corresponding when the latest metadata mirror and/or its corresponding redo log are unavailable.
  • the redo logs are available for recovery at the most recent time.
  • the fault tolerance rate at the time of failure recovery can be improved.
  • one or more slave nodes and/or master nodes directly acquire and save the memory state of the master node at a certain moment as a metadata mirror.
  • Metadata mirroring can be stored in groups of tasks. Thereby, the corresponding metadata mirror can be efficiently organized according to the grouping at the subsequent recovery.
  • a fault recovery apparatus for a distributed system, the distributed system including a master node for scheduling tasks and managing system states, and a plurality of slave nodes for running tasks, the device
  • the method is used for recovering the fault when the primary node fails, and includes: a mirroring acquiring unit, configured to acquire and save a metadata mirror that records scheduling information and system status at a certain moment on the primary node; and the redo log obtaining unit uses Obtaining and saving a redo log of all operations of the primary node after the record is recorded; and a fault recovery unit for invoking the metadata mirror and its corresponding redo log for failure recovery when the fault is recovered.
  • the image acquisition unit performs the acquisition and save operation of the metadata mirroring under the trigger of the master node, the device, and/or the external command.
  • the master node responds to the request of the slave node after each operation thereof is recorded in the redo log by the redo log obtaining unit and stored.
  • the image obtaining unit continuously acquires and saves the metadata mirror of the master node at a plurality of different times
  • the redo log obtaining unit continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the fault recovery unit calls the latest metadata mirror and its corresponding redo log for fault recovery when the fault is recovered.
  • the fault recovery unit calls the data of the latest time available for the metadata mirror and its corresponding redo log to perform fault recovery when the latest metadata mirror and/or its corresponding redo log is unavailable.
  • the image acquisition unit directly acquires and saves the memory state of the master node at a certain moment as a metadata image.
  • the image acquisition unit stores the metadata image according to the task group.
  • a method for recovering a fault of a distributed system comprising a master node for scheduling tasks and managing system states, and a plurality of slave nodes for running tasks
  • the method is configured to perform fault recovery when the primary node fails.
  • the method includes: acquiring and saving a metadata image of the scheduling information and the system state recorded at a certain moment; acquiring and saving the weight of all the scheduling operations after the recording has a time Do the log; and call the metadata mirror and its corresponding redo log for failback when the failure recovers.
  • the metadata mirroring of the master node at a plurality of different moments is continuously acquired and saved, and the redo logs respectively corresponding to the plurality of different moments are continuously acquired and saved.
  • invoking the metadata mirror and its corresponding redo log for fault recovery during fault recovery may include: calling the latest metadata mirror and its corresponding redo log for fault recovery during fault recovery; and in the latest element When data mirroring and/or its corresponding redo log is unavailable, the data of the latest time that the metadata mirror and its corresponding redo log are available are called for failure recovery.
  • the memory state of the master node at a certain moment can be directly obtained and saved as a metadata mirror.
  • the obtaining and saving operation of the metadata mirroring is performed under the trigger of the master node and/or an external command.
  • the master node responds to the request of the slave node after each operation thereof is recorded in the redo log and stored.
  • the metadata image is stored in accordance with a task grouping.
  • a computer program product comprising: a memory; a processor; and a computer program; wherein the computer program is stored in the memory and configured to be processed by the The method of the third aspect of the invention and any of its preferred aspects is performed.
  • a fifth aspect of the invention provides a computer readable storage medium comprising: a program, when executed on a computer, causing a computer to perform the method of the third aspect of the invention and any of the preferred aspects thereof.
  • the distributed system, the fault recovery method, the device, the product and the storage medium of the present invention obtain the metadata mirroring of the master node at one or more moments, and record the subsequent operations of the master node in the redo log, so that When the primary node fails, the primary node can be quickly restored to the pre-fault state based on the previously recorded metadata mirroring and redo logs.
  • FIG. 1 is a schematic diagram showing the architecture of a distributed system of a master-slave architecture.
  • FIG. 2 is a schematic flow chart showing a fault recovery method according to an embodiment of the present invention.
  • FIG. 3 is a diagram showing the continuous storage of a plurality of metadata mirrors and redo logs.
  • FIG. 4 is a schematic block diagram showing the structure of a failure recovery device according to an embodiment of the present invention.
  • FIG. 5 is a structural diagram of a computer program product according to an exemplary embodiment of the present invention.
  • the operation flow of the master node in the scheme is as follows: before the master node performs the operation, the operation may be recorded in the log file, and the operation may be performed after the recording succeeds, that is, the data in the memory may be updated based on the operation;
  • the recovery process is as follows: the log file is read, and the data in the memory is sequentially modified based on the operation of the master node recorded in the log file. This method of recovering log files only by recording write operations is simple, but the recovery process takes a very long time.
  • the inventor found that in the process of recording the log file of the operation of the master node, the image file of the memory data of the master node at a certain moment can be interspersed, and the image file can represent that the master node is corresponding.
  • the current state data at the moment so that when the master node fails, the latest image file and the operation recorded in the log file after the time corresponding to the called image file can be called, and the master node can be implemented according to the called data.
  • Recovery can significantly reduce the time required for recovery compared to just logging log files.
  • the present invention proposes a failure recovery scheme for a primary node in a distributed system, and the failure recovery scheme of the present invention can be implemented by the distributed system shown in FIG. 1.
  • the distributed system of the present invention may include a master node for scheduling tasks and managing system states and a plurality of slave nodes for running scheduled tasks. Both the master node and the slave node can be deployed in the server, and the master node can be deployed in a separate server different from the slave node, or can be deployed in the same server as one of the slave nodes. As a preferred embodiment, different nodes can be deployed in different servers.
  • the distributed system shown in FIG. 1 is composed of a master node and a plurality of slave nodes. It should be understood that the distributed system of the present invention may further include a plurality of master nodes, and may also include other nodes than the master node and the slave node. Devices such as backup master nodes, failover databases, and more.
  • FIG. 2 is a schematic flow chart showing a fault recovery method according to an embodiment of the present invention.
  • the method shown in FIG. 2 can be implemented by the distributed system shown in FIG. 1, and in particular, can be implemented by a master node in a distributed system.
  • step S210 the metadata image of the scheduling information and the system state recorded at a certain moment on the master node is acquired and saved.
  • the master node For a distributed system with a master-slave architecture, after the master node crashes, the entire distributed system is unavailable, so considering the importance of the master node, the master node usually does not directly run specific tasks, but is only responsible for maintaining distributed The operation of the system and the scheduling of tasks are assigned, and specific tasks can be performed by the slave nodes. That is to say, the primary node is mainly responsible for parsing the task request, allocating resources, and locating the target data or nodes according to the metadata, and the specific task is performed by the slave node specified by the master node.
  • the metadata is data for describing data
  • the metadata in the present invention refers specifically to data that the primary node is responsible for saving and managing.
  • the metadata may refer to data that records scheduling information and system status at a certain moment on the master node.
  • the metadata may be system related description data, system state data, current task scheduling and status data, etc.
  • the metadata may be a state describing user data. Data for information such as storage location.
  • the obtained metadata mirror of the master node at a certain time may be a mapping of the memory state of the master node at that moment, so that the memory state of the master node at a certain moment can be directly obtained and saved as a metadata mirror.
  • the metadata mirror of the master node at a certain moment can be obtained by means of Snapshot or dump (backup file system).
  • the operation of obtaining the metadata image may be performed by the master node, by one or more slave nodes, or by a backup master node in the distributed system.
  • the obtained metadata image can be stored persistently on a local disk or a distributed file system, for example, can be stored persistently in the failover database.
  • the master node may perform scheduling according to the packet concurrently when scheduling the task, and the obtained metadata mirror may be a metadata mirror under multiple groups, and therefore, the acquired metadata
  • the mirroring can be stored according to the task group, and the metadata mirrors belonging to the same task group are stored in the same directory, so that the corresponding metadata mirror can be efficiently organized according to the grouping in subsequent recovery.
  • step S220 the redo log in which all operations of the master node after the time is recorded may be acquired and saved by the master node.
  • the operations described herein may refer to operations performed by the primary node on metadata or operations performed by the primary node on its in-memory data.
  • the primary node For each operation performed by the primary node, it can be recorded in the redo log.
  • the operation information of the master node can be sequentially recorded in the redo log.
  • the operation For each operation that the primary node will perform, the operation can be performed by the primary node after the operation is recorded in the redo log and persisted. In this way, when the primary node fails during the execution of the operation, the operation can be resumed according to the data recorded in the redo log. Otherwise, if the re-recording is performed for an operation first, and the operation is interrupted during the execution of the operation or before the operation is recorded or saved, the operation cannot be resumed and can only be repeated.
  • the master node may first record the operation of delivering the target data to the slave node in the redo log, and successfully record and persist the save. After that, the master node sends the target data to the slave node in response to the request of the slave node.
  • the request for the slave node can be responded to the slave node's request after the master node's operation for the request is recorded in the redo log and stored (persistent storage).
  • step S230 the metadata mirror and its corresponding redo log are called for failure recovery when the fault is recovered.
  • metadata mirroring can be seen as a mapping of the memory state of the master node at a certain time, while redo logs record all operations of the master node. Therefore, when the primary node fails, the operation of the primary node may occur according to the metadata mirror acquired before the failure occurs and the operation of the primary node during the period before the failure of the primary node after the time corresponding to the metadata mirror recorded in the redo log. Fault recovery, restore the primary node to the state before the failure occurred.
  • redo log records in the file system for example, you can recover as follows: After the primary node restarts, first traverse the metadata mirror directory in the file system, find the most recent metadata mirror, load it into memory, and then start. The redo log after loading the latest metadata image and start replay, so after the loading is complete, the entire recovery process is complete.
  • a plurality of metadata mirrors corresponding to different time instants may be saved.
  • the acquisition operation of the metadata mirror may be performed periodically or in response to satisfying the predetermined trigger condition.
  • the above trigger condition may be, for example, a certain parameter satisfies a predetermined value, reaches a predetermined interval, or directly responds to an external trigger command.
  • the acquisition operation of the metadata mirror may be performed once every predetermined number of operations are recorded in the redo log, or the acquisition operation of the metadata mirror may be performed once every predetermined time.
  • FIG. 3 is a schematic diagram showing the principle of continuously saving a plurality of metadata mirror files and their corresponding redo logs.
  • the metadata mirror 1 of the master node at time t1 can be obtained first, and the operation of the master node between t1 and t2 can be recorded and stored in the redo log 1, and the metadata mirror of the master node can be acquired again at time t2.
  • the operation of the master node between t2-t3 can be recorded and stored in the redo log 2, and so on, and the metadata mirrors respectively corresponding to the times t1, t2, and t3, and the metadata corresponding to the different moments respectively can be obtained.
  • the master node can first call the latest metadata mirror (ie metadata image at time t3) and its corresponding redo log (the weight within t3-t4 segment) during fault recovery. Do log) for failure recovery. If the latest metadata mirroring and redo logs are not available, you can further call the new metadata mirror (that is, the metadata mirror at time t2) and the redo log (that is, the redo log in the t2-t3 segment). Recovery, and so on, can be pushed back until the available data files are available.
  • the fault tolerance rate at the time of failure recovery can be improved.
  • the solution of the present application can trigger the acquisition and storage of the metadata image (for example, save the state at time t3) under certain conditions or commands, and then start continuous recording of the redo log (ie, record t3). After all the operations). After the failure occurs at time t4, all the operations after t3 can be played back by restoring the state at time t3 so that the master node quickly returns to the state at time t4.
  • the metadata image 1 acquired at time t1 may contain some operations in redo log 1 after time t1. Therefore, when the master node fails at time t2, the metadata image 1 at time t1 and the corresponding redo are used. When log 1 is restored, it is likely that the state of the last restored primary node is inconsistent with the state before the recovery.
  • the time of the operation recorded in the redo log at this time can be recorded in real time, and the metadata mirroring at a certain moment is obtained.
  • the corresponding operation can be removed from the redo log to avoid the phenomenon that the acquired metadata mirror includes some operations recorded in the redo log, so that the metadata mirror can be corresponding to the redo log at the time. Strictly contrasted.
  • FIG. 4 is a block diagram showing the structure of a fault recovery apparatus according to an embodiment of the present invention.
  • the functional modules of the fault recovery device 400 may be implemented by hardware, software, or a combination of hardware and software that implements the principles of the present invention.
  • the functional blocks depicted in FIG. 4 can be combined or divided into sub-modules to implement the principles of the above described invention. Accordingly, the description herein may support any possible combination, or division, or further limitation of the functional modules described herein.
  • the fault recovery apparatus 400 shown in FIG. 4 can be used to implement the fault recovery method shown in FIG. 2, and only the functional modules that the fault recovery apparatus 400 can have and the operations that can be performed by the functional modules are briefly described. For details, please refer to the description above in conjunction with FIG. 2, and details are not described herein again. It should be noted that the fault recovery apparatus 400 may be the primary node itself or a backup primary node.
  • the fault recovery apparatus of the present invention may include a mirror acquisition unit 410, a redo log acquisition unit 420, and a failure recovery unit 430.
  • the image obtaining unit 410 can acquire and save the metadata image of the scheduling information and the system state recorded at a certain moment on the master node
  • the redo log obtaining unit 420 can acquire and save the redo log of all the operations of the master node after the recording time.
  • the fault recovery unit 430 can invoke the metadata mirror and its corresponding redo log for fault recovery when the fault is recovered.
  • the image acquisition unit 410 can perform the acquisition and save operation of the metadata mirror under the trigger of the master node, the device, and/or the external command.
  • the image obtaining unit 410 can directly acquire and save the memory state of the master node at a certain moment as a metadata mirror. Further, the image obtaining unit 410 may store the metadata image according to the task group.
  • the master node responds to the new request of the slave node after each operation thereof is recorded in the redo log and stored in the redo log and stored.
  • the image obtaining unit 410 continuously acquires and saves the metadata mirror of the master node at a plurality of different times
  • the redo log obtaining unit 420 continuously acquires and saves the redo logs respectively corresponding to the plurality of different moments.
  • the fault recovery unit 430 calls the latest metadata mirror and its corresponding redo log for failure recovery when the fault is recovered, and the fault recovery unit 430, when the latest metadata mirror and/or its corresponding redo log is unavailable, The data of the latest time that the metadata mirror and its corresponding redo log are available can be called for failure recovery.
  • the method according to the invention may also be embodied as a computer program or computer program product comprising computer program code instructions for performing the various steps defined above in the above method of the invention.
  • the invention may be embodied as a computer program product comprising: a memory; a processor; and a computer program; wherein the computer program is stored in the memory and configured to perform the invention by the processor The above method.
  • FIG. 5 is a structural diagram of an apparatus for displaying a power amount according to an exemplary embodiment of the present invention.
  • the embodiment provides a computer program product, including: at least one processor 51 and a memory 52.
  • a processor 51 is taken as an example.
  • the processor 51 and the memory 52 are connected by a bus 50.
  • 52 stores instructions executable by at least one processor 51, the instructions being executed by at least one processor 51 to cause at least one processor 51 to perform the above described method of the present invention.
  • the present invention may be embodied as a non-transitory machine readable storage medium (or computer readable storage medium, or machine readable storage medium) having stored thereon executable code (or computer program, or computer instruction code)
  • executable code or computer program, or computer instruction code
  • a processor of an electronic device or computing device, server, etc.
  • each block of the flowchart or block diagram can represent a module, a program segment, or a portion of code that includes one or more of the Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those illustrated in the drawings. For example, two consecutive blocks may be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented in a dedicated hardware-based system that performs the specified function or operation. Or it can be implemented by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Retry When Errors Occur (AREA)

Abstract

本发明公开了一种分布式***及其故障恢复方法、装置、产品和存储介质。其中,从属节点和/或主节点获取并保存记录有主节点上某一时刻的调度信息和***状态的元数据镜像;主节点获取并保存记录有时刻之后主节点所有操作的重做日志;以及主节点在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复。由此,在主节点发生故障时,就可以根据之前记录的元数据镜像和重做日志快速将主节点恢复到故障前的状态。

Description

分布式***及其故障恢复方法、装置、产品和存储介质 技术领域
本发明涉及分布式技术领域,特别是涉及一种分布式***及其故障恢复方法、装置、产品和存储介质。
背景技术
分布式***是把多台机器有机的组合、连接起来,让其协同完成一项任务,例如计算任务、存储任务。其是建立在网络之上的软件***。现有的分布式***大多是主从架构,图1是示出了采用主从架构的分布式***的结构示意图。如图1所示,主从架构的分布式***大多由主节点(master)和多个从属节点(slave)构成。主节点作为分布式***的中心调度节点,通常兼具元数据存储与查询、集群节点状态管理、决策制定与任务下发等功能,由于主节点管理的元数据是***中较为重要的数据,主节点上的数据的丢失对***的影响较大。
因此,需要一种故障切换(failover)机制,使得当主节点遇到未知错误发生崩溃时,能够将主节点恢复到错误发生之前的状态,避免主节点数据的丢失。
发明内容
本发明提供了一种分布式***及其故障恢复方法、装置、产品和存储介质,通过获取主节点在一个或多个时刻下的元数据镜像,并在重做日志中记录主节点的操作,使得在主节点发生故障时,可以根据之前记录的元数据镜像和重做日志将主节点快速恢复到故障前的状态。
根据本发明的第一个方面,提供了一种分布式***,包括用于调度任务并管理***状态的主节点和用于运行被调度的任务的多个从属节点,其中,一个或多个从属节点和/或主节点获取并保存记录有主节点上某一时刻的调度信息和***状态的元数据镜像;主节点获取并保存记录有该时刻之后主节点所有操作的重做日志;以及主节点在故障恢复时调用元数据镜像及其对应 的重做日志进行故障恢复。
由此,根据之前记录的元数据镜像和重做日志可以快速将主节点恢复到故障前的状态,与仅通过记录日志文件的方式相比可以提高恢复效率。
优选地,一个或多个从属节点和/或主节点在主节点和/或外部命令的触发下进行元数据镜像的获取和保存操作。由此,可以根据分布式***的特性,设置不同的触发方式来触发元数据镜像的获取和保存操作。
优选地,主节点在其每一次操作被记录在重做日志内并被存储之后才响应从属节点的请求。由此确保重做日志能够完整记录主节点的每一次操作。
优选地,一个或多个从属节点和/或主节点持续获取并保存主节点在多个不同时刻的元数据镜像,并且主节点持续获取并保存分别对应于多个不同时刻的重做日志。主节点可以在故障恢复时调用最新的元数据镜像及其对应的重做日志进行故障恢复,而当最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。由此,通过保存多份不同时刻的内存镜像和对应的重做日志,可以提高故障恢复时的容错率。
优选地,一个或多个从属节点和/或主节点直接获取并保存主节点在某一时刻的内存状态作为元数据镜像。元数据镜像可以是按照任务分组进行存储的。由此,在后续恢复时可以根据分组高效地组织对应的元数据镜像。
根据本发明的第二个方面,还提供了一种分布式***的故障恢复装置,分布式***包括用于调度任务并管理***状态的主节点和用于运行任务的多个从属节点,该装置用于在主节点发生故障时进行故障恢复,并且包括:镜像获取单元,用于获取并保存记录有主节点上某一时刻的调度信息和***状态的元数据镜像;重做日志获取单元,用于获取并保存记录有时刻之后主节点所有操作的重做日志;以及故障恢复单元,用于在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复。
优选地,镜像获取单元在主节点、装置和/或外部命令的触发下进行元数据镜像的获取和保存操作。
优选地,主节点在其每一次操作被重做日志获取单元记录在重做日志内并存储之后才响应从属节点的请求。
优选地,镜像获取单元持续获取并保存主节点在多个不同时刻的元数据 镜像,并且重做日志获取单元持续获取并保存分别对应于多个不同时刻的重做日志。
优选地,故障恢复单元在故障恢复时调用最新的元数据镜像及其对应的重做日志进行故障恢复。
优选地,故障恢复单元在最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
优选地,镜像获取单元直接获取并保存主节点在某一时刻的内存状态作为元数据镜像。
优选地,镜像获取单元按照任务分组对元数据镜像进行存储。
根据本发明的第三个个方面,还提供了一种分布式***的故障恢复方法,分布式***包括用于调度任务并管理***状态的主节点和用于运行任务的多个从属节点,该方法用于在所述主节点发生故障时进行故障恢复,该方法包括:获取并保存记录有某一时刻的调度信息和***状态的元数据镜像;获取并保存记录有时刻之后所有调度操作的重做日志;以及在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复。
优选地,持续获取并保存所述主节点在多个不同时刻的元数据镜像,并且持续获取并保存分别对应于所述多个不同时刻的重做日志。
优选地,在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复可以包括:在故障恢复时调用最新的元数据镜像及其对应的重做日志进行故障恢复;以及在最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
优选地,可以直接获取并保存主节点在某一时刻的内存状态作为元数据镜像。
优选地,在所述主节点、和/或外部命令的触发下进行所述元数据镜像的获取和保存操作。
优选地,所述主节点在其每一次操作被记录在所述重做日志内并存储之后才响应所述从属节点的请求。
优选地,所述元数据镜像是按照任务分组进行存储的。
根据本发明的第四个方面,还提供了一种计算机程序产品,包括:存储 器;处理器;以及计算机程序;其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行本发明第三方面及其任一优选地方案所述的方法。
本发明的第五个方面提供一种计算机可读存储介质,包括:程序,当其在计算机上运行时,使得计算机执行本发明第三方面及其任一优选地方案所述的方法。
本发明的分布式***及其故障恢复方法、装置、产品和存储介质,通过获取主节点在一个或多个时刻下的元数据镜像,并在重做日志中记录主节点的后续操作,使得在主节点发生故障时,可以根据之前记录的元数据镜像和重做日志将主节点快速恢复到故障前的状态。
附图说明
通过结合附图对本公开示例性实施方式进行更详细的描述,本公开的上述以及其它目的、特征和优势将变得更加明显,其中,在本公开示例性实施方式中,相同的参考标号通常代表相同部件。
图1是示出了主从架构的分布式***的架构示意图。
图2是示出了根据本发明一实施例的故障恢复方法的示意性流程图。
图3是示出了连续保存多个元数据镜像以及重做日志的示意图。
图4是示出了根据本发明一实施例的故障恢复装置的结构的示意性方框图;
图5为本发明一示例性实施例示出的计算机程序产品的结构图。
具体实施方式
下面将参照附图更详细地描述本公开的优选实施方式。虽然附图中显示了本公开的优选实施方式,然而应该理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了使本公开更加透彻和完整,并且能够将本公开的范围完整地传达给本领域的技术人员。
对于图1所示的主从架构的分布式***,由于主节点存储了***正常运行和调度所必须的数据,例如***状态数据和当前调度数据,因此其数 据的丢失对***的影响极大。因此,需要一种故障恢复机制,使得当主节点遇到未知错误时,可以将主节点恢复到一个稳定可靠的状态。针对于此,可以记录主节点所有操作的日志文件,日志文件可以是持久化地存储在磁盘上。那么一旦主节点发生故障,即使丢失主节点的内存中所有数据,当下一次启动时,通过复现(replay)已经记录的日志文件,依然可以使主节点恢复到故障之前的状态。
该方案下主节点的操作流程如下:主节点每次执行操作前,可以将该操作记录到日志文件中,记录成功后再执行该操作,即可以基于该操作更新内存中的数据;发生故障时的恢复流程如下:读取日志文件,基于日志文件中记录的主节点的操作依次修改内存中的数据。这种仅通过记录写操作的日志文件的恢复方式实现简单,但其恢复流程所需时间极长。
为此,发明人在深入研究后发现,在记录主节点的操作的日志文件的过程中,可以穿插地获取主节点在某一时刻下的内存数据的镜像文件,镜像文件可以表征主节点在对应时刻下的当前状态数据,这样在主节点发生故障时,可以调用最近的镜像文件以及日志文件中在所调用的镜像文件所对应的时刻之后记录的操作,根据调用的数据就可以实现主节点的恢复,与仅通过记录日志文件的方式相比可以大幅缩短恢复所需时间。
基于上述构思,本发明提出了一种针对分布式***中的主节点的故障恢复方案,本发明的故障恢复方案可以由图1所示的分布式***实现。如图1所示,本发明的分布式***可以包括用于调度任务并管理***状态的主节点和用于运行被调度的任务的多个从属节点。主节点和从属节点均可以部署在服务器中,并且主节点可以部署在不同于从属节点的一个独立的服务器中,也可以和其中一个从属节点部署在同一个服务器中。作为优选实施例,不同的节点可以部署在不同的服务器中。图1示出的分布式***由一个主节点和多个从属节点构成,应该知道,本发明的分布式***还可以包括多个主节点,并且还可以包括除了主节点、从属节点之外的其它装置,例如备份主节点、故障恢复数据库等等。
下面就本发明的分布式***实现故障恢复方案的具体流程进行详细说明。图2是示出了根据本发明一实施例的故障恢复方法的示意性流程图。其中,图2所示的方法可以由图1所示的分布式***实现,具体地,可以 由分布式***中的主节点实现。
参见图2,在步骤S210,获取并保存记录有主节点上某一时刻的调度信息和***状态的元数据镜像。
对于主从架构的分布式***来说,主节点崩溃之后,会导致整个分布式***不可用,因此考虑到主节点的重要性,主节点通常不直接运行具体任务,而是仅负责维持分布式***的运行以及任务的调度分配,具体任务可由从属节点执行。也就是说,主节点主要负责解析任务请求,分配资源,根据元数据定位目标数据或节点,具体任务由主节点指定的从属节点执行。其中,元数据是用于描述数据的数据,本发明中的元数据特指主节点负责保存和管理的数据。由于主节点用于调度任务并管理***状态,因此,元数据可以是指记录主节点上某一时刻的调度信息和***状态的数据。例如对于Hadoop分布式***而言,元数据可以是***相关描述数据、***状态数据、当前任务调度和状态数据等等,再例如对于分布式存储***而言,元数据可以是描述用户数据的状态信息(如存储位置)的数据。
获取到的主节点在某一时刻的元数据镜像可以是主节点在该时刻的内存状态的一个映射,因此可以直接获取并保存主节点在某一时刻的内存状态作为元数据镜像。具体实现上,可以通过Snapshot(磁盘快照)、dump(备份文件***)等方式获取主节点在某一时刻的元数据镜像。
获取元数据镜像的操作可以由主节点执行,也可以由一个或多个从属节点执行,还可以由分布式***中的备份主节点执行。所获取的元数据镜像可以持久化地存储在本地磁盘或分布式文件***中,例如可以持久化地存储在故障恢复数据库中。
作为本发明的一个可选实施例,主节点在调度任务时可以按照分组并发进行调度,此时所获取的元数据镜像可以是多个分组下的元数据镜像,因此,对于所获取的元数据镜像可以按照任务分组进行存储,将属于同一任务分组的元数据镜像存储在同一目录下,由此在后续恢复时可以根据分组高效地组织对应的元数据镜像。
在步骤S220,可以由主节点获取并保存记录有所述时刻之后主节点所有操作的重做日志。此处述及的操作可以是指主节点对元数据执行的操作,或者是主节点对其内存数据执行的操作。
对于主节点执行的每个操作,可以将其记录在重做日志(redo log)中。重做日志中可以顺序地记录有主节点的操作信息。对于主节点将要执行的每个操作,可以在该操作记录在重做日志中并持久化保存后,才由主节点执行该操作。如此使得在该操作执行过程中主节点出错时,可以根据重做日志中记录的数据恢复该操作。否则如果对于某一操作先执行再记录,在该操作执行过程中或者该操作记录、保存前出错时,则无法恢复这一操作,只能重新来过。
例如,在从属节点向主节点请求任务时(如计算任务、存储任务),主节点可以首先将向从属节点下发目标数据的这一操作记录在重做日志中,在记录并持久化保存成功后,主节点才响应于从属节点的请求,将目标数据发送给从属节点。换句话说,对于从属节点的请求,可以在主节点针对该请求的操作记录在重做日志内并被存储(持久化存储)之后,才响应从属节点的请求。
在步骤S230,在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复。
如上文所述,元数据镜像可以视为主节点在某一时刻的内存状态的映射,而重做日志记录着主节点的所有操作。因此,在主节点出现故障时,可以根据故障发生前所获取的元数据镜像以及重做日志中记录的在元数据镜像对应的时刻之后主节点故障发生之前这段时间内主节点的操作,进行故障恢复,将主节点恢复到故障发生前的状态。以重做日志记录在文件***为例,可以按照如下方式恢复:主节点重新启动后,首先遍历文件***中的元数据镜像目录,找到最近的一次元数据镜像,将其加载到内存,然后开始加载最新元数据镜像之后的重做日志,并开始重放(replay),如此在加载完成之后,整个恢复过程就完成了。
作为本发明的一个可选实施例,在保存主节点的元数据镜像时,可以保存多个对应于不同时刻的元数据镜像。在记录重做日志的过程中,可以周期性地或响应于满足预定的触发条件,执行一次元数据镜像的获取操作。上述触发条件可以是例如某个参数满足预定值,到达预定间隔,或是直接响应于外部的触发命令。例如,可以是在重做日志中每记录预定数量个操作,就执行一次元数据镜像的获取操作,也可以是每隔预定时间执行 一次元数据镜像的获取操作等等。
进一步地,在将主节点的操作记录在重做日志中时,可以持续获取分别对应于多个不同时刻(即多个元数据镜像)的重做日志。图3是示出了持续保存多个元数据镜像文件及其对应的重做日志的原理示意图。
参见图3,首先可以获取t1时刻主节点的元数据镜像1,主节点在t1-t2之间的操作可以记录保存在重做日志1中,在t2时刻可以再次获取主节点的元数据镜像2,主节点在t2-t3之间的操作可以记录保存在重做日志2中,以此类推,可以得到分别对应于t1、t2、t3时刻的元数据镜像,以及分别对应于不同时刻的元数据镜像的重做日志。
由此,假设主节点在t4时刻发生崩溃,在故障恢复时主节点可以首先调用最新的元数据镜像(即t3时刻的元数据镜像)及其对应的重做日志(t3-t4段内的重做日志)进行故障恢复。假如最新的元数据镜像和重做日志不可用,则可以进一步调用次新的元数据镜(即t2时刻的元数据镜像)和重做日志(即t2-t3段内的重做日志)进行故障恢复,以此类推,可以通过不断回推直到获取可用的数据文件。由此,通过保存多份不同时刻的内存镜像和对应的重做日志,可以提高故障恢复时的容错率。
换句话说,本申请的方案能够以一定的条件或是命令触发对元数据镜像的获取和存储(例如,保存t3时刻的状态),随即便启动对重做日志的持续记录(即,记录t3之后的所有操作)。在t4时刻发生故障之后,可以通过恢复t3时刻的状态再回放t3之后的所有操作使得主节点快速回到t4时刻的状态。
在获取主节点在某一时刻的元数据镜像时,例如如图3所示,在t1时刻获取元数据镜像1时,往往不会停止主节点的服务,而获取元数据镜像1需要一定的时间,因此t1时刻所获取的元数据镜像1很可能包含了t1时刻之后重做日志1中的一些操作,因此在t2时刻主节点发生故障时,使用t1时刻的元数据镜像1以及对应的重做日志1进行恢复时,很可能最后恢复的主节点的状态与恢复前的状态不一致。
因此,作为本发明的一个可选实施例,在获取某一时刻的元数据镜像的过程中,可以实时地记录此时重做日志中记录的操作的时间,在某一时刻的元数据镜像获取完毕后,可以从重做日志中去除相应的操作,以避免 获取的元数据镜像包括此后重做日志中所记录的某些操作的现象,从而使得元数据镜像能够和其对应的重做日志在时间上严格对照。
至此已经结合图2-3详细描述本发明的故障恢复方法。另外,本发明的故障恢复方案还可以由一种故障恢复装置实现。图4示出了根据本发明一个实施例的故障恢复装置的结构框图。其中,故障恢复装置400的功能模块可以由实现本发明原理的硬件、软件或硬件和软件的结合来实现。本领域技术人员可以理解的是,图4所描述的功能模块可以组合起来或者划分成子模块,从而实现上述发明的原理。因此,本文的描述可以支持对本文描述的功能模块的任何可能的组合、或者划分、或者更进一步的限定。
图4所示的故障恢复装置400可以用来实现图2所示的故障恢复方法,下面仅就故障恢复装置400可以具有的功能模块以及各功能模块可以执行的操作做简要说明,对于其中涉及的细节部分可以参见上文结合图2的描述,这里不再赘述。需要说明的是,故障恢复装置400可以是主节点本身,也可以是备份主节点。
如图4所示,本发明的故障恢复装置可以包括镜像获取单元410、重做日志获取单元420以及故障恢复单元430。镜像获取单元410可以获取并保存记录有主节点上某一时刻的调度信息和***状态的元数据镜像,重做日志获取单元420可以获取并保存记录有时刻之后主节点所有操作的重做日志,故障恢复单元430可以在故障恢复时调用元数据镜像及其对应的重做日志进行故障恢复。
优选地,镜像获取单元410可以在主节点、装置和/或外部命令的触发下进行元数据镜像的获取和保存操作。镜像获取单元410可以直接获取并保存主节点在某一时刻的内存状态作为元数据镜像。进一步地,镜像获取单元410可以按照任务分组对元数据镜像进行存储。
优选地,主节点在其每一次操作被重做日志获取单元420记录在重做日志内并存储之后才响应从属节点的新请求。
优选地,镜像获取单元410持续获取并保存主节点在多个不同时刻的元数据镜像,并且重做日志获取单元420持续获取并保存分别对应于多个不同时刻的重做日志。此时,故障恢复单元430在故障恢复时调用最新的元数据镜像及其对应的重做日志进行故障恢复,故障恢复单元430在最新 的元数据镜像和/或其对应的重做日志不可用时,可以调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
上文中已经参考附图详细描述了根据本发明的分布式***及其故障恢复方法、装置、产品和存储介质。
此外,根据本发明的方法还可以实现为一种计算机程序或计算机程序产品,该计算机程序或计算机程序产品包括用于执行本发明的上述方法中限定的上述各步骤的计算机程序代码指令。
或者,本发明还可以实施为一种计算机程序产品,包括:存储器;处理器;以及计算机程序;其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行本发明的上述方法。
图5为本发明一示例性实施例示出的电量提醒的设备的结构图。
如图5所示,本实施例提供一种计算机程序产品,包括:至少一个处理器51和存储器52,图5中以一个处理器51为例,处理器51和存储器52通过总线50连接,存储器52存储有可被至少一个处理器51执行的指令,指令被至少一个处理器51执行,以使至少一个处理器51执行本发明的上述方法。
相关说明可以对应参见图2的步骤所对应的相关描述和效果进行理解,此处不做过多赘述。
或者,本发明还可以实施为一种非暂时性机器可读存储介质(或计算机可读存储介质、或机器可读存储介质),其上存储有可执行代码(或计算机程序、或计算机指令代码),当所述可执行代码(或计算机程序、或计算机指令代码)被电子设备(或计算设备、服务器等)的处理器执行时,使所述处理器执行根据本发明的上述方法的各个步骤。
本领域技术人员还将明白的是,结合这里的公开所描述的各种示例性逻辑块、模块、电路和算法步骤可以被实现为电子硬件、计算机软件或两者的组合。
附图中的流程图和框图显示了根据本发明的多个实施例的***和方法的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,所述模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也 应当注意,在有些作为替换的实现中,方框中所标记的功能也可以以不同于附图中所标记的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
以上已经描述了本发明的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (25)

  1. 一种分布式***,其特征在于,包括用于调度任务并管理***状态的主节点和用于运行被调度的任务的多个从属节点,其中,
    一个或多个所述从属节点和/或所述主节点获取并保存记录有所述主节点上某一时刻的调度信息和***状态的元数据镜像;
    所述主节点获取并保存记录有所述时刻之后所述主节点所有操作的重做日志;以及
    所述主节点在故障恢复时调用所述元数据镜像及其对应的重做日志进行故障恢复。
  2. 如权利要求1所述的分布式***,其特征在于,一个或多个所述从属节点和/或所述主节点在所述主节点和/或外部命令的触发下进行所述元数据镜像的获取和保存操作。
  3. 如权利要求1所述的分布式***,其特征在于,所述主节点在其每一次操作被记录在所述重做日志内并被存储之后才响应所述从属节点的请求。
  4. 如权利要求1所述的分布式***,其特征在于,一个或多个所述从属节点和/或所述主节点持续获取并保存所述主节点在多个不同时刻的元数据镜像,并且
    所述主节点持续获取并保存分别对应于所述多个不同时刻的重做日志。
  5. 如权利要求4所述的分布式***,其特征在于,所述主节点在故障恢复时调用最新的所述元数据镜像及其对应的重做日志进行故障恢复。
  6. 如权利要求4所述的分布式***,其特征在于,所述主节点在最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
  7. 如权利要求1所述的分布式***,其特征在于,一个或多个所述从属节点和/或所述主节点直接获取并保存所述主节点在某一时刻的内存状态作为所述元数据镜像。
  8. 如权利要求1所述的分布式***,其特征在于,所述元数据镜像是按照任务分组进行存储的。
  9. 一种分布式***的故障恢复装置,其特征在于,所述分布式***包括用于调度任务并管理***状态的主节点和用于运行任务的多个从属节点,该装置用于在所述主节点发生故障时进行故障恢复,并且包括:
    镜像获取单元,用于获取并保存记录有所述主节点上某一时刻的调度信息和***状态的元数据镜像;
    重做日志获取单元,用于获取并保存记录有所述时刻之后所述主节点所有操作的重做日志;以及
    故障恢复单元,用于在故障恢复时调用所述元数据镜像及其对应的重做日志进行故障恢复。
  10. 如权利要求9所述的装置,其特征在于,所述镜像获取单元在所述主节点、所述装置和/或外部命令的触发下进行所述元数据镜像的获取和保存操作。
  11. 如权利要求9所述的装置,其特征在于,所述主节点在其每一次操作被所述重做日志获取单元记录在所述重做日志内并存储之后才响应所述从属节点的请求。
  12. 如权利要求9所述的装置,其特征在于,所述镜像获取单元持续获取并保存所述主节点在多个不同时刻的元数据镜像,并且
    所述重做日志获取单元持续获取并保存分别对应于所述多个不同时刻的重做日志。
  13. 如权利要求12所述的装置,其特征在于,所述故障恢复单元在故障恢复时调用最新的所述元数据镜像及其对应的重做日志进行故障恢复。
  14. 如权利要求12所述的装置,其特征在于,所述故障恢复单元在最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
  15. 如权利要求9所述的装置,其特征在于,所述镜像获取单元直接获取并保存所述主节点在某一时刻的内存状态作为所述元数据镜像。
  16. 如权利要求9所述的装置,其特征在于,所述镜像获取单元按照任务分组对所述元数据镜像进行存储。
  17. 一种分布式***的故障恢复方法,其特征在于,所述分布式*** 包括用于调度任务并管理***状态的主节点和用于运行任务的多个从属节点,该方法用于在所述主节点发生故障时进行故障恢复,该方法包括:
    获取并保存记录有某一时刻的调度信息和***状态的元数据镜像;
    获取并保存记录有所述时刻之后所有调度操作的重做日志;以及
    在故障恢复时调用所述元数据镜像及其对应的重做日志进行故障恢复。
  18. 如权利要求17所述的方法,其特征在于,
    持续获取并保存主节点在多个不同时刻的元数据镜像,并且
    持续获取并保存分别对应于所述多个不同时刻的重做日志。
  19. 如权利要求18所述的方法,其特征在于,在故障恢复时调用所述元数据镜像及其对应的重做日志进行故障恢复包括:
    在故障恢复时调用最新的所述元数据镜像及其对应的重做日志进行故障恢复;以及
    在最新的元数据镜像和/或其对应的重做日志不可用时,调用元数据镜像及其对应的重做日志都可用的最近时刻的数据进行故障恢复。
  20. 如权利要求17所述的方法,其特征在于,直接获取并保存主节点在某一时刻的内存状态作为所述元数据镜像。
  21. 如权利要求17所述的方法,其特征在于,在所述主节点、和/或外部命令的触发下进行所述元数据镜像的获取和保存操作。
  22. 如权利要求17所述的方法,其特征在于,所述主节点在其每一次操作被记录在所述重做日志内并存储之后才响应所述从属节点的请求。
  23. 如权利要求17所述的方法,其特征在于,所述元数据镜像是按照任务分组进行存储的。
  24. 一种计算机程序产品,其特征在于,包括:
    存储器;处理器;以及计算机程序;
    其中,所述计算机程序存储在所述存储器中,并被配置为由所述处理器执行如权利要求17至23中任一项所述的方法。
  25. 一种计算机可读存储介质,其特征在于,包括:程序,当其在计算机上运行时,使得计算机执行权利要求17至23中任一项所述的方法。
PCT/CN2018/097262 2017-07-28 2018-07-26 分布式***及其故障恢复方法、装置、产品和存储介质 WO2019020081A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710630823.6 2017-07-28
CN201710630823.6A CN107357688B (zh) 2017-07-28 2017-07-28 分布式***及其故障恢复方法和装置

Publications (1)

Publication Number Publication Date
WO2019020081A1 true WO2019020081A1 (zh) 2019-01-31

Family

ID=60285161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097262 WO2019020081A1 (zh) 2017-07-28 2018-07-26 分布式***及其故障恢复方法、装置、产品和存储介质

Country Status (2)

Country Link
CN (1) CN107357688B (zh)
WO (1) WO2019020081A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107357688B (zh) * 2017-07-28 2020-06-12 广东神马搜索科技有限公司 分布式***及其故障恢复方法和装置
CN108390771B (zh) * 2018-01-25 2021-04-16 ***股份有限公司 一种网络拓扑重建方法和装置
CN108427728A (zh) * 2018-02-13 2018-08-21 百度在线网络技术(北京)有限公司 元数据的管理方法、设备及计算机可读介质
CN109189480B (zh) * 2018-07-02 2021-11-09 新华三技术有限公司成都分公司 文件***启动方法及装置
CN109144792A (zh) * 2018-10-08 2019-01-04 郑州云海信息技术有限公司 数据恢复方法、装置及、***及计算机可读存储介质
CN109656911B (zh) * 2018-12-11 2023-08-01 江苏瑞中数据股份有限公司 分布式并行处理数据库***及其数据处理方法
CN111104226B (zh) * 2019-12-25 2024-01-26 东北大学 一种多租户服务资源的智能管理***及方法
CN112379977A (zh) * 2020-07-10 2021-02-19 中国航空工业集团公司西安飞行自动控制研究所 一种基于时间触发的任务级故障处理方法
CN111880969B (zh) * 2020-07-30 2024-06-04 上海达梦数据库有限公司 存储节点恢复方法、装置、设备和存储介质
CN115563028B (zh) * 2022-12-06 2023-03-14 苏州浪潮智能科技有限公司 一种数据缓存方法、装置、设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103294701A (zh) * 2012-02-24 2013-09-11 联想(北京)有限公司 一种分布式文件***以及数据处理的方法
CN104216802A (zh) * 2014-09-25 2014-12-17 北京金山安全软件有限公司 一种内存数据库恢复方法和设备
US9053123B2 (en) * 2010-09-02 2015-06-09 Microsoft Technology Licensing, Llc Mirroring file data
CN107357688A (zh) * 2017-07-28 2017-11-17 广东神马搜索科技有限公司 分布式***及其故障恢复方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9053123B2 (en) * 2010-09-02 2015-06-09 Microsoft Technology Licensing, Llc Mirroring file data
CN103294701A (zh) * 2012-02-24 2013-09-11 联想(北京)有限公司 一种分布式文件***以及数据处理的方法
CN104216802A (zh) * 2014-09-25 2014-12-17 北京金山安全软件有限公司 一种内存数据库恢复方法和设备
CN107357688A (zh) * 2017-07-28 2017-11-17 广东神马搜索科技有限公司 分布式***及其故障恢复方法和装置

Also Published As

Publication number Publication date
CN107357688B (zh) 2020-06-12
CN107357688A (zh) 2017-11-17

Similar Documents

Publication Publication Date Title
WO2019020081A1 (zh) 分布式***及其故障恢复方法、装置、产品和存储介质
US11809726B2 (en) Distributed storage method and device
CN105389230B (zh) 一种结合快照技术的持续数据保护***及方法
WO2019154394A1 (zh) 分布式数据库集群***、数据同步方法及存储介质
US10817478B2 (en) System and method for supporting persistent store versioning and integrity in a distributed data grid
WO2017177941A1 (zh) 主备数据库切换方法和装置
JP2021002369A (ja) 索引更新パイプライン
US8949190B2 (en) Point-in-time database recovery using log holes
CN101539873B (zh) 数据恢复的方法、数据节点及分布式文件***
US9652520B2 (en) System and method for supporting parallel asynchronous synchronization between clusters in a distributed data grid
WO2017128764A1 (zh) 基于缓存集群的缓存方法和***
US10831741B2 (en) Log-shipping data replication with early log record fetching
WO2018098972A1 (zh) 一种日志恢复方法、存储装置和存储节点
CN102158540A (zh) 分布式数据库实现***及方法
WO2021226905A1 (zh) 一种数据存储方法、***及存储介质
US9830228B1 (en) Intelligent backup model for snapshots
WO2015184925A1 (zh) 分布式文件***的数据处理方法及分布式文件***
US11500812B2 (en) Intermediate file processing method, client, server, and system
US11042454B1 (en) Restoration of a data source
CN109726211B (zh) 一种分布式时序数据库
CN113946471A (zh) 基于对象存储的分布式文件级备份方法及***
JP5154843B2 (ja) クラスタシステム、計算機、および障害回復方法
CN116389233A (zh) 容器云管理平台主备切换***、方法、装置和计算机设备
CN112650447B (zh) 一种ceph分布式块存储的备份方法、***及装置
CN103095767B (zh) 分布式缓存***及基于分布式缓存***的数据重构方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18837616

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18837616

Country of ref document: EP

Kind code of ref document: A1