WO2024041363A1 - 无服务器架构分布式容错***、方法、装置、设备及介质 - Google Patents

无服务器架构分布式容错***、方法、装置、设备及介质 Download PDF

Info

Publication number
WO2024041363A1
WO2024041363A1 PCT/CN2023/111562 CN2023111562W WO2024041363A1 WO 2024041363 A1 WO2024041363 A1 WO 2024041363A1 CN 2023111562 W CN2023111562 W CN 2023111562W WO 2024041363 A1 WO2024041363 A1 WO 2024041363A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
computing
target task
persistent storage
computing node
Prior art date
Application number
PCT/CN2023/111562
Other languages
English (en)
French (fr)
Inventor
马林
石海洋
徐锦来
林鹏
吴凯
刘啸
许伟
宫大伟
陈宏智
张帅
Original Assignee
抖音视界有限公司
脸萌有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 抖音视界有限公司, 脸萌有限公司 filed Critical 抖音视界有限公司
Publication of WO2024041363A1 publication Critical patent/WO2024041363A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0668Management of faults, events, alarms or notifications using network fault recovery by dynamic selection of recovery network elements, e.g. replacement by the most appropriate element after failure
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the present disclosure relates to the field of data processing, and in particular to a serverless architecture distributed fault-tolerant system, method, device, equipment and storage medium.
  • serverless architecture With the maturity of cloud, big data, container and other technologies, serverless architecture has emerged. Under the Serverless architecture, users only need to focus on the code implementation of application logic, while the deployment and maintenance of infrastructure such as servers and the elastic expansion and contraction of computing resources are all handled by the Serverless platform. Serverless architecture distributed processing systems are typically larger in scale.
  • the present disclosure provides a serverless architecture distributed fault-tolerant system, which includes:
  • Serverless architecture control module and computing nodes based on distributed architecture including:
  • the serverless architecture control module controls the computing node based on the distributed architecture; the computing node based on the distributed architecture is used to receive and execute the assigned target tasks;
  • the serverless architecture control module is used to monitor the working status of the computing node based on the distributed architecture. When a faulty computing node is detected, build a system for the faulty computing node based on the persistent storage unit in the faulty computing node. Replica computing node;
  • the replica computing node is used to replace the faulty computing node and continue to execute the target task undertaken by the faulty computing node;
  • the persistent storage unit is used to store graph data and state snapshot data corresponding to the target task, and the state snapshot data includes intermediate state data generated during the execution of the target task;
  • the replica computing node is configured to restore and continue executing the target task based on the graph data and state snapshot data corresponding to the target task stored in the persistent storage unit.
  • the serverless architecture control module is specifically configured to build a proxy unit for the persistent storage unit in the failed computing node
  • the constructed agent unit is used to construct a computing unit for the persistent storage unit in the fault computing node.
  • the replica computing node includes the constructed computing unit and the agent unit; and controls the constructed computing unit. , based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit in the failed computing node, restore and continue to execute the target task.
  • the computing node based on the distributed architecture includes an agent unit, a computing unit and a persistent storage unit, and the persistent storage unit is used to store graph data and state snapshot data corresponding to the target task,
  • the status snapshot data includes intermediate status data generated during the execution of the target task, and the system further includes:
  • a master agent unit wherein there is a communication connection between the master agent unit and the agent unit in the computing node based on the distributed architecture
  • the main agent unit is used to monitor the working status of the agent unit, and when a faulty agent unit is detected, build a replica agent unit for the faulty agent unit;
  • the copy agent unit is used to construct a computing unit corresponding to the fault agent unit for the persistent storage unit corresponding to the fault agent unit, and control the computing unit corresponding to the fault agent unit to be based on the persistent storage unit corresponding to the fault agent unit.
  • the state snapshot data and graph data of the target task stored in the storage unit are restored and the target task is continued to be executed.
  • the computing node based on the distributed architecture includes an agent unit, a computing unit and a persistent storage unit, and the persistent storage unit is used to store graph data and state snapshot data corresponding to the target task,
  • the state snapshot data includes intermediate state data generated during the execution of the target task;
  • the agent unit is configured to create a replica computing unit for the faulty computing unit when a faulty computing unit is detected, and control the replica computing unit to replace the faulty computing unit based on the data stored in the persistent storage unit. State snapshot data and graph data of the target task, restore and continue execution of the target task.
  • the constructed agent unit is also used to notify the agent units in other computing nodes to suspend the execution of the assigned target tasks based on the communication connection between the agent units, and resume and continue to execute the said When executing a target task, notify the agent units in other computing nodes to continue executing the assigned target task.
  • the agent unit is specifically configured to build a computing unit for the persistent storage unit in the faulty computing node, and to control the constructed computing unit based on the persistent storage unit in the faulty computing node.
  • the state snapshot data, graph data and data from other calculations corresponding to the target task stored in the storage unit Calculate the status snapshot data of the node, restore and continue to execute the target task.
  • the persistent storage unit adopts a hierarchical structure of memory, persistent storage media and hard disks,
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the three-level storage layers of memory, persistent storage media, and hard disk.
  • the persistent storage unit adopts a hierarchical structure of memory and persistent storage media
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the secondary storage layer of the memory and persistent storage medium.
  • the persistent storage medium includes persistent memory.
  • the computing node based on the distributed architecture includes an agent unit, a computing unit and a persistent storage unit, and the persistent storage unit is used to store graph data and state snapshot data corresponding to the target task,
  • the state snapshot data includes intermediate state data generated during the execution of the target task;
  • the system also includes a main agent unit, and there is a communication connection between the main agent unit and the agent unit in the computing node based on the distributed architecture. ;
  • the main agent unit is used to monitor the working status of the agent unit, and when a faulty agent unit is detected, build a replica agent unit for the faulty agent unit;
  • the copy agent unit is used to construct a computing unit corresponding to the fault agent unit for the persistent storage unit corresponding to the fault agent unit, and control the computing unit corresponding to the fault agent unit to be based on the data corresponding to the fault agent unit. Persist the state snapshot data and graph data of the target task stored in the persistent storage unit, restore and continue to execute the target task;
  • the agent unit is configured to create a replica computing unit for the faulty computing unit when a fault is detected in the computing unit in the computing node, and control the replica computing unit to replace the faulty computing unit, based on the faulty computing unit in the computing node.
  • the state snapshot data and graph data of the target task stored in the persistent storage unit are restored and the target task is continued to be executed.
  • the present disclosure also provides a serverless architecture distributed fault tolerance method, which method includes:
  • a replica computing node is constructed for the faulty computing node based on the persistent storage unit in the faulty computing node.
  • the replica computing node is In order to replace the faulty computing node and continue to execute the target task assigned to the faulty node, the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task, and the status snapshot data includes the target Intermediate state data generated during task execution;
  • building a replica computing node for the faulty computing node based on the persistent storage unit in the faulty computing node includes:
  • the agent unit that controls the construction constructs a computing unit for the persistent storage unit in the failed computing node
  • Controlling the replica computing node to restore and continue executing the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit includes:
  • the computing unit constructed under the control of the agent unit recovers and continues to execute the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • the computing node includes an agent unit, a computing unit and a persistent storage unit.
  • the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task.
  • the status snapshot data Including intermediate state data generated during the execution of the target task, the method further includes:
  • the state snapshot data and graph data of the target task stored in the persistent storage unit are persisted, and the target task is restored and continued to be executed.
  • the computing node includes an agent unit, a computing unit and a persistent storage unit.
  • the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task.
  • the status snapshot data Including intermediate state data generated during the execution of the target task, the method further includes:
  • agent unit in the computing node detects that the computing unit in the computing node fails, create a replica computing unit for the faulty computing unit;
  • the replica computing unit is controlled to replace the failed computing unit, and based on the state snapshot data and graph data of the target task stored in the persistent storage unit in the computing node, restore and continue executing the target task.
  • the method further includes:
  • the agent unit is used to notify other agent units to suspend execution.
  • the target tasks
  • agent units in other computing nodes are notified to continue executing the assigned target tasks.
  • the computing unit constructed using the control of the agent unit restores and continues to execute the process based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • Target tasks include:
  • the computing unit constructed under the control of the agent unit recovers and continues to execute the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit, as well as state snapshot data from other computing nodes. Describe the target tasks.
  • the method further includes:
  • the master agent unit is used to monitor the working status of the agent units in each computing node, and when a faulty agent unit is detected, a replica agent unit is constructed for the faulty agent unit; the replica agent unit is controlled to be the corresponding one of the faulty agent unit.
  • the persistent storage unit constructs the computing unit corresponding to the fault agent unit, and controls the computing unit corresponding to the fault agent unit to base on the state snapshot data of the target task stored in the persistent storage unit corresponding to the fault agent unit and graph data, recover and continue executing the target task;
  • the agent unit in the computing node When the agent unit in the computing node detects that the computing unit in the computing node fails, it creates a replica computing unit for the fault computing unit; controls the replica computing unit to replace the fault computing unit, and based on the fault computing unit corresponding The state snapshot data and graph data of the target task stored in the persistent storage unit are restored and the target task is continued to be executed.
  • the present disclosure also provides a serverless architecture distributed fault-tolerant device, which includes:
  • the first building module is used to monitor the working status of computing nodes based on distributed architecture.
  • a replica computing node is built for the faulty computing node based on the persistent storage unit in the faulty computing node.
  • the replica computing node is used to replace the faulty computing node and continue to execute the target task assigned by the faulty node
  • the persistent storage unit is used to store the graph data and status snapshot data corresponding to the target task, the
  • the status snapshot data includes intermediate status data generated during the execution of the target task;
  • the first recovery execution module is used to control the replica computing node to recover and continue to execute the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • the present disclosure provides a computer-readable storage medium in which instructions are stored.
  • the terminal device implements the above method.
  • the present disclosure provides a data processing device, including: a memory, a processor, and a data stored in A computer program is on the memory and can be run on the processor.
  • the processor executes the computer program, the serverless architecture distributed fault-tolerance method of any of the above embodiments is implemented.
  • the present disclosure provides a computer program product, which includes a computer program/instructions that, when executed by a processor, implements the serverless architecture distributed fault-tolerance method of any of the above embodiments.
  • the present disclosure provides a computer program that includes instructions that, when executed by a processor, cause the processor to implement the serverless architecture distributed fault-tolerance method described in any of the foregoing embodiments.
  • Figure 1 is an architectural diagram of a serverless architecture distributed fault-tolerant system provided by an embodiment of the present disclosure
  • Figure 2 is an architectural diagram of another serverless architecture distributed fault-tolerant system provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of yet another serverless architecture distributed fault-tolerant system provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of another serverless architecture distributed fault-tolerant system provided by an embodiment of the present disclosure.
  • Figure 5 is a flow chart of a serverless architecture distributed fault tolerance method provided by an embodiment of the present disclosure
  • Figure 6 is a schematic structural diagram of a serverless architecture distributed fault-tolerant device provided by an embodiment of the present disclosure
  • Figure 7 is a schematic structural diagram of a serverless architecture distributed fault-tolerant device provided by an embodiment of the present disclosure.
  • Serverless architecture is a software design approach that allows developers to build and run services without managing the underlying architecture.
  • the serverless system allocates multiple computing nodes to it to execute tasks related to the function code. Users do not need to care about computing resource issues, making user development more convenient. Reduce the development burden on users and bring them a better experience.
  • Serverless architecture is used in many fields such as general big data processing and distributed machine learning. For example, it can be applied to the fields of graph computing and graph mining, for computing tasks and mining tasks of large-scale graphs with billions or even trillions of edges. , need to distribute heavy tasks across a large number of computing nodes.
  • the present disclosure provides a serverless architecture distributed fault-tolerant system, which can be deployed in a physical cluster or in a cloud environment. It uses the fault-tolerant function as a basic capability to reduce the impact on the execution progress of tasks due to node failures. Influence.
  • embodiments of the present disclosure provide a serverless architecture distributed fault-tolerant system.
  • Figure 1 is an architectural diagram of a serverless architecture distributed fault-tolerant system provided by some embodiments of the present disclosure.
  • the serverless architecture distributed fault-tolerant system 100 includes a serverless architecture control module 101 and computing nodes based on a distributed architecture, taking computing node 102 and computing node 104 as an example.
  • the serverless architecture control module 101 has a communication connection with each computing node based on a distributed architecture, and the computing nodes based on a distributed architecture are used to receive and execute assigned target tasks.
  • the serverless architecture control module 101 is used to monitor the working status of each computing node based on the distributed architecture.
  • a faulty computing node is detected (assuming that the faulty computing node is the computing node 102), based on the persistence in the faulty computing node 102
  • the storage unit 1021 builds a replica computing node 103 for the faulty computing node 102.
  • the replica computing node 103 is used to replace the faulty computing node 102 and continue to execute the target tasks undertaken by the faulty computing node 103;
  • the persistent storage unit 1021 Used to store graph data and state snapshot data corresponding to the target task, where the state snapshot data includes intermediate state data generated during the execution of the target task;
  • the replica computing node 103 is configured to restore and continue executing the target task based on the graph data and state snapshot data corresponding to the target task stored in the persistent storage unit 1021 .
  • computing nodes based on distributed architecture can be implemented based on container Pods.
  • Each computing node includes an agent unit, a computing unit and a persistence storage unit.
  • the agent unit, computing unit and persistence unit in the same computing node Storage units are deployed in the same Pod.
  • the same Pod can include a One or more computing units, each computing unit corresponds to a process.
  • the agent unit in the Pod is used to control one or more computing units.
  • the persistent storage unit can be implemented based on persistent storage media (such as persistent memory Optane Persistent Memory, PMEM), or based on hard disks (such as solid state drive Solid State Drive, SSD). It can also be based on memory, persistent storage media and The hybrid storage design of the hard disk is implemented to store data persistently, without affecting the storage of data due to power outages and other faults.
  • a compute unit also called a worker refers to an application-coupled entity that holds computing resources and performs assigned computing task loads.
  • the persistent storage unit adopts a hierarchical structure of memory, persistent storage media, and hard disks.
  • the persistent storage unit is specifically used to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the three-level storage layers of memory, persistent storage medium and hard disk. .
  • the graph data and state snapshot data corresponding to the target task stored in the persistent storage unit of each computing node may be different.
  • the persistent storage unit adopts a hierarchical structure of memory and persistent storage media.
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the secondary storage layer of the memory and the persistent storage medium.
  • Embodiments of the present disclosure can perform fault tolerance processing for faults of computing nodes implemented based on containers, achieve container-level fault tolerance at the resource scheduling level, and reduce the impact on task execution progress due to computing node failures.
  • FIG. 1 In order to facilitate further understanding of the above-mentioned serverless architecture distributed fault-tolerant system, some embodiments of the present disclosure provide an architectural diagram of another serverless architecture distributed fault-tolerant system, as shown in Figure 2.
  • the serverless architecture distributed system 200 includes a serverless architecture control module 201 and computing nodes based on a distributed architecture, taking computing nodes 202 and 204 as an example.
  • the computing node 202 may include an agent unit 2021, a computing unit 2022, and a persistent storage unit 2023 with corresponding relationships.
  • the serverless architecture control module 201 is specifically used to monitor the working status of computing nodes based on distributed architecture. When a faulty computing node (assumed to be the computing node 202) is detected, it is the persistent storage unit in the faulty computing node 202. 2023 Build a new proxy unit 2031 (which may be called the first replica proxy unit).
  • the agent unit 2031 is used to build a new computing unit 2032 (which can be called the first replica computing unit) for the persistent storage unit 2023 in the faulty computing node 202.
  • the agent unit 2031 controls the constructed computing unit 2032, based on The state snapshot data and graph data corresponding to the target task stored in the persistent storage unit 2023 of the failed computing node 202 are restored and the target task is continued to be executed.
  • the built agent unit 2031 and computing unit 2032, as well as the persistent storage in the failed computing node 202 The units 2023 are all located in the same computing node, that is, the replica computing node 203.
  • the replica computing node 203 is used to replace the faulty computing node 202 and continue to execute the target tasks undertaken by the faulty computing node 202.
  • the agent unit, computing unit and persistent storage unit in the same computing node have the same index identifier, and the agent unit, computing unit and persistent storage unit with the index identifier can be located through the index identifier.
  • the agent unit, computing unit and persistent storage unit in the same computing node have the same index identifier 1, and the agent unit, computing unit and persistent storage unit with the index identifier 1 can be located through the index identifier 1.
  • the corresponding relationship between each agent unit, computing unit, and persistent storage unit and the index identifier can be set and stored in advance.
  • each computing unit is mounted with a persistent storage unit.
  • the computing unit reads the data shards corresponding to the target task from the remote file system (such as Hadoop distributed file system HDFS) in advance, and saves the data corresponding to the target task.
  • Data fragments are stored in persistent storage units for subsequent execution of target tasks.
  • the embodiments of the present disclosure can minimize resource competition by mounting a separate persistent storage unit for each computing unit instead of sharing the persistent storage unit among the computing units.
  • the agent unit can control the corresponding computing unit to execute the target task. Specifically, it controls the computing unit to obtain the graph data corresponding to the target task from the persistent storage unit and execute the target task.
  • the agent unit controls the computing unit to periodically write the status snapshot data generated during the execution of the target task into the corresponding persistent storage unit to achieve persistent storage of the status snapshot data so that when a fault is detected later,
  • the execution status of the target task can be restored based on the status snapshot data of the target task.
  • the status snapshot data can also be called checkpoint data, which is used to record the execution status data of the target task. Based on the execution status data, a certain transient state of the target task can be restored.
  • a computing node based on a distributed architecture includes an agent unit, a computing unit, and a persistent storage unit.
  • the persistent storage unit is used to store graph data and state snapshot data corresponding to the target task.
  • the status snapshot data includes intermediate status data generated during the execution of the target task.
  • each agent unit can synchronize each other's calculation result data based on negotiation communication and jointly perform target tasks.
  • the serverless architecture control module determines a persistent storage unit included in the faulty computing node, and builds a new computing node (for example, a replica computing node) based on the persistent storage unit. , used to replace the failed computing node and continue to perform the target task.
  • the agent unit in the new computing node can notify other agent units to suspend execution of the target task based on the communication connection between the agent units, and instruct other agent units to synchronize the status snapshot data of the target task to the new computing node. so that new The computing node can replace the failed computing node to resume executing the target task.
  • the agent unit in the new computing node is specifically used to restore and restore the data based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit in the computing node and the state snapshot data from other computing nodes. Continue to perform the stated target tasks.
  • the new agent unit determines to resume execution of the target task to the latest state, it can also notify other agent units to continue executing the target task based on the communication connection between the agent units.
  • the serverless architecture distributed fault-tolerant system also includes a main agent unit. There is a communication connection between the main agent unit and each agent unit.
  • the main agent unit is used to maintain the status of each agent unit and store each agent. Index information of the unit.
  • the serverless architecture distributed fault-tolerant system 300 includes a serverless architecture control module 304 and a master agent unit 301. , state storage module 302 and computing node 303. There is a communication connection between the main agent unit 301 and the agent units in each computing node.
  • the main agent unit 301 uses the state storage module 302 to store the index information of each agent unit.
  • the main agent unit and other agent units can communicate based on negotiation to achieve data synchronization, etc.
  • the main agent unit is used to monitor the working status of the agent units in each computing node, and when a faulty agent unit (assumed to be the agent unit 3031 in the computing node 303) is detected, the faulty agent unit 3031 Build a replica agent unit 3034 (which may be called a second replica agent unit).
  • the replica agent unit 3034 is used to build a new computing unit 3035 (which may be called a second replica computing unit) for the persistent storage unit 3033 corresponding to the fault agent unit 3031, and the replica agent unit 3034 controls the calculation.
  • Unit 3035 based on the state snapshot data and graph data of the target task stored in the persistent storage unit 3033, restore and continue to execute the target task.
  • the failed agent unit 3031 may be an agent unit in any computing node. Since the agent unit 3031 fails, its corresponding computing unit 3032 will be recycled. At this time, it is necessary to create a new agent unit (such as the agent unit 3034) and then create a new computing unit 3035 from the new agent unit 3034.
  • a new agent unit such as the agent unit 3034
  • the master agent unit 301 after detecting a faulty agent unit, notifies the agent units in each computing node to suspend execution of the assigned target tasks based on the communication connection with each agent unit, and resumes the operation.
  • the agent units in other computing nodes are notified to continue executing the assigned target task.
  • compute nodes, agent units, compute units, and persistence units with the same index identity can be stored in advance at each agent unit, or can be stored separately in the state storage module 302 of the main agent unit 301, in order to save the storage resources of each agent unit. Based on the negotiation communication between each agent unit, the required index information can be obtained from the state storage module 302 of the main agent unit 301 based on requirements.
  • the serverless architecture distributed fault-tolerant system provided by the embodiments of the present disclosure is an elastic fault-tolerant framework based on agent units. It can not only support fault tolerance at the computing node level, but also support fault tolerance at the agent unit level, further reducing the risk of distributed faults caused by serverless architecture. The probability of a fault affecting task execution progress in a fault-tolerant system.
  • some embodiments of the present disclosure can also support fault tolerance at the computing unit level.
  • some embodiments of the present disclosure provide another serverless architecture distributed fault tolerance. Schematic diagram of the system.
  • the serverless architecture distributed fault-tolerant system 400 includes computing nodes based on a distributed architecture.
  • the following takes computing nodes 401 and 402 as examples.
  • the agent unit 4011 in the computing node 401 is used to monitor the working status of the computing unit 4012 in the computing node 401, and when a failure of the computing unit 4012 is detected, create a replica computing unit 4014 for the failed computing unit 4012 ( can be called the third copy agent unit), and the agent unit 4011 controls the copy calculation unit 4014 to restore the execution of the target task based on the state snapshot data and graph data of the target task stored in the persistent storage unit 4013 corresponding to the fault calculation unit 4012. Describe the target tasks.
  • the agent unit 4011 when the agent unit 4011 determines that its corresponding computing unit 4012 has failed, it obtains the index information stored in the state storage module of the main agent unit based on the communication connection between the agent units, and determines the relationship with the agent unit 4011 For the persistent storage unit with the same index identifier, that is, the persistent storage unit 4013, the agent unit 4011 creates a new computing unit 4014 for the persistent storage unit 4013 to resume execution of the target task.
  • the agent unit 4011 can be an agent unit in any computing node.
  • agent unit 4011 determines that its corresponding computing unit 4012 has failed, based on the communication connections between the agent units, it notifies the agent units in other computing nodes (agent unit 4021 as shown in Figure 4) The execution of the assigned target task is suspended, and when the target task is resumed and continued to be executed, the agent units in other computing nodes are notified to continue executing the assigned target task.
  • serverless architecture distributed fault-tolerant system provided by the embodiments of the present disclosure can support fault-tolerant functions from the distributed computing node level, agent unit level, and computing unit level respectively, and reduce failures caused by computing nodes, agent units, computing units, etc. Finally, the impact on task execution progress.
  • the serverless architecture distributed fault-tolerant system provided by the embodiments of the present disclosure can be used for planning based on containers. Failures of computing nodes, agent units and computing units are handled separately to achieve fault tolerance at the container level at the resource scheduling level, to agent level fault tolerance at the distributed control plane, and then at the worker level of the computing unit of the application itself.
  • the multi-dimensional fault tolerance forms an end-to-end fault tolerance guarantee, thereby ensuring the end-to-end fault tolerance experience of graph data processing applications and reducing the occurrence of application error alarms and other situations in the entire loop of users using graph data processing applications.
  • embodiments of the present disclosure also provide a serverless architecture distributed fault-tolerant method. Refer to Figure 5 , which is a serverless architecture distributed fault-tolerant method provided by some embodiments of the present disclosure. Flowchart of fault tolerance approach. The method includes: steps S501 to S502.
  • step S501 the working status of the computing node based on the distributed architecture is monitored.
  • a replica computing node is constructed for the faulty computing node based on the persistent storage unit in the faulty computing node.
  • the replica computing node is used to replace the faulty computing node and continue to execute the target task assigned by the faulty node.
  • the persistent storage unit is used to store the graph data and status snapshot data corresponding to the target task.
  • the status Snapshot data includes intermediate state data generated during the execution of the target task.
  • the serverless architecture distributed fault-tolerant method provided by the embodiments of the present disclosure can be applied to the above-mentioned serverless architecture distributed fault-tolerant system.
  • the serverless architecture distributed fault-tolerant system includes a serverless architecture control module and a distributed architecture-based calculation corresponding to the target task. node.
  • the computing node based on the distributed architecture in the embodiment of the present disclosure includes an agent unit, a computing unit and a persistent storage unit.
  • the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task.
  • the status snapshot data includes intermediate status data generated during the execution of the target task.
  • the persistent storage unit adopts a hierarchical structure of memory, persistent storage media, and hard disks.
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the three-level storage layers of memory, persistent storage medium, and hard disk.
  • the persistent storage unit adopts a hierarchical structure of memory and persistent storage media; specifically, the persistent storage unit is specifically configured to store all secondary storage layers in descending order of priority of the memory and persistent storage media.
  • the graph data and status snapshot data corresponding to the above target tasks are stored in the corresponding storage layer.
  • the persistent storage medium in the embodiment of the present disclosure may include persistent memory PMEM or the like.
  • building a replica computing node for the faulty computing node based on the persistent storage unit in the faulty computing node may specifically include: first building a proxy unit for the persistent storage unit in the faulty computing node (which may be called the first replica agent unit), and then control the agent unit to persist in the failed compute node.
  • the storage unit builds a computing unit (which can be called a first copy computing unit) to implement the construction of a new computing node for the failed computing node.
  • the new computing unit constructed under the control of the agent unit can be used to restore and continue executing the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit in the failed computing node.
  • the serverless architecture control module can determine the persistent storage unit corresponding to the faulty computing node based on the pre-stored index relationship between the computing node, the agent unit, the computing unit and the persistent storage unit.
  • the index relationship between computing nodes, agent units, computing units and persistent storage units can be stored in each agent unit in advance, or in the main agent unit in a serverless architecture distributed fault-tolerant system.
  • the serverless architecture control module can communicate with each agent unit to determine the persistent storage unit corresponding to the failed computing node.
  • the serverless architecture control module can keep the graph data stored in the persistent storage unit unchanged in order to avoid the system consumption of storing and storing graph data. Instead, a new computing node is re-created for processing the graph data in the persistent storage unit, that is, the replica computing node of the failed computing node.
  • the replica computing node can be used to process the graph data in the persistent storage unit later. processing to resume execution of the target task.
  • the master agent unit can be used to monitor the working status of the agent units in each computing node.
  • a replica agent unit (which may be called a second replica agent unit) is constructed for the faulty agent unit. ), and then control the replica agent unit to build the computing unit corresponding to the fault agent unit (which can be called the second replica computing unit) for the persistent storage unit corresponding to the fault agent unit, and control the calculation corresponding to the constructed fault agent unit
  • the unit recovers and continues to execute the target task based on the state snapshot data and graph data of the target task stored in the persistent storage unit corresponding to the fault agent unit.
  • the embodiment of the present disclosure can realize the fault tolerance function at the agent unit level and reduce the impact on the execution progress of the target task.
  • the agent unit in the same computing node is used to monitor the working status of the computing unit in the computing node.
  • the agent unit detects a failure of the computing unit, it creates a replica computing unit (which may be called a third replica computing unit) for the failed computing unit, and then controls the replica computing unit to replace the failed computing unit, Based on the state snapshot data and graph data of the target task stored in the persistent storage unit in the same computing node, the target task is restored and continued to be executed.
  • step S502 the replica computing node is controlled to restore and continue executing the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • the calculation result data and status snapshot data generated by each computing node during the execution of the target task can be synchronized to other computing nodes according to the preset period, so that the various computing nodes can collaborate to complete the target task.
  • a new computing unit constructed under the control of the agent unit can be used to restore the state snapshot data, graph data, and state snapshot data from other computing nodes corresponding to the target task stored in its corresponding persistent storage unit. and continue to perform the stated target tasks.
  • the agent unit in the faulty computing unit may notify other agent units to suspend execution of the target task based on the communication connection between the agent units.
  • agent units in other computing nodes may also be notified to continue executing the assigned target tasks.
  • the serverless architecture distributed fault-tolerance method provided by the embodiments of the present disclosure can support fault-tolerance functions from the distributed computing node level, agent unit level, and computing unit level respectively, and reduce the number of faults caused by failures of computing nodes, agent units, computing units, etc. Impact on task execution progress.
  • the present disclosure also provides a serverless architecture distributed fault-tolerant device.
  • Figure 6 is a schematic structural diagram of a serverless architecture distributed fault-tolerant device provided by an embodiment of the present disclosure.
  • the device includes:
  • the first building module 601 is used to monitor the working status of computing nodes based on distributed architecture. When a faulty computing node is detected, build a replica calculation for the faulty computing node based on the persistent storage unit in the faulty computing node. node, the replica computing node is used to replace the faulty computing node and continue to execute the target task assigned by the faulty node, and the persistent storage unit is used to store the graph data and status snapshot data corresponding to the target task, so The state snapshot data includes intermediate state data generated during the execution of the target task;
  • the first recovery execution module 602 is used to control the replica computing node to recover and continue to execute the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • the first building module 601 includes:
  • the first construction sub-module is used to build a proxy unit (which may be called a first replica proxy unit) for the persistent storage unit in the faulty computing node;
  • a control submodule configured to control the constructed agent unit to construct a computing unit (which may be called a first replica computing unit) for the persistent storage unit in the faulty computing node;
  • the recovery execution module is specifically used for:
  • the computing unit constructed under the control of the agent unit recovers and continues to execute the target task based on the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit.
  • the computing node includes an agent unit, a computing unit and a persistent storage unit.
  • the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task.
  • the status snapshot data Including intermediate state data generated during the execution of the target task; the device further includes:
  • the second building module is used to use the master agent unit to monitor the working status of the agent units in each computing node.
  • a replica agent unit (which may be called a second replica agent unit) is constructed for the faulty agent unit. );
  • the second recovery execution module is used to control the replica agent unit to build the computing unit corresponding to the fault agent unit (which may be called the second replica computing unit) for the persistent storage unit corresponding to the fault agent unit, and control the construction
  • the computing unit corresponding to the faulty agent unit recovers and continues to execute the target task based on the state snapshot data and graph data of the target task stored in the persistent storage unit corresponding to the faulty agent unit.
  • the computing node includes an agent unit, a computing unit and a persistent storage unit.
  • the persistent storage unit is used to store graph data and status snapshot data corresponding to the target task.
  • the status snapshot data Including intermediate state data generated during the execution of the target task; the device further includes:
  • a third building module configured to build a replica computing unit for the failed computing unit when the agent unit detects that the computing unit has failed
  • the third recovery execution module is used to control the replica computing unit to replace the failed computing unit, and restore and continue execution of the target task based on the status snapshot data and graph data of the target task stored in the persistent storage unit. Describe the target tasks.
  • the device further includes:
  • a notification module configured to use the agent unit to notify other agent units to suspend execution of the target task based on the communication connection between each agent unit; and when it is detected that the target task is resumed and continues to execute the target task, notify other computing nodes.
  • the agent unit continues to perform the assigned target task.
  • the first recovery execution module is specifically used to:
  • the computing unit constructed under the control of the agent unit recovers and continues to execute the state snapshot data and graph data corresponding to the target task stored in the persistent storage unit, as well as state snapshot data from other computing nodes. Describe the target tasks.
  • the persistent storage unit adopts a hierarchical structure of memory, persistent storage media and hard disk;
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the three-level storage layers of memory, persistent storage media and hard disk.
  • the persistent storage unit adopts a hierarchical structure of memory and persistent storage media
  • the persistent storage unit is specifically configured to store the graph data and status snapshot data corresponding to the target task to the corresponding storage layer in descending order of priority of the secondary storage layer of the memory and persistent storage medium.
  • the persistent storage medium includes persistent memory.
  • the serverless architecture distributed fault-tolerant device provided by the embodiments of the present disclosure can support fault-tolerant functions from the distributed computing node level, the agent unit level, and the computing unit level respectively, and reduce the number of faults caused by failures of computing nodes, agent units, computing units, etc. Impact on task execution progress.
  • embodiments of the present disclosure also provide a computer-readable storage medium. Instructions are stored in the computer-readable storage medium. When the instructions are run on a terminal device, the terminal device enables the terminal device to implement the present invention.
  • the serverless architecture distributed fault tolerance method described in the disclosed embodiments is disclosed.
  • An embodiment of the present disclosure also provides a computer program product.
  • the computer program product includes a computer program/instruction.
  • the serverless architecture distributed fault-tolerant method described in the embodiment of the present disclosure is implemented. .
  • embodiments of the present disclosure also provide a serverless architecture distributed fault-tolerant device, as shown in Figure 7, which may include:
  • the number of processors 701 in the serverless architecture distributed fault-tolerant device can be one or more. In Figure 7, one processor is taken as an example.
  • the processor 701, the memory 702, the input device 703, and the output device 704 may be connected through a bus or other means, wherein the connection through the bus is taken as an example in FIG. 7 .
  • the memory 702 can be used to store software programs and modules.
  • the processor 701 executes various functional applications and data processing of the serverless architecture distributed fault-tolerant device by running the software programs and modules stored in the memory 702 .
  • the memory 702 may mainly include a program storage area and a data storage area, where the program storage area may store an operating system, at least one application program required for a function, and the like.
  • memory 702 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
  • the input device 703 may be used to receive input numeric or character information and generate signal input related to user settings and functional control of the serverless architecture distributed fault-tolerant device.
  • the processor 701 will load the executable files corresponding to the processes of one or more application programs into the memory 702 according to the following instructions, and the processor 701 will run the executable files stored in the memory 702. application to implement the various functions of the above-mentioned serverless architecture distributed fault-tolerant device.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)
  • Retry When Errors Occur (AREA)

Abstract

本公开提供了一种无服务器架构分布式容错***、方法、装置、设备及介质,所述***包括:无服务器架构控制模块和基于分布式架构的计算节点,无服务器架构控制模块监测各计算节点的工作状态,在监测到故障计算节点时,基于故障计算节点中的持久化存储单元为故障计算节点构建副本计算节点,副本计算节点替代故障计算节点继续执行故障计算节点承接的目标任务。副本计算节点基于持久化存储单元中存储的目标任务对应的图数据和状态快照数据,恢复并继续执行目标任务。

Description

无服务器架构分布式容错***、方法、装置、设备及介质
相关申请的交叉引用
本申请是以CN申请号为202211010834.1,申请日为2022年8月23日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及数据处理领域,尤其涉及一种无服务器架构分布式容错***、方法、装置、设备及存储介质。
背景技术
随着云、大数据、容器等技术的成熟,无服务器(Serverless)架构应运而生。在Serverless架构下,用户只需要专注于应用逻辑的代码实现,而服务器等基础设置的部署、维护以及计算资源的弹性扩缩容等均由Serverless平台端负责。无服务器架构分布式处理***通常规模较大。
发明内容
第一方面,本公开提供了一种无服务器架构分布式容错***,所述***包括:
无服务器架构控制模块和基于分布式架构的计算节点,其中:
所述无服务器架构控制模块与所述基于分布式架构的计算节点之间具有通信连接;所述基于分布式架构的计算节点用于接收并执行被分配的目标任务;
所述无服务器架构控制模块用于监测所述基于分布式架构的计算节点的工作状态,在监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元为所述故障计算节点构建副本计算节点;
所述副本计算节点用于替代所述故障计算节点继续执行所述故障计算节点承接的目标任务;
所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
所述副本计算节点用于基于所述持久化存储单元中存储的所述目标任务对应的图数据和状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,所述无服务器架构控制模块,具体用于为所述故障计算节点中的持久化存储单元构建代理单元;
构建的所述代理单元用于为所述故障计算节点中的持久化存储单元构建计算单元,所述副本计算节点中包括构建的所述计算单元和所述代理单元;控制构建的所述计算单元,基于所述故障计算节点中的持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述***还包括:
主代理单元,其中,所述主代理单元与基于分布式架构的计算节点中的代理单元之间具有通信连接;
所述主代理单元用于监测代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
所述副本代理单元用于为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
所述代理单元,用于在监测到故障计算单元时,为所述故障计算单元创建副本计算单元,控制所述副本计算单元替代所述故障计算单元基于所述持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,构建的所述代理单元,还用于基于各代理单元之间的通信连接,通知其他计算节点中的代理单元暂停执行被分配的目标任务,并在恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
在一些实施例中,构建的所述代理单元,具体用于为所述故障计算节点中的持久化存储单元构建计算单元,并控制构建的所述计算单元,基于所述故障计算节点中的持久化存储单元中存储的所述目标任务对应的状态快照数据、图数据以及来自其他计 算节点的状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,所述持久化存储单元采用内存、持久化存储介质和硬盘的分层结构,
所述持久化存储单元,具体用于按照内存、持久化存储介质和硬盘三级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
在一些实施例中,所述持久化存储单元采用内存和持久化存储介质的分层结构;
所述持久化存储单元,具体用于按照内存和持久化存储介质二级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
在一些实施例中,所述持久化存储介质包括持久内存。
在一些实施例中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;所述***还包括主代理单元,所述主代理单元与基于分布式架构的计算节点中的代理单元之间具有通信连接;
所述主代理单元,用于监测代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
所述副本代理单元,用于为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务;
所述代理单元,用于在监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元,控制所述副本计算单元替代所述故障计算单元,基于所述计算节点中的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
第二方面,本公开还提供了一种无服务器架构分布式容错方法,所述方法包括:
监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点,所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点包括:
为所述故障计算节点中的持久化存储单元构建代理单元;
控制构建的所述代理单元为所述故障计算节点中的持久化存储单元构建计算单元;
所述控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务包括:
利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述方法还包括:
利用主代理单元监测各计算节点中的代理单元的工作状态,当监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
控制所述副本代理单元为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制构建的所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述方法还包括:
当计算节点中的代理单元监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元;
控制所述副本计算单元替代所述故障计算单元,基于所述计算节点中的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述方法还包括:
基于各代理单元之间的通信连接,利用所述代理单元通知其他代理单元暂停执行 所述目标任务;
当监测到恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
在一些实施例中,所述利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务包括:
利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,所述方法还包括:
利用主代理单元监测各计算节点中的代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;控制所述副本代理单元为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务;
当计算节点中的代理单元监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元;控制所述副本计算单元替代所述故障计算单元,基于所述故障计算单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
第三方面,本公开还提供了一种无服务器架构分布式容错装置,所述装置包括:
第一构建模块,用于监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点,所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
第一恢复执行模块,用于控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
第四方面,本公开提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现上述的方法。
第五方面,本公开提供了一种数据处理设备,包括:存储器,处理器,及存储在 所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现上述任意实施例的无服务器架构分布式容错方法。
第六方面,本公开提供了一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现上述任意实施例的无服务器架构分布式容错方法。
第七方面,本公开提供了一种计算机程序,包括指令,当所述指令被处理器执行时使所述处理器实现前述任意实施例所述的无服务器架构分布式容错方法。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种无服务器架构分布式容错***的架构图;
图2为本公开实施例提供的另一种无服务器架构分布式容错***的架构图;
图3为本公开实施例提供的又一种无服务器架构分布式容错***的示意图;
图4为本公开实施例提供的又一种无服务器架构分布式容错***的示意图;
图5为本公开实施例提供的一种无服务器架构分布式容错方法的流程图;
图6为本公开实施例提供的一种无服务器架构分布式容错装置的结构示意图;
图7为本公开实施例提供的一种无服务器架构分布式容错设备的结构示意图。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
无服务器Serverless架构是一种软件设计方法,允许开发人员构建和运行服务,无需管理底层架构体系。为用户提供用户态的服务,用户只需编写实现应用逻辑的函 数代码,然后将函数代码上传至无服务器***即可。在检测到触发函数代码执行的事件发生时,无服务器***为其分配多个计算节点,用于执行函数代码相关的任务,用户不需要关心计算资源层面的问题,使得用户开发变得更加便捷,减轻用户的开发负担,给用户带来更好的体验。
无服务器架构用于通用大数据处理和分布式机器学习等多个领域,例如可以应用于图计算和图挖掘领域,对于数十亿甚至数万亿条边的大规模图的计算任务和挖掘任务,需要将繁重的任务分布在大量的计算节点。发明人发现,图计算和图挖掘领域的无服务器架构分布式***中,较少考虑容错作为其基本能力,而容错能力的不足,一旦某个节点出现故障,那么整个作业都会失败,而重启作业需要从头开始执行,导致无服务器架构分布式***中任务的执行进度会受到影响。为此,本公开提供了一种无服务器架构分布式容错***,可以部署于物理集群,也可以部署于云环境中,将容错功能作为基本能力,降低因节点等的故障对任务的执行进度的影响。
具体的,本公开实施例提供了一种无服务器架构分布式容错***,参考图1,为本公开一些实施例提供的一种无服务器架构分布式容错***的架构图。
无服务器架构分布式容错***100中包括无服务器架构控制模块101和基于分布式架构的计算节点,以计算节点102和计算节点104为例。无服务器架构控制模块101与各个基于分布式架构的计算节点之间具有通信连接,基于分布式架构的计算节点用于接收并执行被分配的目标任务。
无服务器架构控制模块101,用于监测各个基于分布式架构的计算节点的工作状态,在监测到故障计算节点(假设为故障计算节点为计算节点102)时,基于故障计算节点102中的持久化存储单元1021,为故障计算节点102构建副本计算节点103,所述副本计算节点103用于替代所述故障计算节点102继续执行所述故障计算节点103承接的目标任务;所述持久化存储单元1021用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
所述副本计算节点103,用于基于所述持久化存储单元1021中存储的所述目标任务对应的图数据和状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,基于分布式架构的计算节点可以基于容器Pod实现,每个计算节点中包括代理单元、计算单元和持久化存储单元,同一个计算节点中的代理单元、计算单元和持久化存储单元均部署于同一个Pod中。另外,同一个Pod中可以包括一 个或多个计算单元,每个计算单元对应于一个进程。该Pod中的代理单元用于对该一个或多个计算单元进行控制。
其中,持久化存储单元可以基于持久化存储介质(如持久内存Optane Persistent Memory,PMEM)实现,也可以基于硬盘(如固态硬盘Solid State Drive,SSD)实现,还可以基于内存、持久化存储介质和硬盘的混合存储设计实现,用于持久化存储数据,不因断电等故障情况影响数据的存储。计算单元(也称为worker)是指持有计算资源并执行分配的计算任务负载的应用程序耦合实体。
在一些实施例中,持久化存储单元采用内存、持久化存储介质和硬盘的分层结构。
具体的,所述持久化存储单元,具体用于按照内存、持久化存储介质和硬盘三级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。每个计算节点的持久化存储单元存储的所述目标任务对应的图数据和状态快照数据可以不同。
另在一些实施例中,持久化存储单元采用内存和持久化存储介质的分层结构。
具体的,所述持久化存储单元,具体用于按照内存和持久化存储介质二级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
本公开实施例能够针对基于容器实现的计算节点的故障进行容错处理,实现资源调度层面的容器级别的容错,降低因计算节点发生故障,对任务执行进度的影响。
为了便于对上述无服务器架构分布式容错***的进一步理解,本公开一些实施例提供了另一种无服务器架构分布式容错***的架构图,如图2所示。
无服务器架构分布式***200包括无服务器架构控制模块201和基于分布式架构的计算节点,以计算节点202和204为例。具体的,计算节点202中可以包括具有对应关系的代理单元2021、计算单元2022和持久化存储单元2023。
无服务器架构控制模块201,具体用于监测基于分布式架构的计算节点的工作状态,在监测到故障计算节点(假设为计算节点202)时,为所述故障计算节点202中的持久化存储单元2023构建新的代理单元2031(可以称为第一副本代理单元)。
代理单元2031,用于为所述故障计算节点202中的持久化存储单元2023构建新的计算单元2032(可以称为第一副本计算单元),代理单元2031控制构建的所述计算单元2032,基于所述故障计算节点202中的持久化存储单元2023中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
构建的代理单元2031和计算单元2032,以及故障计算节点202中的持久化存储 单元2023,均处于同一个计算节点,即副本计算节点203中,副本计算节点203用于替代故障计算节点202继续执行所述故障计算节点202承接的目标任务。
在一些实施例中,同一计算节点中的代理单元、计算单元和持久化存储单元具有相同的索引标识,通过该索引标识能够定位到具有该索引标识的代理单元、计算单元和持久化存储单元。例如,同一计算节点中的代理单元、计算单元和持久化存储单元具有相同的索引标识1,通过索引标识1能够定位到具有该索引标识1的代理单元、计算单元和持久化存储单元。具体的,各个代理单元、计算单元和持久化存储单元分别与索引标识的对应关系可以预先设置并存储。
实际应用中,每个计算单元分别挂载有持久化存储单元,计算单元预先从远程文件***(如Hadoop分布式文件***HDFS)中读取目标任务对应的数据分片,并将目标任务对应的数据分片存储于持久化存储单元中,用于后续目标任务的执行。本公开实施例通过为每个计算单元挂载单独的持久化存储单元,而不是各个计算单元共享持久化存储单元,能够较大限度地降低资源竞争。
代理单元可以控制对应的计算单元执行目标任务,具体的,控制计算单元从持久化存储单元中获取目标任务对应的图数据,执行目标任务。
另外,代理单元控制计算单元将执行目标任务过程中产生的状态快照数据,周期性的写入对应的持久化存储单元中,实现对状态快照数据的持久化存储,以便后续在监测到故障时,可以基于目标任务的状态快照数据恢复目标任务的执行状态。其中,状态快照数据也可以称为检查点(checkpoint)数据,用于记录目标任务的执行状态数据,基于该执行状态数据能够恢复目标任务的某个暂态。
在一些实施例中,基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据。
本公开实施例中,各计算节点中的代理单元之间具有通信连接,各代理单元之间可以基于协商通信同步彼此的计算结果数据,共同执行目标任务。
在一些实施例中,无服务器架构控制模块在监测到故障计算节点时,确定故障计算节点中包括的持久化存储单元,并基于该持久化存储单元构建新的计算节点(例如,副本计算节点),用于替代故障计算节点继续执行目标任务。另外,新的计算节点中的代理单元,可以基于各代理单元之间的通信连接,通知其他代理单元暂停执行目标任务,并指示其他代理单元将目标任务的状态快照数据同步至新的计算节点,以便新 的计算节点能够代替故障计算节点恢复执行目标任务。
具体的,新的计算节点中的代理单元,具体用于基于该计算节点中的持久化存储单元中存储的目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
值得注意的是,新的代理单元在确定恢复执行目标任务至最新状态后,还可以基于各代理单元之间的通信连接,通知其他代理单元可以继续执行目标任务。
在上述实施例的基础上,无服务器架构分布式容错***中还包括主代理单元,主代理单元与各代理单元之间具有通信连接,主代理单元用于维护各个代理单元的状态以及存储各个代理单元的索引信息。
如图3所示,为本公开实施例提供的另一种无服务器架构分布式容错***的架构图,其中,无服务器架构分布式容错***300中包括无服务器架构控制模块304、主代理单元301、状态存储模块302和计算节点303,主代理单元301与各计算节点中的代理单元之间具有通信连接,主代理单元301利用状态存储模块302存储各个代理单元的索引信息。主代理单元与其他各代理单元之间可以基于协商通信,实现数据同步等。
另外,所述主代理单元,用于监测各个计算节点中的代理单元的工作状态,并在监测到故障代理单元(假设为计算节点303中的代理单元3031)时,为所述故障代理单元3031构建副本代理单元3034(可以称为第二副本代理单元)。
所述副本代理单元3034,用于为所述故障代理单元3031对应的持久化存储单元3033构建新的计算单元3035(可以称为第二副本计算单元),并由副本代理单元3034控制所述计算单元3035,基于所述持久化存储单元3033中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
具体的,发生故障的代理单元3031可以为任一计算节点中的代理单元。由于代理单元3031发生故障,其对应的计算单元3032会被回收,此时需要在创建新的代理单元(如代理单元3034)之后,由新的代理单元3034创建新的计算单元3035。
在一些实施例中,主代理单元301在在监测到故障代理单元之后,基于与各代理单元之间的通信连接,通知各个计算节点中的代理单元暂停执行被分配的目标任务,并在恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
在一些实施例中,具有相同索引标识的计算节点、代理单元、计算单元和持久化 存储单元之间的对应关系,可以分别预先存储于各个代理单元处,也可以单独存储于主代理单元301的状态存储模块302中,以便节省各代理单元的存储资源。基于各代理单元之间的协商通信,可以基于需求从主代理单元301的状态存储模块302中获取所需的索引信息。
本公开实施例提供的无服务器架构分布式容错***,是一个基于代理单元的弹性容错框架,不仅能够支持计算节点层级的容错,还能够支持代理单元层级的容错,进一步降低因无服务器架构分布式容错***中发生故障影响任务执行进度的概率。
另外,在上述实施例的基础上,本公开一些实施例还可以支持计算单元层级的容错,具体的,如图4所示,为本公开一些实施例提供的又一种无服务器架构分布式容错***的示意图。
其中,无服务器架构分布式容错***400中包括基于分布式架构的计算节点,以下以计算节点401和402为例。其中,计算节点401中的代理单元4011,用于监测计算节点401中的计算单元4012的工作状态,并在监测到计算单元4012发生故障时,为发生故障的计算单元4012创建副本计算单元4014(可以称为第三副本代理单元),并由代理单元4011控制副本计算单元4014基于所述故障计算单元4012对应的持久化存储单元4013中存储的目标任务的状态快照数据和图数据,恢复执行所述目标任务。
在一些实施例中,代理单元4011在确定自身对应的计算单元4012发生故障时,基于代理单元之间的通信连接,获取存储于主代理单元的状态存储模块中的索引信息,确定与代理单元4011具有相同索引标识的持久化存储单元,即持久化存储单元4013,由代理单元4011为持久化存储单元4013创建新的计算单元4014,用于恢复执行目标任务。具体的,代理单元4011可以为任一计算节点中的代理单元。
在一些实施例中,代理单元4011在确定自身对应的计算单元4012发生故障时,基于各代理单元之间的通信连接,通知其他计算节点中的代理单元(如图4所示的代理单元4021)暂停执行被分配的目标任务,并在恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
可见,本公开实施例提供的无服务器架构分布式容错***,能够分别从分布式计算节点层级、代理单元层级以及计算单元层级,支持容错功能,降低因计算节点、代理单元、计算单元等发生故障后,对任务执行进度的影响。
本公开实施例提供的无服务器架构分布式容错***,能够针对基于容器实现的计 算节点的故障、代理单元的故障以及计算单元的故障分别进行容错处理,实现从资源调度层面的容器级别容错,到分布式控制面的代理单元Agent级别容错,再到应用本身的计算单元worker级别容错的多维度容错,形成了端到端的容错保障,从而保证了图数据处理应用的端到端容错体验,降低在用户使用图数据处理应用的整个回路中出现应用出错报警等情况的发生。基于上述无服务器架构分布式容错***的实施例描述,本公开实施例还提供了一种无服务器架构分布式容错方法,参考图5,为本公开一些实施例提供的一种无服务器架构分布式容错方法的流程图。该方法包括:步骤S501~S502。
在步骤S501中,监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点。
所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据。
本公开实施例提供的无服务器架构分布式容错方法可以应用于上述无服务器架构分布式容错***,该无服务器架构分布式容错***包括无服务器架构控制模块以及目标任务对应的基于分布式架构的计算节点。
本公开实施例中的基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据。
在一些实施例中,持久化存储单元采用内存、持久化存储介质和硬盘的分层结构。具体的,持久化存储单元具体用于按照内存、持久化存储介质和硬盘三级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
在一些实施例中,持久化存储单元采用内存和持久化存储介质的分层结构;具体的,持久化存储单元具体用于按照内存和持久化存储介质二级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
另外,本公开实施例中的持久化存储介质可以包括持久内存PMEM等。
在一些实施例中,基于故障计算节点中的持久化存储单元,为故障计算节点构建副本计算节点,具体可以包括:首先为故障计算节点中的持久化存储单元构建代理单元(可以称为第一副本代理单元),然后控制该代理单元为故障计算节点中的持久化 存储单元构建计算单元(可以称为第一副本计算单元),实现故障计算节点的新的计算节点的构建。
然后,可以利用代理单元控制构建的新的计算单元,基于故障计算节点中的持久化存储单元中存储的目标任务对应的状态快照数据和图数据,恢复并继续执行目标任务。
在一些实施例中,无服务器架构控制模块可以基于预先存储的计算节点、代理单元、计算单元以及持久化存储单元之间的索引关系,确定故障计算节点对应的持久化存储单元。
实际应用中,计算节点、代理单元、计算单元以及持久化存储单元之间的索引关系可以预先存储于各个代理单元中,也可以预先存储于无服务器架构分布式容错***中的主代理单元中。无服务器架构控制模块可以与各个代理单元进行通信,以确定故障计算节点对应的持久化存储单元。
本公开实施例中,无服务器架构控制模块在确定故障计算节点对应的持久化存储单元之后,为了避免图数据的存入存出的***消耗,可以保持该持久化存储单元中存储的图数据不动,而是重新创建用于处理该持久化存储单元中的图数据的新的计算节点,即故障计算节点的副本计算节点,后续可以利用副本计算节点对该持久化存储单元中的图数据进行处理,以恢复执行目标任务。
在一些实施例中,可以利用主代理单元监测各计算节点中的代理单元的工作状态,当监测到故障代理单元时,为所述故障代理单元构建副本代理单元(可以称为第二副本代理单元),然后控制副本代理单元为故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元(可以称为第二副本计算单元),并控制构建的所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
本公开实施例通过为故障代理单元构建副本代理单元,能够实现代理单元层级的容错功能,降低对目标任务的执行进度的影响。
在一些实施例中,同一计算节点中的代理单元用于监测该计算节点中的计算单元的工作状态。当代理单元监测到计算单元发生故障时,为发生故障的所述计算单元创建副本计算单元(可以称为第三副本计算单元),然后控制所述副本计算单元替代发生故障的所述计算单元,基于同一计算节点中所述持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在步骤S502中,控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
实际应用中,各计算节点在执行目标任务的过程中产生的计算结果数据和状态快照数据等,可以按照预设周期同步至其他计算节点,以便各个计算节点之间协同完成目标任务。
在一些实施例中,可以利用代理单元控制构建的新的计算单元,基于其对应的持久化存储单元中存储的目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,可以基于各代理单元之间的通信连接,由故障计算单元中的代理单元通知其他代理单元暂停执行所述目标任务。
当监测到恢复并继续执行所述目标任务时,还可以通知其他计算节点中的代理单元继续执行被分配的目标任务。
本公开实施例提供的无服务器架构分布式容错方法,能够分别从分布式计算节点层级、代理单元层级以及计算单元层级,支持容错功能,降低因计算节点、代理单元、计算单元等发生故障后,对任务执行进度的影响。
基于上述方法实施例,本公开还提供了一种无服务器架构分布式容错装置,参考图6,为本公开实施例提供的一种无服务器架构分布式容错装置的结构示意图,所述装置包括:
第一构建模块601,用于监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点,所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
第一恢复执行模块602,用于控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述第一构建模块601,包括:
第一构建子模块,用于为所述故障计算节点中的持久化存储单元构建代理单元(可以称为第一副本代理单元);
控制子模块,用于控制构建的所述代理单元为所述故障计算节点中的持久化存储单元构建计算单元(可以称为第一副本计算单元);
相应的,所述恢复执行模块,具体用于:
利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;所述装置还包括:
第二构建模块,用于利用主代理单元监测各计算节点中的代理单元的工作状态,当监测到故障代理单元时,为所述故障代理单元构建副本代理单元(可以称为第二副本代理单元);
第二恢复执行模块,用于控制所述副本代理单元为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元(可以称为第二副本计算单元),并控制构建的所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;所述装置还包括:
第三构建模块,用于当所述代理单元监测到所述计算单元发生故障时,为发生故障的所述计算单元构建副本计算单元;
第三恢复执行模块,用于控制所述副本计算单元替代发生故障的所述计算单元,基于所述持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
在一些实施例中,所述装置还包括:
通知模块,用于基于各代理单元之间的通信连接,利用所述代理单元通知其他代理单元暂停执行所述目标任务;以及当监测到恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
在一些实施例中,所述第一恢复执行模块,具体用于:
利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
在一些实施例中,所述持久化存储单元采用内存、持久化存储介质和硬盘的分层结构;
所述持久化存储单元具体用于按照内存、持久化存储介质和硬盘三级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
在一些实施例中,所述持久化存储单元采用内存和持久化存储介质的分层结构;
所述持久化存储单元具体用于按照内存和持久化存储介质二级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
在一些实施例中,所述持久化存储介质包括持久内存。
本公开实施例提供的无服务器架构分布式容错装置,能够分别从分布式计算节点层级、代理单元层级以及计算单元层级,支持容错功能,降低因计算节点、代理单元、计算单元等发生故障后,对任务执行进度的影响。
除了上述方法和装置以外,本公开实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现本公开实施例所述的无服务器架构分布式容错方法。
本公开实施例还提供了一种计算机程序产品,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现本公开实施例所述的无服务器架构分布式容错方法。
另外,本公开实施例还提供了一种无服务器架构分布式容错设备,参见图7所示,可以包括:
处理器701、存储器702、输入装置703和输出装置704。无服务器架构分布式容错设备中的处理器701的数量可以一个或多个,图7中以一个处理器为例。在本公开的一些实施例中,处理器701、存储器702、输入装置703和输出装置704可通过总线或其它方式连接,其中,图7中以通过总线连接为例。
存储器702可用于存储软件程序以及模块,处理器701通过运行存储在存储器702的软件程序以及模块,从而执行无服务器架构分布式容错设备的各种功能应用以及数据处理。存储器702可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作***、至少一个功能所需的应用程序等。此外,存储器702可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。输入装置703可用于接收输入的数字或字符信息,以及产生与无服务器架构分布式容错设备的用户设置以及功能控制有关的信号输入。
具体在本实施例中,处理器701会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行文件加载到存储器702中,并由处理器701来运行存储在存储器702中的应用程序,从而实现上述无服务器架构分布式容错设备的各种功能。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上所述仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文所述的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (22)

  1. 一种无服务器架构分布式容错***,包括:
    无服务器架构控制模块和基于分布式架构的计算节点,其中:
    所述无服务器架构控制模块与所述基于分布式架构的计算节点之间具有通信连接;所述基于分布式架构的计算节点用于接收并执行被分配的目标任务;
    所述无服务器架构控制模块用于监测所述基于分布式架构的计算节点的工作状态,在监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元为所述故障计算节点构建副本计算节点;
    所述副本计算节点用于替代所述故障计算节点继续执行所述故障计算节点承接的目标任务;
    所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
    所述副本计算节点用于基于所述持久化存储单元中存储的所述目标任务对应的图数据和状态快照数据,恢复并继续执行所述目标任务。
  2. 根据权利要求1所述的无服务器架构分布式容错***,其中:
    所述无服务器架构控制模块,具体用于为所述故障计算节点中的持久化存储单元构建代理单元;
    构建的所述代理单元用于为所述故障计算节点中的持久化存储单元构建计算单元,所述副本计算节点中包括构建的所述计算单元和所述代理单元;控制构建的所述计算单元,基于所述故障计算节点中的持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
  3. 根据权利要求1或2所述的无服务器架构分布式容错***,其中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述***还包括:
    主代理单元,其中,所述主代理单元与基于分布式架构的计算节点中的代理单元之间具有通信连接;
    所述主代理单元用于监测代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
    所述副本代理单元用于为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  4. 根据权利要求1-2任一项所述的无服务器架构分布式容错***,其中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
    所述代理单元,用于在监测到故障计算单元时,为所述故障计算单元创建副本计算单元,控制所述副本计算单元替代所述故障计算单元基于所述持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  5. 根据权利要求2-4任一项所述的无服务器架构分布式容错***,其中,
    构建的所述代理单元,还用于基于各代理单元之间的通信连接,通知其他计算节点中的代理单元暂停执行被分配的目标任务,并在恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
  6. 根据权利要求2-4任一项所述的无服务器架构分布式容错***,其中,
    构建的所述代理单元,具体用于为所述故障计算节点中的持久化存储单元构建计算单元,并控制构建的所述计算单元,基于所述故障计算节点中的持久化存储单元中存储的所述目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
  7. 根据权利要求1-6任一项所述的无服务器架构分布式容错***,其中,所述持久化存储单元采用内存、持久化存储介质和硬盘的分层结构,
    所述持久化存储单元,具体用于按照内存、持久化存储介质和硬盘三级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
  8. 根据权利要求1-7任一项所述的无服务器架构分布式容错***,其中,所述持久化存储单元采用内存和持久化存储介质的分层结构,
    所述持久化存储单元,具体用于按照内存和持久化存储介质二级存储层的优先级降序,将所述目标任务对应的图数据和状态快照数据存储至对应的存储层。
  9. 根据权利要求7或8所述的无服务器架构分布式容错***,其中,所述持久化存储介质包括持久内存。
  10. 根据权利要求1或2所述的无服务器架构分布式容错***,其中,所述基于分布式架构的计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;所述***还包括主代理单元,所述主代理单元与基于分布式架构的计算节点中的代理单元之间具有通信连接;
    所述主代理单元,用于监测代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
    所述副本代理单元,用于为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务;
    所述代理单元,用于在监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元,控制所述副本计算单元替代所述故障计算单元,基于所述计算节点中的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  11. 一种无服务器架构分布式容错方法,包括:
    监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点,所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
    控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
  12. 根据权利要求11所述的无服务器架构分布式容错方法,其中,所述基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点包括:
    为所述故障计算节点中的持久化存储单元构建代理单元;
    控制构建的所述代理单元为所述故障计算节点中的持久化存储单元构建计算单元;
    所述控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务包括:
    利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
  13. 根据权利要求11或12所述的无服务器架构分布式容错方法,其中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述方法还包括:
    利用主代理单元监测各计算节点中的代理单元的工作状态,当监测到故障代理单元时,为所述故障代理单元构建副本代理单元;
    控制所述副本代理单元为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制构建的所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  14. 根据权利要求11-12任一项所述的无服务器架构分布式容错方法,其中,所述计算节点中包括代理单元、计算单元和持久化存储单元,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据,所述方法还包括:
    当计算节点中的代理单元监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元;
    控制所述副本计算单元替代所述故障计算单元,基于所述计算节点中的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  15. 根据权利要求12-14任一项所述的无服务器架构分布式容错方法,还包括:
    基于各代理单元之间的通信连接,利用所述代理单元通知其他代理单元暂停执行所述目标任务;
    当监测到恢复并继续执行所述目标任务时,通知其他计算节点中的代理单元继续执行被分配的目标任务。
  16. 根据权利要求12-14任一项所述的无服务器架构分布式容错方法,其中,所述利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务包括:
    利用所述代理单元控制构建的所述计算单元,基于所述持久化存储单元中存储的 所述目标任务对应的状态快照数据、图数据以及来自其他计算节点的状态快照数据,恢复并继续执行所述目标任务。
  17. 根据权利要求11或12所述的无服务器架构分布式容错方法,还包括:
    利用主代理单元监测各计算节点中的代理单元的工作状态,并在监测到故障代理单元时,为所述故障代理单元构建副本代理单元;控制所述副本代理单元为所述故障代理单元对应的持久化存储单元构建所述故障代理单元对应的计算单元,并控制所述故障代理单元对应的计算单元基于所述故障代理单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务;
    当计算节点中的代理单元监测到所述计算节点中的计算单元发生故障时,为故障计算单元创建副本计算单元;控制所述副本计算单元替代所述故障计算单元,基于所述故障计算单元对应的持久化存储单元中存储的所述目标任务的状态快照数据和图数据,恢复并继续执行所述目标任务。
  18. 一种无服务器架构分布式容错装置,包括:
    第一构建模块,用于监测基于分布式架构的计算节点的工作状态,当监测到故障计算节点时,基于所述故障计算节点中的持久化存储单元,为所述故障计算节点构建副本计算节点,所述副本计算节点用于替代所述故障计算节点继续执行所述故障节点被分配的目标任务,所述持久化存储单元用于存储所述目标任务对应的图数据和状态快照数据,所述状态快照数据包括所述目标任务执行过程中产生的中间状态数据;
    第一恢复执行模块,用于控制所述副本计算节点基于所述持久化存储单元中存储的所述目标任务对应的状态快照数据和图数据,恢复并继续执行所述目标任务。
  19. 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现权利要求11-17任一项所述的无服务器架构分布式容错方法。
  20. 一种分布式图数据处理设备,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如权利要求11-17任一项所述的无服务器架构分布式容错方法。
  21. 一种计算机程序产品,其包括计算机程序/指令,当所述计算机程序/指令被处理器执行时,致使所述处理器实现权利要求11-17任一项所述的无服务器架构分布式容错方法。
  22. 一种计算机程序,包括指令,当所述指令被处理器执行时使所述处理器实现 权利要求11-17任一项所述的无服务器架构分布式容错方法。
PCT/CN2023/111562 2022-08-23 2023-08-07 无服务器架构分布式容错***、方法、装置、设备及介质 WO2024041363A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211010834.1 2022-08-23
CN202211010834.1A CN115378800B (zh) 2022-08-23 2022-08-23 无服务器架构分布式容错***、方法、装置、设备及介质

Publications (1)

Publication Number Publication Date
WO2024041363A1 true WO2024041363A1 (zh) 2024-02-29

Family

ID=84068258

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/111562 WO2024041363A1 (zh) 2022-08-23 2023-08-07 无服务器架构分布式容错***、方法、装置、设备及介质

Country Status (2)

Country Link
CN (1) CN115378800B (zh)
WO (1) WO2024041363A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115378800B (zh) * 2022-08-23 2024-07-16 抖音视界有限公司 无服务器架构分布式容错***、方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170003899A1 (en) * 2015-07-01 2017-01-05 Oracle International Corporation System and method for distributed persistent store archival and retrieval in a distributed computing environment
US10152386B1 (en) * 2016-07-29 2018-12-11 Nutanix, Inc. Efficient disaster rollback across heterogeneous storage systems
US20200042496A1 (en) * 2018-08-02 2020-02-06 MemVerge, Inc Key Value Store Snapshot in a Distributed Memory Object Architecture
CN111736996A (zh) * 2020-06-17 2020-10-02 上海交通大学 一种面向分布式非易失内存***的进程持久化方法及装置
CN115378800A (zh) * 2022-08-23 2022-11-22 抖音视界有限公司 无服务器架构分布式容错***、方法、装置、设备及介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170003899A1 (en) * 2015-07-01 2017-01-05 Oracle International Corporation System and method for distributed persistent store archival and retrieval in a distributed computing environment
US10152386B1 (en) * 2016-07-29 2018-12-11 Nutanix, Inc. Efficient disaster rollback across heterogeneous storage systems
US20200042496A1 (en) * 2018-08-02 2020-02-06 MemVerge, Inc Key Value Store Snapshot in a Distributed Memory Object Architecture
CN111736996A (zh) * 2020-06-17 2020-10-02 上海交通大学 一种面向分布式非易失内存***的进程持久化方法及装置
CN115378800A (zh) * 2022-08-23 2022-11-22 抖音视界有限公司 无服务器架构分布式容错***、方法、装置、设备及介质

Also Published As

Publication number Publication date
CN115378800A (zh) 2022-11-22
CN115378800B (zh) 2024-07-16

Similar Documents

Publication Publication Date Title
US11074143B2 (en) Data backup and disaster recovery between environments
US11797395B2 (en) Application migration between environments
US9870291B2 (en) Snapshotting shared disk resources for checkpointing a virtual machine cluster
US11663085B2 (en) Application backup and management
US9727429B1 (en) Method and system for immediate recovery of replicated virtual machines
US9280430B2 (en) Deferred replication of recovery information at site switchover
TWI625621B (zh) 用於資料庫中進行回復的方法、電腦可用程式產品、與資料處理系統
US10055300B2 (en) Disk group based backup
US9201736B1 (en) Methods and apparatus for recovery of complex assets in distributed information processing systems
JP2011060055A (ja) 仮想計算機システム、仮想マシンの復旧処理方法及びそのプログラム
US11650891B2 (en) Preventing non-detectable data loss during site switchover
US20120066678A1 (en) Cluster-aware virtual input/output server
WO2024041363A1 (zh) 无服务器架构分布式容错***、方法、装置、设备及介质
CN115562911B (zh) 虚拟机数据备份方法及装置、***、电子设备、存储介质
WO2017097006A1 (zh) 一种实时数据容错处理方法及***
CN105589756A (zh) 批处理集群***以及方法
CN113742081A (zh) 一种基于容器技术的分布式任务迁移方法及分布式***
CN115576655A (zh) 容器数据保护***、方法、装置、设备及可读存储介质
US7519857B2 (en) Method, apparatus, and system for a software based business continuity solution for a computing environment
CN111400098B (zh) 一种副本管理方法、装置、电子设备及存储介质
CN114443369A (zh) 异构集群虚拟机备份恢复方法、***及云平台
US11995041B2 (en) Methods and systems to reduce latency of input/output (I/O) operations based on file system optimizations during creation of common snapshots for synchronous replicated datasets of a primary copy of data at a primary storage system to a mirror copy of the data at a cross-site secondary storage system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23856453

Country of ref document: EP

Kind code of ref document: A1