WO2021219208A1 - Method and system for backing up data - Google Patents

Method and system for backing up data Download PDF

Info

Publication number
WO2021219208A1
WO2021219208A1 PCT/EP2020/061833 EP2020061833W WO2021219208A1 WO 2021219208 A1 WO2021219208 A1 WO 2021219208A1 EP 2020061833 W EP2020061833 W EP 2020061833W WO 2021219208 A1 WO2021219208 A1 WO 2021219208A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
image data
backup
storage apparatus
image
Prior art date
Application number
PCT/EP2020/061833
Other languages
French (fr)
Inventor
Assaf Natanzon
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to CN202080015592.8A priority Critical patent/CN113892086A/en
Priority to PCT/EP2020/061833 priority patent/WO2021219208A1/en
Publication of WO2021219208A1 publication Critical patent/WO2021219208A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/20Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
    • G06F11/2053Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant
    • G06F11/2056Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring
    • G06F11/2071Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where persistent mass storage functionality or persistent mass storage control functionality is redundant by mirroring using a plurality of controllers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/84Using snapshots, i.e. a logical point-in-time copy of the data

Definitions

  • the present disclosure relates to a system and method for optimizing data backup.
  • the methods and systems described herein optimise the transfer, storage and recovery of image data associated to virtual machine images from a backup storage apparatus.
  • a snapshot of the data volume is generated and communicated to a backup site for storage.
  • snapshots are stored in an unstructured manner in object storage. Backup systems that back up data to object storage are preferable in some instances as object storage is cheaper than other storage architectures. However, recovery of data from object storage is often slow.
  • Disaster recovery systems are designed to minimize the data loss and recovery time in the event of a disaster. Some disaster recovery systems use continuous data replication to replicate a disk image in a data volume at a replication site. Disaster recovery systems that use continuous data replication methods consume a higher amount of bandwidth during data transfer than backup systems however they also reduce the recovery time.
  • a method for backing up image data in a data volume associated to a virtual machine image comprises evaluating, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data, selecting a data movement method and data format on the basis of the evaluation and transferring image data from the data volume to the backup data storage apparatus on the basis of the selection.
  • the method according to the first aspect enables a user to optimize the transfer and storage of image data to a backup data storage apparatus. This optimization is made on the basis of multiple factors including the recovery time, cost and available bandwidth.
  • a computing system comprises a data volume for hosting a virtual machine image and a processor communicatively coupled to the data volume and arranged to access image data in the data volume associated to a virtual machine image evaluate, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data select a data movement method and data format on the basis of the evaluation and transfer image data from the data volume to the backup data storage apparatus on the basis of the selection.
  • the method comprises storing the image data in a block storage device at the backup data storage apparatus.
  • the method comprises storing the image data in an object storage device at the backup data storage apparatus.
  • the method further comprises evaluating a cost for transferring image data between storage devices at the backup data storage apparatus and selecting a data movement method and data format on the basis of the evaluation.
  • the method according to the third implementation form further optimizes the selection of the data movement method and data format based on the basis of the cost of transferring the data between storage devices at the backup site.
  • This implementation may be used to dynamically change the data movement method and storage based on the workload at the backup site.
  • one of the one or more data movement methods comprises replicating image data to a replicated data volume at the backup data storage apparatus.
  • replicating image data in the replicated data volume comprises intercepting a data write operation to the data volume associated to the virtual machine image, communicating the write operation to the replicated data volume and replicating the write operation in the replicated data volume.
  • the method comprises generating a first snapshot of image data and storing image data at the backup data storage apparatus on the basis of the first snapshot.
  • the method further comprises generating a second snapshot of image, subsequent to the first snapshot, identifying image data stored associated to the first snapshot at the backup data storage apparatus, determining a difference between the image data of the first and second snapshot and storing image data at the backup data storage apparatus on the basis of the determination.
  • the method according to the seventh implementation form provides an efficient method of generating backup data of a data volume.
  • the method comprises accessing image data stored at the object storage device of the backup data storage apparatus and transferring the image data to a block storage device.
  • This method according to the eighth implementation form allows backed up data to be recovered more efficiently compared to recovering data directly from the object storage device.
  • the method comprises generating a snapshot of the replicated data volume and transferring the snapshot to an object storage device at the backup data storage apparatus.
  • the method according to the ninth implementation form provides a cheap way of storing data that has been transferred to a replicated data volume.
  • Figure 1 is a schematic diagram of an apparatus for backing up image data, according to an example.
  • Figure 2 is a block diagram of a method for backing up image data, according to an example.
  • Figure 3 is a schematic diagram of a system comprising a memory and program code for backing up image data, according to an example.
  • FIG. 1 is a block diagram of an apparatus 100.
  • the apparatus 100 comprises a host computing system 110.
  • the host computing system 110 is a server in a data centre. There may be multiple such computing systems similar to the host computing system 110.
  • the host computing system 110 is coupled to a local block storage device 115. In the local block storage device 115 data is organised into blocks. The blocks emulate traditional disk or tape storage. Each block has an identifier that allows the block to be addressed and retrieved by the host computing system 110.
  • the host computing system 110 comprises a hypervisor 120.
  • the hypervisor 120 may be implemented in software, firmware or hardware.
  • the hypervisor 120 creates and executes one or more virtual machines on the host system 110.
  • Each virtual machine may be referred to herein as an instance, virtual machine instance, guest, guest instance or guest machine.
  • the hypervisor 120 presents each instance with a virtual operating platform and manages the execution of a virtualized operating system. Multiple instances share the same underlying hardware resources of the host computing system 110 such as the local block storage device 115.
  • the hypervisor 120 may provide a user with one or more administrative capabilities that allow the user to control the operation of the virtual machines that are executing on the host computing system 110.
  • the host 110 controls a single virtual machine instance 130 that is managed by the hypervisor 120.
  • the computer file(s) containing the contents and structure of the virtual machine instance 130 are stored on the local block storage device 115 of the host computing system 110 as the virtual machine volume 135.
  • the host computing system 110 is in communication with a backup data storage apparatus 140.
  • the backup data storage apparatus 140 is remote from the host system 110.
  • the backup data storage apparatus 140 may be managed by a third-party cloud service provider.
  • the backup data storage apparatus 140 comprises a block storage device 150.
  • data is organised into blocks similar data stored in the local block storage device 115.
  • the backup data storage apparatus 140 further comprises an object storage device 160.
  • the object storage device 160 manages data as objects.
  • Each object comprises data, a variable amount of metadata, and a unique identifier.
  • Storing data as objects allows the management of low-level data storage operations to be abstracted away at higher levels.
  • performing operations has performance implications. For example, modifying data associated to an object requires that the object is retrieved from the object storage device 160, the data is modified, and the entire object is written back to the object storage device 160. This makes object storage less useful for storing data that changes very frequently.
  • object storage is useful for storing data if that data does not change frequently, or at all, as may be the case when storing backup data, for example.
  • data is moved from the virtual machine data volume 135 of the host system 110 to the backup data storage apparatus 140 using one of two data movement methods.
  • the first method referred to as continuous data replication, replicates the write operations of the virtual machine 130 to the virtual machine data volume 135.
  • the virtual machine 130 executes a splitter 170.
  • the splitter 170 intercepts the Input/Output (I/O) operations to the virtual machine data volume 135 on the copies every write I/O to an application running on the backup storage apparatus 140. This write I/O is replicated by the application to replica of the virtual machine volume 135 at the block storage device 150.
  • I/O Input/Output
  • Continuous replication consumes a large amount of bandwidth as I/O updates are continually being communicated across the network.
  • the recovery time in the event of a failure of the host system 110 is fast.
  • continuous replication may allow restoration to any point in time and the recovery point may be close to synchronous to the machine data volume 135.
  • the recovery point objective (RPO) is low and can measure in seconds compared to hours to recovery from the object storage device 160.
  • the backup storage device 140 may be arranged to periodically create a snapshot 155 of the replica volume at the block storage device 150, to allow point-in- time recovery.
  • the second data movement method is based on backup to the object storage device 160.
  • An application 180 running on the host system 110 generates a snapshot of the virtual machine volume 135.
  • the snapshot 185 is stored in the local block storage device 115.
  • the application 180 determines a list of changes between the snapshot and the previous snapshot.
  • the snapshot comprises the whole virtual machine volume 135, if it is the first time a snapshot has been generated.
  • a chunking algorithm may be used to divide the data into chunks to reduce the amount of data communication using deduplication.
  • Deduplication comprises determining which chunks are already present in the object storage device 160. New chunks are communicated by the application 180 to the object storage device 160 and the data is stored together with metadata that indicates where the chunks are located.
  • Image data is restored from backup in the object storage device 160 as follows: firstly an image is selected to restore from the object storage device 160 and a new disk volume is created in the block storage device 150. The image is rebuilt on the disk volume from the object storage device 160. The metadata of the chunked image data may be used to identify which data chunks correspond to the image. Finally, the disk volume is attached to a new virtual machine instance. In this case the recovery of the virtual machine 130 is performed in the backup storage apparatus 140.
  • the application 180 generates a snapshot of the virtual machine volume 135 in the local block storage device 115.
  • the continuous data replication to the backup block storage device 150 is continued and further snapshots are generated.
  • the continuous replication stops.
  • a process to copy all the snapshots generated at the replica site into the object storage device 160 begins.
  • the oldest snapshot of the virtual machine volume 135 is transferred to a backup tier in the object storage device 160.
  • the application 180 moves through successive snapshots, determining a list of changes and transmitting data to the object storage device 160.
  • the virtual machine 130 may be backed up to the object storage device 160 using the method previously described.
  • a second snapshot of the volume 135 on the production site is created and the difference between the second snapshot and the first snapshot is transferred.
  • the host system 110 is communicating image data to the object storage device 160.
  • the splitter 170 is activated in the host system 110 and I/O operations to the local block storage device 115 are tracked.
  • a final snapshot is generated and a backup to the object storage device 160 is performed.
  • the most recent snapshot is restored from the object storage device to a replica volume at the backup block storage device 150.
  • Continuous data replication starts to the replica volume in the block storage device 150, and the changes that occurred since tracking started are synchronized.
  • the data When the data is in the replication format on the replicated data volume, the data may also be periodically transferred to a backup format. Instead of keeping multiple snapshots of the replica volume 150, the snapshot 155 is created and immediately transferred to the object storage 160 using a differential from the previous snapshot of volume 150.
  • Data is transferred continuously and is kept in the block storage device 150.
  • This method uses a block storage device snapshot 155 to keep a historical point-in-time record.
  • This method based on replication, allows fast recovery, uses high network bandwidth, has low RPO, and relies on expensive storage, namely block storage with fewer point-in-time snapshots to recover from in the event of failure.
  • Data is transferred continuously and is kept in the block storage device 150, but only the last few points in time, or just the latest point in time is kept in the block storage device 150. Old data copies are kept in the object storage device 160.
  • This method uses replication and backup and allows fast recovery, uses high network bandwidth, has a low RPO, but uses expensive storage: both block and object storage with many points in time to recover from in the event of failure.
  • Data is transferred using the application 180 and is kept in the backup format in the object storage device 160. This method has a slow recovery, but has cheap storage, low bandwidth to perform backup and many points in time to choose from in the event of failure.
  • Data is transferred using the application 180 and is kept in backup format in the object storage device 160 but the latest images are also transferred to block format in the block storage device 150 to allow fast recovery.
  • This method uses backup and replication which allows for fast recovery, uses less bandwidth and high RPO, but suffers more data loss in the event of a disaster and storage is expensive using both block and object storage devices. There are many points in time to choose from in the event of failure.
  • data may be transferred as snapshots and stored only in the object storage device 160 (option 3), but then transferred to the block storage device 150 and from then on data may be backed up using data replication to the block storage device 150 (option 1 or 2).
  • the cost of each of these options is estimated for each virtual machine instance, based on past behaviour of the virtual machine instance.
  • the total bandwidth required for each option to move the data to the backup data storage apparatus 140 is estimated.
  • the cost of storage at the backup data storage apparatus 140 is estimated based on the workload and the size of the instance.
  • a recovery time and recovery cost for recovering the image data from the backup data storage apparatus 140 may also be estimated.
  • Data loss cost may be estimated (RPO price).
  • the data loss cost may depend on parameters supplied by a user. For example, in some cases, the recovery cost in financial terms may be higher or lower depending on how business-critical data is to a user. These estimations enable an overall cost function to be evaluated and a cost optimization to be performed across virtual machine instances. A user may set a target for recovery cost and the method described herein optimizes the transfer and storage of data based on the target.
  • the data workload changes and, using the methods described herein, the data movement method and storage is changed dynamically.
  • the cost and method selection may also depend on how long the images need to be kept. For example, if images need to be kept for months or years, it may be preferable to keep older images in a backup format in object storage 160 even if the recovery needs to be fast as fast recovery may only be needed for newer images.
  • Figure 2 shows a block diagram of a method 200, according to an example.
  • the method 200 is implemented in conjunction with the other methods and systems described herein.
  • the method 200 may be implemented on the apparatus 100 shown in Figure 1.
  • a recovery time and recovery cost for recovering the image data from the backup data storage apparatus a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data is evaluated.
  • the data movement method may comprise continuous data replication or the snapshot method and the data format may comprise the block data format or an object storage format or both.
  • a data movement method and data format is selected on the basis of the evaluation.
  • image data is transferred from the data volume to the backup data storage apparatus on the basis of the selection.
  • the methods and systems described herein enable a user to dynamically and seamlessly switch between methods of transferring and storing data in an optimal fashion that takes into account multiple factors including a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data.
  • Examples in the present disclosure can be provided as methods, systems or machine-readable instructions, such as any combination of software, hardware, firmware or the like.
  • Such machine-readable instructions may be included on a computer readable storage medium (including but not limited to disc storage, CD-ROM, optical storage, etc.) having computer readable program codes therein or thereon.
  • the machine-readable instructions may, for example, be executed by a general-purpose computer, a special purpose computer, an embedded processor or processors of other programmable data processing devices to realize the functions described in the description and diagrams.
  • a processor or processing apparatus may execute the machine- readable instructions.
  • modules of apparatus may be implemented by a processor executing machine-readable instructions stored in a memory, or a processor operating in accordance with instructions embedded in logic circuitry.
  • the term 'processor' is to be interpreted broadly to include a CPU, processing unit, logic unit, or programmable gate set etc.
  • the methods and modules may all be performed by a single processor or divided amongst several processors.
  • Such machine-readable instructions may also be stored in a computer readable storage that can guide the computer or other programmable data processing devices to operate in a specific mode.
  • the instructions may be provided on a non-transitory computer readable storage medium encoded with instructions, executable by a processor.
  • Figure 3 shows an example of a processor 310 associated with a memory 320.
  • the memory 320 includes program code 330 which is executable by the processor 310.
  • the program code 330 provides instructions to: access image data in the data volume associated to a virtual machine image, evaluate, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data, select a data movement method and data format on the basis of the evaluation and transfer image data from the data volume to the backup data storage apparatus on the basis of the selection.
  • Such machine-readable instructions may also be loaded onto a computer or other programmable data processing devices, so that the computer or other programmable data processing devices perform a series of operations to produce computer-implemented processing, thus the instructions executed on the computer or other programmable devices provide an operation for realizing functions specified by flow(s) in the flow charts and/or block(s) in the block diagrams.
  • teachings herein may be implemented in the form of a computer software product, the computer software product being stored in a storage medium and comprising a plurality of instructions for making a computer device implement the methods recited in the examples of the present disclosure.
  • one or more steps of the embodiment methods provided herein may be performed by corresponding units or modules.
  • the respective units or modules may be hardware, software, or a combination thereof.
  • one or more of the units or modules may be an integrated circuit, such as field programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs).
  • FPGAs field programmable gate arrays
  • ASICs application-specific integrated circuits

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for backing up image data in a data volume associated to a virtual machine image is described. The method comprises evaluating, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data, selecting a data movement method and data format on the basis of the evaluation and transferring image data from the data volume to the backup data storage apparatus on the basis of the selection.

Description

METHOD AND SYSTEM FOR BACKING UP DATA
TECHNICAL FIELD
The present disclosure relates to a system and method for optimizing data backup. In particular, the methods and systems described herein optimise the transfer, storage and recovery of image data associated to virtual machine images from a backup storage apparatus.
BACKGROUND
Data loss and recovery can be very costly for businesses and organizations. On the other hand backing up data to the cloud may also be expensive depending on the method that is used, the type of data and the resources that are available to perform the back up.
In a traditional backup system a snapshot of the data volume is generated and communicated to a backup site for storage. In cloud-based backup systems snapshots are stored in an unstructured manner in object storage. Backup systems that back up data to object storage are preferable in some instances as object storage is cheaper than other storage architectures. However, recovery of data from object storage is often slow.
Disaster recovery systems are designed to minimize the data loss and recovery time in the event of a disaster. Some disaster recovery systems use continuous data replication to replicate a disk image in a data volume at a replication site. Disaster recovery systems that use continuous data replication methods consume a higher amount of bandwidth during data transfer than backup systems however they also reduce the recovery time.
SUMMARY
It is an object of the invention to provide a method for backing up image data associated to a virtual machine image.
The foregoing and other objects are achieved by the features of the independent claims. Further implementation forms are apparent from the dependent claims, the description and the figures. According to a first aspect a method for backing up image data in a data volume associated to a virtual machine image is provided. The method comprises evaluating, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data, selecting a data movement method and data format on the basis of the evaluation and transferring image data from the data volume to the backup data storage apparatus on the basis of the selection.
The method according to the first aspect enables a user to optimize the transfer and storage of image data to a backup data storage apparatus. This optimization is made on the basis of multiple factors including the recovery time, cost and available bandwidth.
According to a second aspect a computing system is provided. The computing system comprises a data volume for hosting a virtual machine image and a processor communicatively coupled to the data volume and arranged to access image data in the data volume associated to a virtual machine image evaluate, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data select a data movement method and data format on the basis of the evaluation and transfer image data from the data volume to the backup data storage apparatus on the basis of the selection.
In a first implementation form, the method comprises storing the image data in a block storage device at the backup data storage apparatus.
In a second implementation form, the method comprises storing the image data in an object storage device at the backup data storage apparatus.
In a third implementation form, the method further comprises evaluating a cost for transferring image data between storage devices at the backup data storage apparatus and selecting a data movement method and data format on the basis of the evaluation.
The method according to the third implementation form further optimizes the selection of the data movement method and data format based on the basis of the cost of transferring the data between storage devices at the backup site. This implementation may be used to dynamically change the data movement method and storage based on the workload at the backup site.
In a fourth implementation form, one of the one or more data movement methods comprises replicating image data to a replicated data volume at the backup data storage apparatus.
In a fifth implementation form replicating image data in the replicated data volume comprises intercepting a data write operation to the data volume associated to the virtual machine image, communicating the write operation to the replicated data volume and replicating the write operation in the replicated data volume.
In a sixth implementation form the method comprises generating a first snapshot of image data and storing image data at the backup data storage apparatus on the basis of the first snapshot.
In a seventh implementation form the method further comprises generating a second snapshot of image, subsequent to the first snapshot, identifying image data stored associated to the first snapshot at the backup data storage apparatus, determining a difference between the image data of the first and second snapshot and storing image data at the backup data storage apparatus on the basis of the determination.
The method according to the seventh implementation form provides an efficient method of generating backup data of a data volume.
In an eighth implementation form the method comprises accessing image data stored at the object storage device of the backup data storage apparatus and transferring the image data to a block storage device.
This method according to the eighth implementation form allows backed up data to be recovered more efficiently compared to recovering data directly from the object storage device.
In a ninth implementation form the method comprises generating a snapshot of the replicated data volume and transferring the snapshot to an object storage device at the backup data storage apparatus.
The method according to the ninth implementation form provides a cheap way of storing data that has been transferred to a replicated data volume.
These and other aspects of the invention will be apparent from and the embodiment(s) described below. BRIEF DESCRIPTION OF THE DRAWINGS
For a more complete understanding of the present disclosure, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
Figure 1 is a schematic diagram of an apparatus for backing up image data, according to an example.
Figure 2 is a block diagram of a method for backing up image data, according to an example.
Figure 3 is a schematic diagram of a system comprising a memory and program code for backing up image data, according to an example.
DETAILED DESCRIPTION
Example embodiments are described below in sufficient detail to enable those of ordinary skill in the art to embody and implement the systems and processes herein described. It is important to understand that embodiments can be provided in many alternate forms and should not be construed as limited to the examples set forth herein.
Accordingly, while embodiments can be modified in various ways and take on various alternative forms, specific embodiments thereof are shown in the drawings and described in detail below as examples. There is no intent to limit to the particular forms disclosed. On the contrary, all modifications, equivalents, and alternatives falling within the scope of the appended claims should be included. Elements of the example embodiments are consistently denoted by the same reference numerals throughout the drawings and detailed description where appropriate.
The terminology used herein to describe embodiments is not intended to limit the scope. The articles “a,” “an,” and “the” are singular in that they have a single referent, however the use of the singular form in the present document should not preclude the presence of more than one referent. In other words, elements referred to in the singular can number one or more, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes,” and/or “including,” when used herein, specify the presence of stated features, items, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, items, steps, operations, elements, components, and/or groups thereof. Unless otherwise defined, all terms (including technical and scientific terms) used herein are to be interpreted as is customary in the art. It will be further understood that terms in common usage should also be interpreted as is customary in the relevant art and not in an idealized or overly formal sense unless expressly so defined herein.
Figure 1 is a block diagram of an apparatus 100. The apparatus 100 comprises a host computing system 110. In one example, the host computing system 110 is a server in a data centre. There may be multiple such computing systems similar to the host computing system 110. The host computing system 110 is coupled to a local block storage device 115. In the local block storage device 115 data is organised into blocks. The blocks emulate traditional disk or tape storage. Each block has an identifier that allows the block to be addressed and retrieved by the host computing system 110.
In the example shown in Figure 1, the host computing system 110 comprises a hypervisor 120. The hypervisor 120 may be implemented in software, firmware or hardware. The hypervisor 120 creates and executes one or more virtual machines on the host system 110. Each virtual machine may be referred to herein as an instance, virtual machine instance, guest, guest instance or guest machine. The hypervisor 120 presents each instance with a virtual operating platform and manages the execution of a virtualized operating system. Multiple instances share the same underlying hardware resources of the host computing system 110 such as the local block storage device 115.
The hypervisor 120 may provide a user with one or more administrative capabilities that allow the user to control the operation of the virtual machines that are executing on the host computing system 110. In Figure 1 , the host 110 controls a single virtual machine instance 130 that is managed by the hypervisor 120. The computer file(s) containing the contents and structure of the virtual machine instance 130 are stored on the local block storage device 115 of the host computing system 110 as the virtual machine volume 135.
In the example shown in Figure 1, the host computing system 110 is in communication with a backup data storage apparatus 140. In examples, the backup data storage apparatus 140 is remote from the host system 110. For example the backup data storage apparatus 140 may be managed by a third-party cloud service provider.
The backup data storage apparatus 140 comprises a block storage device 150. In the block storage device 150 data is organised into blocks similar data stored in the local block storage device 115. The backup data storage apparatus 140 further comprises an object storage device 160. In contrast to the block storage device 150, the object storage device 160 manages data as objects. Each object comprises data, a variable amount of metadata, and a unique identifier. Storing data as objects allows the management of low-level data storage operations to be abstracted away at higher levels. However, performing operations has performance implications. For example, modifying data associated to an object requires that the object is retrieved from the object storage device 160, the data is modified, and the entire object is written back to the object storage device 160. This makes object storage less useful for storing data that changes very frequently. On the other hand, object storage is useful for storing data if that data does not change frequently, or at all, as may be the case when storing backup data, for example.
In examples described herein, data is moved from the virtual machine data volume 135 of the host system 110 to the backup data storage apparatus 140 using one of two data movement methods. The first method, referred to as continuous data replication, replicates the write operations of the virtual machine 130 to the virtual machine data volume 135. The virtual machine 130 executes a splitter 170. The splitter 170 intercepts the Input/Output (I/O) operations to the virtual machine data volume 135 on the copies every write I/O to an application running on the backup storage apparatus 140. This write I/O is replicated by the application to replica of the virtual machine volume 135 at the block storage device 150.
Continuous replication consumes a large amount of bandwidth as I/O updates are continually being communicated across the network. However, the recovery time in the event of a failure of the host system 110 is fast. Furthermore, continuous replication may allow restoration to any point in time and the recovery point may be close to synchronous to the machine data volume 135. In other words, the recovery point objective (RPO) is low and can measure in seconds compared to hours to recovery from the object storage device 160. According to examples described herein the backup storage device 140 may be arranged to periodically create a snapshot 155 of the replica volume at the block storage device 150, to allow point-in- time recovery.
The second data movement method is based on backup to the object storage device 160. An application 180 running on the host system 110 generates a snapshot of the virtual machine volume 135. The snapshot 185 is stored in the local block storage device 115. The application 180 determines a list of changes between the snapshot and the previous snapshot. The snapshot comprises the whole virtual machine volume 135, if it is the first time a snapshot has been generated. A chunking algorithm may be used to divide the data into chunks to reduce the amount of data communication using deduplication. Deduplication comprises determining which chunks are already present in the object storage device 160. New chunks are communicated by the application 180 to the object storage device 160 and the data is stored together with metadata that indicates where the chunks are located.
Image data is restored from backup in the object storage device 160 as follows: firstly an image is selected to restore from the object storage device 160 and a new disk volume is created in the block storage device 150. The image is rebuilt on the disk volume from the object storage device 160. The metadata of the chunked image data may be used to identify which data chunks correspond to the image. Finally, the disk volume is attached to a new virtual machine instance. In this case the recovery of the virtual machine 130 is performed in the backup storage apparatus 140.
In examples described herein a method of switching from continuous data replication of the virtual machine 130 to the block storage device 150 and backup to the object storage device 160 is described. Firstly, the application 180 generates a snapshot of the virtual machine volume 135 in the local block storage device 115. The continuous data replication to the backup block storage device 150 is continued and further snapshots are generated. Once all the data of the local snapshot of volume 135 has arrived at the block storage device 150, the continuous replication stops. A process to copy all the snapshots generated at the replica site into the object storage device 160 begins. The oldest snapshot of the virtual machine volume 135 is transferred to a backup tier in the object storage device 160. Then the application 180 moves through successive snapshots, determining a list of changes and transmitting data to the object storage device 160. Once this data has been uploaded to the object storage device 160, from that point onwards the virtual machine 130 may be backed up to the object storage device 160 using the method previously described. A second snapshot of the volume 135 on the production site is created and the difference between the second snapshot and the first snapshot is transferred.
Similarly, a method of switching from backup to the object storage device 160 to continuous data replication to the block storage device 150 is described. Initially, the host system 110 is communicating image data to the object storage device 160. The splitter 170 is activated in the host system 110 and I/O operations to the local block storage device 115 are tracked. A final snapshot is generated and a backup to the object storage device 160 is performed. The most recent snapshot is restored from the object storage device to a replica volume at the backup block storage device 150. Continuous data replication starts to the replica volume in the block storage device 150, and the changes that occurred since tracking started are synchronized.
When the data is in the replication format on the replicated data volume, the data may also be periodically transferred to a backup format. Instead of keeping multiple snapshots of the replica volume 150, the snapshot 155 is created and immediately transferred to the object storage 160 using a differential from the previous snapshot of volume 150.
According to examples described herein, using these methods, for each virtual machine instance running on the host system 110, there are four options for storage and data transfer:
1. Data is transferred continuously and is kept in the block storage device 150. This method uses a block storage device snapshot 155 to keep a historical point-in-time record. This method, based on replication, allows fast recovery, uses high network bandwidth, has low RPO, and relies on expensive storage, namely block storage with fewer point-in-time snapshots to recover from in the event of failure.
2. Data is transferred continuously and is kept in the block storage device 150, but only the last few points in time, or just the latest point in time is kept in the block storage device 150. Old data copies are kept in the object storage device 160. This method uses replication and backup and allows fast recovery, uses high network bandwidth, has a low RPO, but uses expensive storage: both block and object storage with many points in time to recover from in the event of failure.
3. Data is transferred using the application 180 and is kept in the backup format in the object storage device 160. This method has a slow recovery, but has cheap storage, low bandwidth to perform backup and many points in time to choose from in the event of failure.
4. Data is transferred using the application 180 and is kept in backup format in the object storage device 160 but the latest images are also transferred to block format in the block storage device 150 to allow fast recovery. This method uses backup and replication which allows for fast recovery, uses less bandwidth and high RPO, but suffers more data loss in the event of a disaster and storage is expensive using both block and object storage devices. There are many points in time to choose from in the event of failure.
Using the methods described above, it is possible to move between these options. For example, data may be transferred as snapshots and stored only in the object storage device 160 (option 3), but then transferred to the block storage device 150 and from then on data may be backed up using data replication to the block storage device 150 (option 1 or 2). The cost of each of these options is estimated for each virtual machine instance, based on past behaviour of the virtual machine instance. The total bandwidth required for each option to move the data to the backup data storage apparatus 140 is estimated. The cost of storage at the backup data storage apparatus 140 is estimated based on the workload and the size of the instance. A recovery time and recovery cost for recovering the image data from the backup data storage apparatus 140 may also be estimated. Data loss cost may be estimated (RPO price). The data loss cost may depend on parameters supplied by a user. For example, in some cases, the recovery cost in financial terms may be higher or lower depending on how business-critical data is to a user. These estimations enable an overall cost function to be evaluated and a cost optimization to be performed across virtual machine instances. A user may set a target for recovery cost and the method described herein optimizes the transfer and storage of data based on the target.
In some cases, the data workload changes and, using the methods described herein, the data movement method and storage is changed dynamically. The cost and method selection may also depend on how long the images need to be kept. For example, if images need to be kept for months or years, it may be preferable to keep older images in a backup format in object storage 160 even if the recovery needs to be fast as fast recovery may only be needed for newer images.
Figure 2 shows a block diagram of a method 200, according to an example. The method 200 is implemented in conjunction with the other methods and systems described herein. In particular the method 200 may be implemented on the apparatus 100 shown in Figure 1.
At block 210, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data is evaluated. As discussed herein, the data movement method may comprise continuous data replication or the snapshot method and the data format may comprise the block data format or an object storage format or both. At block 220, a data movement method and data format is selected on the basis of the evaluation. At block 230, image data is transferred from the data volume to the backup data storage apparatus on the basis of the selection.
Individually, both the methods of continuous data replication to block storage and backup to object storage have drawbacks. Backup systems consume less bandwidth and can use cheaper storage, while continuous data replication requires faster storage and more bandwidth but the recovery times are much faster when a disaster occurs. Most users would like to use continuous data replication all the time, however due to the cost and bandwidth requirements this is impractical.
The methods and systems described herein enable a user to dynamically and seamlessly switch between methods of transferring and storing data in an optimal fashion that takes into account multiple factors including a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data.
Examples in the present disclosure can be provided as methods, systems or machine-readable instructions, such as any combination of software, hardware, firmware or the like. Such machine-readable instructions may be included on a computer readable storage medium (including but not limited to disc storage, CD-ROM, optical storage, etc.) having computer readable program codes therein or thereon.
The present disclosure is described with reference to flow charts and/or block diagrams of the method, devices and systems according to examples of the present disclosure. Although the flow diagrams described above show a specific order of execution, the order of execution may differ from that which is depicted. Blocks described in relation to one flow chart may be combined with those of another flow chart. In some examples, some blocks of the flow diagrams may not be necessary and/or additional blocks may be added. It shall be understood that each flow and/or block in the flow charts and/or block diagrams, as well as combinations of the flows and/or diagrams in the flow charts and/or block diagrams can be realized by machine readable instructions.
The machine-readable instructions may, for example, be executed by a general-purpose computer, a special purpose computer, an embedded processor or processors of other programmable data processing devices to realize the functions described in the description and diagrams. In particular, a processor or processing apparatus may execute the machine- readable instructions. Thus, modules of apparatus may be implemented by a processor executing machine-readable instructions stored in a memory, or a processor operating in accordance with instructions embedded in logic circuitry. The term 'processor' is to be interpreted broadly to include a CPU, processing unit, logic unit, or programmable gate set etc. The methods and modules may all be performed by a single processor or divided amongst several processors. Such machine-readable instructions may also be stored in a computer readable storage that can guide the computer or other programmable data processing devices to operate in a specific mode.
For example, the instructions may be provided on a non-transitory computer readable storage medium encoded with instructions, executable by a processor. Figure 3 shows an example of a processor 310 associated with a memory 320. The memory 320 includes program code 330 which is executable by the processor 310. The program code 330 provides instructions to: access image data in the data volume associated to a virtual machine image, evaluate, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data, select a data movement method and data format on the basis of the evaluation and transfer image data from the data volume to the backup data storage apparatus on the basis of the selection.
Such machine-readable instructions may also be loaded onto a computer or other programmable data processing devices, so that the computer or other programmable data processing devices perform a series of operations to produce computer-implemented processing, thus the instructions executed on the computer or other programmable devices provide an operation for realizing functions specified by flow(s) in the flow charts and/or block(s) in the block diagrams.
Further, the teachings herein may be implemented in the form of a computer software product, the computer software product being stored in a storage medium and comprising a plurality of instructions for making a computer device implement the methods recited in the examples of the present disclosure.
While the method, apparatus and related aspects have been described with reference to certain examples, various modifications, changes, omissions, and substitutions can be made without departing from the present disclosure. In particular, a feature or block from one example may be combined with or substituted by a feature/block of another example.
It should be appreciated that one or more steps of the embodiment methods provided herein may be performed by corresponding units or modules. The respective units or modules may be hardware, software, or a combination thereof. For instance, one or more of the units or modules may be an integrated circuit, such as field programmable gate arrays (FPGAs) or application-specific integrated circuits (ASICs). Although the present disclosure and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims

1. A method for backing up image data in a data volume associated to a virtual machine image, the method comprising: evaluating, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data; selecting a data movement method and data format on the basis of the evaluation; and transferring image data from the data volume to the backup data storage apparatus on the basis of the selection.
2. The method of claim 1 , comprising storing the image data in a block storage device at the backup data storage apparatus.
3. The method of claim 1 or 2, comprising storing the image data in an object storage device at the backup data storage apparatus.
4. The method of claim 1 , further comprising: evaluating a cost for transferring image data between storage devices at the backup data storage apparatus; and selecting a data movement method and data format on the basis of the evaluation.
5. The method of claim 1, wherein one of the one or more data movement methods comprises replicating image data to a replicated data volume at the backup data storage apparatus.
6. The method of claim 5, wherein replicating image data in the replicated data volume comprises: intercepting a data write operation to the data volume associated to the virtual machine image; communicating the write operation to the replicated data volume; and replicating the write operation in the replicated data volume.
7. The method of claim 1 , comprising: generating a first snapshot of image data; and storing image data at the backup data storage apparatus on the basis of the first snapshot.
8. The method of claim 7, further comprising: generating a second snapshot of image data subsequent to the first snapshot; identifying image data associated to the first snapshot stored at the backup data storage apparatus; determining a difference between the image data of the first and second snapshot; and storing image data at the backup data storage apparatus on the basis of the determination.
9. The method of claim 3, comprising accessing image data stored at the object storage device of the backup data storage apparatus and transferring the image data to a block storage device.
10. The method of claims 5 or 6, comprising generating a snapshot of the replicated data volume and transferring the snapshot to an object storage device at the backup data storage apparatus.
11. A computing system comprising: a data volume for hosting a virtual machine image; a processor communicatively coupled to the data volume and arranged to: access image data in the data volume associated to a virtual machine image; evaluate, for each one of one or more data movement methods for transferring image data to a backup data storage apparatus, and for each one of one or more data formats for storing image data at the backup data storage apparatus, a recovery time and recovery cost for recovering the image data from the backup data storage apparatus, a bandwidth cost for transferring image data, a data storage cost for storing image data and cost of data loss of the image data; select a data movement method and data format on the basis of the evaluation; and transfer image data from the data volume to the backup data storage apparatus on the basis of the selection.
12 The computing system of claim 11 , wherein the processor is further arranged to: evaluate a cost for transferring image data between storage devices at the backup data storage apparatus; and select a data movement method and data format on the basis of the evaluation.
13. The computing system of claim 11 , wherein for one of the one or more data movement methods the processor is arranged to replicate image data to a replicated data volume at the backup data storage apparatus.
14. The computing system of claim 11 , wherein the processor is arranged to: intercept a data write operation to the data volume associated to the virtual machine image; and communicate the write operation to the replicated data volume.
15. The computing system of claim 11, wherein the replicated data volume is a block storage volume.
16. The computing system of claim 11, wherein, for one of the one or more data movement methods, the processor is arranged to: generate a snapshot of image data in the data volume; and communicate image data for storage at a data storage device on the backup data storage apparatus on the basis of the snapshot.
17. The computing system of claim 11 , wherein the processor is arranged to: determine a difference between the snapshot and image data associated to the virtual machine image stored at the data storage device; and communicate the image data for storage at the data storage device on the basis of the determination.
18. The computing system of claim 11 , wherein the data storage device is an object storage device.
19. An apparatus, comprising: a computing system according to any one of claims 11 to 18; and a data storage apparatus in communication with the computing system, the data storage apparatus comprising one or more data storage devices to store image data received from the computing system.
20. The apparatus of claim 19, wherein at least one of the one or more data storage devices is a block storage device.
21. The apparatus of claim 19 or 20, wherein at least one of the one or more data storage devices is an object storage device.
PCT/EP2020/061833 2020-04-29 2020-04-29 Method and system for backing up data WO2021219208A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080015592.8A CN113892086A (en) 2020-04-29 2020-04-29 Method and system for backing up data
PCT/EP2020/061833 WO2021219208A1 (en) 2020-04-29 2020-04-29 Method and system for backing up data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2020/061833 WO2021219208A1 (en) 2020-04-29 2020-04-29 Method and system for backing up data

Publications (1)

Publication Number Publication Date
WO2021219208A1 true WO2021219208A1 (en) 2021-11-04

Family

ID=70480275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2020/061833 WO2021219208A1 (en) 2020-04-29 2020-04-29 Method and system for backing up data

Country Status (2)

Country Link
CN (1) CN113892086A (en)
WO (1) WO2021219208A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120123999A1 (en) * 2010-11-16 2012-05-17 Actifio, Inc. System and method for managing data with service level agreements that may specify non-uniform copying of data
US8484356B1 (en) * 2011-06-08 2013-07-09 Emc Corporation System and method for allocating a storage unit for backup in a storage system with load balancing
US20170031613A1 (en) * 2015-07-30 2017-02-02 Unitrends, Inc. Disaster recovery systems and methods
US10467102B1 (en) * 2016-12-15 2019-11-05 EMC IP Holding Company LLC I/O score-based hybrid replication in a storage system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120123999A1 (en) * 2010-11-16 2012-05-17 Actifio, Inc. System and method for managing data with service level agreements that may specify non-uniform copying of data
US8484356B1 (en) * 2011-06-08 2013-07-09 Emc Corporation System and method for allocating a storage unit for backup in a storage system with load balancing
US20170031613A1 (en) * 2015-07-30 2017-02-02 Unitrends, Inc. Disaster recovery systems and methods
US10467102B1 (en) * 2016-12-15 2019-11-05 EMC IP Holding Company LLC I/O score-based hybrid replication in a storage system

Also Published As

Publication number Publication date
CN113892086A (en) 2022-01-04

Similar Documents

Publication Publication Date Title
US11797395B2 (en) Application migration between environments
US11782891B2 (en) Automated log-based remediation of an information management system
US11042446B2 (en) Application-level live synchronization across computing platforms such as cloud platforms
US11074143B2 (en) Data backup and disaster recovery between environments
US10114581B1 (en) Creating a virtual access point in time on an object based journal replication
CN107111533B (en) Virtual machine cluster backup
US10055148B1 (en) Storing application data as an enhanced copy
US10235061B1 (en) Granular virtual machine snapshots
CN107003890B (en) Efficiently providing virtual machine reference points
US20170262345A1 (en) Backup, Archive and Disaster Recovery Solution with Distributed Storage over Multiple Clouds
US7831787B1 (en) High efficiency portable archive with virtualization
US11809287B2 (en) On-the-fly PiT selection in cloud disaster recovery
US11645169B2 (en) Dynamic resizing and re-distribution of destination data storage resources for bare metal restore operations in a data storage management system
US10776211B1 (en) Methods, systems, and apparatuses to update point in time journal using map reduce to create a highly parallel update
US11675674B2 (en) Instant recovery of databases
WO2021219208A1 (en) Method and system for backing up data
US20220391288A1 (en) Continuous data protection in cloud using streams
US11836512B2 (en) Virtual machine replication strategy based on predicted application failures
US20220391328A1 (en) Continuous data protection in cloud using streams
US11151101B2 (en) Adjusting growth of persistent log
US20220391287A1 (en) Continuous data protection in cloud using streams

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20723092

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20723092

Country of ref document: EP

Kind code of ref document: A1