US20160150012A1 - Content-based replication of data between storage units - Google Patents

Content-based replication of data between storage units Download PDF

Info

Publication number
US20160150012A1
US20160150012A1 US14/950,456 US201514950456A US2016150012A1 US 20160150012 A1 US20160150012 A1 US 20160150012A1 US 201514950456 A US201514950456 A US 201514950456A US 2016150012 A1 US2016150012 A1 US 2016150012A1
Authority
US
United States
Prior art keywords
upstream
chunk
checksum
array
downstream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/950,456
Inventor
Tomasz Barszczak
Gurunatha Karaje
Nimesh Bhagat
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Enterprise Development LP
Original Assignee
Nimble Storage Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nimble Storage Inc filed Critical Nimble Storage Inc
Priority to US14/950,456 priority Critical patent/US20160150012A1/en
Assigned to NIMBLE STORAGE, INC. reassignment NIMBLE STORAGE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BARSZCZAK, TOMASZ, BHAGAT, NIMESH, KARAJE, GURUNATHA
Publication of US20160150012A1 publication Critical patent/US20160150012A1/en
Assigned to HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP reassignment HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NIMBLE STORAGE, INC.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/184Distributed file systems implemented as replicated file system
    • G06F16/1844Management specifically adapted to replicated file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Definitions

  • the present embodiments relate to methods, systems, and programs for replicating data in a networked storage system.
  • Network storage also referred to as network storage systems or storage systems, is computer data storage connected to a computer network providing data access to heterogeneous clients.
  • network storage systems process a large amount of Input/Output (IO) requests, and high availability, speed, and reliability are desirable characteristics of network storage.
  • IO Input/Output
  • data is copied from one system to another, such as when an organization upgrades to a new data storage device, when backing up data to a different location, or when backing up data for the purpose of disaster recovery.
  • the data needs to be migrated or replicated to the new device from the old device.
  • What is needed is a network storage device, software, and systems that provide verification of the correct transfer of large amounts of data from one system to another, as well as ways to correct errors found during the replication process.
  • the present embodiments relate to fixing problems when data is replicated from a first system to a second system. It should be appreciated that the present embodiments can be implemented in numerous ways, such as a method, an apparatus, a system, a device, or a computer program on a computer readable medium. Several embodiments are described below.
  • One aspect includes a method for replicating data across storage systems.
  • the method includes an operation for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array.
  • the method further includes comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array.
  • a plurality of chunks is defined in the snapshot. For each chunk in the snapshot, an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array.
  • the method includes an operation for sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
  • One general aspect includes a method for replicating data across storage systems.
  • the method includes an operation for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array.
  • the method also includes an operation for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array.
  • a plurality of chunks is defined in the snapshot.
  • an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array.
  • a plurality of blocks is defined in the chunk.
  • an upstream block checksum calculated by the upstream array is compared with a downstream block checksum calculated by the downstream array.
  • the upstream block checksum is different from the downstream block checksum data of the block is sent from the upstream array to the downstream array.
  • One aspect includes a non-transitory computer-readable storage medium storing a computer program for replicating data across storage systems.
  • the computer-readable storage medium includes program instructions for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array, and program instructions for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array.
  • the upstream snapshot checksum is different from the downstream snapshot checksum, a plurality of chunks is defined in the snapshot. For each chunk in the snapshot, an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array.
  • the storage medium further includes program instructions for sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
  • FIG. 1 illustrates the replication of the snapshots from one system to another, according to one embodiment.
  • FIG. 2 illustrates the partition of a volume into a plurality of chunks, where each chunk may include a plurality of blocks, according to one embodiment.
  • FIG. 3 illustrates the content-based replication (CBR) method for validating data and correcting erroneous data between two volumes, according to one embodiment.
  • CBR content-based replication
  • FIG. 4 illustrates the CBR process which includes checking block checksums, according to one embodiment.
  • FIG. 5 illustrates the read and write paths within the storage array, according to one embodiment.
  • FIG. 6 illustrates an example of a configuration where multiple arrays can be made part of a group (i.e., a cluster), in accordance with one embodiment.
  • FIG. 7 illustrates the architecture of a storage array, according to one embodiment.
  • FIG. 8 is a flow chart of a method for replicating data across storage systems, according to one embodiment.
  • a Snapshot Delta Replication (SDR) method is used to replicate snapshots of data volumes in a network storage device.
  • SDR Snapshot Delta Replication
  • something could have gone wrong during the replication and a check is made to determine if the replicated snapshot is correct. If the replication is not completely correct, the data would have to be resent, which may be very resource costly.
  • CBR Content-Based Replication
  • volume checksums are made at the upstream system (system being replicated) and the downstream system (system where the replicated data is received). If the checksums do not match, the volume is divided into large pieces of data, referred to herein as chunks, (e.g., 16 MB although other values are also possible). Then checksums are performed for each chunk, at the upstream system and the downstream system. If corresponding pair of checksums for the same chunk do not match at the upstream and the downstream systems, then the upstream system sends the chunk of data to the downstream system.
  • chunks e.g. 16 MB although other values are also possible.
  • another level of iteration is used to further divide the chunks into smaller pieces and perform checksums on the smaller pieces. For example, checksums of the blocks within a chunk can be compared, and then the blocks that have mismatched checksums are transmitted over the network.
  • an automated program determines when CBR is to be performed, based on system parameters (design by the system designers), or user configuration (e.g., once a week), or based on heuristics that determine when the risk of an incorrect replication is high (e.g., after installing a new release). For example, in one embodiment, CBR could be more efficient for replication seeding than SDR when a common base snapshot is not found between the upstream and the downstream volumes, however, the downstream volume may already have blocks of the volume due to an earlier SDR.
  • FIG. 1 illustrates the replication of the snapshots from one system to another, according to one embodiment.
  • a volume is a single accessible storage area, reserved for one application or one host, or for a group of users of an organization or to segment/separate types of data for security or accessibly.
  • the data of the volume is divided into blocks, and the data from the volume is accessed by identifying a block (e.g., identifying an offset associated with the block being retrieved). That is, data from the volume is accessed by the host in units of a size of the block, and the block is the smallest amount of data that can be requested from the volume.
  • the networked storage device where the data is stored is also referred to herein as a storage array or a storage system.
  • a first system creates snapshots of a volume over time (e.g., S 1 , S 2 , S 3 , etc.).
  • the volume replicates one or more of the snapshots to a second volume, for example to provide backup of the data in a different location or in a different storage array.
  • the storage array that holds the source data to be copied is referred to as the upstream storage array, or the upstream system, or the base storage array, and the storage array that receives a copy of the data is referred to as the downstream storage array or the downstream system.
  • SDR When SDR is in the process of replicating a snapshot to create a replicated snapshot in another storage array, to compute what blocks need to be transferred, SDR uses a base snapshot that is already present on the downstream as well as on the upstream. This common snapshot is also referred to as the common ancestor snapshot. After SDR is complete, the replicated snapshot is present on both the upstream and the downstream storage arrays.
  • replication means copying all the data from the upstream volume to the downstream volume.
  • the replication of a later snapshot includes copying only the data that has changed, which is also referred to herein as the delta data or the difference between the two snapshots. It is noted that not all the snapshots in the upstream volume have to be replicated to the downstream volume.
  • the upstream volume has over time generated five snapshots, S 1 , S 2 , S 3 , S 4 , and S 5 .
  • the replication policy specifies that every other snapshot in the upstream volume is to be copied to the downstream volume. Therefore, the downstream volume has replicated snapshots S 1 ′, S 3 ′, and S 5 ′.
  • the snapshots with the apostrophe mark refer to the data in the downstream system.
  • Replicating snapshot S 1 requires copying all the data from S 1 to S 1 ′ because there are no previous snapshots that have been replicated. However, replicating snapshot S 3 requires only copying the difference between S 3 and S 1 [S 3 -S 1 ]. In one embodiment, this method for replicating snapshots from the upstream to the downstream volume by copying the difference between two snapshots in time is referred to herein as snapshot delta replication (SDR).
  • SDR snapshot delta replication
  • SDR is an efficient process, but other times SDR is very inefficient.
  • two blocks, B 1 and B 2 are written to the volume after snapshot S 1 is taken but before snapshot S 3 is taken. If SDR is performed for snapshot S 3 using snapshot S 1 as the common snapshot, only blocks B 1 and B 2 will be replicated (i.e., transmitted to the downstream system) and SDR is efficient in this case.
  • snapshot S 1 is not available in the downstream system, then SDR would be inefficient as the complete volume would have to be transmitted to the downstream system.
  • FIG. 2 illustrates the partition of a volume into a plurality of chunks, where each chunk includes a plurality of blocks, according to one embodiment.
  • a downstream snapshot is not exactly the same as the upstream snapshot, e.g., because of a failure during the communication of the data from the upstream to the downstream volume.
  • the detection that the snapshots are not exactly equal may be performed by doing checksums of the upstream and downstream volumes. If the checksums don't match, then there is a problem with the copied data. An obvious and expensive solution is to recopy all the data again until the checksums match. However, re-copying large amounts of data may cause distress in the data storage system and impact performance, which means that that transfer of large amounts of data should be avoided during normal operating hours. Therefore, resending the data is not a desirable solution.
  • the volume is logically divided into large groups of data, referred to herein as chunks.
  • the size of a block is 4 KBytes, but other values are also possible, such as in the range from 256 bytes to 50 Kbytes or more.
  • a chunk (e.g., 16 MB) is usually much larger than a block, so the chunk includes a plurality of blocks.
  • the chunk is not addressable for accessing data from the volume and the chunk is only utilized for correcting the replication of snapshots, as described in more detail below.
  • Other embodiments may include other sizes for chunks, such as in the range from 1 megabytes to 100 megabytes, or in the range from 100 megabytes to 1 or several gigabytes.
  • the size of the chunk is 100 times the size of the block, but other multipliers may also be possible, such as 50 to 5000. Therefore, the size of the chunk may be 50 to 5000 times bigger than the size of the block.
  • FIG. 2 shows a volume that has been divided into chunks C 1 , C 2 , C 3 , etc. Further, each chunk contains blocks, such as block C 1 that includes blocks B 1 , B 2 , B 3 , etc.
  • the checksums performed can be of any type. In one embodiment, a cryptographically strong checksum is utilized. For example, a checksum that requires data read and checksum computation, provides SHA-1, 20-bytes long (e.g., 5 GB per TB, if transmitted for every 4K uncompressed block. In another embodiment, 16 bytes encryption is utilized. In another embodiment, the checksum is SHA-2.
  • checksum is a Fletcher checksum.
  • Fletcher checksum may be utilized for snapshots
  • SHA-1 checksum may be utilized for chunks or blocks.
  • the checksum is negotiated between the upstream and the downstream storage arrays during the CBR initialization period.
  • checksums may be performed over compressed or uncompressed data.
  • the checksum of uncompressed data is utilized but this requires decompression which causes higher resource utilization.
  • the checksum is performed over compressed data, however, this option may stop working when compression of blocks starts differing between upstream and downstream (e.g., due to background strong recompression taking place in the downstream system).
  • uncompressed checksums are stored for certain data ranges, and a larger checksum is formed by combining the data from the uncompressed checksums. This way, there is no need to decompress the blocks to obtain the checksums of the chunks.
  • FIG. 3 illustrates the content-based replication (CBR) method for validating data and correcting erroneous data transfers between two volumes, according to one embodiment.
  • CBR content-based replication
  • a snapshot S 1 is copied from an upstream volume to a snapshot S 1 ′ in the downstream volume.
  • the snapshots can be replicated by using the SDR method described above.
  • the network storage system may limit the CBR process to one volume at a time, in order to limit the stress on the system.
  • one or more volumes may skip the CBR process if the volumes have been created after a certain time (e.g., time when the storage array was upgraded past a known release with a potential replication problem).
  • the upstream and the downstream arrays may exchange CBR-related information, such as checksum type, checksum size, and how much data is covered by each checksum (e.g., size of the chunk, how many blocks in each chunk).
  • CBR-related information such as checksum type, checksum size, and how much data is covered by each checksum (e.g., size of the chunk, how many blocks in each chunk).
  • the validation of the snapshots can be initiated in different ways. For example, an administrator may request a storage array to check for the validity of a snapshot in a downstream volume, or an automated validating process may be initiated by the storage array. For example, a validating process may be initiated periodically or maybe initiated after the data center updates the software of one or more storage arrays, or as additional hardware (e.g. another storage array) is added to the network data system.
  • the upstream volume computes the checksum of S 1 , i.e., the checksum of the complete snapshot S 1 .
  • the upstream volume then sends a request to the downstream volume to provide the checksum of the downstream snapshot S 1 ′.
  • the downstream volume initiates the process for comparing the checksums.
  • some of the methods described herein include operations performed by the upstream volume (e.g., initiating the validation procedure, comparing checksums, etc.), but the same principles may be applied when the downstream volume perform these operations (e.g., initiating the validation of the replicated data) for validating replicated data.
  • the downstream volume then calculates S 1 ′ checksum (or retrieves the checksum from memory if the checksum is already available) and sends the checksum to the upstream volume.
  • the upstream volume compares the two checksums of S 1 and S 1 ′, and if the checksums match that snapshot is assumed to be correct (e.g., validated). However, if the checksums do not match, then the content-based replication CBR process is started.
  • a principle of CBR is to calculate the checksums of large amounts of data (e.g., for each chunk) instead of comparing the checksums for each of the individual blocks in the volume.
  • the system administrator gets an alert (on the downstream array, or on the upstream array, or on both). The alert indicates that the replicated snapshot is compromised (and maybe older snapshots too). After executing CBR, the system administrator will get another alert that the mismatch has been fixed in the most recent replicated snapshot.
  • the upstream volume sends a request to the downstream volume to start the CBR process, and sends information related to the process, such as the checksum type to be performed, the chunk size, and a cursor used to indicate at what chunk to start the CBR process.
  • the cursor is useful in case the CBR process is suspended for any reason, such as a system suffering downtime or a network-related problem (e.g., network disconnect). This way, when the upstream and the downstream volume are ready to continue with the suspended CBR process, the process does not have to be restarted from the beginning but from the place associated with the value of the cursor.
  • the cursor may be kept in the upstream volume, or in the downstream volume, or in both places.
  • the cursor is an identifier for a chunk in the volume, wherein all the chunks that preceded the identified chunk are considered to have been already validated.
  • the upstream and the downstream systems calculate the respective checksums C i and C i ′. Then the downstream array sends C i ′ checksum to the upstream array, and the upstream array compares checksums C i and C i ′. If the checksums match, the process continues with the next chunk, until all the chunks are validated. However, if the checksums C i and C i ′ do not match, the upstream storage array sends the data for chunk C i to the downstream array. When the last chunk has been validated, the upstream storage array sends a CBR complete notification to the downstream array.
  • the upstream array and the downstream array coordinate the validation process by checksumming and comparing a plurality of chunks simultaneously (e.g., in parallel), that is, the arrays do not have to wait till a chunk validation is completed to perform the validation of the next chunk and several chunk validation processes may be performed in parallel.
  • SDR and CBR may coexist in the same storage array, or even in the same volume, because at different times and under different circumstances one method may be preferred over others.
  • a per-volume state is maintained, in both the upstream and the downstream array, for managing and tracking content based replication of each volume.
  • the downstream volume's state is consulted during the replication protocol phase that occurs prior to the SDR data transfer phase. If the upstream or the downstream volume state indicates the need for content based replication to occur, the upstream and/or the downstream array coordinate with the storage control system to initiate CBR.
  • the upstream array sends an indication to the downstream array during the snapshot replication phase as to whether or not content based replication was carried out. This allows the downstream array to update the volume state, which includes clearing a flag that indicates a content based replication is needed, and updating a state to indicate the snapshot ID at which content based replication occurred. Also, the downstream array will issue an alert if the volume record indicates that errors took place (which need to be fixed at this point).
  • FIG. 4 illustrates the CBR process which includes checking block checksums, according to one embodiment.
  • the purpose of CBR is to perform checksums in large groups of data, and if the checksums fail, then sent only the data that is incorrectly replicated.
  • the volume has been divided into chunks, as shown in FIG. 2 , but the process may be performed iteratively and further divide each chunk into sub-chunks which are smaller than the chunks.
  • checksums for the sub-chunks are calculated and compared and the data for the sub-chunks that fail the validation is sent from the upstream array to the downstream array, instead of having to send the whole chunk.
  • the size of the chunk is between 5 times and 1000 times the size of the sub-chunk, but other value multipliers are also possible.
  • FIG. 4 is an example of a two-level CBR process, where the first level of validation is performed for the chunks, as described above with reference to FIG. 3 , and the second level of validation is performed at the block level.
  • blocks are being utilized as sub-chunks, any other size of sub-chunk may also be utilized.
  • the operations described in FIG. 4 are initially the same as the method in FIG. 3 , but the method diverges once the checksum for a chunk fails.
  • the second level CBR is initiated at the block level.
  • the upstream volume sends a command to the downstream volume that there has been a chunk checksum mismatch and block checksum is initiated.
  • the command includes information regarding the second level validation, such as a block cursor (similar to the chunk cursor), the chunk identifier for the validation, the number of blocks to be validated in the chunk, etc.
  • the upstream and the downstream volumes then calculate the checksum for a block B j and the downstream volume sends the checksum of B j ′.
  • the upstream volume compares the checksums of B j and B j ′, and if there is a mismatch the data for block Bj is sent to the downstream array.
  • identification is sent to the downstream array that the validation of that chunk has been completed.
  • the checksums for the chunk are rechecked to validate that the chunk is now correctly replicated.
  • the downstream array compares the checksums and notifies the upstream array which blocks to re-send.
  • the upstream and downstream arrays compute checksums and if the checksums don't match, the upstream array sends data to fix the mismatch.
  • the two states of verification and fixing can be done sequentially or it can be parallelized, for example if checksum of chunk 0 . . . 16 MB of bin1 does not match, the system will start fixing 0 . . . 16 MB while performing checksum on the next chunk 16 MB . . . 32 MB.
  • FIG. 5 illustrates the read and write paths within the storage array, according to one embodiment.
  • the initiator 106 in the host 104 sends the write request to the storage array 102 .
  • the write data is written into NVRAM 108 , and an acknowledgment is sent back to the initiator (e.g., the host or application making the request).
  • storage array 102 supports variable block sizes. Data blocks in the NVRAM 108 are grouped together to form a segment that includes a plurality of data blocks, which may be of different sizes. The segment is compressed and then written to HDD 110 .
  • the segment is also written to the SSD cache 112 .
  • the segment is written to the SSD 112 in parallel while writing the segment to HDD 110 .
  • the performance of the write path is driven by the flushing of NVRAM 108 to disk 110 .
  • the initiator 106 sends a read request to storage array 102 .
  • the requested data may be found in any of the different levels of storage mediums of the storage array 102 .
  • a check is made to see if the data is found in RAM (not shown), which is a shadow memory of NVRAM 108 , and if the data is found in RAM then the data is read from RAM and sent back to the initiator 106 .
  • the shadow RAM memory e.g., DRAM
  • the data is also written to the shadow RAM so the read operations can be served from the shadow RAM leaving the NVRAM free for processing write operations.
  • the data is not found in the shadow RAM then a check is made to determine if the data is in cache, and if so (i.e., cache hit), the data is read from the flash cache 112 and sent to the initiator 106 . If the data is not found in the NVRAM 108 nor in the flash cache 112 , then the data is read from the hard drives 110 and sent to the initiator 106 . In addition, if the data being served from hard disk 110 is cache worthy, then the data is also cached in the SSD cache 112 .
  • FIG. 6 illustrates an example of a configuration where multiple arrays can be made part of a group (i.e., a cluster), in accordance with one embodiment.
  • a group is configured by storage arrays that have also been associated with pools 1150 , 1152 .
  • array 1 and array 2 are associated with pool 1150 .
  • the arrays 1 and 2 of pool 1150 are configured with volume 1 1160 - 1 and array 3 is configured in pool 1152 for managing volume 2 1160 - 2 .
  • Pool 1152 that currently contains volume 2 can be grown by adding additional arrays to increase performance and storage capacity. Further illustrated is the ability to replicate a particular group, such as group A to group B, while maintaining the configuration settings for the pools and volumes associated with group A.
  • a volume can be configured to span multiple storage arrays of a storage pool.
  • arrays in a volume are members of a storage pool.
  • the default storage pool may be pool 1150 that includes array 1 and array 2 .
  • pools can be used to separate organizational sensitive data, such as finance and human resources to meet security requirements.
  • pooling can also be made by application type. In some embodiments, it is possible to selectively migrate volumes from one pool to another pool.
  • the migration of pools can include migration of their associated snapshots, and volumes can support reads/writes during migration processes.
  • existing pools can add arrays to boost performance and storage capacity or evacuate arrays from existing pools (e.g., when storage and/or performance is no longer needed or when one array is being replaced with another array).
  • logic in the storage OS allows for merging of pools of a group. This is useful when combining storage resources that were previously in separate pools, thus increasing performance scaling across multiple arrays.
  • groups and storage pools aggregate arrays for management while storage pools aggregate arrays for capacity and performance.
  • some operations on storage pools may include creating and deleting storage pools, adding and removing arrays to or from storage pools, merging storage pools, and the like.
  • a command line may be provided to access a particular pool, which allows management of multiple storage arrays via the command line (CLI) interface.
  • a scale-out set up can be created by either performing a group merge or adding an array.
  • a group merge is meant to merge two arrays that are already set up and have objects and data stored thereon. The merge process ensures that there are no duplicate objects and the merge adheres to other rules around replication, online volumes, etc.
  • Multi-array groups can also be created by adding an underutilized array to another existing array.
  • storage pools are rebalanced when storage objects such as arrays, pools and volumes are added, removed or merged.
  • Rebalancing is a non-disruptive low-impact process that allows application IO to continue uninterrupted even to the data sets during migration. Pool rebalancing gives highest priority to active data IO and performs the rebalancing process with a lower priority.
  • a group may be associated with several arrays, and at least one array is designated as the group leader (GL).
  • the group leader has the configuration files and data that it maintains to manage the group of arrays.
  • a backup group leader (BGL) may be identified as one of the members of the storage arrays.
  • the GL is the storage array manager, while the other arrays of the group are member arrays.
  • the GL may be migrated to another member array in case of a failure or possible failure at the array operating as the GL.
  • the configuration files are replicated at the BGL, the BGL is the one that takes the role as a new GL and another member array is designated as the BGL.
  • volumes are striped across a particular pool of arrays.
  • group configuration data (configuration files and data managed by a GL) is stored in a common location and is replicated to the BGL.
  • a single management IP (Internet Protocol) address is used to access the group.
  • Benefits of a centrally managed group include single volume collections across the group, snapshot and replication schedules spanning the group, added level of security by creating pools, shared access control lists (ACLs), high availability, and general array administration that operates at the group level and CLI command access to the specific group.
  • ACLs shared access control lists
  • the storage scale-out architecture allows management of a storage cluster that spreads volumes and their IO requests between multiple arrays.
  • a host cannot assume that a volume can be accessed through specific paths to one specific array or another.
  • the disclosed storage scale-out clusters advertise one IP address (e.g., iSCSI discovery).
  • Volume IO requests are redirected to the appropriate array by leveraging deep integration with the host operating system platforms (e.g., Microsoft, VMware, etc.), or using iSCSI redirection.
  • FIG. 7 illustrates the architecture of a storage array, according to one embodiment.
  • storage array 102 includes an active controller 1120 , a standby controller 1124 , one or more HDDs 110 , and one or more SSDs 112 .
  • the controller 1120 includes non-volatile RAM (NVRAM) 1118 , which is for storing the incoming data as the data arrives to the storage array. After the data is processed (e.g., compressed and organized in segments (e.g., coalesced)), the data is transferred from the NVRAM 1118 to HDD 110 , or to SSD 112 , or to both.
  • NVRAM non-volatile RAM
  • the active controller 1120 further includes CPU 1108 , general-purpose RAM 1112 (e.g., used by the programs executing in CPU 1108 ), input/output module 1110 for communicating with external devices (e.g., USB port, terminal port, connectors, plugs, links, etc.), one or more network interface cards (NICs) 1114 for exchanging data packages through network 1156 , one or more power supplies 1116 , a temperature sensor (not shown), and a storage connect module 1122 for sending and receiving data to and from the HDD 110 and SSD 112 .
  • standby controller 1124 includes the same components as active controller 1120 .
  • Active controller 1120 is configured to execute one or more computer programs stored in RAM 1112 .
  • One of the computer programs is the storage operating system (OS) used to perform operating system functions for the active controller device.
  • OS storage operating system
  • one or more expansion shelves 1130 may be coupled to storage array 102 to increase HDD 1132 capacity, or SSD 1134 capacity, or both.
  • Active controller 1120 and standby controller 1124 have their own NVRAMs, but they share HDDs 110 and SSDs 112 .
  • the standby controller 1124 receives copies of what gets stored in the NVRAM 1118 of the active controller 1120 and stores the copies in its own NVRAM. If the active controller 1120 fails, standby controller 1124 takes over the management of the storage array 102 .
  • servers also referred to herein as hosts
  • read/write requests e.g., IO requests
  • the storage array 102 stores the sent data or sends back the requested data to host 104 .
  • Host 104 is a computing device including a CPU 1150 , memory (RAM) 1146 , permanent storage (HDD) 1142 , a NIC card 1152 , and an IO module 1154 .
  • the host 104 includes one or more applications 1136 executing on CPU 1150 , a host operating system 1138 , and a computer program storage array manager 1140 that provides an interface for accessing storage array 102 to applications 1136 .
  • Storage array manager 1140 includes an initiator 1144 and a storage OS interface program 1148 . When an IO operation is requested by one of the applications 1136 , the initiator 1144 establishes a connection with storage array 102 in one of the supported formats (e.g., iSCSI, Fibre Channel, or any other protocol).
  • the storage OS interface 1148 provides console capabilities for managing the storage array 102 by communicating with the active controller 1120 and the storage OS 1106 executing therein.
  • resources from the storage array 102 are required. Some of these resources may be a bottleneck in the processing of storage requests because the resources are over utilized, or are slow, or for any other reason.
  • the CPU and the hard drives of the storage array 102 can become over utilized and become performance bottlenecks.
  • the CPU may become very busy because the CPU is utilized for processing storage IO requests while also performing background tasks, such as garbage collection, snapshots, replication, alert reporting, etc.
  • the SSD cache which is a fast responding system, may press the CPU for cycles, thus causing potential bottlenecks for other requested IOs or for processing background operations.
  • the hard disks may also become a bottleneck because the inherent access speed to data is slow when compared to accessing data from memory (e.g., NVRAM) or SSD.
  • NVRAM memory
  • Embodiments presented herein are described with reference to CPU and HDD bottlenecks, but the same principles may be applied to other resources, such as a system with insufficient amount of NVRAM.
  • SSDs functioning as flash cache should be understood to operate the SSD as a cache for block level data access, providing service to read operations instead of only reading from HDDs 110 .
  • the storage operating system 1106 is configured with an algorithm that allows for intelligent writing of certain data to the SSDs 112 (e.g., cache-worthy data), and all data is written directly to the HDDs 110 from NVRAM 1118 .
  • the algorithm in one embodiment, is configured to select cache-worthy data for writing to the SSDs 112 , in a manner that provides in increased likelihood that a read operation will access data from SSDs 112 .
  • the algorithm is referred to as a cache accelerated sequential layout (CASL) architecture, which intelligently leverages unique properties of flash and disk to provide high performance and optimal use of capacity.
  • CASL cache accelerated sequential layout
  • CASL caches “hot” active data onto SSD in real time—without the need to set complex policies. This way, the storage array can instantly respond to read requests—as much as ten times faster than traditional bolt-on or tiered approaches to flash caching.
  • CASL CASL as being an algorithm processed by the storage OS.
  • optimizations, modifications, additions, and subtractions to versions of CASL may take place from time to time.
  • reference to CASL should be understood to represent exemplary functionality, and the functionality may change from time to time, and may be modified to include or exclude features referenced herein or incorporated by reference herein.
  • embodiments described herein are just examples, and many more examples and/or implementations may be defined by combining elements and/or omitting elements described with reference to the claimed features.
  • SSDs 112 may be referred to as flash, or flash cache, or flash-based memory cache, or flash drives, storage flash, or simply cache. Consistent with the use of these terms, in the context of storage array 102 , the various implementations of SSD 112 provide block level caching to storage, as opposed to instruction level caching. As mentioned above, one functionality enabled by algorithms of the storage OS 1106 is to provide storage of cache-worthy block level data to the SSDs, so that subsequent read operations are optimized (i.e., reads that are likely to hit the flash cache will be stored to SSDs 12 , as a form of storage caching, to accelerate the performance of the storage array 102 ).
  • the “block level processing” of SSDs 112 serving as storage cache, is different than “instruction level processing,” which is a common function in microprocessor environments.
  • microprocessor environments utilize main memory, and various levels of cache memory (e.g., L 1 , L 2 , etc.).
  • Instruction level caching is differentiated further, because instruction level caching is block-agnostic, meaning that instruction level caching is not aware of what type of application is producing or requesting the data processed by the microprocessor.
  • the microprocessor is required to treat all instruction level caching equally, without discriminating or differentiating processing of different types of applications.
  • the storage caching facilitated by SSDs 112 is implemented by algorithms exercised by the storage OS 1106 , which can differentiate between the types of blocks being processed for each type of application or applications. That is, block data being written to storage 1130 can be associated with block data specific applications. For instance, one application may be a mail system application, while another application may be a financial database application, and yet another may be for a website-hosting application. Each application can have different storage accessing patterns and/or requirements.
  • block data e.g., associated with the specific applications
  • FIG. 8 is a flow chart of a method for replicating data across storage systems, according to one embodiment.
  • Operation 802 is for transferring the snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array. From operation 802 , the method flows to operation 804 for comparing an upstream snapshot checksum (usc) of the snapshot in the upstream array with a downstream snapshot checksum (dsc) of the snapshot in the downstream array.
  • usc upstream snapshot checksum
  • dsc downstream snapshot checksum
  • a check is made to determine if usc is equal to dsc. If usc is equal to dsc, the method flows to operation 818 , and if usc is not equal to dsc, the method flows to operation 808 . In operation 808 , a plurality of chunks is defined in the snapshot.
  • the method flows to operation 810 where a comparison is made of an upstream chunk checksum (ucc) calculated by the upstream array with a downstream chunk checksum (dsc) calculated by the downstream array.
  • the snapshot operations described herein solve the problem of having to re-send all the data of a volume from one storage system to another storage system when a problem occurs while replicating data.
  • the data of the original snapshot is kept in permanent storage of the upstream array, and the replicated data is kept in permanent storage in the downstream array.
  • the volume is divided into logical groups of data, referred to as chunks, and then each chunk is validated. When all the chunks have been validated, the replicated version of the volume is considered validated.
  • the operations described herein refer to the exchange of information between two separate storage devices, which exchange data and transfer parameters to validate the replicated data. By re-transmitting data only chunks that have been improperly replicated, savings in time and resources are attained because the data of the chunks that have been correctly replicated does not have to be re-transmitted.
  • One or more embodiments can also be fabricated as computer readable code on a non-transitory computer readable storage medium.
  • the non-transitory computer readable storage medium is any non-transitory data storage device that can store data, which can be thereafter be read by a computer system. Examples of the non-transitory computer readable storage medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes and other optical and non-optical data storage devices.
  • the non-transitory computer readable storage medium can include computer readable storage medium distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.

Abstract

Methods, systems, and computer programs are presented for replicating data across storage systems. One method includes an operation for transferring a snapshot of a volume from an upstream array to a downstream array. The method further includes an operation for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array. When the upstream snapshot checksum is different from the downstream snapshot checksum, a plurality of chunks in the snapshot are defined. Further, for each chunk in the snapshot, a comparison is made of an upstream chunk checksum calculated by the upstream array with a downstream chunk checksum calculated by the downstream array. When the upstream chunk checksum is different from the downstream chunk checksum then the data of the chunk is sent from the upstream array to the downstream array.

Description

    CLAIM OF PRIORITY
  • This application claims priority from U.S. Provisional Patent Application No. 62/084,395, filed Nov. 25, 2014, entitled “Content-Based Replication of Data Between Storage Units,” and from U.S. Provisional Patent Application No. 62/084,403, filed Nov. 25, 2014, entitled “Content-Based Replication of Data in Scale Out System.” These provisional applications are herein incorporated by reference.
  • CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is related by subject matter to U.S. patent application Ser. No. ______ (Attorney Docket No. NIMSP112) filed on the same day as the instant application and entitled “Content-Based Replication of Data in Scale Out System”, which is incorporated herein by reference.
  • BACKGROUND
  • 1. Field of the Invention
  • The present embodiments relate to methods, systems, and programs for replicating data in a networked storage system.
  • 2. Description of the Related Art
  • Network storage, also referred to as network storage systems or storage systems, is computer data storage connected to a computer network providing data access to heterogeneous clients. Typically network storage systems process a large amount of Input/Output (IO) requests, and high availability, speed, and reliability are desirable characteristics of network storage.
  • Sometimes data is copied from one system to another, such as when an organization upgrades to a new data storage device, when backing up data to a different location, or when backing up data for the purpose of disaster recovery. The data needs to be migrated or replicated to the new device from the old device.
  • However, when transferring large volumes of data, there could be some glitches during the transfer/replication process, and some of the data may be improperly transferred. It may be very expensive resource wise to retransfer all the data, because it may take a large amount of processor and network resources that may impact the ongoing operation of the data service. Also, when data is being replicated to a different storage system, there could be a previous snapshot of the data in both systems. If a change is detected between snapshots being replicated, it may be very expensive to transmit over the network large amounts of data if only a small portion of the data has changed. Further yet, if a common base snapshot is lost, resending all the data may be very expensive.
  • What is needed is a network storage device, software, and systems that provide verification of the correct transfer of large amounts of data from one system to another, as well as ways to correct errors found during the replication process.
  • It is in this context that embodiments arise.
  • SUMMARY
  • The present embodiments relate to fixing problems when data is replicated from a first system to a second system. It should be appreciated that the present embodiments can be implemented in numerous ways, such as a method, an apparatus, a system, a device, or a computer program on a computer readable medium. Several embodiments are described below.
  • One aspect includes a method for replicating data across storage systems. The method includes an operation for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array. The method further includes comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array. When the upstream snapshot checksum is different from the downstream snapshot checksum, a plurality of chunks is defined in the snapshot. For each chunk in the snapshot, an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array. Further, the method includes an operation for sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
  • One general aspect includes a method for replicating data across storage systems. The method includes an operation for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array. The method also includes an operation for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array. When the upstream snapshot checksum is different from the downstream snapshot checksum, a plurality of chunks is defined in the snapshot. For each chunk in the snapshot, an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array. When the upstream chunk checksum is different from the downstream chunk checksum, a plurality of blocks is defined in the chunk. Further, for each block in the chunk an upstream block checksum calculated by the upstream array is compared with a downstream block checksum calculated by the downstream array. When the upstream block checksum is different from the downstream block checksum data of the block is sent from the upstream array to the downstream array.
  • One aspect includes a non-transitory computer-readable storage medium storing a computer program for replicating data across storage systems. The computer-readable storage medium includes program instructions for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array, and program instructions for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array. When the upstream snapshot checksum is different from the downstream snapshot checksum, a plurality of chunks is defined in the snapshot. For each chunk in the snapshot, an upstream chunk checksum calculated by the upstream array is compared with a downstream chunk checksum calculated by the downstream array. The storage medium further includes program instructions for sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
  • Other aspects will become apparent from the following detailed description, taken in conjunction with the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The embodiments may best be understood by reference to the following description taken in conjunction with the accompanying drawings.
  • FIG. 1 illustrates the replication of the snapshots from one system to another, according to one embodiment.
  • FIG. 2 illustrates the partition of a volume into a plurality of chunks, where each chunk may include a plurality of blocks, according to one embodiment.
  • FIG. 3 illustrates the content-based replication (CBR) method for validating data and correcting erroneous data between two volumes, according to one embodiment.
  • FIG. 4 illustrates the CBR process which includes checking block checksums, according to one embodiment.
  • FIG. 5 illustrates the read and write paths within the storage array, according to one embodiment.
  • FIG. 6 illustrates an example of a configuration where multiple arrays can be made part of a group (i.e., a cluster), in accordance with one embodiment.
  • FIG. 7 illustrates the architecture of a storage array, according to one embodiment.
  • FIG. 8 is a flow chart of a method for replicating data across storage systems, according to one embodiment.
  • DETAILED DESCRIPTION
  • The following embodiments describe methods, devices, systems, and computer programs for replicating data across storage systems. It will be apparent, that the present embodiments may be practiced without some or all of these specific details. In other instances, well-known process operations have not been described in detail in order not to unnecessarily obscure the present embodiments.
  • In some implementations, a Snapshot Delta Replication (SDR) method is used to replicate snapshots of data volumes in a network storage device. However, something could have gone wrong during the replication, and a check is made to determine if the replicated snapshot is correct. If the replication is not completely correct, the data would have to be resent, which may be very resource costly. In order to avoid having to replicate all the data again, a Content-Based Replication (CBR) method is used to minimize the amount of data needed to fix the replicated snapshot.
  • With the CBR method, volume checksums are made at the upstream system (system being replicated) and the downstream system (system where the replicated data is received). If the checksums do not match, the volume is divided into large pieces of data, referred to herein as chunks, (e.g., 16 MB although other values are also possible). Then checksums are performed for each chunk, at the upstream system and the downstream system. If corresponding pair of checksums for the same chunk do not match at the upstream and the downstream systems, then the upstream system sends the chunk of data to the downstream system.
  • In one embodiment, another level of iteration is used to further divide the chunks into smaller pieces and perform checksums on the smaller pieces. For example, checksums of the blocks within a chunk can be compared, and then the blocks that have mismatched checksums are transmitted over the network.
  • In another embodiment, an automated program determines when CBR is to be performed, based on system parameters (design by the system designers), or user configuration (e.g., once a week), or based on heuristics that determine when the risk of an incorrect replication is high (e.g., after installing a new release). For example, in one embodiment, CBR could be more efficient for replication seeding than SDR when a common base snapshot is not found between the upstream and the downstream volumes, however, the downstream volume may already have blocks of the volume due to an earlier SDR.
  • FIG. 1 illustrates the replication of the snapshots from one system to another, according to one embodiment. In one embodiment, a volume is a single accessible storage area, reserved for one application or one host, or for a group of users of an organization or to segment/separate types of data for security or accessibly. In one embodiment, the data of the volume is divided into blocks, and the data from the volume is accessed by identifying a block (e.g., identifying an offset associated with the block being retrieved). That is, data from the volume is accessed by the host in units of a size of the block, and the block is the smallest amount of data that can be requested from the volume. The networked storage device where the data is stored is also referred to herein as a storage array or a storage system.
  • In one embodiment, a first system creates snapshots of a volume over time (e.g., S1, S2, S3, etc.). The volume replicates one or more of the snapshots to a second volume, for example to provide backup of the data in a different location or in a different storage array.
  • The storage array that holds the source data to be copied is referred to as the upstream storage array, or the upstream system, or the base storage array, and the storage array that receives a copy of the data is referred to as the downstream storage array or the downstream system. When SDR is in the process of replicating a snapshot to create a replicated snapshot in another storage array, to compute what blocks need to be transferred, SDR uses a base snapshot that is already present on the downstream as well as on the upstream. This common snapshot is also referred to as the common ancestor snapshot. After SDR is complete, the replicated snapshot is present on both the upstream and the downstream storage arrays.
  • In one embodiment, replication means copying all the data from the upstream volume to the downstream volume. In some embodiments, if the common ancestor snapshot of the volume has already been replicated, the replication of a later snapshot includes copying only the data that has changed, which is also referred to herein as the delta data or the difference between the two snapshots. It is noted that not all the snapshots in the upstream volume have to be replicated to the downstream volume.
  • For example, in the exemplary embodiment of FIG. 1, the upstream volume has over time generated five snapshots, S1, S2, S3, S4, and S5. The replication policy specifies that every other snapshot in the upstream volume is to be copied to the downstream volume. Therefore, the downstream volume has replicated snapshots S1′, S3′, and S5′. As used herein, the snapshots with the apostrophe mark refer to the data in the downstream system.
  • Replicating snapshot S1 requires copying all the data from S1 to S1′ because there are no previous snapshots that have been replicated. However, replicating snapshot S3 requires only copying the difference between S3 and S1 [S3-S1]. In one embodiment, this method for replicating snapshots from the upstream to the downstream volume by copying the difference between two snapshots in time is referred to herein as snapshot delta replication (SDR).
  • Sometimes, SDR is an efficient process, but other times SDR is very inefficient. For example, in one scenario, two blocks, B1 and B2 are written to the volume after snapshot S1 is taken but before snapshot S3 is taken. If SDR is performed for snapshot S3 using snapshot S1 as the common snapshot, only blocks B1 and B2 will be replicated (i.e., transmitted to the downstream system) and SDR is efficient in this case. However, if for some reason, snapshot S1 is not available in the downstream system, then SDR would be inefficient as the complete volume would have to be transmitted to the downstream system.
  • FIG. 2 illustrates the partition of a volume into a plurality of chunks, where each chunk includes a plurality of blocks, according to one embodiment. Sometimes, a downstream snapshot is not exactly the same as the upstream snapshot, e.g., because of a failure during the communication of the data from the upstream to the downstream volume.
  • In one embodiment, the detection that the snapshots are not exactly equal may be performed by doing checksums of the upstream and downstream volumes. If the checksums don't match, then there is a problem with the copied data. An obvious and expensive solution is to recopy all the data again until the checksums match. However, re-copying large amounts of data may cause distress in the data storage system and impact performance, which means that that transfer of large amounts of data should be avoided during normal operating hours. Therefore, resending the data is not a desirable solution.
  • In one embodiment, the volume is logically divided into large groups of data, referred to herein as chunks. In one embodiment, the size of a block is 4 KBytes, but other values are also possible, such as in the range from 256 bytes to 50 Kbytes or more.
  • A chunk (e.g., 16 MB) is usually much larger than a block, so the chunk includes a plurality of blocks. In one embodiment, the chunk is not addressable for accessing data from the volume and the chunk is only utilized for correcting the replication of snapshots, as described in more detail below. Other embodiments may include other sizes for chunks, such as in the range from 1 megabytes to 100 megabytes, or in the range from 100 megabytes to 1 or several gigabytes. In one embodiment, the size of the chunk is 100 times the size of the block, but other multipliers may also be possible, such as 50 to 5000. Therefore, the size of the chunk may be 50 to 5000 times bigger than the size of the block.
  • FIG. 2 shows a volume that has been divided into chunks C1, C2, C3, etc. Further, each chunk contains blocks, such as block C1 that includes blocks B1, B2, B3, etc. The checksums performed can be of any type. In one embodiment, a cryptographically strong checksum is utilized. For example, a checksum that requires data read and checksum computation, provides SHA-1, 20-bytes long (e.g., 5 GB per TB, if transmitted for every 4K uncompressed block. In another embodiment, 16 bytes encryption is utilized. In another embodiment, the checksum is SHA-2.
  • Another possible checksum is a Fletcher checksum. Further, several types of checksums may be utilized depending on the size of the data to be checksumed. For example, a Fletcher checksum may be utilized for snapshots, and an SHA-1 checksum may be utilized for chunks or blocks. In one embodiment, the checksum is negotiated between the upstream and the downstream storage arrays during the CBR initialization period.
  • Further, the checksums may be performed over compressed or uncompressed data. In one embodiment, the checksum of uncompressed data is utilized but this requires decompression which causes higher resource utilization. In another embodiment, the checksum is performed over compressed data, however, this option may stop working when compression of blocks starts differing between upstream and downstream (e.g., due to background strong recompression taking place in the downstream system). In yet another embodiment, uncompressed checksums are stored for certain data ranges, and a larger checksum is formed by combining the data from the uncompressed checksums. This way, there is no need to decompress the blocks to obtain the checksums of the chunks.
  • FIG. 3 illustrates the content-based replication (CBR) method for validating data and correcting erroneous data transfers between two volumes, according to one embodiment. In one embodiment, a snapshot S1 is copied from an upstream volume to a snapshot S1′ in the downstream volume. For example, the snapshots can be replicated by using the SDR method described above. In one embodiment, the network storage system may limit the CBR process to one volume at a time, in order to limit the stress on the system. In another embodiment, one or more volumes may skip the CBR process if the volumes have been created after a certain time (e.g., time when the storage array was upgraded past a known release with a potential replication problem).
  • At start time, the upstream and the downstream arrays may exchange CBR-related information, such as checksum type, checksum size, and how much data is covered by each checksum (e.g., size of the chunk, how many blocks in each chunk).
  • The validation of the snapshots can be initiated in different ways. For example, an administrator may request a storage array to check for the validity of a snapshot in a downstream volume, or an automated validating process may be initiated by the storage array. For example, a validating process may be initiated periodically or maybe initiated after the data center updates the software of one or more storage arrays, or as additional hardware (e.g. another storage array) is added to the network data system.
  • In one embodiment, the upstream volume computes the checksum of S1, i.e., the checksum of the complete snapshot S1. The upstream volume then sends a request to the downstream volume to provide the checksum of the downstream snapshot S1′. In other embodiment, the downstream volume initiates the process for comparing the checksums. In general, some of the methods described herein include operations performed by the upstream volume (e.g., initiating the validation procedure, comparing checksums, etc.), but the same principles may be applied when the downstream volume perform these operations (e.g., initiating the validation of the replicated data) for validating replicated data.
  • The downstream volume then calculates S1′ checksum (or retrieves the checksum from memory if the checksum is already available) and sends the checksum to the upstream volume. The upstream volume compares the two checksums of S1 and S1′, and if the checksums match that snapshot is assumed to be correct (e.g., validated). However, if the checksums do not match, then the content-based replication CBR process is started.
  • A principle of CBR is to calculate the checksums of large amounts of data (e.g., for each chunk) instead of comparing the checksums for each of the individual blocks in the volume. In one embodiment, when a mismatch is found, the system administrator gets an alert (on the downstream array, or on the upstream array, or on both). The alert indicates that the replicated snapshot is compromised (and maybe older snapshots too). After executing CBR, the system administrator will get another alert that the mismatch has been fixed in the most recent replicated snapshot.
  • The upstream volume sends a request to the downstream volume to start the CBR process, and sends information related to the process, such as the checksum type to be performed, the chunk size, and a cursor used to indicate at what chunk to start the CBR process. The cursor is useful in case the CBR process is suspended for any reason, such as a system suffering downtime or a network-related problem (e.g., network disconnect). This way, when the upstream and the downstream volume are ready to continue with the suspended CBR process, the process does not have to be restarted from the beginning but from the place associated with the value of the cursor. In one embodiment, the cursor may be kept in the upstream volume, or in the downstream volume, or in both places. In one embodiment, the cursor is an identifier for a chunk in the volume, wherein all the chunks that preceded the identified chunk are considered to have been already validated.
  • For each chunk, the upstream and the downstream systems calculate the respective checksums Ci and Ci′. Then the downstream array sends Ci′ checksum to the upstream array, and the upstream array compares checksums Ci and Ci′. If the checksums match, the process continues with the next chunk, until all the chunks are validated. However, if the checksums Ci and Ci′ do not match, the upstream storage array sends the data for chunk Ci to the downstream array. When the last chunk has been validated, the upstream storage array sends a CBR complete notification to the downstream array.
  • In some embodiments, the upstream array and the downstream array coordinate the validation process by checksumming and comparing a plurality of chunks simultaneously (e.g., in parallel), that is, the arrays do not have to wait till a chunk validation is completed to perform the validation of the next chunk and several chunk validation processes may be performed in parallel.
  • It is noted that SDR and CBR may coexist in the same storage array, or even in the same volume, because at different times and under different circumstances one method may be preferred over others.
  • In one embodiment, a per-volume state is maintained, in both the upstream and the downstream array, for managing and tracking content based replication of each volume. The downstream volume's state is consulted during the replication protocol phase that occurs prior to the SDR data transfer phase. If the upstream or the downstream volume state indicates the need for content based replication to occur, the upstream and/or the downstream array coordinate with the storage control system to initiate CBR.
  • Once the data transfer phase has completed, the upstream array sends an indication to the downstream array during the snapshot replication phase as to whether or not content based replication was carried out. This allows the downstream array to update the volume state, which includes clearing a flag that indicates a content based replication is needed, and updating a state to indicate the snapshot ID at which content based replication occurred. Also, the downstream array will issue an alert if the volume record indicates that errors took place (which need to be fixed at this point).
  • FIG. 4 illustrates the CBR process which includes checking block checksums, according to one embodiment. As discussed above with reference to FIG. 3, the purpose of CBR is to perform checksums in large groups of data, and if the checksums fail, then sent only the data that is incorrectly replicated. The volume has been divided into chunks, as shown in FIG. 2, but the process may be performed iteratively and further divide each chunk into sub-chunks which are smaller than the chunks.
  • If the checksum for a chunk fails, then checksums for the sub-chunks are calculated and compared and the data for the sub-chunks that fail the validation is sent from the upstream array to the downstream array, instead of having to send the whole chunk. In one embodiment, the size of the chunk is between 5 times and 1000 times the size of the sub-chunk, but other value multipliers are also possible.
  • FIG. 4 is an example of a two-level CBR process, where the first level of validation is performed for the chunks, as described above with reference to FIG. 3, and the second level of validation is performed at the block level. Although blocks are being utilized as sub-chunks, any other size of sub-chunk may also be utilized.
  • The operations described in FIG. 4 are initially the same as the method in FIG. 3, but the method diverges once the checksum for a chunk fails. In this case, the second level CBR is initiated at the block level. The upstream volume sends a command to the downstream volume that there has been a chunk checksum mismatch and block checksum is initiated. The command includes information regarding the second level validation, such as a block cursor (similar to the chunk cursor), the chunk identifier for the validation, the number of blocks to be validated in the chunk, etc.
  • The upstream and the downstream volumes then calculate the checksum for a block Bj and the downstream volume sends the checksum of Bj′. The upstream volume compares the checksums of Bj and Bj′, and if there is a mismatch the data for block Bj is sent to the downstream array. In one embodiment, once all the blocks in the chunk are validated, identification is sent to the downstream array that the validation of that chunk has been completed. In one embodiment, the checksums for the chunk are rechecked to validate that the chunk is now correctly replicated. In one embodiment, the downstream array compares the checksums and notifies the upstream array which blocks to re-send.
  • In CBR, the upstream and downstream arrays compute checksums and if the checksums don't match, the upstream array sends data to fix the mismatch. The two states of verification and fixing can be done sequentially or it can be parallelized, for example if checksum of chunk 0 . . . 16 MB of bin1 does not match, the system will start fixing 0 . . . 16 MB while performing checksum on the next chunk 16 MB . . . 32 MB.
  • FIG. 5 illustrates the read and write paths within the storage array, according to one embodiment. Regarding the write path, the initiator 106 in the host 104 sends the write request to the storage array 102. As the write data comes in, the write data is written into NVRAM 108, and an acknowledgment is sent back to the initiator (e.g., the host or application making the request). In one embodiment, storage array 102 supports variable block sizes. Data blocks in the NVRAM 108 are grouped together to form a segment that includes a plurality of data blocks, which may be of different sizes. The segment is compressed and then written to HDD 110. In addition, if the segment is considered to be cache-worthy (i.e., important enough to be cached or likely to be accessed again) the segment is also written to the SSD cache 112. In one embodiment, the segment is written to the SSD 112 in parallel while writing the segment to HDD 110.
  • In one embodiment, the performance of the write path is driven by the flushing of NVRAM 108 to disk 110. With regards to the read path, the initiator 106 sends a read request to storage array 102. The requested data may be found in any of the different levels of storage mediums of the storage array 102. First, a check is made to see if the data is found in RAM (not shown), which is a shadow memory of NVRAM 108, and if the data is found in RAM then the data is read from RAM and sent back to the initiator 106. In one embodiment, the shadow RAM memory (e.g., DRAM) keeps a copy of the data in the NVRAM and the read operations are served from the shadow RAM memory. When data is written to the NVRAM, the data is also written to the shadow RAM so the read operations can be served from the shadow RAM leaving the NVRAM free for processing write operations.
  • If the data is not found in the shadow RAM then a check is made to determine if the data is in cache, and if so (i.e., cache hit), the data is read from the flash cache 112 and sent to the initiator 106. If the data is not found in the NVRAM 108 nor in the flash cache 112, then the data is read from the hard drives 110 and sent to the initiator 106. In addition, if the data being served from hard disk 110 is cache worthy, then the data is also cached in the SSD cache 112.
  • FIG. 6 illustrates an example of a configuration where multiple arrays can be made part of a group (i.e., a cluster), in accordance with one embodiment. In this example, a group is configured by storage arrays that have also been associated with pools 1150, 1152. For example, array 1 and array 2 are associated with pool 1150. The arrays 1 and 2 of pool 1150 are configured with volume 1 1160-1 and array 3 is configured in pool 1152 for managing volume 2 1160-2. Pool 1152 that currently contains volume 2, can be grown by adding additional arrays to increase performance and storage capacity. Further illustrated is the ability to replicate a particular group, such as group A to group B, while maintaining the configuration settings for the pools and volumes associated with group A.
  • As shown, a volume can be configured to span multiple storage arrays of a storage pool. In this configuration, arrays in a volume are members of a storage pool. In one example, if an array is added to a group and the array if not specified to a particular pool, the array will be made a member of a default storage pool. For instance, in FIG. 6, the default storage pool may be pool 1150 that includes array 1 and array 2. In one embodiment, pools can be used to separate organizational sensitive data, such as finance and human resources to meet security requirements. In additional to pooling by organization, pooling can also be made by application type. In some embodiments, it is possible to selectively migrate volumes from one pool to another pool. The migration of pools can include migration of their associated snapshots, and volumes can support reads/writes during migration processes. In yet another feature, existing pools can add arrays to boost performance and storage capacity or evacuate arrays from existing pools (e.g., when storage and/or performance is no longer needed or when one array is being replaced with another array). Still further, logic in the storage OS allows for merging of pools of a group. This is useful when combining storage resources that were previously in separate pools, thus increasing performance scaling across multiple arrays.
  • The difference between groups and storage pools is that groups aggregate arrays for management while storage pools aggregate arrays for capacity and performance. As noted above, some operations on storage pools may include creating and deleting storage pools, adding and removing arrays to or from storage pools, merging storage pools, and the like. In one example, a command line may be provided to access a particular pool, which allows management of multiple storage arrays via the command line (CLI) interface. In one embodiment, a scale-out set up can be created by either performing a group merge or adding an array. A group merge is meant to merge two arrays that are already set up and have objects and data stored thereon. The merge process ensures that there are no duplicate objects and the merge adheres to other rules around replication, online volumes, etc. Multi-array groups can also be created by adding an underutilized array to another existing array.
  • In one embodiment, storage pools are rebalanced when storage objects such as arrays, pools and volumes are added, removed or merged. Rebalancing is a non-disruptive low-impact process that allows application IO to continue uninterrupted even to the data sets during migration. Pool rebalancing gives highest priority to active data IO and performs the rebalancing process with a lower priority.
  • As noted, a group may be associated with several arrays, and at least one array is designated as the group leader (GL). The group leader has the configuration files and data that it maintains to manage the group of arrays. In one embodiment, a backup group leader (BGL) may be identified as one of the members of the storage arrays. Thus, the GL is the storage array manager, while the other arrays of the group are member arrays. In some cases, the GL may be migrated to another member array in case of a failure or possible failure at the array operating as the GL. As the configuration files are replicated at the BGL, the BGL is the one that takes the role as a new GL and another member array is designated as the BGL. In one embodiment, volumes are striped across a particular pool of arrays. As noted, group configuration data (configuration files and data managed by a GL) is stored in a common location and is replicated to the BGL.
  • In one embodiment, only a single management IP (Internet Protocol) address is used to access the group. Benefits of a centrally managed group include single volume collections across the group, snapshot and replication schedules spanning the group, added level of security by creating pools, shared access control lists (ACLs), high availability, and general array administration that operates at the group level and CLI command access to the specific group.
  • In one implementation, the storage scale-out architecture allows management of a storage cluster that spreads volumes and their IO requests between multiple arrays. A host cannot assume that a volume can be accessed through specific paths to one specific array or another. Instead of advertising all of the iSCSI interfaces on the array, the disclosed storage scale-out clusters advertise one IP address (e.g., iSCSI discovery). Volume IO requests are redirected to the appropriate array by leveraging deep integration with the host operating system platforms (e.g., Microsoft, VMware, etc.), or using iSCSI redirection.
  • FIG. 7 illustrates the architecture of a storage array, according to one embodiment. In one embodiment, storage array 102 includes an active controller 1120, a standby controller 1124, one or more HDDs 110, and one or more SSDs 112. In one embodiment, the controller 1120 includes non-volatile RAM (NVRAM) 1118, which is for storing the incoming data as the data arrives to the storage array. After the data is processed (e.g., compressed and organized in segments (e.g., coalesced)), the data is transferred from the NVRAM 1118 to HDD 110, or to SSD 112, or to both.
  • In addition, the active controller 1120 further includes CPU 1108, general-purpose RAM 1112 (e.g., used by the programs executing in CPU 1108), input/output module 1110 for communicating with external devices (e.g., USB port, terminal port, connectors, plugs, links, etc.), one or more network interface cards (NICs) 1114 for exchanging data packages through network 1156, one or more power supplies 1116, a temperature sensor (not shown), and a storage connect module 1122 for sending and receiving data to and from the HDD 110 and SSD 112. In one embodiment, standby controller 1124 includes the same components as active controller 1120.
  • Active controller 1120 is configured to execute one or more computer programs stored in RAM 1112. One of the computer programs is the storage operating system (OS) used to perform operating system functions for the active controller device. In some implementations, one or more expansion shelves 1130 may be coupled to storage array 102 to increase HDD 1132 capacity, or SSD 1134 capacity, or both.
  • Active controller 1120 and standby controller 1124 have their own NVRAMs, but they share HDDs 110 and SSDs 112. The standby controller 1124 receives copies of what gets stored in the NVRAM 1118 of the active controller 1120 and stores the copies in its own NVRAM. If the active controller 1120 fails, standby controller 1124 takes over the management of the storage array 102. When servers, also referred to herein as hosts, connect to the storage array 102, read/write requests (e.g., IO requests) are sent over network 1156, and the storage array 102 stores the sent data or sends back the requested data to host 104.
  • Host 104 is a computing device including a CPU 1150, memory (RAM) 1146, permanent storage (HDD) 1142, a NIC card 1152, and an IO module 1154. The host 104 includes one or more applications 1136 executing on CPU 1150, a host operating system 1138, and a computer program storage array manager 1140 that provides an interface for accessing storage array 102 to applications 1136. Storage array manager 1140 includes an initiator 1144 and a storage OS interface program 1148. When an IO operation is requested by one of the applications 1136, the initiator 1144 establishes a connection with storage array 102 in one of the supported formats (e.g., iSCSI, Fibre Channel, or any other protocol). The storage OS interface 1148 provides console capabilities for managing the storage array 102 by communicating with the active controller 1120 and the storage OS 1106 executing therein.
  • To process the IO requests, resources from the storage array 102 are required. Some of these resources may be a bottleneck in the processing of storage requests because the resources are over utilized, or are slow, or for any other reason. In general, the CPU and the hard drives of the storage array 102 can become over utilized and become performance bottlenecks. For example, the CPU may become very busy because the CPU is utilized for processing storage IO requests while also performing background tasks, such as garbage collection, snapshots, replication, alert reporting, etc. In one example, if there are many cache hits (i.e., the SSD contains the requested data during IO requests), the SSD cache, which is a fast responding system, may press the CPU for cycles, thus causing potential bottlenecks for other requested IOs or for processing background operations.
  • The hard disks may also become a bottleneck because the inherent access speed to data is slow when compared to accessing data from memory (e.g., NVRAM) or SSD. Embodiments presented herein are described with reference to CPU and HDD bottlenecks, but the same principles may be applied to other resources, such as a system with insufficient amount of NVRAM.
  • As used herein, SSDs functioning as flash cache, should be understood to operate the SSD as a cache for block level data access, providing service to read operations instead of only reading from HDDs 110. Thus, if data is present in SSDs 112, reading will occur from the SSDs instead of requiring a read to the HDDs 110, which is a slower operation. As mentioned above, the storage operating system 1106 is configured with an algorithm that allows for intelligent writing of certain data to the SSDs 112(e.g., cache-worthy data), and all data is written directly to the HDDs 110 from NVRAM 1118.
  • The algorithm, in one embodiment, is configured to select cache-worthy data for writing to the SSDs 112, in a manner that provides in increased likelihood that a read operation will access data from SSDs 112. In some embodiments, the algorithm is referred to as a cache accelerated sequential layout (CASL) architecture, which intelligently leverages unique properties of flash and disk to provide high performance and optimal use of capacity. In one embodiment, CASL caches “hot” active data onto SSD in real time—without the need to set complex policies. This way, the storage array can instantly respond to read requests—as much as ten times faster than traditional bolt-on or tiered approaches to flash caching.
  • For purposes of discussion and understanding, reference is made to CASL as being an algorithm processed by the storage OS. However, it should be understood that optimizations, modifications, additions, and subtractions to versions of CASL may take place from time to time. As such, reference to CASL should be understood to represent exemplary functionality, and the functionality may change from time to time, and may be modified to include or exclude features referenced herein or incorporated by reference herein. Still further, it should be understood that the embodiments described herein are just examples, and many more examples and/or implementations may be defined by combining elements and/or omitting elements described with reference to the claimed features.
  • In some implementations, SSDs 112 may be referred to as flash, or flash cache, or flash-based memory cache, or flash drives, storage flash, or simply cache. Consistent with the use of these terms, in the context of storage array 102, the various implementations of SSD 112 provide block level caching to storage, as opposed to instruction level caching. As mentioned above, one functionality enabled by algorithms of the storage OS 1106 is to provide storage of cache-worthy block level data to the SSDs, so that subsequent read operations are optimized (i.e., reads that are likely to hit the flash cache will be stored to SSDs 12, as a form of storage caching, to accelerate the performance of the storage array 102).
  • In one embodiment, it should be understood that the “block level processing” of SSDs 112, serving as storage cache, is different than “instruction level processing,” which is a common function in microprocessor environments. In one example, microprocessor environments utilize main memory, and various levels of cache memory (e.g., L1, L2, etc.). Instruction level caching, is differentiated further, because instruction level caching is block-agnostic, meaning that instruction level caching is not aware of what type of application is producing or requesting the data processed by the microprocessor. Generally speaking, the microprocessor is required to treat all instruction level caching equally, without discriminating or differentiating processing of different types of applications.
  • In the various implementations described herein, the storage caching facilitated by SSDs 112 is implemented by algorithms exercised by the storage OS 1106, which can differentiate between the types of blocks being processed for each type of application or applications. That is, block data being written to storage 1130 can be associated with block data specific applications. For instance, one application may be a mail system application, while another application may be a financial database application, and yet another may be for a website-hosting application. Each application can have different storage accessing patterns and/or requirements. In accordance with several embodiments described herein, block data (e.g., associated with the specific applications) can be treated differently when processed by the algorithms executed by the storage OS 1106, for efficient use of flash cache 112.
  • FIG. 8 is a flow chart of a method for replicating data across storage systems, according to one embodiment. Operation 802 is for transferring the snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array. From operation 802, the method flows to operation 804 for comparing an upstream snapshot checksum (usc) of the snapshot in the upstream array with a downstream snapshot checksum (dsc) of the snapshot in the downstream array.
  • In operation 806, a check is made to determine if usc is equal to dsc. If usc is equal to dsc, the method flows to operation 818, and if usc is not equal to dsc, the method flows to operation 808. In operation 808, a plurality of chunks is defined in the snapshot.
  • From operation 808, the method flows to operation 810 where a comparison is made of an upstream chunk checksum (ucc) calculated by the upstream array with a downstream chunk checksum (dsc) calculated by the downstream array.
  • In operation 812, a check is made to determine if ucc is equal to dsc. If ucc is equal to dsc, the method flows to operation 814 where the chunk is considered validated. If ucc is not equal to dsc, the method flows to operation 816 where data of the is sent chunk from the upstream array to the downstream array. Operations 810, 812, and 814 or 816 are repeated for all the chunks defined in operation 808. When all the chunks have been validated, the snapshot is considered validated in operation 818.
  • The snapshot operations described herein solve the problem of having to re-send all the data of a volume from one storage system to another storage system when a problem occurs while replicating data. The data of the original snapshot is kept in permanent storage of the upstream array, and the replicated data is kept in permanent storage in the downstream array. Instead of having to resend all the data of the volume, the volume is divided into logical groups of data, referred to as chunks, and then each chunk is validated. When all the chunks have been validated, the replicated version of the volume is considered validated. The operations described herein refer to the exchange of information between two separate storage devices, which exchange data and transfer parameters to validate the replicated data. By re-transmitting data only chunks that have been improperly replicated, savings in time and resources are attained because the data of the chunks that have been correctly replicated does not have to be re-transmitted.
  • One or more embodiments can also be fabricated as computer readable code on a non-transitory computer readable storage medium. The non-transitory computer readable storage medium is any non-transitory data storage device that can store data, which can be thereafter be read by a computer system. Examples of the non-transitory computer readable storage medium include hard drives, network attached storage (NAS), read-only memory, random-access memory, CD-ROMs, CD-Rs, CD-RWs, magnetic tapes and other optical and non-optical data storage devices. The non-transitory computer readable storage medium can include computer readable storage medium distributed over a network-coupled computer system so that the computer readable code is stored and executed in a distributed fashion.
  • Although the method operations were described in a specific order, it should be understood that other housekeeping operations may be performed in between operations, or operations may be adjusted so that they occur at slightly different times, or may be distributed in a system which allows the occurrence of the processing operations at various intervals associated with the processing, as long as the processing of the overlay operations are performed in the desired way.
  • Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, it will be apparent that certain changes and modifications can be practiced within the scope of the appended claims. Accordingly, the present embodiments are to be considered as illustrative and not restrictive, and the embodiments are not to be limited to the details given herein, but may be modified within the scope and equivalents of the described embodiments.

Claims (20)

1. A method for replicating data across storage systems, the method comprising:
transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array;
comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array;
when the upstream snapshot checksum is different from the downstream snapshot checksum, defining a plurality of chunks in the snapshot; and
for each chunk in the snapshot,
comparing an upstream chunk checksum calculated by the upstream array with a downstream chunk checksum calculated by the downstream array; and
sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
2. The method as recited in claim 1, further including:
exchanging, before defining the plurality of chunks, transfer parameters between the upstream array and the downstream array, the transfer parameters including one or more of checksum type for calculating the upstream chunk checksum and the downstream chunk checksum, or a checksum size, or a chunk size, or a cursor indicating at what chunk to start the comparing of the upstream chunk checksum and the downstream chunk checksum.
3. The method as recited in claim 2, further including:
starting comparing the upstream chunk checksum with the downstream chunk checksum at the chunk indicated by the cursor.
4. The method as recited in claim 1, wherein comparing the upstream chunk checksum with the downstream chunk checksum further includes:
calculating, by the upstream array, the upstream chunk checksum;
sending, from the upstream array to the downstream array, a request to get the downstream chunk checksum;
calculating, by the downstream array, the downstream chunk checksum;
sending the downstream chunk checksum to the upstream array; and
comparing, by the upstream array, the upstream chunk checksum with the downstream chunk checksum.
5. The method as recited in claim 1, wherein transferring the snapshot of the volume includes transferring all data from the snapshot from the upstream array to the downstream array, wherein the upstream array is a first storage system that includes a first processor, a first volatile memory, and a first permanent storage, wherein the downstream array is a second storage system that includes a second processor, a second volatile memory, and a second permanent storage, wherein the upstream array stores the snapshot of the volume to be replicated to the downstream array, wherein a volume holds data for the single accessible storage area, wherein data of the volume is accessible by a host in communication with the storage system.
6. The method as recited in claim 1, further including:
sending, from the upstream array to the downstream array, a confirmation message indicating that the snapshot has been validated.
7. The method as recited in claim 1, wherein the snapshot of the volume includes one or more blocks, wherein data from the snapshot is accessed by a host in units of a size of the block, the method further including:
storing, in the upstream array, checksums of blocks of the snapshot; and
calculating the upstream chunk checksum based on the checksums of the blocks in the chunk, wherein the chunk is not uncompressed to calculate the upstream chunk checksum.
8. The method as recited in claim 7, wherein a chunk includes one hundred or more blocks, wherein data from the chunk is not directly addressable by the host.
9. The method as recited in claim 7, wherein a chunk has a size in a range from 1 Megabyte to 16 Megabytes, wherein a block has a block size in a range from 1 Kilobyte to 256 Kilobytes.
10. A method for replicating data across storage systems, the method comprising:
transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array;
comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array;
when the upstream snapshot checksum is different from the downstream snapshot checksum, defining a plurality of chunks in the snapshot; and
for each chunk in the snapshot,
comparing an upstream chunk checksum calculated by the upstream array with a downstream chunk checksum calculated by the downstream array;
when the upstream chunk checksum is different from the downstream chunk checksum, defining a plurality of blocks in the chunk; and
for each block in the chunk,
comparing an upstream block checksum calculated by the upstream array with a downstream block checksum calculated by the downstream array; and
sending, from the upstream array to the downstream array, data of the block when the upstream block checksum is different from the downstream block checksum.
11. The method as recited in claim 10, exchanging, before defining the plurality of chunks, transfer parameters between the upstream array and the downstream array, the transfer parameters including one or more of checksum type for calculating the upstream chunk checksum and the downstream chunk checksum, or a checksum size, or a chunk size, or a cursor indicating at what chunk to start the comparing of the upstream chunk checksum and the downstream chunk checksum.
12. The method as recited in claim 11, further including:
starting comparing the upstream chunk checksum with the downstream chunk checksum at the chunk indicated by the cursor.
13. The method as recited in claim 10, wherein data from the snapshot is accessed by a host in units of a size of the block, the method further including:
storing, in the upstream array, checksums of blocks of the snapshot; and
calculating the upstream chunk checksum based on the checksums of the blocks in the chunk, wherein the chunk is not uncompressed to calculate the upstream chunk checksum.
14. The method as recited in claim 10, wherein operations of the method are performed by a computer program when executed by one or more processors, the computer program being embedded in a non-transitory computer-readable storage medium.
15. A non-transitory computer-readable storage medium storing a computer program for replicating data across storage systems, the computer-readable storage medium comprising:
program instructions for transferring a snapshot of a volume from an upstream array to a downstream array, the volume being a single accessible storage area within the upstream array;
program instructions for comparing an upstream snapshot checksum of the snapshot in the upstream array with a downstream snapshot checksum of the snapshot in the downstream array;
program instructions for, when the upstream snapshot checksum is different from the downstream snapshot checksum, defining a plurality of chunks in the snapshot; and
for each chunk in the snapshot,
program instructions for comparing an upstream chunk checksum calculated by the upstream array with a downstream chunk checksum calculated by the downstream array; and
program instructions for sending, from the upstream array to the downstream array, data of the chunk when the upstream chunk checksum is different from the downstream chunk checksum.
16. The storage medium as recited in claim 15, further including:
program instructions for exchanging, before defining the plurality of chunks, transfer parameters between the upstream array and the downstream array, the transfer parameters including one or more of checksum type for calculating the upstream chunk checksum and the downstream chunk checksum, or a checksum size, or a chunk size, or a cursor indicating at what chunk to start the comparing of the upstream chunk checksum and the downstream chunk checksum.
17. The storage medium as recited in claim 16, further including:
program instructions for starting comparing the upstream chunk checksum with the downstream chunk checksum at the chunk indicated by the cursor.
18. The storage medium as recited in claim 15, wherein comparing the upstream chunk checksum with the downstream chunk checksum further includes:
program instructions for calculating, by the upstream array, the upstream chunk checksum;
program instructions for sending, from the upstream array to the downstream array, a request to get the downstream chunk checksum;
program instructions for calculating, by the downstream array, the downstream chunk checksum;
program instructions for sending the downstream chunk checksum to the upstream array; and
program instructions for comparing, by the upstream array, the upstream chunk checksum with the downstream chunk checksum.
19. The storage medium as recited in claim 15, wherein transferring the snapshot of the volume includes transferring all data from the snapshot from the upstream array to the downstream array.
20. The storage medium as recited in claim 15, further including:
program instructions for sending, from the upstream array to the downstream array, a confirmation message indicating that the snapshot has been validated.
US14/950,456 2014-11-25 2015-11-24 Content-based replication of data between storage units Abandoned US20160150012A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/950,456 US20160150012A1 (en) 2014-11-25 2015-11-24 Content-based replication of data between storage units

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201462084403P 2014-11-25 2014-11-25
US201462084395P 2014-11-25 2014-11-25
US14/950,456 US20160150012A1 (en) 2014-11-25 2015-11-24 Content-based replication of data between storage units

Publications (1)

Publication Number Publication Date
US20160150012A1 true US20160150012A1 (en) 2016-05-26

Family

ID=56010446

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/950,456 Abandoned US20160150012A1 (en) 2014-11-25 2015-11-24 Content-based replication of data between storage units
US14/950,482 Active 2036-11-17 US10467246B2 (en) 2014-11-25 2015-11-24 Content-based replication of data in scale out system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/950,482 Active 2036-11-17 US10467246B2 (en) 2014-11-25 2015-11-24 Content-based replication of data in scale out system

Country Status (1)

Country Link
US (2) US20160150012A1 (en)

Cited By (142)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10296451B1 (en) 2018-11-01 2019-05-21 EMC IP Holding Company LLC Content addressable storage system utilizing content-based and address-based mappings
US10324640B1 (en) 2018-01-22 2019-06-18 EMC IP Holding Company LLC Storage system with consistent initiation of data replication across multiple distributed processing modules
US10338851B1 (en) 2018-01-16 2019-07-02 EMC IP Holding Company LLC Storage system with consistent termination of data replication across multiple distributed processing modules
US10356273B2 (en) * 2017-07-12 2019-07-16 Kyocera Document Solutions Inc. Image reading device and image reading method
US10359965B1 (en) * 2017-07-28 2019-07-23 EMC IP Holding Company LLC Signature generator for use in comparing sets of data in a content addressable storage system
US10394485B1 (en) 2018-03-29 2019-08-27 EMC IP Holding Company LLC Storage system with efficient re-synchronization mode for use in replication of data from source to target
US10437855B1 (en) 2017-07-28 2019-10-08 EMC IP Holding Company LLC Automatic verification of asynchronously replicated data
US10467246B2 (en) 2014-11-25 2019-11-05 Hewlett Packard Enterprise Development Lp Content-based replication of data in scale out system
US10466925B1 (en) 2017-10-25 2019-11-05 EMC IP Holding Company LLC Compression signaling for replication process in a content addressable storage system
US10496324B2 (en) 2018-03-30 2019-12-03 EMC IP Holding Company LLC Storage system with concurrent fan-out asynchronous replication using decoupled replication sessions
US10496489B1 (en) 2017-11-21 2019-12-03 EMC IP Holding Company LLC Storage system configured for controlled transition between asynchronous and synchronous replication modes
US10496674B2 (en) 2017-08-07 2019-12-03 International Business Machines Corporation Self-describing volume ancestry for data synchronization
US10521317B1 (en) 2017-10-26 2019-12-31 EMC IP Holding Company LLC Compressing data to be replicated utilizing a compression method selected based on network behavior
CN110737542A (en) * 2018-07-19 2020-01-31 慧与发展有限责任合伙企业 Freezing and unfreezing upstream and downstream rolls
US10558613B1 (en) 2018-07-19 2020-02-11 EMC IP Holding Company LLC Storage system with decrement protection of reference counts
US10585762B2 (en) 2014-04-29 2020-03-10 Hewlett Packard Enterprise Development Lp Maintaining files in a retained file system
US10592161B1 (en) 2019-01-22 2020-03-17 EMC IP Holding Company LLC Storage system with flexible scanning supporting storage volume addition and/or recovery in asynchronous replication
US10592159B2 (en) 2018-06-20 2020-03-17 EMC IP Holding Company LLC Processing device configured for data integrity testing utilizing replicated test metadata file
US10606519B1 (en) 2018-10-19 2020-03-31 EMC IP Holding Company LLC Edge case handling in system with dynamic flow control
US10635533B2 (en) 2018-07-30 2020-04-28 EMC IP Holding Company LLC Efficient computation of parity data in storage system implementing data striping
US10671320B2 (en) 2018-07-24 2020-06-02 EMC IP Holding Company LLC Clustered storage system configured with decoupling of process restart from in-flight command execution
US10684915B2 (en) 2018-07-25 2020-06-16 EMC IP Holding Company LLC Efficient packing of compressed data in storage system implementing data striping
US10691355B2 (en) 2018-11-02 2020-06-23 EMC IP Holding Company LLC Apparatus, method and computer program product for controlled ordering of data pages for migration from source storage system into target storage system
US10691373B2 (en) 2018-07-18 2020-06-23 EMC IP Holding Company LLC Object headers facilitating storage of data in a write buffer of a storage system
US10691551B2 (en) 2018-07-24 2020-06-23 EMC IP Holding Company LLC Storage system with snapshot generation control utilizing monitored differentials of respective storage volumes
US10698772B2 (en) 2018-07-17 2020-06-30 EMC IP Holding Company LLC Storage system with multiple write journals supporting synchronous replication failure recovery
US10705965B2 (en) 2018-07-23 2020-07-07 EMC IP Holding Company LLC Metadata loading in storage systems
US10719253B2 (en) 2018-10-31 2020-07-21 EMC IP Holding Company LLC Efficient compression of data in storage systems through offloading computation to storage devices
US10725855B2 (en) 2018-10-22 2020-07-28 EMC IP Holding Company LLC Storage system with data integrity verification performed in conjunction with internal data movement
US20200257514A1 (en) * 2019-02-11 2020-08-13 Salesforce.Com, Inc. Scalable artifact distribution
US10747474B2 (en) 2018-10-22 2020-08-18 EMC IP Holding Company LLC Online cluster expansion for storage system with decoupled logical and physical capacity
US10747677B2 (en) 2018-07-27 2020-08-18 EMC IP Holding Company LLC Snapshot locking mechanism
US10754736B2 (en) 2018-10-25 2020-08-25 EMC IP Holding Company LLC Storage system with scanning and recovery of internal hash metadata structures
US10754559B1 (en) 2019-03-08 2020-08-25 EMC IP Holding Company LLC Active-active storage clustering with clock synchronization
US10754575B2 (en) 2018-07-16 2020-08-25 EMC IP Holding Company LLC Storage system with replication process utilizing simulated target responses
US10761933B2 (en) 2018-09-21 2020-09-01 EMC IP Holding Company LLC Prefill of raid stripes in a storage system by reading of existing data
US10783134B2 (en) 2018-07-31 2020-09-22 EMC IP Holding Company LLC Polling process for monitoring interdependent hardware components
US10817385B2 (en) 2018-07-31 2020-10-27 EMC IP Holding Company LLC Storage system with backup control utilizing content-based signatures
US10824512B2 (en) 2018-07-31 2020-11-03 EMC IP Holding Company LLC Managing journaling resources with copies stored in multiple locations
US10826990B2 (en) 2018-07-23 2020-11-03 EMC IP Holding Company LLC Clustered storage system configured for bandwidth efficient processing of writes at sizes below a native page size
US10831407B2 (en) 2019-01-31 2020-11-10 EMC IP Holding Company LLC Write flow offloading to raid array storage enclosure
US10831735B2 (en) 2018-07-25 2020-11-10 EMC IP Holding Company LLC Processing device configured for efficient generation of a direct mapped hash table persisted to non-volatile block memory
US10838863B2 (en) 2019-02-01 2020-11-17 EMC IP Holding Company LLC Storage system with write cache release protection
US10846178B2 (en) 2019-01-11 2020-11-24 EMC IP Holding Company LLC Hash-based remote rebuild assistance for content addressable storage systems
US10852965B2 (en) 2018-10-30 2020-12-01 EMC IP Holding Company LLC Write folding mechanism using reusable shared striping in a storage system
US10852999B2 (en) 2018-07-31 2020-12-01 EMC IP Holding Company LLC Storage system with decoupling of reference count updates
US10860241B2 (en) 2018-10-24 2020-12-08 EMC IP Holding Company LLC Storage system configured for token-based data transfer in active-active configuration with synchronous replication
US10866969B2 (en) 2018-03-28 2020-12-15 EMC IP Holding Company LLC Storage system with loopback replication process providing unique identifiers for collision-free object pairing
US10866735B2 (en) 2019-03-26 2020-12-15 EMC IP Holding Company LLC Storage system with variable granularity counters
US10866760B2 (en) 2019-04-15 2020-12-15 EMC IP Holding Company LLC Storage system with efficient detection and clean-up of stale data for sparsely-allocated storage in replication
US10871991B2 (en) 2019-01-18 2020-12-22 EMC IP Holding Company LLC Multi-core processor in storage system executing dedicated polling thread for increased core availability
US10871960B2 (en) 2019-04-23 2020-12-22 EMC IP Holding Company LLC Upgrading a storage controller operating system without rebooting a storage system
US10884650B1 (en) 2017-10-25 2021-01-05 EMC IP Holding Company LLC Opportunistic compression of replicated data in a content addressable storage system
US10884651B2 (en) 2018-07-23 2021-01-05 EMC IP Holding Company LLC Storage system with multi-phase verification of synchronously replicated data
US10884799B2 (en) 2019-01-18 2021-01-05 EMC IP Holding Company LLC Multi-core processor in storage system executing dynamic thread for increased core availability
US10891195B2 (en) 2019-03-19 2021-01-12 EMC IP Holding Company LLC Storage system with differential scanning of non-ancestor snapshot pairs in asynchronous replication
US10901847B2 (en) 2018-07-31 2021-01-26 EMC IP Holding Company LLC Maintaining logical to physical address mapping during in place sector rebuild
US10909001B1 (en) 2019-08-23 2021-02-02 EMC IP Holding Company LLC Storage system with snapshot group split functionality
US10922147B2 (en) 2018-07-19 2021-02-16 EMC IP Holding Company LLC Storage system destaging based on synchronization object with watermark
US10929047B2 (en) 2018-07-31 2021-02-23 EMC IP Holding Company LLC Storage system with snapshot generation and/or preservation control responsive to monitored replication data
US10929239B2 (en) 2019-07-19 2021-02-23 EMC IP Holding Company LLC Storage system with snapshot group merge functionality
US10929050B2 (en) 2019-04-29 2021-02-23 EMC IP Holding Company LLC Storage system with deduplication-aware replication implemented using a standard storage command protocol
US10936240B2 (en) * 2018-12-04 2021-03-02 International Business Machines Corporation Using merged snapshots to increase operational efficiency for network caching based disaster recovery
US10936010B2 (en) 2019-03-08 2021-03-02 EMC IP Holding Company LLC Clock synchronization for storage systems in an active-active configuration
US10942654B2 (en) 2018-11-01 2021-03-09 EMC IP Holding Company LLC Hash-based data recovery from remote storage system
US10951699B1 (en) * 2017-11-28 2021-03-16 EMC IP Holding Company LLC Storage system with asynchronous messaging between processing modules for data replication
US10956078B2 (en) 2018-03-27 2021-03-23 EMC IP Holding Company LLC Storage system with loopback replication process providing object-dependent slice assignment
US10977216B2 (en) 2018-05-29 2021-04-13 EMC IP Holding Company LLC Processing device utilizing content-based signature prefix for efficient generation of deduplication estimate
US10983962B2 (en) 2018-05-29 2021-04-20 EMC IP Holding Company LLC Processing device utilizing polynomial-based signature subspace for efficient generation of deduplication estimate
US10990286B1 (en) 2019-10-30 2021-04-27 EMC IP Holding Company LLC Parallel upgrade of nodes in a storage system
US10996887B2 (en) 2019-04-29 2021-05-04 EMC IP Holding Company LLC Clustered storage system with dynamic space assignments across processing modules to counter unbalanced conditions
US10996871B2 (en) 2018-11-01 2021-05-04 EMC IP Holding Company LLC Hash-based data recovery from remote storage system responsive to missing or corrupted hash digest
US10997072B1 (en) 2019-10-16 2021-05-04 EMC IP Holding Company LLC Host-based acceleration of a content addressable storage system
US10996898B2 (en) 2018-05-29 2021-05-04 EMC IP Holding Company LLC Storage system configured for efficient generation of capacity release estimates for deletion of datasets
US11003629B2 (en) 2018-10-31 2021-05-11 EMC IP Holding Company LLC Dual layer deduplication for application specific file types in an information processing system
US11010251B1 (en) 2020-03-10 2021-05-18 EMC IP Holding Company LLC Metadata update journal destaging with preload phase for efficient metadata recovery in a distributed storage system
US11030314B2 (en) 2018-07-31 2021-06-08 EMC IP Holding Company LLC Storage system with snapshot-based detection and remediation of ransomware attacks
US11036602B1 (en) 2019-11-25 2021-06-15 EMC IP Holding Company LLC Storage system with prioritized RAID rebuild
US11055014B2 (en) 2019-03-26 2021-07-06 EMC IP Holding Company LLC Storage system providing automatic configuration updates for remote storage objects in a replication process
US11055188B2 (en) 2019-04-12 2021-07-06 EMC IP Holding Company LLC Offloading error processing to raid array storage enclosure
US11055028B1 (en) 2020-02-03 2021-07-06 EMC IP Holding Company LLC Storage system with reduced read latency
US11061618B1 (en) 2020-02-25 2021-07-13 EMC IP Holding Company LLC Disk array enclosure configured to determine metadata page location based on metadata identifier
US11079957B2 (en) 2019-11-01 2021-08-03 Dell Products L.P. Storage system capacity expansion using mixed-capacity storage devices
US11079969B1 (en) 2020-02-25 2021-08-03 EMC IP Holding Company LLC Disk array enclosure configured for metadata and data storage processing
US11079961B1 (en) 2020-02-03 2021-08-03 EMC IP Holding Company LLC Storage system with write-via-hash functionality for synchronous replication of logical storage volumes
US11086558B2 (en) 2018-11-01 2021-08-10 EMC IP Holding Company LLC Storage system with storage volume undelete functionality
US11093161B1 (en) 2020-06-01 2021-08-17 EMC IP Holding Company LLC Storage system with module affinity link selection for synchronous replication of logical storage volumes
US11093159B2 (en) 2019-01-15 2021-08-17 EMC IP Holding Company LLC Storage system with storage volume pre-copy functionality for increased efficiency in asynchronous replication
US11099767B2 (en) 2019-10-25 2021-08-24 EMC IP Holding Company LLC Storage system with throughput-based timing of synchronous replication recovery
US11099766B2 (en) 2019-06-21 2021-08-24 EMC IP Holding Company LLC Storage system configured to support one-to-many replication
US11106557B2 (en) 2020-01-21 2021-08-31 EMC IP Holding Company LLC Persistence points based coverage mechanism for flow testing in high-performance storage systems
US11106810B2 (en) 2018-07-30 2021-08-31 EMC IP Holding Company LLC Multi-tenant deduplication with non-trusted storage system
US11126361B1 (en) 2020-03-16 2021-09-21 EMC IP Holding Company LLC Multi-level bucket aggregation for journal destaging in a distributed storage system
US11137929B2 (en) 2019-06-21 2021-10-05 EMC IP Holding Company LLC Storage system configured to support cascade replication
US11144461B2 (en) 2020-03-09 2021-10-12 EMC IP Holding Company LLC Bandwidth efficient access to persistent storage in a distributed storage system
US11144232B2 (en) 2020-02-21 2021-10-12 EMC IP Holding Company LLC Storage system with efficient snapshot pair creation during synchronous replication of logical storage volumes
US11144229B2 (en) 2018-11-01 2021-10-12 EMC IP Holding Company LLC Bandwidth efficient hash-based migration of storage volumes between storage systems
US11151048B2 (en) 2019-10-25 2021-10-19 Dell Products L.P. Host-based read performance optimization of a content addressable storage system
CN113569047A (en) * 2021-07-23 2021-10-29 中信银行股份有限公司 Intersystem data verification method, device, equipment and readable storage medium
US11169880B1 (en) 2020-04-20 2021-11-09 EMC IP Holding Company LLC Storage system configured to guarantee sufficient capacity for a distributed raid rebuild process
US11194664B2 (en) 2020-04-20 2021-12-07 EMC IP Holding Company LLC Storage system configured to guarantee sufficient capacity for a distributed raid rebuild process
US11204716B2 (en) 2019-01-31 2021-12-21 EMC IP Holding Company LLC Compression offloading to RAID array storage enclosure
US11216443B2 (en) 2018-06-20 2022-01-04 EMC IP Holding Company LLC Processing device configured for data integrity testing utilizing signature-based multi-phase write operations
US11232128B2 (en) * 2019-01-14 2022-01-25 EMC IP Holding Company LLC Storage systems configured with time-to-live clustering for replication in active-active configuration
US11232010B2 (en) * 2020-01-20 2022-01-25 EMC IP Holding Company LLC Performance monitoring for storage system with core thread comprising internal and external schedulers
US11249654B2 (en) 2020-02-18 2022-02-15 EMC IP Holding Company LLC Storage system with efficient data and parity distribution across mixed-capacity storage devices
US11249834B2 (en) 2019-05-15 2022-02-15 EMC IP Holding Company LLC Storage system with coordinated recovery across multiple input-output journals of different types
US11275765B2 (en) * 2019-01-28 2022-03-15 EMC IP Holding Company LLC Storage systems configured for storage volume addition in synchronous replication using active-active configuration
US11281386B2 (en) 2020-02-25 2022-03-22 EMC IP Holding Company LLC Disk array enclosure with metadata journal
US11288286B2 (en) 2019-01-22 2022-03-29 EMC IP Holding Company LLC Storage system with data consistency checking in synchronous replication using active snapshot set
US11308125B2 (en) 2018-03-27 2022-04-19 EMC IP Holding Company LLC Storage system with fast recovery and resumption of previously-terminated synchronous replication
US11314416B1 (en) 2020-10-23 2022-04-26 EMC IP Holding Company LLC Defragmentation of striped volume in data storage system
US11327812B1 (en) 2020-10-19 2022-05-10 EMC IP Holding Company LLC Distributed storage system with per-core rebalancing of thread queues
US11360712B2 (en) 2020-02-03 2022-06-14 EMC IP Holding Company LLC Storage system with continuous data verification for synchronous replication of logical storage volumes
US11372772B2 (en) 2018-06-20 2022-06-28 EMC IP Holding Company LLC Content addressable storage system configured for efficient storage of count-key-data tracks
US11372570B1 (en) * 2021-02-12 2022-06-28 Hitachi, Ltd. Storage device, computer system, and data transfer program for deduplication
US11379142B2 (en) 2019-03-29 2022-07-05 EMC IP Holding Company LLC Snapshot-enabled storage system implementing algorithm for efficient reclamation of snapshot storage space
US11386042B2 (en) 2019-03-29 2022-07-12 EMC IP Holding Company LLC Snapshot-enabled storage system implementing algorithm for efficient reading of data from stored snapshots
US11392551B2 (en) 2019-02-04 2022-07-19 EMC IP Holding Company LLC Storage system utilizing content-based and address-based mappings for deduplicatable and non-deduplicatable types of data
US11392295B2 (en) 2020-05-27 2022-07-19 EMC IP Holding Company LLC Front-end offload of storage system processing
US11397705B2 (en) 2019-02-26 2022-07-26 EMC IP Holding Company LLC Storage system configured to generate sub-volume snapshots
US11429517B2 (en) 2018-06-20 2022-08-30 EMC IP Holding Company LLC Clustered storage system with stateless inter-module communication for processing of count-key-data tracks
US11435921B2 (en) 2020-11-19 2022-09-06 EMC IP Holding Company LLC Selective deduplication in a distributed storage system
US11436138B2 (en) 2020-10-21 2022-09-06 EMC IP Holding Company LLC Adaptive endurance tuning of solid-state storage system
US11467906B2 (en) 2019-08-02 2022-10-11 EMC IP Holding Company LLC Storage system resource rebuild based on input-output operation indicator
US11481291B2 (en) 2021-01-12 2022-10-25 EMC IP Holding Company LLC Alternative storage node communication channel using storage devices group in a distributed storage system
US11494301B2 (en) 2020-05-12 2022-11-08 EMC IP Holding Company LLC Storage system journal ownership mechanism
US11494405B2 (en) 2020-12-21 2022-11-08 EMC IP Holding Company LLC Lock contention resolution for active-active replication performed in conjunction with journal recovery
US11494103B2 (en) 2019-08-02 2022-11-08 EMC IP Holding Company LLC Memory-efficient processing of RAID metadata bitmaps
US11507597B2 (en) 2021-03-31 2022-11-22 Pure Storage, Inc. Data replication to meet a recovery point objective
US11513882B2 (en) 2020-06-08 2022-11-29 EMC IP Holding Company LLC Dynamic modification of IO shaping mechanisms of multiple storage nodes in a distributed storage system
US11520527B1 (en) 2021-06-11 2022-12-06 EMC IP Holding Company LLC Persistent metadata storage in a storage system
US11531470B2 (en) 2020-10-21 2022-12-20 EMC IP Holding Company LLC Offload of storage system data recovery to storage devices
US11593313B2 (en) 2018-05-29 2023-02-28 EMC IP Holding Company LLC Processing device configured for efficient generation of data reduction estimates for combinations of datasets
US11609883B2 (en) 2018-05-29 2023-03-21 EMC IP Holding Company LLC Processing device configured for efficient generation of compression estimates for datasets
US11616722B2 (en) 2020-10-22 2023-03-28 EMC IP Holding Company LLC Storage system with adaptive flow control using multiple feedback loops
US11645174B2 (en) 2019-10-28 2023-05-09 Dell Products L.P. Recovery flow with reduced address lock contention in a content addressable storage system
US11687245B2 (en) 2020-11-19 2023-06-27 EMC IP Holding Company LLC Dynamic slice assignment in a distributed storage system
US11775202B2 (en) 2021-07-12 2023-10-03 EMC IP Holding Company LLC Read stream identification in a distributed storage system
US11853568B2 (en) 2020-10-21 2023-12-26 EMC IP Holding Company LLC Front-end offload of storage system hash and compression processing
US11875198B2 (en) 2021-03-22 2024-01-16 EMC IP Holding Company LLC Synchronization object issue detection using object type queues and associated monitor threads in a storage system
US11886911B2 (en) 2020-06-29 2024-01-30 EMC IP Holding Company LLC End-to-end quality of service mechanism for storage system using prioritized thread queues

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10956079B2 (en) * 2018-04-13 2021-03-23 Hewlett Packard Enterprise Development Lp Data resynchronization
US11100135B2 (en) * 2018-07-18 2021-08-24 EMC IP Holding Company LLC Synchronous replication in a storage system
US11226984B2 (en) * 2019-08-13 2022-01-18 Capital One Services, Llc Preventing data loss in event driven continuous availability systems

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100179941A1 (en) * 2008-12-10 2010-07-15 Commvault Systems, Inc. Systems and methods for performing discrete data replication
US20110099148A1 (en) * 2008-07-02 2011-04-28 Bruning Iii Theodore E Verification Of Remote Copies Of Data
US20120317079A1 (en) * 2011-06-08 2012-12-13 Kurt Alan Shoens Systems and methods of data replication of a file system
US20130007366A1 (en) * 2011-07-01 2013-01-03 International Business Machines Corporation Delayed instant copy operation for short-lived snapshots
US20140201153A1 (en) * 2013-01-11 2014-07-17 Commvault Systems, Inc. Partial file restore in a data storage system
US9740583B1 (en) * 2012-09-24 2017-08-22 Amazon Technologies, Inc. Layered keys for storage volumes

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9811574D0 (en) * 1998-05-30 1998-07-29 Ibm Indexed file system and a method and a mechanism for accessing data records from such a system
US7827214B1 (en) 2003-02-14 2010-11-02 Google Inc. Maintaining data in a file system
US7836387B1 (en) 2005-04-29 2010-11-16 Oracle America, Inc. System and method for protecting data across protection domain boundaries
US7650394B2 (en) 2006-09-15 2010-01-19 Microsoft Corporation Synchronizing email recipient lists using block partition information
US7895501B2 (en) 2007-02-06 2011-02-22 Vision Solutions, Inc. Method for auditing data integrity in a high availability database
US8060812B2 (en) * 2007-07-27 2011-11-15 International Business Machines Corporation Methods, systems, and computer program products for class verification
US7814074B2 (en) 2008-03-14 2010-10-12 International Business Machines Corporation Method and system for assuring integrity of deduplicated data
US8209283B1 (en) 2009-02-19 2012-06-26 Emc Corporation System and method for highly reliable data replication
US8504517B2 (en) 2010-03-29 2013-08-06 Commvault Systems, Inc. Systems and methods for selective data replication
US9116831B2 (en) * 2010-10-06 2015-08-25 Cleversafe, Inc. Correcting an errant encoded data slice
US8762336B2 (en) * 2011-05-23 2014-06-24 Microsoft Corporation Geo-verification and repair
US9910904B2 (en) 2011-08-30 2018-03-06 International Business Machines Corporation Replication of data objects from a source server to a target server
US9600371B2 (en) * 2011-09-09 2017-03-21 Oracle International Corporation Preserving server-client session context
US9449014B2 (en) 2011-11-29 2016-09-20 Dell Products L.P. Resynchronization of replicated data
US8972678B2 (en) 2011-12-21 2015-03-03 Emc Corporation Efficient backup replication
US9152659B2 (en) 2011-12-30 2015-10-06 Bmc Software, Inc. Systems and methods for migrating database data
US8977602B2 (en) 2012-06-05 2015-03-10 Oracle International Corporation Offline verification of replicated file system
US20140007189A1 (en) * 2012-06-28 2014-01-02 International Business Machines Corporation Secure access to shared storage resources
US9075529B2 (en) * 2013-01-04 2015-07-07 International Business Machines Corporation Cloud based data migration and replication
WO2015070160A1 (en) * 2013-11-08 2015-05-14 MustBin Inc. Bin enabled data object encryption and storage apparatuses, methods and systems
US9767106B1 (en) 2014-06-30 2017-09-19 EMC IP Holding Company LLC Snapshot based file verification
US20170193004A1 (en) 2014-07-22 2017-07-06 Hewlett Packard Enterprise Development Lp Ensuring data integrity of a retained file upon replication
US9697268B1 (en) * 2014-09-16 2017-07-04 Amazon Technologies, Inc. Bulk data distribution system
US20160150012A1 (en) 2014-11-25 2016-05-26 Nimble Storage, Inc. Content-based replication of data between storage units

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110099148A1 (en) * 2008-07-02 2011-04-28 Bruning Iii Theodore E Verification Of Remote Copies Of Data
US20100179941A1 (en) * 2008-12-10 2010-07-15 Commvault Systems, Inc. Systems and methods for performing discrete data replication
US20120317079A1 (en) * 2011-06-08 2012-12-13 Kurt Alan Shoens Systems and methods of data replication of a file system
US20130007366A1 (en) * 2011-07-01 2013-01-03 International Business Machines Corporation Delayed instant copy operation for short-lived snapshots
US9740583B1 (en) * 2012-09-24 2017-08-22 Amazon Technologies, Inc. Layered keys for storage volumes
US20140201153A1 (en) * 2013-01-11 2014-07-17 Commvault Systems, Inc. Partial file restore in a data storage system

Cited By (152)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10585762B2 (en) 2014-04-29 2020-03-10 Hewlett Packard Enterprise Development Lp Maintaining files in a retained file system
US10467246B2 (en) 2014-11-25 2019-11-05 Hewlett Packard Enterprise Development Lp Content-based replication of data in scale out system
US10356273B2 (en) * 2017-07-12 2019-07-16 Kyocera Document Solutions Inc. Image reading device and image reading method
US10860234B2 (en) 2017-07-28 2020-12-08 EMC IP Holding Company LLC Signature generator for use in comparing sets of data in a content addressable storage system
US10359965B1 (en) * 2017-07-28 2019-07-23 EMC IP Holding Company LLC Signature generator for use in comparing sets of data in a content addressable storage system
US10437855B1 (en) 2017-07-28 2019-10-08 EMC IP Holding Company LLC Automatic verification of asynchronously replicated data
US10496674B2 (en) 2017-08-07 2019-12-03 International Business Machines Corporation Self-describing volume ancestry for data synchronization
US10884650B1 (en) 2017-10-25 2021-01-05 EMC IP Holding Company LLC Opportunistic compression of replicated data in a content addressable storage system
US10466925B1 (en) 2017-10-25 2019-11-05 EMC IP Holding Company LLC Compression signaling for replication process in a content addressable storage system
US10521317B1 (en) 2017-10-26 2019-12-31 EMC IP Holding Company LLC Compressing data to be replicated utilizing a compression method selected based on network behavior
US10496489B1 (en) 2017-11-21 2019-12-03 EMC IP Holding Company LLC Storage system configured for controlled transition between asynchronous and synchronous replication modes
US10951699B1 (en) * 2017-11-28 2021-03-16 EMC IP Holding Company LLC Storage system with asynchronous messaging between processing modules for data replication
US10338851B1 (en) 2018-01-16 2019-07-02 EMC IP Holding Company LLC Storage system with consistent termination of data replication across multiple distributed processing modules
US10324640B1 (en) 2018-01-22 2019-06-18 EMC IP Holding Company LLC Storage system with consistent initiation of data replication across multiple distributed processing modules
US11308125B2 (en) 2018-03-27 2022-04-19 EMC IP Holding Company LLC Storage system with fast recovery and resumption of previously-terminated synchronous replication
US10956078B2 (en) 2018-03-27 2021-03-23 EMC IP Holding Company LLC Storage system with loopback replication process providing object-dependent slice assignment
US10866969B2 (en) 2018-03-28 2020-12-15 EMC IP Holding Company LLC Storage system with loopback replication process providing unique identifiers for collision-free object pairing
US10394485B1 (en) 2018-03-29 2019-08-27 EMC IP Holding Company LLC Storage system with efficient re-synchronization mode for use in replication of data from source to target
US10496324B2 (en) 2018-03-30 2019-12-03 EMC IP Holding Company LLC Storage system with concurrent fan-out asynchronous replication using decoupled replication sessions
US11609883B2 (en) 2018-05-29 2023-03-21 EMC IP Holding Company LLC Processing device configured for efficient generation of compression estimates for datasets
US11593313B2 (en) 2018-05-29 2023-02-28 EMC IP Holding Company LLC Processing device configured for efficient generation of data reduction estimates for combinations of datasets
US10977216B2 (en) 2018-05-29 2021-04-13 EMC IP Holding Company LLC Processing device utilizing content-based signature prefix for efficient generation of deduplication estimate
US10983962B2 (en) 2018-05-29 2021-04-20 EMC IP Holding Company LLC Processing device utilizing polynomial-based signature subspace for efficient generation of deduplication estimate
US10996898B2 (en) 2018-05-29 2021-05-04 EMC IP Holding Company LLC Storage system configured for efficient generation of capacity release estimates for deletion of datasets
US11429517B2 (en) 2018-06-20 2022-08-30 EMC IP Holding Company LLC Clustered storage system with stateless inter-module communication for processing of count-key-data tracks
US11216443B2 (en) 2018-06-20 2022-01-04 EMC IP Holding Company LLC Processing device configured for data integrity testing utilizing signature-based multi-phase write operations
US11372772B2 (en) 2018-06-20 2022-06-28 EMC IP Holding Company LLC Content addressable storage system configured for efficient storage of count-key-data tracks
US10592159B2 (en) 2018-06-20 2020-03-17 EMC IP Holding Company LLC Processing device configured for data integrity testing utilizing replicated test metadata file
US10754575B2 (en) 2018-07-16 2020-08-25 EMC IP Holding Company LLC Storage system with replication process utilizing simulated target responses
US10698772B2 (en) 2018-07-17 2020-06-30 EMC IP Holding Company LLC Storage system with multiple write journals supporting synchronous replication failure recovery
US10691373B2 (en) 2018-07-18 2020-06-23 EMC IP Holding Company LLC Object headers facilitating storage of data in a write buffer of a storage system
CN110737542A (en) * 2018-07-19 2020-01-31 慧与发展有限责任合伙企业 Freezing and unfreezing upstream and downstream rolls
US10558613B1 (en) 2018-07-19 2020-02-11 EMC IP Holding Company LLC Storage system with decrement protection of reference counts
US10922147B2 (en) 2018-07-19 2021-02-16 EMC IP Holding Company LLC Storage system destaging based on synchronization object with watermark
US10705927B2 (en) * 2018-07-19 2020-07-07 Hewlett Packard Enterprise Development Lp Freeze a volume of a replication set and unfreeze the volume based on at least one of a snapshot permit message, a snapshot abort message, and expiration of a timeout
US10942895B2 (en) 2018-07-19 2021-03-09 EMC IP Holding Company LLC Storage system with decrement protection of reference counts
US10826990B2 (en) 2018-07-23 2020-11-03 EMC IP Holding Company LLC Clustered storage system configured for bandwidth efficient processing of writes at sizes below a native page size
US10884651B2 (en) 2018-07-23 2021-01-05 EMC IP Holding Company LLC Storage system with multi-phase verification of synchronously replicated data
US10705965B2 (en) 2018-07-23 2020-07-07 EMC IP Holding Company LLC Metadata loading in storage systems
US10671320B2 (en) 2018-07-24 2020-06-02 EMC IP Holding Company LLC Clustered storage system configured with decoupling of process restart from in-flight command execution
US10691551B2 (en) 2018-07-24 2020-06-23 EMC IP Holding Company LLC Storage system with snapshot generation control utilizing monitored differentials of respective storage volumes
US10990479B2 (en) 2018-07-25 2021-04-27 EMC IP Holding Company LLC Efficient packing of compressed data in storage system implementing data striping
US10831735B2 (en) 2018-07-25 2020-11-10 EMC IP Holding Company LLC Processing device configured for efficient generation of a direct mapped hash table persisted to non-volatile block memory
US10684915B2 (en) 2018-07-25 2020-06-16 EMC IP Holding Company LLC Efficient packing of compressed data in storage system implementing data striping
US10747677B2 (en) 2018-07-27 2020-08-18 EMC IP Holding Company LLC Snapshot locking mechanism
US10635533B2 (en) 2018-07-30 2020-04-28 EMC IP Holding Company LLC Efficient computation of parity data in storage system implementing data striping
US11106810B2 (en) 2018-07-30 2021-08-31 EMC IP Holding Company LLC Multi-tenant deduplication with non-trusted storage system
US11327834B2 (en) 2018-07-30 2022-05-10 EMC IP Holding Company LLC Efficient computation of parity data in storage system implementing data striping
US10783134B2 (en) 2018-07-31 2020-09-22 EMC IP Holding Company LLC Polling process for monitoring interdependent hardware components
US10929047B2 (en) 2018-07-31 2021-02-23 EMC IP Holding Company LLC Storage system with snapshot generation and/or preservation control responsive to monitored replication data
US10852999B2 (en) 2018-07-31 2020-12-01 EMC IP Holding Company LLC Storage system with decoupling of reference count updates
US10824512B2 (en) 2018-07-31 2020-11-03 EMC IP Holding Company LLC Managing journaling resources with copies stored in multiple locations
US11030314B2 (en) 2018-07-31 2021-06-08 EMC IP Holding Company LLC Storage system with snapshot-based detection and remediation of ransomware attacks
US10901847B2 (en) 2018-07-31 2021-01-26 EMC IP Holding Company LLC Maintaining logical to physical address mapping during in place sector rebuild
US10817385B2 (en) 2018-07-31 2020-10-27 EMC IP Holding Company LLC Storage system with backup control utilizing content-based signatures
US10983860B2 (en) 2018-09-21 2021-04-20 EMC IP Holding Company LLC Automatic prefill of a storage system with conditioning of raid stripes
US10761933B2 (en) 2018-09-21 2020-09-01 EMC IP Holding Company LLC Prefill of raid stripes in a storage system by reading of existing data
US10783038B2 (en) 2018-09-21 2020-09-22 EMC IP Holding Company LLC Distributed generation of random data in a storage system
US10606519B1 (en) 2018-10-19 2020-03-31 EMC IP Holding Company LLC Edge case handling in system with dynamic flow control
US11055029B2 (en) 2018-10-19 2021-07-06 EMC IP Holding Company LLC Edge case handling in system with dynamic flow control
US10725855B2 (en) 2018-10-22 2020-07-28 EMC IP Holding Company LLC Storage system with data integrity verification performed in conjunction with internal data movement
US10747474B2 (en) 2018-10-22 2020-08-18 EMC IP Holding Company LLC Online cluster expansion for storage system with decoupled logical and physical capacity
US10860241B2 (en) 2018-10-24 2020-12-08 EMC IP Holding Company LLC Storage system configured for token-based data transfer in active-active configuration with synchronous replication
US10754736B2 (en) 2018-10-25 2020-08-25 EMC IP Holding Company LLC Storage system with scanning and recovery of internal hash metadata structures
US10852965B2 (en) 2018-10-30 2020-12-01 EMC IP Holding Company LLC Write folding mechanism using reusable shared striping in a storage system
US11003629B2 (en) 2018-10-31 2021-05-11 EMC IP Holding Company LLC Dual layer deduplication for application specific file types in an information processing system
US10719253B2 (en) 2018-10-31 2020-07-21 EMC IP Holding Company LLC Efficient compression of data in storage systems through offloading computation to storage devices
US10628299B1 (en) 2018-11-01 2020-04-21 EMC IP Holding Company LLC Content addressable storage system utilizing content-based and address-based mappings
US11144229B2 (en) 2018-11-01 2021-10-12 EMC IP Holding Company LLC Bandwidth efficient hash-based migration of storage volumes between storage systems
US10942654B2 (en) 2018-11-01 2021-03-09 EMC IP Holding Company LLC Hash-based data recovery from remote storage system
US10296451B1 (en) 2018-11-01 2019-05-21 EMC IP Holding Company LLC Content addressable storage system utilizing content-based and address-based mappings
US11086558B2 (en) 2018-11-01 2021-08-10 EMC IP Holding Company LLC Storage system with storage volume undelete functionality
US10996871B2 (en) 2018-11-01 2021-05-04 EMC IP Holding Company LLC Hash-based data recovery from remote storage system responsive to missing or corrupted hash digest
US10691355B2 (en) 2018-11-02 2020-06-23 EMC IP Holding Company LLC Apparatus, method and computer program product for controlled ordering of data pages for migration from source storage system into target storage system
US10936240B2 (en) * 2018-12-04 2021-03-02 International Business Machines Corporation Using merged snapshots to increase operational efficiency for network caching based disaster recovery
US10846178B2 (en) 2019-01-11 2020-11-24 EMC IP Holding Company LLC Hash-based remote rebuild assistance for content addressable storage systems
US11232128B2 (en) * 2019-01-14 2022-01-25 EMC IP Holding Company LLC Storage systems configured with time-to-live clustering for replication in active-active configuration
US11093159B2 (en) 2019-01-15 2021-08-17 EMC IP Holding Company LLC Storage system with storage volume pre-copy functionality for increased efficiency in asynchronous replication
US10871991B2 (en) 2019-01-18 2020-12-22 EMC IP Holding Company LLC Multi-core processor in storage system executing dedicated polling thread for increased core availability
US10884799B2 (en) 2019-01-18 2021-01-05 EMC IP Holding Company LLC Multi-core processor in storage system executing dynamic thread for increased core availability
US11288286B2 (en) 2019-01-22 2022-03-29 EMC IP Holding Company LLC Storage system with data consistency checking in synchronous replication using active snapshot set
US10592161B1 (en) 2019-01-22 2020-03-17 EMC IP Holding Company LLC Storage system with flexible scanning supporting storage volume addition and/or recovery in asynchronous replication
US11275765B2 (en) * 2019-01-28 2022-03-15 EMC IP Holding Company LLC Storage systems configured for storage volume addition in synchronous replication using active-active configuration
US11204716B2 (en) 2019-01-31 2021-12-21 EMC IP Holding Company LLC Compression offloading to RAID array storage enclosure
US10831407B2 (en) 2019-01-31 2020-11-10 EMC IP Holding Company LLC Write flow offloading to raid array storage enclosure
US10838863B2 (en) 2019-02-01 2020-11-17 EMC IP Holding Company LLC Storage system with write cache release protection
US11392551B2 (en) 2019-02-04 2022-07-19 EMC IP Holding Company LLC Storage system utilizing content-based and address-based mappings for deduplicatable and non-deduplicatable types of data
US10795662B2 (en) * 2019-02-11 2020-10-06 Salesforce.Com, Inc. Scalable artifact distribution
US20200257514A1 (en) * 2019-02-11 2020-08-13 Salesforce.Com, Inc. Scalable artifact distribution
US11397705B2 (en) 2019-02-26 2022-07-26 EMC IP Holding Company LLC Storage system configured to generate sub-volume snapshots
US10754559B1 (en) 2019-03-08 2020-08-25 EMC IP Holding Company LLC Active-active storage clustering with clock synchronization
US10936010B2 (en) 2019-03-08 2021-03-02 EMC IP Holding Company LLC Clock synchronization for storage systems in an active-active configuration
US10891195B2 (en) 2019-03-19 2021-01-12 EMC IP Holding Company LLC Storage system with differential scanning of non-ancestor snapshot pairs in asynchronous replication
US11055014B2 (en) 2019-03-26 2021-07-06 EMC IP Holding Company LLC Storage system providing automatic configuration updates for remote storage objects in a replication process
US10866735B2 (en) 2019-03-26 2020-12-15 EMC IP Holding Company LLC Storage system with variable granularity counters
US11386042B2 (en) 2019-03-29 2022-07-12 EMC IP Holding Company LLC Snapshot-enabled storage system implementing algorithm for efficient reading of data from stored snapshots
US11379142B2 (en) 2019-03-29 2022-07-05 EMC IP Holding Company LLC Snapshot-enabled storage system implementing algorithm for efficient reclamation of snapshot storage space
US11055188B2 (en) 2019-04-12 2021-07-06 EMC IP Holding Company LLC Offloading error processing to raid array storage enclosure
US10866760B2 (en) 2019-04-15 2020-12-15 EMC IP Holding Company LLC Storage system with efficient detection and clean-up of stale data for sparsely-allocated storage in replication
US10871960B2 (en) 2019-04-23 2020-12-22 EMC IP Holding Company LLC Upgrading a storage controller operating system without rebooting a storage system
US10996887B2 (en) 2019-04-29 2021-05-04 EMC IP Holding Company LLC Clustered storage system with dynamic space assignments across processing modules to counter unbalanced conditions
US10929050B2 (en) 2019-04-29 2021-02-23 EMC IP Holding Company LLC Storage system with deduplication-aware replication implemented using a standard storage command protocol
US11249834B2 (en) 2019-05-15 2022-02-15 EMC IP Holding Company LLC Storage system with coordinated recovery across multiple input-output journals of different types
US11099766B2 (en) 2019-06-21 2021-08-24 EMC IP Holding Company LLC Storage system configured to support one-to-many replication
US11137929B2 (en) 2019-06-21 2021-10-05 EMC IP Holding Company LLC Storage system configured to support cascade replication
US10929239B2 (en) 2019-07-19 2021-02-23 EMC IP Holding Company LLC Storage system with snapshot group merge functionality
US11467906B2 (en) 2019-08-02 2022-10-11 EMC IP Holding Company LLC Storage system resource rebuild based on input-output operation indicator
US11494103B2 (en) 2019-08-02 2022-11-08 EMC IP Holding Company LLC Memory-efficient processing of RAID metadata bitmaps
US10909001B1 (en) 2019-08-23 2021-02-02 EMC IP Holding Company LLC Storage system with snapshot group split functionality
US10997072B1 (en) 2019-10-16 2021-05-04 EMC IP Holding Company LLC Host-based acceleration of a content addressable storage system
US11151048B2 (en) 2019-10-25 2021-10-19 Dell Products L.P. Host-based read performance optimization of a content addressable storage system
US11099767B2 (en) 2019-10-25 2021-08-24 EMC IP Holding Company LLC Storage system with throughput-based timing of synchronous replication recovery
US11645174B2 (en) 2019-10-28 2023-05-09 Dell Products L.P. Recovery flow with reduced address lock contention in a content addressable storage system
US10990286B1 (en) 2019-10-30 2021-04-27 EMC IP Holding Company LLC Parallel upgrade of nodes in a storage system
US11079957B2 (en) 2019-11-01 2021-08-03 Dell Products L.P. Storage system capacity expansion using mixed-capacity storage devices
US11036602B1 (en) 2019-11-25 2021-06-15 EMC IP Holding Company LLC Storage system with prioritized RAID rebuild
US11232010B2 (en) * 2020-01-20 2022-01-25 EMC IP Holding Company LLC Performance monitoring for storage system with core thread comprising internal and external schedulers
US11106557B2 (en) 2020-01-21 2021-08-31 EMC IP Holding Company LLC Persistence points based coverage mechanism for flow testing in high-performance storage systems
US11079961B1 (en) 2020-02-03 2021-08-03 EMC IP Holding Company LLC Storage system with write-via-hash functionality for synchronous replication of logical storage volumes
US11055028B1 (en) 2020-02-03 2021-07-06 EMC IP Holding Company LLC Storage system with reduced read latency
US11360712B2 (en) 2020-02-03 2022-06-14 EMC IP Holding Company LLC Storage system with continuous data verification for synchronous replication of logical storage volumes
US11249654B2 (en) 2020-02-18 2022-02-15 EMC IP Holding Company LLC Storage system with efficient data and parity distribution across mixed-capacity storage devices
US11144232B2 (en) 2020-02-21 2021-10-12 EMC IP Holding Company LLC Storage system with efficient snapshot pair creation during synchronous replication of logical storage volumes
US11061618B1 (en) 2020-02-25 2021-07-13 EMC IP Holding Company LLC Disk array enclosure configured to determine metadata page location based on metadata identifier
US11281386B2 (en) 2020-02-25 2022-03-22 EMC IP Holding Company LLC Disk array enclosure with metadata journal
US11079969B1 (en) 2020-02-25 2021-08-03 EMC IP Holding Company LLC Disk array enclosure configured for metadata and data storage processing
US11144461B2 (en) 2020-03-09 2021-10-12 EMC IP Holding Company LLC Bandwidth efficient access to persistent storage in a distributed storage system
US11010251B1 (en) 2020-03-10 2021-05-18 EMC IP Holding Company LLC Metadata update journal destaging with preload phase for efficient metadata recovery in a distributed storage system
US11126361B1 (en) 2020-03-16 2021-09-21 EMC IP Holding Company LLC Multi-level bucket aggregation for journal destaging in a distributed storage system
US11194664B2 (en) 2020-04-20 2021-12-07 EMC IP Holding Company LLC Storage system configured to guarantee sufficient capacity for a distributed raid rebuild process
US11169880B1 (en) 2020-04-20 2021-11-09 EMC IP Holding Company LLC Storage system configured to guarantee sufficient capacity for a distributed raid rebuild process
US11494301B2 (en) 2020-05-12 2022-11-08 EMC IP Holding Company LLC Storage system journal ownership mechanism
US11392295B2 (en) 2020-05-27 2022-07-19 EMC IP Holding Company LLC Front-end offload of storage system processing
US11093161B1 (en) 2020-06-01 2021-08-17 EMC IP Holding Company LLC Storage system with module affinity link selection for synchronous replication of logical storage volumes
US11513882B2 (en) 2020-06-08 2022-11-29 EMC IP Holding Company LLC Dynamic modification of IO shaping mechanisms of multiple storage nodes in a distributed storage system
US11886911B2 (en) 2020-06-29 2024-01-30 EMC IP Holding Company LLC End-to-end quality of service mechanism for storage system using prioritized thread queues
US11327812B1 (en) 2020-10-19 2022-05-10 EMC IP Holding Company LLC Distributed storage system with per-core rebalancing of thread queues
US11436138B2 (en) 2020-10-21 2022-09-06 EMC IP Holding Company LLC Adaptive endurance tuning of solid-state storage system
US11853568B2 (en) 2020-10-21 2023-12-26 EMC IP Holding Company LLC Front-end offload of storage system hash and compression processing
US11531470B2 (en) 2020-10-21 2022-12-20 EMC IP Holding Company LLC Offload of storage system data recovery to storage devices
US11616722B2 (en) 2020-10-22 2023-03-28 EMC IP Holding Company LLC Storage system with adaptive flow control using multiple feedback loops
US11314416B1 (en) 2020-10-23 2022-04-26 EMC IP Holding Company LLC Defragmentation of striped volume in data storage system
US11435921B2 (en) 2020-11-19 2022-09-06 EMC IP Holding Company LLC Selective deduplication in a distributed storage system
US11687245B2 (en) 2020-11-19 2023-06-27 EMC IP Holding Company LLC Dynamic slice assignment in a distributed storage system
US11494405B2 (en) 2020-12-21 2022-11-08 EMC IP Holding Company LLC Lock contention resolution for active-active replication performed in conjunction with journal recovery
US11481291B2 (en) 2021-01-12 2022-10-25 EMC IP Holding Company LLC Alternative storage node communication channel using storage devices group in a distributed storage system
US11372570B1 (en) * 2021-02-12 2022-06-28 Hitachi, Ltd. Storage device, computer system, and data transfer program for deduplication
US11875198B2 (en) 2021-03-22 2024-01-16 EMC IP Holding Company LLC Synchronization object issue detection using object type queues and associated monitor threads in a storage system
US11507597B2 (en) 2021-03-31 2022-11-22 Pure Storage, Inc. Data replication to meet a recovery point objective
US11520527B1 (en) 2021-06-11 2022-12-06 EMC IP Holding Company LLC Persistent metadata storage in a storage system
US11775202B2 (en) 2021-07-12 2023-10-03 EMC IP Holding Company LLC Read stream identification in a distributed storage system
CN113569047A (en) * 2021-07-23 2021-10-29 中信银行股份有限公司 Intersystem data verification method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
US10467246B2 (en) 2019-11-05
US20160147855A1 (en) 2016-05-26

Similar Documents

Publication Publication Date Title
US20160150012A1 (en) Content-based replication of data between storage units
US11716385B2 (en) Utilizing cloud-based storage systems to support synchronous replication of a dataset
US11836155B2 (en) File system operation handling during cutover and steady state
US20210019067A1 (en) Data deduplication across storage systems
US20220091771A1 (en) Moving Data Between Tiers In A Multi-Tiered, Cloud-Based Storage System
US9977746B2 (en) Processing of incoming blocks in deduplicating storage system
US10459632B1 (en) Method and system for automatic replication data verification and recovery
US9928003B2 (en) Management of writable snapshots in a network storage device
US10152527B1 (en) Increment resynchronization in hash-based replication
US8706694B2 (en) Continuous data protection of files stored on a remote storage device
US8930947B1 (en) System and method for live migration of a virtual machine with dedicated cache
US10409508B2 (en) Updating of pinned storage in flash based on changes to flash-to-disk capacity ratio
US9274956B1 (en) Intelligent cache eviction at storage gateways
US20170177479A1 (en) Cached volumes at storage gateways
US9559889B1 (en) Cache population optimization for storage gateways
US10042719B1 (en) Optimizing application data backup in SMB
WO2021226344A1 (en) Providing data management as-a-service
US9053033B1 (en) System and method for cache content sharing
US10210060B2 (en) Online NVM format upgrade in a data storage system operating with active and standby memory controllers
US11003541B2 (en) Point-in-time copy on a remote system
US11216204B2 (en) Degraded redundant metadata, DRuM, technique
US9864661B2 (en) Cache-accelerated replication of snapshots between storage devices
US20230034463A1 (en) Selectively using summary bitmaps for data synchronization
US20230110067A1 (en) Systems, methods, and devices for near storage elasticity
US11620190B2 (en) Techniques for performing backups using hints

Legal Events

Date Code Title Description
AS Assignment

Owner name: NIMBLE STORAGE, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BARSZCZAK, TOMASZ;KARAJE, GURUNATHA;BHAGAT, NIMESH;REEL/FRAME:037135/0137

Effective date: 20151123

AS Assignment

Owner name: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NIMBLE STORAGE, INC.;REEL/FRAME:042810/0906

Effective date: 20170601

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION