US20140337578A1 - Redundant array of inexpensive disks (raid) system configured to reduce rebuild time and to prevent data sprawl - Google Patents
Redundant array of inexpensive disks (raid) system configured to reduce rebuild time and to prevent data sprawl Download PDFInfo
- Publication number
- US20140337578A1 US20140337578A1 US14/445,145 US201414445145A US2014337578A1 US 20140337578 A1 US20140337578 A1 US 20140337578A1 US 201414445145 A US201414445145 A US 201414445145A US 2014337578 A1 US2014337578 A1 US 2014337578A1
- Authority
- US
- United States
- Prior art keywords
- virtual memory
- raid
- memory addresses
- computer
- pds
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0638—Organizing or formatting or addressing of data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
- G06F11/1092—Rebuilding, e.g. when physically replacing a failing disk
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
- G06F12/0615—Address space extension
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0662—Virtualisation aspects
- G06F3/0665—Virtualisation aspects at area level, e.g. provisioning of virtual or logical volumes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
- G06F12/0646—Configuration or reconfiguration
-
- G06F2003/0695—
Definitions
- the invention relates generally to storage systems that implement Redundant Array of Inexpensive Disks (RAID) technology. More particularly, the invention relates to a RAID system that is capable of performing rebuild processes in a reduced amount of time and that is capable of reducing or preventing data sprawl.
- RAID Redundant Array of Inexpensive Disks
- a storage array or disk array is a data storage device that includes multiple disk drives or similar persistent storage units.
- a storage array can allow large amounts of data to be stored in an efficient manner.
- a storage array also can provide redundancy to promote reliability, as in the case of a RAID system.
- RAID systems simultaneously use two or more hard disk drives, referred to herein as physical disk drives (PDs), to achieve greater levels of performance, reliability and/or larger data volume sizes.
- PDs physical disk drives
- the phrase “RAID” is generally used to describe computer data storage schemes that divide and replicate data among multiple PDs.
- one or more PDs are set up as a RAID virtual disk drive (VD).
- VD RAID virtual disk drive
- VD Redi-Redge Data
- data might be distributed across multiple PDs, but the VD is seen by the user and by the operating system of the computer as a single disk.
- the VD is “virtual” in that storage space in the VD maps to the physical storage space in the PDs, but the VD usually does not itself represent a single physical storage device.
- RAID has seven basic levels corresponding to different system designs.
- the seven basic RAID levels are typically referred to as RAID levels 0-6.
- RAID level 5 uses striping in combination with distributed parity.
- striping means that logically sequential data, such as a single data file, is fragmented and assigned to multiple PDs in a round-robin fashion. Thus, the data is said to be “striped” over multiple PDs when the data is written.
- distributed parity means that the parity bits that are calculated for each strip of data are distributed over all of the PDs rather than being stored on one or more dedicated parity PDs. Striping improves performance because the data fragments that make up each data stripe are written in parallel to different PDs and read in parallel from the different PDs. Distributing the parity bits also improves performance in that the parity bits associated with different data stripes can be written in parallel to different PDs using parallel write operations as opposed to having to use sequential write operations to a dedicated parity PD.
- FIG. 1 illustrates a block diagram of a known RAID system 2 comprising a computer 3 , a RAID controller 4 , and array 5 of PDs 6 .
- an OS 7 of the computer 3 When the computer 3 has data to write, an OS 7 of the computer 3 generates a write command, which is received by a file system (FS) 8 of the OS 7 .
- the FS 8 then issues an input/output ( 10 ) command to the RAID controller 4 .
- the IO command contains the data to be written and virtual memory addresses where the data is currently located in a virtual memory 9 .
- a RAID processor 4 a of the RAID controller 4 receives the IO command and then maps the virtual memory addresses to physical addresses in one or more of the PDs 6 .
- the RAID processor 4 a maintains a table of the virtual-to-physical address mapping in a local memory device 4 b of the RAID controller 4 .
- the RAID controller 4 then causes the data to be written to the physical addresses in one or more of the PDs 6 .
- the failed PD 6 is rebuilt by reading all of the stripes from the PDs 6 other than the failed PD 6 , computing the data and parity of the failed PD 6 from all of the stripes read from the other PDs 6 , and writing the computed data and parity to a replacement PD.
- the main issues associated with this rebuild technique are that they (1) take a very long time to perform, (2) consume a large amount of resources, and (3) detrimentally impact system performance during the rebuild process.
- the RAID system 2 is at a lower level of protection or is without protection from data integrity risks in the event that another of the PDs 6 fails. Rebuilds can take days or weeks, and the performance of the RAID system 2 is detrimentally impacted during that time period.
- One technique that has been used to reduce the amount of data and parity that has to be computed during a rebuild involves only rebuilding “used” portions of the failed PD 6 .
- a portion of a PD 6 is considered “used” if it has been written with data.
- the RAID controller 4 of the RAID system 2 marks zones on the PDs 6 that have been written so that it is able to distinguish between zones that have been written and zones that have not been written. If a PD 6 subsequently fails, new data and parity are only computed for zones in the failed PD 6 that were marked as written at the time of the failure.
- One drawback is that the FS 8 often moves data around, which causes the same data to be stored in different zones of the PDs 6 at different times.
- the OS 7 may subsequently free data, but although the FS 8 is aware that the data has been freed, the RAID controller 4 is not made aware that the data has been freed. Therefore, the RAID controller 4 continues to consider the zone in the PD 6 in which the freed data is stored as “used”. Consequently, any zone in the failed PD 6 that was “touched” (i.e., written) at any point in time will be rebuilt. This results in more data being rebuilt than is necessary, and the process tends to be degenerative over time.
- Another disadvantage of this technique is that services and applications exist that by their nature use inordinate amounts of space on PDs 6 temporarily and then free the data. Again, while the FS 8 is aware that the data has been freed, the RAID controller 4 is not, and so any zones in the failed PD 6 that were “touched” are considered “used” and therefore will be rebuilt. Consequently, much more data and parity are rebuilt than is necessary.
- FSs typically operate.
- FSs are typically designed such that when making a choice between writing data to space that has never been written and writing data to space that has been written and subsequently freed, they choose to write data to space that has never been written. This results in “data sprawl” in that data gets written to more areas in the PDs than is necessary.
- the RAID controller is unaware that the data has been freed and considers the corresponding zones in the PDs as used. Consequently, if a PD fails, any zones that were previously written, even if subsequently freed, will be rebuilt, which results in more data being rebuilt than is necessary.
- a need also exists for a way to reduce the amount of data that needs to be rebuilt when performing a rebuild in a RAID system.
- a need also exists for a way to prevent data sprawl in a RAID system.
- the invention is directed to a RAID system, method and controller for reducing the amount of time that is required to perform a rebuild process.
- the invention is also directed to computer-readable mediums (CRMs) having computer instructions stored thereon for reducing rebuild time.
- the RAID system comprises a computer, an array of PDs, and a RAID controller interfaced with the computer and with the array of PDs.
- the computer comprises at least a first processor and a first local storage system.
- An OS of the computer runs on the first processor and uses the first local storage system.
- a file system (FS) running on the computer uses a portion of the first local storage system as virtual memory and maintains a virtual memory table in the first local storage system.
- FS file system
- the virtual memory table comprises at least entries identifying addresses in the virtual memory that are currently being used by the FS.
- Virtual memory addresses are currently being used by the FS if they have been written by the FS and have not been freed by the FS subsequent to being written by the FS.
- the OS causes the virtual memory addresses that have been written by the FS and the corresponding data to be output from the computer.
- the RAID controller identifies to a filter driver running on the computer one or more virtual memory disks in the virtual memory that contain the data that is stored in the PD for which the rebuild process is being performed.
- the filter driver scans a portion of the virtual memory table corresponding to the identified virtual memory disk or disks and identifies used virtual memory addresses.
- the filter driver then causes the used virtual memory addresses to be output to the RAID controller.
- the RAID controller translates the used virtual memory addresses into used physical addresses and causes data and parity to be reconstructed for the used physical addresses.
- the method for performing a rebuild in a RAID system comprises the following: in the event that a rebuild process is to be performed for one of the PDs, outputting from the RAID controller to a filter driver running on a computer of the RAID system identifiers of one or more virtual memory disks of a virtual memory of the computer.
- the identified virtual memory addresses contain data corresponding to data stored in the physical memory addresses of the PD or PDs for which the rebuild process is being performed; with the filter driver, receiving the identifiers in the computer and scanning a virtual memory table of the OS of the computer to identify used virtual memory addresses associated with the identifiers; with the filter driver, causing the used virtual memory addresses to be output from the computer to the RAID controller; in the RAID controller, translating the used virtual memory addresses into used physical memory addresses of the PD for which the rebuild process is being performed, and in the RAID controller, rebuilding data and parity for the used physical addresses of the PD for which the rebuild process is being performed.
- a RAID controller for performing a rebuild in a RAID system comprises at least an interface for interfacing with the computer and an array of PDs of the RAID system and a RAID processor for performing the rebuild.
- the RAID controller receives virtual memory addresses and corresponding data.
- the virtual memory addresses correspond to entries in a virtual memory table maintained by an FS of the computer of the RAID system.
- the entries identify virtual memory addresses in a virtual memory of a first local storage system of the computer that have been written by the FS and that have not been freed by the FS subsequent to being written by the FS.
- the RAID processor translates the virtual memory addresses into physical memory addresses in one or more of the PDs and writes the corresponding data to the corresponding physical memory addresses.
- the RAID processor identifies, via the interface, to a filter driver of the computer one or more virtual memory disks in the virtual memory that correspond to the physical memory addresses of one or more of the PDs for which the rebuild process is to be performed.
- the RAID processor receives, via the interface, virtual memory addresses identified by the filter driver as currently used virtual memory addresses and translates the currently used virtual memory addresses into currently used physical addresses and causes data and parity to be reconstructed for the currently used physical addresses.
- the CRM comprises one or more computer software programs for performing a rebuild in a RAID system.
- the computer software program or programs comprises a first code segment for execution by the RAID controller, a second code segment for execution by the computer, and third and fourth code segments for execution by the RAID controller.
- the first code segment causes identifiers of one or more virtual memory disks of a virtual memory that correspond to physical memory addresses of the PD or PDs for which the rebuild process is to be performed to be output from the RAID controller.
- the second code segment is a filter driver code segment that receives the identifiers output from the RAID controller and scans the virtual memory table of the OS of the computer of the RAID system to identify used virtual memory addresses associated with the identifiers. Used virtual memory addresses are virtual memory addresses that have been written by an FS of the computer and that have not been freed by the FS subsequent to being written by the FS.
- the filter driver code segment causes the used virtual memory addresses to be output from the computer to the RAID controller.
- the third code segment translates the used virtual memory addresses into used physical memory addresses of the PD or PDs for which the rebuild process is being performed.
- the fourth code segment then reconstructs data and parity for the used physical addresses of the PD for which the rebuild process is being performed.
- FIG. 1 illustrates a block diagram of a known RAID system.
- FIG. 2 illustrates a block diagram of a RAID system in accordance with an illustrative embodiment configured to reduce the amount of time that is required to perform a rebuild process and to reduce the amount of data and parity that have to be rebuilt during the rebuild process.
- FIG. 3 illustrates a flowchart that demonstrates the rebuild process in accordance with an illustrative embodiment.
- FIG. 4 illustrates the array of PDs shown in FIG. 2 and demonstrates the manner in which data sprawl is reduced or prevented in accordance with an illustrative embodiment.
- FIG. 5 illustrates a flowchart that represents the method performed by the RAID system shown in FIG. 2 to prevent data sprawl.
- a filter driver is provided in the OS of the computer of the RAID system that, in the event that one of the PDs is to be rebuilt, scans the virtual memory table of the computer to identify virtual memory addresses that are used and communicates the identified virtual memory addresses to the RAID controller.
- the RAID controller translates the identified virtual memory addresses into physical addresses of the PD being rebuilt.
- the RAID controller then rebuilds data and parity only for physical addresses in the PD that are associated with the virtual memory addresses identified by the filter driver. This reduces the amount of data and parity that are rebuilt during a rebuild process and reduces the amount of time that is required to perform the rebuild process.
- data is stored in the PDs in a way that limits data sprawl.
- data sprawl By limiting data sprawl, the number of addresses in the PDs containing data and parity that have to be rebuilt is reduced, thereby reducing the amount of time that is required to perform the rebuild process.
- the first and second aspects of the invention may be employed together or separately.
- Embodiments of the invention use these known computational methods to reconstruct data and parity, but reduce the amount of data and parity that have to be reconstructed, and therefore reduce the amount of time that is required to rebuild the PD being replaced. Illustrative, or exemplary, embodiments of the first aspect of the invention will now be described with reference to FIGS. 2-3 .
- FIG. 2 illustrates a block diagram of a RAID system 100 in accordance with an illustrative embodiment configured to reduce the amount of time that is required to perform a rebuild process and to reduce the amount of data and parity that have to be rebuilt during the rebuild process.
- the RAID system 100 includes a computer 110 , a RAID controller 120 , and an array 130 of PDs 131 .
- the computer 110 may be any type of computer, but it is typically a server.
- the computer 110 includes an OS 140 having an FS 150 , a virtual memory 160 , a virtual memory table 170 , and a filter driver 200 .
- the filter driver 200 is depicted as being separate from the FS 150 , the filter driver 200 may be part of the FS 150 .
- the OS 140 , the FS 150 , and the filter driver 200 are typically implemented as computer software programs that reside in a local storage system 210 of the computer 110 and that are executed by at least one processor 220 of the computer 110 .
- the local storage system 210 typically comprises at least one hard disk drive (HDD) (not shown) and at least one solid state memory device (not shown).
- the virtual memory 160 and the virtual memory table 170 reside in the local storage system 210 of the computer 110 .
- the OS 140 When the computer 110 has data to write, the OS 140 generates a write command, which is received by the FS 150 .
- the FS 150 then writes the data to addresses in the virtual memory 160 and creates entries in the virtual memory table 170 that indicate where the data is stored in the virtual memory 160 .
- the FS 150 then issues an IO command to the RAID controller 120 .
- the IO command contains the data to be written and the virtual memory addresses where the data is currently located in the virtual memory 160 .
- a RAID processor 120 a of the RAID controller 120 receives the IO command and then maps the virtual memory addresses to physical memory addresses in one or more of the PDs 131 of the array 130 .
- the RAID processor 120 a maintains a mapping table of the virtual-to-physical address mapping in a local memory device 120 b of the RAID controller 120 .
- the mapping table could be stored in an external memory device (not shown) that is accessible by the RAID processor 120 a .
- the RAID controller 120 then causes the data to be written to the physical addresses in one or more of the PDs 131 .
- one of the problems with the known rebuild technique results from the fact that when the OS 7 frees data, the RAID controller 4 is unaware that the data has been freed. Therefore, the RAID controller 4 does not know to free the corresponding data in the PDs 6 . As a result, the corresponding physical addresses in the PDs 6 are considered by the RAID controller 4 to be used, i.e., to contain valid data. Consequently, if one of the PDs 6 fails, any addresses in the failed PD 6 that were written at any point in time are rebuilt, even if those addresses contain data that has been freed in the virtual memory 9 by the OS 7 .
- the filter driver 200 if a rebuild is to be performed for one of the PDs 131 , the filter driver 200 identifies used virtual memory addresses in the virtual memory 160 that correspond to physical addresses in the PD 131 being rebuilt. Virtual memory addresses that are used are those which have been written by the FS 150 and not subsequently freed by the FS 150 . The filter driver 200 then causes the used virtual memory addresses to be communicated to the RAID controller 120 . The RAID processor 120 a translates the used virtual memory addresses into their corresponding physical addresses in the PD 131 being rebuilt. The RAID controller 120 then rebuilds data and parity only for the physical addresses in the PD 131 that correspond to the used virtual memory addresses identified by the filter driver 200 .
- FIG. 3 illustrates a flowchart that demonstrates an example of a method for determining which data needs to be rebuilt and for rebuilding the data.
- the RAID controller 120 identifies to the filter driver 200 the virtual memory disk or disks that correspond to the PD 131 being rebuilt, as indicated by block 201 .
- the filter driver 200 then scans the corresponding portion of the virtual memory table 170 and identifies the used virtual memory addresses, as indicated by block 203 .
- the filter driver 200 then causes the used virtual memory addresses to be output to the RAID controller 120 , as indicated by block 205 .
- the RAID controller 120 translates the used virtual memory addresses into physical memory addresses in the PD 131 , as indicated by block 207 .
- the RAID controller 120 reconstructs data and parity only for those physical addresses of the PD 131 being rebuilt, as indicated by block 209 .
- the method is partially performed by the filter driver 200 of the OS 140 and partially by the RAID controller 120 .
- Both parts are typically implemented as computer software programs.
- the computer software program corresponding to the filter driver 200 is executed by the processor 220 of the computer 110 running the OS 140 .
- the computer software program performed by the RAID controller 120 is executed by the processor 120 a of the RAID controller 120 .
- the computer software programs are typically stored on two separate computer-readable mediums (CRMs), one of which typically resides in the local storage system 210 of the computer 110 and the other of which typically resides in the local memory element 120 b of the RAID controller 120 . Any type of CRM may be used for this purpose including solid state memory devices, magnetic memory devices and optical memory devices.
- Solid state memory devices that are suitable for this purpose include, for example, Random Access Memory (RAM) devices, Read-Only Memory (ROM) devices, programmable ROM (PROM) devices, erasable PROM (EPROM) devices, and flash memory devices. It should be noted, however, that the method could be performed in hardware or in a combination of hardware and software or firmware.
- RAM Random Access Memory
- ROM Read-Only Memory
- PROM programmable ROM
- EPROM erasable PROM
- flash memory devices flash memory devices. It should be noted, however, that the method could be performed in hardware or in a combination of hardware and software or firmware.
- the second aspect of the invention relates to reducing data sprawl in a RAID system so that in the event that a rebuild has to be performed, the amount of data that has to be rebuilt and the amount of time that is required to perform the rebuild process are reduced.
- the FS 8 typically causes data to be spread around the array 5 for the reasons described above, thereby resulting in data sprawl.
- the RAID controller 120 allocates storage space in the PDs 131 in a way that prevents data from being spread around the array 130 .
- the RAID controller 120 allocates less than all of the storage space of the array 130 for use by the OS 140 .
- the space that is initially allocated comprises addresses that are typically contiguous, or at least substantially contiguous. For example, if the array 130 has an available storage capacity of 1 terabyte (TB), the RAID controller 120 may initially allocate 200 gigabytes (GBs), or about 20%.
- the RAID controller 120 writes the data to addresses in the PDs 131 of the array 130 that are in the initially allocated space, thereby confining the data and parity to particular portions of the array 130 .
- the RAID controller 120 allocates additional space in the array 130 that is contiguous or substantially contiguous with the initially allocated space.
- space allocated earlier in time is filled before space allocated later in time is filled, and the data is confined to the allocated space. In this way, data sprawl is prevented or at least reduced, which reduces disk seek times and the amount of time that is required to perform a rebuild in the event that one of the PDs 131 fails.
- FIG. 4 illustrates the array 130 of PDs 131 shown in FIG. 2 in accordance with an exemplary embodiment in which the array 130 is made up of three PDs 131 1 , 131 2 and 131 3 .
- each of the PDs 131 1 , 131 2 and 131 3 is made up of N blocks of storage space, wherein N is a positive integer.
- the RAID controller 120 initially allocates M blocks of storage space in each of the PDs 131 1 , 131 2 and 131 3 for use by the FS 150 , where M is a positive integer that is greater than 0 and less than N.
- the RAID controller 120 causes the data and parity to be stored in the M blocks that were initially allocated for use.
- the M blocks of storage space are close to being full (e.g., 90% full)
- the RAID controller 120 allocates additional space comprising P blocks of storage space in the PDs 131 1 , 131 2 and 131 3 , where P is a positive integer that is less than N and that is less than, equal to or greater than M. Typically, P will be less than or equal to M.
- the rebuild process can be performed in less time due to the fact that the data is more confined as opposed to being spread around throughout the array 130 .
- the second aspect of the invention is combined with the first aspect of the invention such that data that is written by the OS 140 and subsequently freed by the OS 140 is not rebuilt by the RAID controller 120 , the amount of time that is required to perform the rebuild process can be even further reduced.
- FIG. 5 illustrates a flowchart that represents the method performed by the RAID system 100 shown in FIG. 2 to prevent data sprawl.
- the RAID controller 120 initially allocates M blocks of storage space in each of the PDs 131 as available for use by the FS 150 , as indicated by block 301 .
- the RAID controller 120 causes the data and parity to be stored in addresses within the M blocks that were initially allocated for use, as indicated by block 303 .
- a determination is then made as to whether X percentage of the allocated space has been filled, where X is an integer that is less than or equal to 100 , as indicated by block 305 .
- X could correspond to a percentage of the allocated space that is unfilled, e.g., 10%. If the query of block 305 is answered in the affirmative, then the process proceeds to block 307 at which the RAID controller 120 allocates P blocks of additional storage space in the PDs 131 , assuming there is storage space remaining in the PDs 130 . The process then returns to block 303 .
- each PD 131 has N blocks of storage capacity
- the PDs 131 may not always be identical types of storage devices and/or may not always have the same storage capacity.
- block 305 uses a threshold percentage, X
- the threshold could instead be a particular number of blocks filled or a particular number of blocks remaining unfilled. For example, rather than determining when a particular percentage of the allocated space has been filled or remains unfilled, the process may determine when X of the allocated blocks have been filled or remain unfilled.
- the M blocks that are initially allocated and the P blocks that are subsequently allocated are typically contiguous portions of the array 130 , this is not a requirement of the invention. Although using contiguous portions of the array 130 reduces disk seek times, thereby improving performance and reducing rebuild time, the allocated portions may be noncontiguous while still achieving a reduction in disk seek times and rebuild time.
- the methods represented by the flowcharts of FIGS. 3 and 5 may be performed separately or together, performing them together further reduces the amount of data that is required to be rebuilt and the amount of time that is spent performing a rebuild process.
- the method represented by the flowchart of FIG. 5 is typically implemented in a computer software program that is stored in the local memory 120 b of the RAID controller 120 and executed by the processor 120 a of the RAID controller 120 .
- the method represented by the flowchart of FIG. 5 may, however, be implemented in hardware or in a combination of hardware and software and/or firmware, as will be understood by those skilled in the art in view of the description being provided herein.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Computer Security & Cryptography (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
Description
- The present application is a divisional application of, and claims the benefit of the filing date of, U.S. application Ser. No. 13/037,895, entitled “A REDUNDANT ARRAY OF INEXPENSIVE DISKS (RAID) SYSTEM CONFIGURED TO REDUCE REBUILD TIME AND TO PREVENT DATA SPRAWL,” which has been allowed.
- The invention relates generally to storage systems that implement Redundant Array of Inexpensive Disks (RAID) technology. More particularly, the invention relates to a RAID system that is capable of performing rebuild processes in a reduced amount of time and that is capable of reducing or preventing data sprawl.
- A storage array or disk array is a data storage device that includes multiple disk drives or similar persistent storage units. A storage array can allow large amounts of data to be stored in an efficient manner. A storage array also can provide redundancy to promote reliability, as in the case of a RAID system. In general, RAID systems simultaneously use two or more hard disk drives, referred to herein as physical disk drives (PDs), to achieve greater levels of performance, reliability and/or larger data volume sizes. The phrase “RAID” is generally used to describe computer data storage schemes that divide and replicate data among multiple PDs. In RAID systems, one or more PDs are set up as a RAID virtual disk drive (VD). In a RAID VD, data might be distributed across multiple PDs, but the VD is seen by the user and by the operating system of the computer as a single disk. The VD is “virtual” in that storage space in the VD maps to the physical storage space in the PDs, but the VD usually does not itself represent a single physical storage device.
- Although a variety of different RAID system designs exist, all have two key design goals, namely: (1) to increase data reliability and (2) to increase input/output (I/O) performance. RAID has seven basic levels corresponding to different system designs. The seven basic RAID levels are typically referred to as RAID levels 0-6.
RAID level 5 uses striping in combination with distributed parity. The term “striping” means that logically sequential data, such as a single data file, is fragmented and assigned to multiple PDs in a round-robin fashion. Thus, the data is said to be “striped” over multiple PDs when the data is written. The term “distributed parity” means that the parity bits that are calculated for each strip of data are distributed over all of the PDs rather than being stored on one or more dedicated parity PDs. Striping improves performance because the data fragments that make up each data stripe are written in parallel to different PDs and read in parallel from the different PDs. Distributing the parity bits also improves performance in that the parity bits associated with different data stripes can be written in parallel to different PDs using parallel write operations as opposed to having to use sequential write operations to a dedicated parity PD. - In order to implement distributed parity, all but one of the PDs must be present for the system to operate. Failure of any one of the PDs necessitates replacement of the PD, but does not cause the system to fail. Upon failure of one of the PDs, the data and parity that was on the failed PD can be rebuilt by using the data and parity stored on the other PDs to reconstruct the data and parity that was stored on the failed PD.
- In order to demonstrate the manner in which a rebuild process is typically performed, the manner in which a known RAID system typically operates will be described with reference to
FIG. 1 .FIG. 1 illustrates a block diagram of a knownRAID system 2 comprising acomputer 3, aRAID controller 4, andarray 5 ofPDs 6. When thecomputer 3 has data to write, anOS 7 of thecomputer 3 generates a write command, which is received by a file system (FS) 8 of theOS 7. The FS 8 then issues an input/output (10) command to theRAID controller 4. The IO command contains the data to be written and virtual memory addresses where the data is currently located in avirtual memory 9. ARAID processor 4 a of theRAID controller 4 receives the IO command and then maps the virtual memory addresses to physical addresses in one or more of thePDs 6. TheRAID processor 4 a maintains a table of the virtual-to-physical address mapping in alocal memory device 4 b of theRAID controller 4. TheRAID controller 4 then causes the data to be written to the physical addresses in one or more of thePDs 6. - If one of the
PDs 6 fails, the failedPD 6 is rebuilt by reading all of the stripes from thePDs 6 other than the failedPD 6, computing the data and parity of the failedPD 6 from all of the stripes read from theother PDs 6, and writing the computed data and parity to a replacement PD. The main issues associated with this rebuild technique are that they (1) take a very long time to perform, (2) consume a large amount of resources, and (3) detrimentally impact system performance during the rebuild process. In addition, while the rebuild process is ongoing, theRAID system 2 is at a lower level of protection or is without protection from data integrity risks in the event that another of thePDs 6 fails. Rebuilds can take days or weeks, and the performance of theRAID system 2 is detrimentally impacted during that time period. - In addition, as technological improvements in storage devices are made, their storage capacity greatly increases over time. For example, for some types of storage devices, storage capacity doubles every eighteen months or so. These increases in storage capacity mean that, in the event that one of the PDs fails, an even larger number of stripes are used to compute the new data and parity, which results in an even larger number of computations. Consequently, the amount of time that is required to perform the rebuild is further increased. Interestingly, a large part of the failed
PD 6 is typically unused, but because this is not known to theRAID controller 4, it has no other option but to rebuild the failedPD 6 in its entirety. - One technique that has been used to reduce the amount of data and parity that has to be computed during a rebuild involves only rebuilding “used” portions of the failed
PD 6. A portion of aPD 6 is considered “used” if it has been written with data. With this technique, theRAID controller 4 of theRAID system 2 marks zones on thePDs 6 that have been written so that it is able to distinguish between zones that have been written and zones that have not been written. If aPD 6 subsequently fails, new data and parity are only computed for zones in the failedPD 6 that were marked as written at the time of the failure. - This technique has several disadvantages. One drawback is that the
FS 8 often moves data around, which causes the same data to be stored in different zones of thePDs 6 at different times. The OS 7 may subsequently free data, but although the FS 8 is aware that the data has been freed, theRAID controller 4 is not made aware that the data has been freed. Therefore, theRAID controller 4 continues to consider the zone in thePD 6 in which the freed data is stored as “used”. Consequently, any zone in the failedPD 6 that was “touched” (i.e., written) at any point in time will be rebuilt. This results in more data being rebuilt than is necessary, and the process tends to be degenerative over time. Another disadvantage of this technique is that services and applications exist that by their nature use inordinate amounts of space onPDs 6 temporarily and then free the data. Again, while the FS 8 is aware that the data has been freed, theRAID controller 4 is not, and so any zones in the failedPD 6 that were “touched” are considered “used” and therefore will be rebuilt. Consequently, much more data and parity are rebuilt than is necessary. - Yet another drawback of this technique results from the manner in which FS s typically operate. FSs are typically designed such that when making a choice between writing data to space that has never been written and writing data to space that has been written and subsequently freed, they choose to write data to space that has never been written. This results in “data sprawl” in that data gets written to more areas in the PDs than is necessary. Even if the data is subsequently freed, the RAID controller is unaware that the data has been freed and considers the corresponding zones in the PDs as used. Consequently, if a PD fails, any zones that were previously written, even if subsequently freed, will be rebuilt, which results in more data being rebuilt than is necessary. In addition, data sprawl can also result in only a small portion of a zone actually being used while other portions of the same zone are unused. When the zone is rebuilt, both the used and unused portions of the zone are rebuilt. Again, this results in more data being rebuilt than is necessary.
- Accordingly, a need exists for a way to reduce the amount of time that is required to perform a rebuild process in a RAID system. A need also exists for a way to reduce the amount of data that needs to be rebuilt when performing a rebuild in a RAID system. A need also exists for a way to prevent data sprawl in a RAID system.
- The invention is directed to a RAID system, method and controller for reducing the amount of time that is required to perform a rebuild process. The invention is also directed to computer-readable mediums (CRMs) having computer instructions stored thereon for reducing rebuild time. The RAID system comprises a computer, an array of PDs, and a RAID controller interfaced with the computer and with the array of PDs. The computer comprises at least a first processor and a first local storage system. An OS of the computer runs on the first processor and uses the first local storage system. A file system (FS) running on the computer uses a portion of the first local storage system as virtual memory and maintains a virtual memory table in the first local storage system.
- The virtual memory table comprises at least entries identifying addresses in the virtual memory that are currently being used by the FS. Virtual memory addresses are currently being used by the FS if they have been written by the FS and have not been freed by the FS subsequent to being written by the FS. The OS causes the virtual memory addresses that have been written by the FS and the corresponding data to be output from the computer. In the event that a rebuild process is to be performed for one of the PDs, the RAID controller identifies to a filter driver running on the computer one or more virtual memory disks in the virtual memory that contain the data that is stored in the PD for which the rebuild process is being performed. The filter driver scans a portion of the virtual memory table corresponding to the identified virtual memory disk or disks and identifies used virtual memory addresses. The filter driver then causes the used virtual memory addresses to be output to the RAID controller. The RAID controller translates the used virtual memory addresses into used physical addresses and causes data and parity to be reconstructed for the used physical addresses.
- The method for performing a rebuild in a RAID system comprises the following: in the event that a rebuild process is to be performed for one of the PDs, outputting from the RAID controller to a filter driver running on a computer of the RAID system identifiers of one or more virtual memory disks of a virtual memory of the computer. The identified virtual memory addresses contain data corresponding to data stored in the physical memory addresses of the PD or PDs for which the rebuild process is being performed; with the filter driver, receiving the identifiers in the computer and scanning a virtual memory table of the OS of the computer to identify used virtual memory addresses associated with the identifiers; with the filter driver, causing the used virtual memory addresses to be output from the computer to the RAID controller; in the RAID controller, translating the used virtual memory addresses into used physical memory addresses of the PD for which the rebuild process is being performed, and in the RAID controller, rebuilding data and parity for the used physical addresses of the PD for which the rebuild process is being performed.
- A RAID controller for performing a rebuild in a RAID system comprises at least an interface for interfacing with the computer and an array of PDs of the RAID system and a RAID processor for performing the rebuild. The RAID controller receives virtual memory addresses and corresponding data. The virtual memory addresses correspond to entries in a virtual memory table maintained by an FS of the computer of the RAID system. The entries identify virtual memory addresses in a virtual memory of a first local storage system of the computer that have been written by the FS and that have not been freed by the FS subsequent to being written by the FS. The RAID processor translates the virtual memory addresses into physical memory addresses in one or more of the PDs and writes the corresponding data to the corresponding physical memory addresses.
- In the event that a rebuild is to be performed for one or more of the PDs, the RAID processor identifies, via the interface, to a filter driver of the computer one or more virtual memory disks in the virtual memory that correspond to the physical memory addresses of one or more of the PDs for which the rebuild process is to be performed. The RAID processor receives, via the interface, virtual memory addresses identified by the filter driver as currently used virtual memory addresses and translates the currently used virtual memory addresses into currently used physical addresses and causes data and parity to be reconstructed for the currently used physical addresses.
- The CRM comprises one or more computer software programs for performing a rebuild in a RAID system. The computer software program or programs comprises a first code segment for execution by the RAID controller, a second code segment for execution by the computer, and third and fourth code segments for execution by the RAID controller.
- In the event that a rebuild process is to be performed for one or more of the PDs, the first code segment causes identifiers of one or more virtual memory disks of a virtual memory that correspond to physical memory addresses of the PD or PDs for which the rebuild process is to be performed to be output from the RAID controller. The second code segment is a filter driver code segment that receives the identifiers output from the RAID controller and scans the virtual memory table of the OS of the computer of the RAID system to identify used virtual memory addresses associated with the identifiers. Used virtual memory addresses are virtual memory addresses that have been written by an FS of the computer and that have not been freed by the FS subsequent to being written by the FS. The filter driver code segment causes the used virtual memory addresses to be output from the computer to the RAID controller. The third code segment translates the used virtual memory addresses into used physical memory addresses of the PD or PDs for which the rebuild process is being performed. The fourth code segment then reconstructs data and parity for the used physical addresses of the PD for which the rebuild process is being performed.
- These and other features and advantages of the invention will become apparent from the following description, drawings and claims.
-
FIG. 1 illustrates a block diagram of a known RAID system. -
FIG. 2 illustrates a block diagram of a RAID system in accordance with an illustrative embodiment configured to reduce the amount of time that is required to perform a rebuild process and to reduce the amount of data and parity that have to be rebuilt during the rebuild process. -
FIG. 3 illustrates a flowchart that demonstrates the rebuild process in accordance with an illustrative embodiment. -
FIG. 4 illustrates the array of PDs shown inFIG. 2 and demonstrates the manner in which data sprawl is reduced or prevented in accordance with an illustrative embodiment. -
FIG. 5 illustrates a flowchart that represents the method performed by the RAID system shown inFIG. 2 to prevent data sprawl. - In accordance with a first aspect of the invention, a filter driver is provided in the OS of the computer of the RAID system that, in the event that one of the PDs is to be rebuilt, scans the virtual memory table of the computer to identify virtual memory addresses that are used and communicates the identified virtual memory addresses to the RAID controller. The RAID controller translates the identified virtual memory addresses into physical addresses of the PD being rebuilt. The RAID controller then rebuilds data and parity only for physical addresses in the PD that are associated with the virtual memory addresses identified by the filter driver. This reduces the amount of data and parity that are rebuilt during a rebuild process and reduces the amount of time that is required to perform the rebuild process.
- In accordance with a second aspect of the invention, data is stored in the PDs in a way that limits data sprawl. By limiting data sprawl, the number of addresses in the PDs containing data and parity that have to be rebuilt is reduced, thereby reducing the amount of time that is required to perform the rebuild process. The first and second aspects of the invention may be employed together or separately.
- The terms “rebuild,” “rebuilding,” “rebuilding process,” and the like, as those terms are used herein, are intended to denote the known process of reconstructing data and parity when a PD is being replaced, either due to its failure or for any other reason, such as to upgrade the RAID system. As is known in the art, data and parity associated with addresses in a PD being replaced are computed using data and parity stored in the other PDs, typically by exclusively ORing the data and parity from the other PDs using known equations. Therefore, in the interest of brevity, the manner in which these computations are performed will not be described herein. Embodiments of the invention use these known computational methods to reconstruct data and parity, but reduce the amount of data and parity that have to be reconstructed, and therefore reduce the amount of time that is required to rebuild the PD being replaced. Illustrative, or exemplary, embodiments of the first aspect of the invention will now be described with reference to
FIGS. 2-3 . -
FIG. 2 illustrates a block diagram of aRAID system 100 in accordance with an illustrative embodiment configured to reduce the amount of time that is required to perform a rebuild process and to reduce the amount of data and parity that have to be rebuilt during the rebuild process. TheRAID system 100 includes acomputer 110, aRAID controller 120, and anarray 130 ofPDs 131. Thecomputer 110 may be any type of computer, but it is typically a server. Thecomputer 110 includes anOS 140 having anFS 150, avirtual memory 160, a virtual memory table 170, and afilter driver 200. Although thefilter driver 200 is depicted as being separate from theFS 150, thefilter driver 200 may be part of theFS 150. - The
OS 140, theFS 150, and thefilter driver 200 are typically implemented as computer software programs that reside in alocal storage system 210 of thecomputer 110 and that are executed by at least oneprocessor 220 of thecomputer 110. Thelocal storage system 210 typically comprises at least one hard disk drive (HDD) (not shown) and at least one solid state memory device (not shown). Thevirtual memory 160 and the virtual memory table 170 reside in thelocal storage system 210 of thecomputer 110. - When the
computer 110 has data to write, theOS 140 generates a write command, which is received by theFS 150. TheFS 150 then writes the data to addresses in thevirtual memory 160 and creates entries in the virtual memory table 170 that indicate where the data is stored in thevirtual memory 160. TheFS 150 then issues an IO command to theRAID controller 120. The IO command contains the data to be written and the virtual memory addresses where the data is currently located in thevirtual memory 160. ARAID processor 120 a of theRAID controller 120 receives the IO command and then maps the virtual memory addresses to physical memory addresses in one or more of thePDs 131 of thearray 130. TheRAID processor 120 a maintains a mapping table of the virtual-to-physical address mapping in alocal memory device 120 b of theRAID controller 120. Alternatively, the mapping table could be stored in an external memory device (not shown) that is accessible by theRAID processor 120 a. TheRAID controller 120 then causes the data to be written to the physical addresses in one or more of thePDs 131. - As indicated above with reference to
FIG. 1 , one of the problems with the known rebuild technique results from the fact that when theOS 7 frees data, theRAID controller 4 is unaware that the data has been freed. Therefore, theRAID controller 4 does not know to free the corresponding data in thePDs 6. As a result, the corresponding physical addresses in thePDs 6 are considered by theRAID controller 4 to be used, i.e., to contain valid data. Consequently, if one of thePDs 6 fails, any addresses in the failedPD 6 that were written at any point in time are rebuilt, even if those addresses contain data that has been freed in thevirtual memory 9 by theOS 7. - In contrast to the known RAID system and rebuild technique, in accordance with embodiments of the invention, if a rebuild is to be performed for one of the
PDs 131, thefilter driver 200 identifies used virtual memory addresses in thevirtual memory 160 that correspond to physical addresses in thePD 131 being rebuilt. Virtual memory addresses that are used are those which have been written by theFS 150 and not subsequently freed by theFS 150. Thefilter driver 200 then causes the used virtual memory addresses to be communicated to theRAID controller 120. TheRAID processor 120 a translates the used virtual memory addresses into their corresponding physical addresses in thePD 131 being rebuilt. TheRAID controller 120 then rebuilds data and parity only for the physical addresses in thePD 131 that correspond to the used virtual memory addresses identified by thefilter driver 200. -
FIG. 3 illustrates a flowchart that demonstrates an example of a method for determining which data needs to be rebuilt and for rebuilding the data. When one of thePDs 131 is to be rebuilt, theRAID controller 120 identifies to thefilter driver 200 the virtual memory disk or disks that correspond to thePD 131 being rebuilt, as indicated byblock 201. Thefilter driver 200 then scans the corresponding portion of the virtual memory table 170 and identifies the used virtual memory addresses, as indicated byblock 203. Thefilter driver 200 then causes the used virtual memory addresses to be output to theRAID controller 120, as indicated byblock 205. TheRAID controller 120 translates the used virtual memory addresses into physical memory addresses in thePD 131, as indicated byblock 207. TheRAID controller 120 then reconstructs data and parity only for those physical addresses of thePD 131 being rebuilt, as indicated byblock 209. - It can be seen from the above description of
FIG. 3 that the method is partially performed by thefilter driver 200 of theOS 140 and partially by theRAID controller 120. Both parts are typically implemented as computer software programs. The computer software program corresponding to thefilter driver 200 is executed by theprocessor 220 of thecomputer 110 running theOS 140. The computer software program performed by theRAID controller 120 is executed by theprocessor 120 a of theRAID controller 120. The computer software programs are typically stored on two separate computer-readable mediums (CRMs), one of which typically resides in thelocal storage system 210 of thecomputer 110 and the other of which typically resides in thelocal memory element 120 b of theRAID controller 120. Any type of CRM may be used for this purpose including solid state memory devices, magnetic memory devices and optical memory devices. Solid state memory devices that are suitable for this purpose include, for example, Random Access Memory (RAM) devices, Read-Only Memory (ROM) devices, programmable ROM (PROM) devices, erasable PROM (EPROM) devices, and flash memory devices. It should be noted, however, that the method could be performed in hardware or in a combination of hardware and software or firmware. - Illustrative, or exemplary, embodiments of the aforementioned second aspect of the invention will now be described with reference to
FIGS. 4 and 5 . As mentioned above, the second aspect of the invention relates to reducing data sprawl in a RAID system so that in the event that a rebuild has to be performed, the amount of data that has to be rebuilt and the amount of time that is required to perform the rebuild process are reduced. With the knownRAID system 2 shown inFIG. 1 , theFS 8 typically causes data to be spread around thearray 5 for the reasons described above, thereby resulting in data sprawl. In addition to increasing disk seek times, data sprawl increases the number of addresses in a failedPD 6 that have to be rebuilt because any address in the failedPD 6 that has been “touched”, i.e., written, is rebuilt, even if the corresponding data invirtual memory 9 was subsequently freed after being written. The manner in which data sprawl and its effects are prevented will now be described with reference toFIGS. 2 , 4 and 5. - With reference again to
FIG. 2 , theRAID controller 120 allocates storage space in thePDs 131 in a way that prevents data from being spread around thearray 130. At initialization, theRAID controller 120 allocates less than all of the storage space of thearray 130 for use by theOS 140. In addition, the space that is initially allocated comprises addresses that are typically contiguous, or at least substantially contiguous. For example, if thearray 130 has an available storage capacity of 1 terabyte (TB), theRAID controller 120 may initially allocate 200 gigabytes (GBs), or about 20%. As theFS 150 of theOS 140 writes data, theRAID controller 120 writes the data to addresses in thePDs 131 of thearray 130 that are in the initially allocated space, thereby confining the data and parity to particular portions of thearray 130. As the initially allocated space becomes close to being filled, theRAID controller 120 allocates additional space in thearray 130 that is contiguous or substantially contiguous with the initially allocated space. Thus, space allocated earlier in time is filled before space allocated later in time is filled, and the data is confined to the allocated space. In this way, data sprawl is prevented or at least reduced, which reduces disk seek times and the amount of time that is required to perform a rebuild in the event that one of thePDs 131 fails. -
FIG. 4 illustrates thearray 130 ofPDs 131 shown inFIG. 2 in accordance with an exemplary embodiment in which thearray 130 is made up of threePDs PDs RAID controller 120 initially allocates M blocks of storage space in each of thePDs FS 150, where M is a positive integer that is greater than 0 and less than N. As theFS 150 writes data by sending IO commands to theRAID controller 120, theRAID controller 120 causes the data and parity to be stored in the M blocks that were initially allocated for use. When the M blocks of storage space are close to being full (e.g., 90% full), theRAID controller 120 allocates additional space comprising P blocks of storage space in thePDs - Because of the manner in which storage space in the
PDs 131 is allocated, disk seek times are reduced, which improves performance. In addition, in the event that one of thePDs 131 fails, the rebuild process can be performed in less time due to the fact that the data is more confined as opposed to being spread around throughout thearray 130. In addition, if the second aspect of the invention is combined with the first aspect of the invention such that data that is written by theOS 140 and subsequently freed by theOS 140 is not rebuilt by theRAID controller 120, the amount of time that is required to perform the rebuild process can be even further reduced. -
FIG. 5 illustrates a flowchart that represents the method performed by theRAID system 100 shown inFIG. 2 to prevent data sprawl. At the start, i.e., when theRAID system 100 is initialized, theRAID controller 120 initially allocates M blocks of storage space in each of thePDs 131 as available for use by theFS 150, as indicated byblock 301. As theFS 150 writes data by sending IO commands to theRAID controller 120, theRAID controller 120 causes the data and parity to be stored in addresses within the M blocks that were initially allocated for use, as indicated byblock 303. A determination is then made as to whether X percentage of the allocated space has been filled, where X is an integer that is less than or equal to 100, as indicated byblock 305. Alternatively, X could correspond to a percentage of the allocated space that is unfilled, e.g., 10%. If the query ofblock 305 is answered in the affirmative, then the process proceeds to block 307 at which theRAID controller 120 allocates P blocks of additional storage space in thePDs 131, assuming there is storage space remaining in thePDs 130. The process then returns to block 303. - It should be noted that many variations may be made to the process described above with reference to
FIGS. 4 and 5 . For example, although it is assumed in the illustrative embodiment that eachPD 131 has N blocks of storage capacity, thePDs 131 may not always be identical types of storage devices and/or may not always have the same storage capacity. Also, whileblock 305 uses a threshold percentage, X, the threshold could instead be a particular number of blocks filled or a particular number of blocks remaining unfilled. For example, rather than determining when a particular percentage of the allocated space has been filled or remains unfilled, the process may determine when X of the allocated blocks have been filled or remain unfilled. In addition, while the M blocks that are initially allocated and the P blocks that are subsequently allocated are typically contiguous portions of thearray 130, this is not a requirement of the invention. Although using contiguous portions of thearray 130 reduces disk seek times, thereby improving performance and reducing rebuild time, the allocated portions may be noncontiguous while still achieving a reduction in disk seek times and rebuild time. - As indicated above, although the methods represented by the flowcharts of
FIGS. 3 and 5 may be performed separately or together, performing them together further reduces the amount of data that is required to be rebuilt and the amount of time that is spent performing a rebuild process. The method represented by the flowchart ofFIG. 5 is typically implemented in a computer software program that is stored in thelocal memory 120 b of theRAID controller 120 and executed by theprocessor 120 a of theRAID controller 120. The method represented by the flowchart ofFIG. 5 may, however, be implemented in hardware or in a combination of hardware and software and/or firmware, as will be understood by those skilled in the art in view of the description being provided herein. - It should be noted that the invention has been described herein with reference to a few illustrative embodiments for the purposes of describing the principles and concepts of the invention. The invention is not limited to the embodiments described herein, as will be understood by persons skilled in the art in view of the description provided herein. Modifications may be made to the embodiments described herein and all such modifications are within the scope of the invention, as will be understood by persons skilled in the art in view of the description provided herein.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/445,145 US20140337578A1 (en) | 2011-03-01 | 2014-07-29 | Redundant array of inexpensive disks (raid) system configured to reduce rebuild time and to prevent data sprawl |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/037,895 US8825950B2 (en) | 2011-03-01 | 2011-03-01 | Redundant array of inexpensive disks (RAID) system configured to reduce rebuild time and to prevent data sprawl |
US14/445,145 US20140337578A1 (en) | 2011-03-01 | 2014-07-29 | Redundant array of inexpensive disks (raid) system configured to reduce rebuild time and to prevent data sprawl |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/037,895 Division US8825950B2 (en) | 2011-03-01 | 2011-03-01 | Redundant array of inexpensive disks (RAID) system configured to reduce rebuild time and to prevent data sprawl |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140337578A1 true US20140337578A1 (en) | 2014-11-13 |
Family
ID=46754025
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/037,895 Expired - Fee Related US8825950B2 (en) | 2011-03-01 | 2011-03-01 | Redundant array of inexpensive disks (RAID) system configured to reduce rebuild time and to prevent data sprawl |
US14/445,145 Abandoned US20140337578A1 (en) | 2011-03-01 | 2014-07-29 | Redundant array of inexpensive disks (raid) system configured to reduce rebuild time and to prevent data sprawl |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/037,895 Expired - Fee Related US8825950B2 (en) | 2011-03-01 | 2011-03-01 | Redundant array of inexpensive disks (RAID) system configured to reduce rebuild time and to prevent data sprawl |
Country Status (1)
Country | Link |
---|---|
US (2) | US8825950B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260132A (en) * | 2015-09-18 | 2016-01-20 | 久盈世纪(北京)科技有限公司 | Method and device for hot loading disk filter drive |
US11625193B2 (en) | 2020-07-10 | 2023-04-11 | Samsung Electronics Co., Ltd. | RAID storage device, host, and RAID system |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8527698B2 (en) * | 2010-06-22 | 2013-09-03 | Lsi Corporation | Parity-based raid system configured to protect against data corruption caused by the occurrence of write holes |
US9891993B2 (en) | 2014-05-23 | 2018-02-13 | International Business Machines Corporation | Managing raid parity stripe contention |
KR102580123B1 (en) | 2016-05-03 | 2023-09-20 | 삼성전자주식회사 | Raid storage device and management method thereof |
CN108733314B (en) * | 2017-04-17 | 2021-06-29 | 伊姆西Ip控股有限责任公司 | Method, apparatus, and computer-readable storage medium for Redundant Array of Independent (RAID) reconstruction |
US10459807B2 (en) * | 2017-05-23 | 2019-10-29 | International Business Machines Corporation | Determining modified portions of a RAID storage array |
US10372561B1 (en) * | 2017-06-12 | 2019-08-06 | Amazon Technologies, Inc. | Block storage relocation on failure |
US20190317889A1 (en) * | 2018-04-15 | 2019-10-17 | Synology Inc. | Apparatuses and methods and computer program products for a redundant array of independent disk (raid) reconstruction |
US11269562B2 (en) * | 2019-01-29 | 2022-03-08 | EMC IP Holding Company, LLC | System and method for content aware disk extent movement in raid |
US20230409245A1 (en) * | 2022-06-21 | 2023-12-21 | Samsung Electronics Co., Ltd. | Method and system for solid state drive (ssd)-based redundant array of independent disks (raid) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110107028A1 (en) * | 2008-07-07 | 2011-05-05 | Louis James L | Dynamically Expanding Storage Capacity of a Storage Volume |
US8209587B1 (en) * | 2007-04-12 | 2012-06-26 | Netapp, Inc. | System and method for eliminating zeroing of disk drives in RAID arrays |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7490270B2 (en) * | 2004-11-09 | 2009-02-10 | Dell Products L.P. | Method, system, and software for rebuilding a storage drive |
US20090271659A1 (en) | 2008-04-24 | 2009-10-29 | Ulf Troppens | Raid rebuild using file system and block list |
JP2010033261A (en) * | 2008-07-28 | 2010-02-12 | Hitachi Ltd | Storage device and control method |
-
2011
- 2011-03-01 US US13/037,895 patent/US8825950B2/en not_active Expired - Fee Related
-
2014
- 2014-07-29 US US14/445,145 patent/US20140337578A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8209587B1 (en) * | 2007-04-12 | 2012-06-26 | Netapp, Inc. | System and method for eliminating zeroing of disk drives in RAID arrays |
US20110107028A1 (en) * | 2008-07-07 | 2011-05-05 | Louis James L | Dynamically Expanding Storage Capacity of a Storage Volume |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260132A (en) * | 2015-09-18 | 2016-01-20 | 久盈世纪(北京)科技有限公司 | Method and device for hot loading disk filter drive |
US11625193B2 (en) | 2020-07-10 | 2023-04-11 | Samsung Electronics Co., Ltd. | RAID storage device, host, and RAID system |
Also Published As
Publication number | Publication date |
---|---|
US20120226853A1 (en) | 2012-09-06 |
US8825950B2 (en) | 2014-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8825950B2 (en) | Redundant array of inexpensive disks (RAID) system configured to reduce rebuild time and to prevent data sprawl | |
US10606491B2 (en) | Providing redundancy in a virtualized storage system for a computer system | |
US10140041B1 (en) | Mapped RAID (redundant array of independent disks) in a data storage system with RAID extent sub-groups that are used to perform drive extent allocation and data striping for sequential data accesses to a storage object | |
US9378093B2 (en) | Controlling data storage in an array of storage devices | |
US10459814B2 (en) | Drive extent based end of life detection and proactive copying in a mapped RAID (redundant array of independent disks) data storage system | |
US6898668B2 (en) | System and method for reorganizing data in a raid storage system | |
US9846544B1 (en) | Managing storage space in storage systems | |
US10073621B1 (en) | Managing storage device mappings in storage systems | |
US8977894B2 (en) | Operating a data storage system | |
US9990263B1 (en) | Efficient use of spare device(s) associated with a group of devices | |
US20100306466A1 (en) | Method for improving disk availability and disk array controller | |
US8041891B2 (en) | Method and system for performing RAID level migration | |
US10678641B2 (en) | Techniques for optimizing metadata resiliency and performance | |
US20050091452A1 (en) | System and method for reducing data loss in disk arrays by establishing data redundancy on demand | |
US11449402B2 (en) | Handling of offline storage disk | |
CN111124262B (en) | Method, apparatus and computer readable medium for managing Redundant Array of Independent Disks (RAID) | |
US10579540B2 (en) | Raid data migration through stripe swapping | |
KR20110087272A (en) | A loose coupling between raid volumes and drive groups for improved performance | |
US11256447B1 (en) | Multi-BCRC raid protection for CKD | |
CN110569000A (en) | Host RAID (redundant array of independent disk) management method and device based on solid state disk array | |
US20130179634A1 (en) | Systems and methods for idle time backup of storage system volumes | |
US8140752B2 (en) | Method of executing a background task and an array controller | |
US10853257B1 (en) | Zero detection within sub-track compression domains | |
US8935488B2 (en) | Storage system and storage control method | |
US11144445B1 (en) | Use of compression domains that are more granular than storage allocation units |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: LSI CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BERT, LUCA;REEL/FRAME:033456/0266 Effective date: 20140725 |
|
AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LSI CORPORATION;REEL/FRAME:035390/0388 Effective date: 20140814 |
|
AS | Assignment |
Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH CAROLINA Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:037808/0001 Effective date: 20160201 Owner name: BANK OF AMERICA, N.A., AS COLLATERAL AGENT, NORTH Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD.;REEL/FRAME:037808/0001 Effective date: 20160201 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD., SINGAPORE Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041710/0001 Effective date: 20170119 Owner name: AVAGO TECHNOLOGIES GENERAL IP (SINGAPORE) PTE. LTD Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENTS;ASSIGNOR:BANK OF AMERICA, N.A., AS COLLATERAL AGENT;REEL/FRAME:041710/0001 Effective date: 20170119 |