WO2016194095A1 - 情報処理システム、ストレージ装置及び記憶デバイス - Google Patents
情報処理システム、ストレージ装置及び記憶デバイス Download PDFInfo
- Publication number
- WO2016194095A1 WO2016194095A1 PCT/JP2015/065719 JP2015065719W WO2016194095A1 WO 2016194095 A1 WO2016194095 A1 WO 2016194095A1 JP 2015065719 W JP2015065719 W JP 2015065719W WO 2016194095 A1 WO2016194095 A1 WO 2016194095A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- storage
- parity
- storage device
- old
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1076—Parity data used in redundant arrays of independent storages, e.g. in RAID systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0804—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with main memory updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/061—Improving I/O performance
- G06F3/0613—Improving I/O performance in relation to throughput
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0655—Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
- G06F3/0659—Command handling arrangements, e.g. command buffers, queues, command scheduling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/0671—In-line storage system
- G06F3/0683—Plurality of storage devices
- G06F3/0689—Disk arrays, e.g. RAID, JBOD
Definitions
- SSD Solid State Drive
- HDD Hard Disk Drive
- SSDs can be accessed faster than HDDs, and storage devices can be sped up by being installed as storage devices in storage devices. Speed up.
- storage devices with non-volatile semiconductor memory such as ReRAM (Resistance Random Access Memory) and PRAM (Phase Change Random Access Memory).
- ReRAM Resistance Random Access Memory
- PRAM Phase Change Random Access Memory
- the storage system uses RAID (Redundant Array (of Independent) (or Inexpensive) (Disks)) technology to make the system highly reliable.
- RAID manages a plurality of storage devices as a group (hereinafter referred to as a RAID group), and creates redundant data called parity from the data. Then, data and parity are stored in different storage devices in the storage system, and when the storage device fails, the data stored in the failed storage device can be restored from the data and parity stored in the other storage device. .
- RAID also has a method of achieving high reliability by duplicating data. That is, the same data as certain data is stored as redundant data in two different storage devices. Again, every time data is written to the storage device, it is necessary to update the duplicated data, which causes a reduction in the processing performance of the storage controller. In recent years, there is a great need to analyze a large amount of data called big data, and the load of data transfer of a storage controller that performs data transfer is increasing.
- the load on the storage controller is reduced by the technology described below.
- the storage controller transfers the new data received from the host computer to the first storage device in which the old data is stored, and the first storage device generates an intermediate parity based on the old data and the new data.
- the storage controller reads the intermediate parity from the first storage device and transfers the intermediate parity to the second storage device in which the old parity is stored.
- the second storage device generates a new parity based on the old data and the intermediate parity. As a result, the parity is updated as the data is updated.
- the information processing system forms a RAID group, and includes a plurality of storage devices that are connected to one bus and communicate with each other.
- Each of the plurality of storage devices has a device controller and a storage medium for storing data.
- the plurality of storage devices include a first storage device that stores old data and a second storage device that stores old parity corresponding to the old data.
- the first device controller of the first storage device generates an intermediate parity based on the old data and the new data that updates the old data, and stores the second storage device that stores the old parity corresponding to the old data.
- the intermediate parity is specified and transmitted to the second storage device, and the second device controller of the second storage device generates a new parity based on the intermediate parity and the old parity.
- the load on the storage controller is reduced by reducing the number of data transfers between the storage controller and the storage device, thus speeding up the write process.
- the storage apparatus 20 includes a storage controller 200 and storage devices 31 to 34.
- the storage devices 31 to 34 are each connected to one bus 270 and can communicate with each other.
- the storage controller 200 is connected to the same bus 270 as each of the storage devices 31 to 34.
- the storage controller 200 receives a read command or a write command from a host computer or the like outside the storage apparatus 20, and accesses a storage device according to a request from the host computer.
- Each of the storage devices 31 to 34 includes a device controller and a storage medium (not shown).
- the device controller stores data received from a device external to the storage device in a storage medium, reads data from the storage medium, and transfers the data to a device external to the storage device.
- the storage medium is a nonvolatile semiconductor memory in this embodiment.
- RAID group of RAID 5 (3 Data + 1 Parity) is configured based on four storage devices.
- parity is generated for each stripe according to a predetermined rule.
- old parity (old P) 0 is generated based on old data (old D) 0, old data (old D) 1, and old data (old D) 2.
- old parity 0 is generated by exclusive OR operation (hereinafter referred to as XOR operation) of old data 0, old data 1, and old data 2.
- Old data 0, old data 1, old data 2, and old parity 0 are distributed and stored in the storage devices 31 to 34 one by one. When old data 1 cannot be read due to a failure of the storage device, etc., old data 1 is restored by XOR operation of old data 0, old data 2, and old parity 0.
- the parity is generated in case data is lost. For this reason, when the data in the stripe is updated, the parity of the stripe including the data must also be updated.
- the storage controller 200 receives new data 0 from the host computer. Next, the storage controller 200 transfers the new data 0 via the bus 270 to the storage device 31 that stores the old data 0. The new data 0 is data for updating the old data 0.
- Storage device 31 receives new data 0 (S1001).
- the device controller of the storage device 31 performs an XOR operation on the new data 0 and the old data 0 to generate an intermediate parity (intermediate P) 0 (S1002).
- the storage device 31 specifies the storage device 34 storing the old parity 0, and transmits the intermediate parity 0 to the storage device 34 via the bus 270 (S1003).
- the storage device 34 receives the intermediate parity 0, performs an XOR operation of the intermediate parity 0 and the old parity 0, and generates a new parity (new P) 0 (S1004).
- This embodiment can be applied to the information processing system shown in FIGS.
- FIG. 2 is a diagram showing a physical configuration (hereinafter referred to as system configuration 1) of the storage apparatus 20 according to the embodiment of the present invention.
- the storage apparatus 20 is connected to the host computer 10 via the network 260.
- the network 260 is, for example, a SAN (Storage Area Network) or a LAN (Local Area Network).
- the storage device 20 is connected to the management computer 15 in the same manner as the host computer.
- the host computer 10 includes hardware resources such as a processor, a memory, an input / output device, and a host bus adapter, and software resources such as a device driver, an operating system (OS), and an application program.
- the processor In the host computer 10, the processor generates a command (for example, a read command or a write command) according to a program on the memory, and transmits the command to the storage apparatus 20 via the network 260.
- the configuration of the management computer 15 is the same as that of the host computer 10.
- the storage device 20 includes a storage controller 200, a switch 280, and a plurality of (for example, four) storage devices 31 to 34.
- Each of the plurality of storage devices 31 to 34 is connected to the switch 280 by an internal bus (for example, a PCI-Express (PCIe) bus).
- PCIe PCI-Express
- the plurality of storage devices are connected to each other, and end-to-end communication is possible between the storage devices.
- the storage controller 200 and the switch 280 are connected, and the storage controller 200 can access a plurality of storage devices.
- the storage controller 200 includes a processor 210, a memory 220, a switch 230, a host interface 240, an I / O interface 250, and a management interface 290.
- the storage controller 200 receives commands from the host computer 10, controls the entire storage apparatus, and provides a management screen 1800 as shown in FIG.
- the processor 210 analyzes the command received from the host computer 10 based on the program, performs arithmetic processing, and gives an instruction to each part of the storage controller 200 to control the entire storage apparatus 20.
- the memory 220 stores management information (for example, RAID management information 810 and lock management information 910) of the entire storage apparatus 20, and temporarily stores read commands and write commands from the host computer 10 and target data of the commands.
- the switch 230 connects the processor 210, the memory 220, the host interface 240, and the I / O interface 250 in the storage controller 20, and routes data exchanged between the parts according to the address and ID.
- the host interface 240 is connected to the host computer 10 via the network 260.
- the host interface 240 performs data transmission / reception with the host computer 10 in accordance with an instruction from the processor 210 or a request from the host computer 10.
- Data transmitted / received by the host interface 240 is stored in the memory 220.
- the management interface 290 has the same configuration as the host interface 240 and is connected to the management computer 15.
- the I / O interface 250 is connected to the storage devices 31 to 34 via the bus 270.
- the I / O interface 250 transmits and receives data to and from the storage devices 31 to 34 in accordance with an instruction from the processor 210 or a request from the storage devices 31 to 34.
- Data transmitted / received by the I / O interface 250 is stored in the memory 220.
- the bus 270 is, for example, a PCIe bus.
- FIG. 3 shows a physical configuration of a server according to an embodiment of the present invention (hereinafter referred to as system configuration 2).
- This system includes a database server (hereinafter referred to as a server) 80 connected to a network 86.
- a database server hereinafter referred to as a server
- An example of the network 86 is a LAN.
- a plurality of client terminals are connected to the network 86, and the server 80 receives a database processing request generated in the client terminal or the server 80 and returns an analysis result.
- the server 80 is a database server is shown, but a server that provides services other than a database, such as a file server, may be used.
- the server 80 acquires information from the storage devices 31 to 34 in the server, and provides a management screen 1800 as shown in FIG. 17 to the user.
- the server 80 includes a processor 81, a memory 82, a network interface 83, a chip set 84, and an expander 85.
- the processor 81 analyzes a request generated in the client terminal or the server 80 based on the program, and controls the entire server 80 and performs various arithmetic processes.
- the memory 82 stores a program executed by the processor 81, stores management information (for example, RAID management information 810, lock management information 910) of the entire server 80, and temporarily stores requests and analysis target data.
- management information for example, RAID management information 810, lock management information 910
- the network interface 83 is connected to the network 86.
- the network interface 83 performs data transmission / reception with the client terminal in accordance with an instruction from the processor 81 or a request from the client terminal connected to the network 86.
- Data transmitted / received by the network interface 83 is stored in the memory 82.
- the chipset 84 connects the processor 210, the memory 220, the host interface 240, and the I / O interface 250 in the storage controller 20, and routes data exchanged between the respective parts according to addresses and IDs.
- the expander 85 includes a plurality of (for example, four) storage devices 31 to 34 and a switch 88 inside.
- the storage devices 31 to 34 may be directly connected to the chip set 84 without the expander 85.
- the storage devices 31 to 34 inside the expander 85 are connected to the switch 88.
- the switch 88 is connected to the chip set 84 via the bus 87.
- the bus 87 is a PCI Express bus or SAS, for example.
- FIG. 4 shows a configuration example of a storage device using NVM (Non-Volatile Memory) as a storage medium.
- NVM Non-Volatile Memory
- the storage device 31 includes a device controller 310 and an NVM array 410.
- the device controller 310 and the NVM array 410 are connected by a plurality of buses 318.
- the device controller 310 includes a processor 311, a memory 312, a data buffer 313, a parity operation unit 314, an I / O interface 315, an NVM interface 316, and a switch 317. These may be configured with a single semiconductor element as an ASIC (Application Specific Integrated Circuit) or FPGA (Field Programmable Gate Array), or even with a configuration in which multiple individual ICs (Integrated Circuits) are connected to each other. Good.
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the processor 311 analyzes requests from the storage controller 10 and the other storage devices 32 to 34 based on the program, performs arithmetic processing, and controls the entire storage device 31.
- the memory 312 stores a program executed by the processor 311, device management information 279 regarding 410, and RAID management information distributed from the host device.
- the RAID configuration information may not be distributed from the host device and may not be stored in the memory 312. In this case, information indicating the RAID configuration is added to a command of a parity write command described later.
- Data buffer 313 temporarily stores read / write command data and data being processed.
- Parity operation unit 314 is an operation device that performs processing necessary for parity generation.
- the parity operation unit 314 is a hardware circuit that performs an XOR operation, for example.
- the parity operation unit 314 is described as a hardware circuit.
- the device controller 310 may have a parity operation function, and the processor 311 may execute the parity operation by executing a program. .
- the I / O interface 315 is connected to the storage controller 20 and other storage devices 32 to 34 via the bus 270.
- the I / O interface 315 performs data transmission / reception with the storage controller 20 or other storage devices 32 to 34 in accordance with an instruction from the processor 311 or a request from the storage controller 20 or other storage devices 32 to 34.
- Data transmitted / received by the I / O interface 315 is stored in the data buffer 313.
- the NVM interface 316 is connected to the NVM array 410 by a plurality of buses 318.
- the NVM interface transmits and receives data to and from the NVM array 410 according to instructions from the processor 311. Data to be transmitted / received is stored in the data buffer 313.
- the switch 317 is connected to each part in the device controller 310 and relays data transfer between the parts.
- the NVM array 410 includes a plurality of NVM chips 411.
- the NVM chip 411 is, for example, a NAND flash memory chip.
- Each NVM chip 411 has a plurality of blocks (physical blocks), and each block has a plurality of pages (physical pages).
- data cannot be overwritten, data is erased in units of blocks, and data is read / written in units of pages.
- one page is described as 16 KB. However, since the page differs depending on the design of a chip to be used, the size is not limited.
- the NVM chip may be a memory that can be accessed at higher speed than flash memory such as PRAM and ReRAM.
- the devices external to the storage device are the processor 210 of the storage controller 200 and the processor 81 of the server 80.
- the logical storage space is composed of a plurality of logical blocks, and each logical block is associated with a logical address.
- the CPU 210 of the storage controller 200 can read / write data in the logical area corresponding to the designated logical address by issuing an access command designating the logical address to the storage device 31.
- a physical address used in the storage device 31 is associated with each of a plurality of physical areas constituting a physical storage space constituted by a plurality of NVM chips 411.
- the physical address indicating the position of the physical area in the storage device 31 is called “PBA (Physical Block Address)”, and the logical address indicating the position of the logical area of the storage device 31 is “Device LBA (Logical Block Address) ) ".
- PBA represents a position in units of 16 KB, which is the page size of NVM.
- the NVM page size and the PBA management unit are described as being the same, but they may be different.
- Flash memory is a write-once memory, and data cannot be overwritten. For this reason, when the device controller 310 updates data, the new data is stored on a page (called a new page) different from the page where the old data is stored (called the old page). Change the device LBA correspondence to the correspondence between the PBA of the new page and the device LBA. Old pages that no longer have a corresponding relationship with the device LBA are subject to erasure processing.
- the storage device 31 manages the association between the PBA and the device LBA using, for example, the following address conversion table.
- FIG. 6 shows a configuration example of the address conversion table 610.
- the address conversion table 610 is stored in the memory 312 of the storage device 31, and has a record for each logical block of the logical storage space provided by the storage device 31. Each record includes information about the devices LBA611 and PBA612 and the presence / absence 613 of update data.
- the device LBA 611 represents a head address for each logical block obtained by dividing the logical storage space provided by the storage device 31 into logical blocks of a predetermined size.
- the logical block size is 16 KB. This is the same as the page unit (16 KB in this embodiment), which is a unit that can access the NVM, and can be made to correspond one-to-one, so that access control to the NVM chip 411 is facilitated.
- the logical block size may be any size as long as it is smaller than the size of the logical storage space provided by the storage device 31.
- PBA612 indicates the position of a page that is a physical area.
- Update information 613 indicates information held during parity update processing. “Yes” in the update information 613 means that the parity is being updated, and information indicating the storage location of the new data is stored.
- the information indicating the storage location of the new data is, for example, address information indicating an area where the new data is stored in the data buffer 313 of the storage device 31, or PBA storing the new data.
- the reason why the update information 613 exists in this embodiment is as follows. Normally, when the storage device acquires new data, the PBA storing the old data is changed to the PBA storing the new data. The old data can no longer be read out and is to be erased. Although details will be described later, in this embodiment, when the storage device receives a parity write command for instructing parity update, even if new data associated therewith is acquired, the old data is not updated until the completion of parity update is confirmed. Keep it readable. This is to improve the reliability by allowing the process to be resumed from the place where the old data is read when an error occurs during the parity update process. For this reason, during the parity update process, the storage device needs to manage the storage location of both old data and new data.
- device LBA “0” is associated with PBA “0”.
- the device controller 310 reads data from the page indicated by the PBA “0” and responds to the storage controller 200. Regardless of the presence / absence of the update information 613, data is read from the PBA value stored in the PBA 612.
- the storage device when the storage device is formatted, for example, by writing zero data to the NVM, the state may be managed as “unallocated”.
- the device LBA when zero data is written and formatted in NVM, the device LBA is assigned a PBA in which zero data is stored. You may manage as.
- RAID is applied.
- a case where RAID 5 of 3 Data + 1 Parity is applied in the system configuration 1 and one RAID group is configured by the storage devices 31 to 34 will be described as an example.
- a configuration in which one parity is generated for data stored in three different storage devices will be described, but the present invention is not limited to this example.
- One parity may be generated for data stored in five different storage devices.
- three parities may be generated for data stored in three different storage devices to form one RAID group. There may be two or more RAID groups.
- Fig. 7 shows the relationship between the logical volume 50 and the RAID group in this example.
- the storage controller 200 provides the logical volume 50 as a data storage area to the host computer 10. Note that the storage controller 200 may provide a plurality of logical volumes to the host computer 10.
- the logical volume 50 is managed by being divided into storage areas called a plurality of logical blocks. Each logical block is identified by being assigned an LBA.
- the host computer 10 can access an arbitrary storage area by specifying the logical volume identification number and the storage LBA.
- the logical storage spaces 51 to 54 provided to the storage controller 200 by the storage devices 31 to 34 are also divided into a plurality of logical blocks.
- the LBA of the logical volume 50 provided by the storage controller 20 is referred to as a storage LBA.
- the LBAs of the logical storage spaces 51 to 54 provided by the storage devices 31 to 34 are called device LBAs.
- the logical blocks in the logical volume 50 are associated with the logical blocks in the logical storage spaces 51-54.
- the storage controller 20 specifies a storage device and a device LBA from the storage LBA specified by the host computer 10 and accesses the storage device.
- parity of 1 logical block is generated for data of 3 continuous logical blocks, and a stripe of 4 logical blocks is configured.
- stripe 0 is composed of parity 0 obtained by XORing 16 KB of Data 0 to 2 and Data 0 to 2 for each bit.
- the storage LBA is managed in units of 16 KB
- Data0, Data1,... are all 16 KB data.
- the four devices # of the storage devices 31 to 34 are assumed to be # 0, 1, 2, and 3, respectively.
- the device # and in-device address of the RAID group corresponding to the storage LBA can be uniquely identified by the following calculation using the storage LBA value. Specifically, device # is the remainder of dividing the storage LBA value by the number of devices in the RAID group.
- the in-device address can be obtained by dividing the value of the storage LBA by the number of data pieces 3 in the stripe (rounded down).
- device # is the remainder of 6 ⁇ 4, 2 and the in-device address is 6 ⁇ 3, rounded down to 2, and if the number of devices in the RAID group and the number of data per stripe are known, it is uniquely obtained by calculation. It becomes possible.
- the parity is also updated when any of the data in the same stripe is updated. For example, when the parity 0 is calculated by the XOR operation of the data 0 to 2, the new parity is calculated by the XOR operation of the old data (Data6), the new data (Write data), and the old parity (Parity0). Details of the parity update process will be described later.
- Figure 8 shows an example of the PCIe bus address space.
- This address map is created as follows. First, when the storage device is initialized or when the storage controller 200 newly recognizes a storage device, the storage controller 200 inquires what address space is set for each storage device. Each storage device responds to the storage controller 200 with the range of the address space (device address) and the size of each of the logical storage space and the communication space. The storage controller 200 creates an address map by setting an address offset so that each storage device can be identified in the logical storage space using the response result. Then, the storage controller 200 sets the address map in the PCIe root complex (I / O interface 250) and the switch 280. For example, an address map is stored in a memory in the switch 280. This makes it possible to uniquely identify the address of each storage device in a storage apparatus including a plurality of storage devices, and the switch 280 can route a packet to the corresponding address.
- I / O interface 250 the address map
- the logical storage space address offsets of devices # 0 to # 3 are “0”, “100000”, “200000”, and “300000”, and the communication space address offsets of devices # 0 to # 3 are “90000” and “190000”, respectively. "290000” and "390000” are set.
- the communication space is mapped to some or all of the registers of the processors 311 and the data buffers 313 of the storage devices 31 to 34. By using the address of this communication space, it is possible to exchange control information with the storage controller 200 and store data read from other recording devices.
- Fig. 9 shows an example of RAID management information 810.
- the RAID management information 810 is stored in the memory 82 of the storage device 20.
- the storage controller accesses each storage device with reference to the RAID management information.
- RAID group # 811 is information for uniquely identifying a RAID group in the storage apparatus 20.
- the RAID level 812 indicates the RAID control method of the RAID group.
- the RAID level is, for example, RAID1, RAID5, RAID6, RAID10, or the like.
- the RAID configuration 813 indicates the number of data in the stripe and the number of parity in the RAID group. For example, for one RAID stripe, when three storage devices store data and one storage device stores parity, the RAID configuration is represented as “3D1P”.
- the stripe size 814 is the size of each data and parity in the stripe.
- Device # 815 is information for uniquely identifying a device within a RAID group.
- the device offset 816 indicates the start position of the address of each storage device in the logical address space when a plurality of storage devices are used as one logical storage space.
- the device size 817 indicates the size of the logical storage space of the storage device, and an address space corresponding to the device size 817 from the device offset 816 is an accessible logical storage space in each storage device.
- the storage controller 100 cannot distinguish the storage device only by the device LBA, by setting the address offset for each device # and using the value obtained by adding the device LBA to the address offset, the storage devices 31 to 34 connected to the PCIe bus It becomes possible to uniquely access a storage area of an arbitrary storage device.
- the communication space information is an address of the communication space of each storage device. Thereby, each storage device can access another storage device in the RAID group.
- RAID configuration information 810 an example of RAID group # 811, RAID level 812, RAID configuration 813, stripe size 814, device #, device offset 816, and device size 817 has been described.
- the information may be any information that allows the storage device to instruct data transfer with another storage device in order to perform the parity update process.
- the storage controller 100 may notify the device start address and the device end address instead of the device offset 816 and the device size 817, and will be described with an example of using SAS for the bus described later.
- the device address and device size to be notified may be notified.
- the device offset 816 and the device size 817, the device start address and the device end address, the device address and the device size are information for uniquely identifying each storage area of the plurality of storage devices, and the storage device identification information Call.
- FIG. 10 is an example of the lock management information 910.
- the lock management information 910 holds information on the relationship between the process executed by the processor 210 and the lock state. Locking means prohibiting access (for example, reading and writing) to an area targeted for locking by a process other than the process that has secured the lock. Specific information held by the lock management information 910 is as follows.
- Process # 911 indicates an identification number of a process that the processor 210 is executing or is scheduled to execute in the future.
- the process management information 910 is searched to identify an empty process #, and a process type, a lock state, and the like are recorded in a column corresponding to the process #.
- the processor 210 deletes the contents of the corresponding process.
- Process type 912 indicates the type of process such as write or read.
- RAID group # 913 indicates the identification number of the RAID group to be processed.
- Device # 914 indicates the identification number of the device in the RAID group to be processed.
- a device LBA915 indicates a device LBA of a storage device to be processed.
- the lock state 915 indicates whether the process has secured the lock and the lock target.
- “Stripe lock” means that access by other processes to the stripe corresponding to the device LBA is prohibited until the execution of the process for the target device LBA is completed. When the process that secures the lock completes, the lock is released.
- the entire stripe is locked. This is to guarantee the order of data update and parity update accompanying data update. If read or write processing occurs in the stripe during data and parity update processing by a certain write processing, it may not be possible to determine which data will be executed before or after the update, and inconsistency may occur. There is. Therefore, by ensuring the lock, it is possible to prevent process order guarantees and inconsistencies from occurring. In the case of a read, no data update occurs, so no lock is necessary.
- parity update processing in random write of system configuration 1 is shown.
- the ratio of the number of logical blocks to be updated is less than a predetermined value for a plurality of continuous logical blocks in one stripe, it is called random write, and when it is equal to or higher than a predetermined value, it is called sequential write.
- parity update processing in random write will be described
- parity update processing in sequential write will be described.
- new parity 0 can be generated using old data 0 and old parity 0 as shown in FIG.
- new parity can be generated by XOR operation of these three data.
- a random write parity update process may be applied to new data 0 and new data 1 in order.
- new data 1 and new data 2 that updates the first half of old data 2
- parity is updated as a sequential write. It may be more efficient to process this. Whether to process as a random write or a sequential write may be appropriately set depending on the stripe size.
- FIG. 11 shows the data flow between each device for write processing.
- FIG. 12 (FIGS. 12-1 and 12-2) shows a ladder chart of a write process with parity update.
- the host computer 10 transfers a write command to the storage apparatus 20. Specifically, first, the host computer 10 generates new data 0 and a write command in the memory.
- the write command includes a storage LBA indicating the logical block to which new data 0 is written, and information indicating the storage location of new data 0 in the memory in the host computer 10 (for example, the address of the memory in the host computer 10). .
- the host computer 10 notifies the storage device 20 that a new write command has been generated. For example, the host computer 10 notifies the generation of a command by incrementing the value of a specific area on the memory 220.
- the processor 210 of the storage controller 200 that has received the notification issues an instruction to the host interface 240 and transfers the newly generated write command to the memory 220. Note that the transfer of the write command may be performed by the host computer serving as the master without notifying the storage apparatus 20.
- the storage controller 200 acquires new data 511 from the host computer 10. Specifically, in the storage controller 200, the processor 210 instructs the host interface 240 to transfer new data from the memory of the host computer 10 designated by the write command to the memory 220. The transfer of new data 0 may be performed with the host computer 10 serving as a master. When the storage controller 200 acquires the new data 0, the storage controller 200 may transmit a completion response to the write command to the host computer 10. In this case, the subsequent processing is executed asynchronously with the write command.
- the storage controller 200 secures the lock of stripe 0 to which the new data 0 belongs. Specifically, based on the RAID management information 810, the processor 210 first identifies the storage device 31 and device LBA that are the write destination of the new data 0 from the storage LBA specified by the acquired write command. Then, the processor 210 refers to the lock management information 910 stored in the memory 220 and confirms the lock state of the stripe 0 corresponding to the device LBA of the new data 0. If stripe 0 is already locked, processing is suspended until the lock is released. When the stripe lock is released, the processor 210 updates the lock management information 910 in the memory 220 to secure the stripe 0 lock.
- the processor 210 determines whether to perform parity update as random write or sequential write. In this embodiment, since new data 0 for one logical block is received, it is determined to update the parity as a random write.
- the storage controller 200 transfers a parity write command to the storage device 31. Specifically, the processor 210 creates a parity write command on the memory 220.
- the parity write command is a command for instructing writing of new data 0 and updating of parity corresponding to new data 0.
- the parity write command includes address information indicating the storage location of the new data 0 in the memory L220 and the device LBA to which the new data 0 is written.
- the RAID management information 810 since the RAID management information 810 is distributed in advance to each of the storage devices 31 to 34, it is not necessary to include parity storage position information in the parity write command. For this reason, the load on the processor 210 due to the creation of the parity write command is reduced.
- the processor 210 includes the device LBA of the storage device in which the old parity 0 is stored in the parity write command. As a result, each storage device need not store the RAID management information 810 in the memory 312, and the capacity of the memory 312 can be reduced.
- the processor 210 instructs the I / O interface 250 to notify the storage device 31 of command generation.
- the processor 311 instructs the I / O interface 315 to transfer the parity write command on the memory 220 to the data buffer 313.
- the transfer of the parity write command may be performed with the storage device 31 serving as a master.
- the device controller 310 acquires new data 0 from the storage controller 200. Specifically, the processor 311 of the device controller 310 analyzes the acquired parity write command and specifies the area (address) of the memory 210 of the storage controller 200 in which the new data 0 is stored. Next, the processor 311 of the device controller 310 instructs the I / O interface 315 to transfer new data 0 from the area of the memory 220 designated by the parity write command to the data buffer 313. The transfer of the new data 0 may be performed with the storage controller 200 as a master.
- the device controller 310 reads the old data 0 from the NVM array 410.
- the processor 311 of the device controller 310 refers to the address conversion table 610 and identifies the PBA of the physical area in which the old data 0 is stored from the device LBA specified by the parity write command.
- the processor 311 issues an instruction to the NVM interface 316, reads the old data 0 from the page of the NVM chip 411 in the NVM array 410 based on the specified PBA, and stores it in the data buffer 313.
- the device controller 310 updates the address conversion table 610 to manage the storage location of the new data 0 and reads both the new data 0 and the old data 0 by one of the following two processes: Keep it possible. Even after receiving the new data 0, the device controller 310 maintains the PBA storing the old data in the address conversion table 610 without updating it. When the device controller 310 receives a commit command to be described later, the device controller 310 sets the old data to be discarded. This is a process for improving the reliability of the storage apparatus in preparation for the occurrence of an error.
- the processor 311 of the device controller 310 stores the address of the data buffer 313 storing the new data 0 in the update information 613 in association with the device LBA of the address conversion table 610.
- intermediate parity can be generated without writing new data 0 to the NVM array, so the time until the completion of parity update is shortened and performance is improved.
- the device controller 310 writes new data 0 to the NVM array 410.
- the processor 311 selects a free page to which new data 0 is written, and stores the PBA of the free page in the update information 613 in association with the device LBA of the address conversion table 610.
- the processor 311 issues an instruction to the NVM interface 316 based on the selected PBA, and causes the NVM chip 411 in the NVM array 410 to write new data 0.
- the new data 0 is stored in the NVM array, which is a non-volatile memory, and therefore, there is an effect that the new data 0 is not lost even if a sudden power failure occurs during the parity update process.
- the device controller 310 In process S708, the device controller 310 generates an intermediate parity 0 based on the old data 0 and the new data 0. Specifically, in the device controller, the processor 311 issues an instruction to the parity operation unit 314, reads the old data 0 and the new data 0 on the data buffer 313, performs the parity operation, and sets the result as intermediate parity 0. Store in the buffer 313.
- the device controller 310 transfers a parity update command to the device controller 340.
- the parity update command is a command for instructing generation of a new parity based on the intermediate parity and the old parity.
- the processor 311 of the device controller 310 creates a parity update command in the data buffer 313.
- the processor 311 refers to the RAID management information 810, identifies the device LBA of the storage device in which the old parity 0 is stored, and includes the identified device LBA in the parity update command.
- the processor 311 includes the address of the data buffer 313 in which the intermediate parity 0 generated in step S708 is stored and the address of the data buffer 313 in which the parity update command is stored in the parity update command. Then, the processor 311 instructs the I / O interface 315 to notify the storage device 34 of the generation of the parity update command. In the storage device 34 that has received the notification, the processor 341 instructs the I / O interface 345 to transfer the parity update command on the data buffer 313 of the storage device 31 to the data buffer 343 of the storage device 34. The transfer of the parity update command may be performed by the device controller 310 serving as a master. When the parity write command includes the device LBA of the storage device in which the old parity 0 is stored, the processor 311 may include the device LBA in the parity update command.
- the device controller 310 if the completion response to the parity update command cannot be received even after a predetermined time has elapsed, the device controller 310 notifies the storage controller 200 that a Timeout error has occurred. Processing when an error occurs will be described later.
- the device controller 340 acquires the intermediate parity 0 from the device controller 310. Specifically, in the device controller 340, the processor 341 causes the I / O interface 345 to transfer the intermediate parity 0 from the address of the data buffer 313 designated by the parity update command to the data buffer 314. The transfer of the intermediate parity 0 may be performed with the storage device 31 serving as a master.
- the device controller 340 reads the old parity 0 from the NVM array 440.
- the processor 341 refers to the address translation table 810 to identify the PBA in which the old parity is stored from the device LBA included in the parity update command, issues an instruction to the NVM interface 346, and Based on the PBA storing the parity, the old parity 0 is read from the NVM chip 441 in the NVM array 440 and stored in the data buffer 343.
- the device controller In process S712, the device controller generates a new parity 0 based on the old parity 0 and the intermediate parity 0. Specifically, in the device controller, the processor 341 issues an instruction to the parity operation unit 344, reads the old parity 0 and the intermediate parity 0 on the data buffer 343, performs the parity operation, and sets the result as the new parity 0. It is stored in the data buffer 343.
- the device controller 340 maintains the new parity 0 and the old parity 0 in a readable state even after the new parity 0 is generated by one of the following two processes. This is the same as the device controller 310 managing the new data 0 and the old data 0 in process S707.
- the processor 341 of the device controller 340 stores the address of the data buffer 343 storing the new parity 0 in the update information 613 in association with the device LBA of the address conversion table 610. Therefore, since the completion response of S712 can be transmitted without writing the new parity 0 in the NVM array, the time until the completion of the parity update is shortened, and the performance is improved.
- the device controller 340 writes a new parity 0 to the NVM array 410.
- the processor 341 selects a free page to which new data 0 is written, and stores the PBA of the free page in the update information 613 in association with the device LBA of the address conversion table 610.
- the processor 341 issues an instruction to the NVM interface 346 based on the selected PBA, and causes the new parity 0 to be written to the NVM chip 441 in the NVM array 440.
- the device controller 340 transmits a parity update command completion response to the device controller 310. Specifically, first, in the device controller 340, the processor 341 creates a completion response to the parity update command in the data buffer 343. Next, the processor 341 issues an instruction to the I / O interface 345 and transfers the created completion response to the data buffer 313 in the storage device 31. The creation of the completion response may be notified to the storage device 31, and the completion response may be transferred using the storage device 31 as a master.
- the device controller 340 may transmit a completion response before acquisition of the old parity 0 after acquiring the intermediate parity 0. In this case, after sending the completion response, the device controller 340 acquires the old parity 0 from the NVM array and generates a new parity 0. Thereby, the time until the parity update is completed is further shortened.
- the device controller 310 transfers a completion response to the parity write command to the storage controller 200 in response to receiving a completion response to the parity update command from the device controller 340. Specifically, first, in the device controller 310, the processor 311 creates a completion response to the parity write command in the data buffer 313. Next, the processor 311 issues an instruction to the I / O interface 315, and transfers the created completion response to the memory 220 in the storage controller 200. Note that the creation of the completion response may be notified to the storage 200, and the completion response may be transferred with the storage controller 200 as a master.
- processing S716 and S717 in response to receiving a completion response to the parity write command from the device controller 31, the storage controller 200 transfers a commit command for reflecting the data update to the storage devices 31 and 34, respectively.
- processing S716, 718, and 720 will be described by taking a commit command to the new data 0 of the storage device 31 as an example.
- the processor 210 upon receiving a completion response to the parity write command from the storage device 31, the processor 210 creates a commit command to the storage device 31 on the memory 220.
- the commit command is a command for notifying completion of parity update processing.
- the commit command can also be said to be a command for discarding old data 0 and confirming it as new data 0.
- the commit command includes a device LBA indicating the storage destination of the new data 0 in the storage device 31.
- the processor 210 instructs the I / O interface 250 to notify the storage device 31 of the creation of the command. Receiving the notification, the storage device 31 instructs the I / O interface 315 to transfer the commit command on the memory 220 to the data buffer 313.
- the transfer of the commit command may be performed with the storage controller 200 serving as a master.
- the address translation table 610 is updated. Specifically, in the device controller 310, the processor 311 selects a page for storing the new data 0 stored in the data buffer 313, and addresses the PBA of the page to correspond to the device LBA of the new data 0. Stored in PBA 612 of conversion table 610. Then, the processor 311 deletes the information of the update information 613 in the address conversion table 610. Then, the processor 311 stores the new data 0 in the selected PBA page.
- the storage device 31 returns the old data 0 for the read to the device LBA before the execution of the process S718, but after the execution of the process S718, the storage device 31 returns the new data 0 for the read to the same device LBA.
- the device controller 310 receives the commit command, the old data 0 can be discarded. The fact that the device controller 310 actually erases the old data 0 is performed asynchronously with the reception of the commit command.
- the PBA value stored in the update information 613 is copied to the PBA612 field, and the information in the update information 613 is deleted.
- the device controller 310 returns a completion response to the commit command to the storage controller 200. Specifically, first, the processor 311 of the device controller 310 creates a completion response to the commit command in the data buffer 313. Next, the processor 311 issues an instruction to the I / O interface 315, and transfers the created completion response to the memory 220 in the storage controller 200. The creation of the completion response may be notified to the storage controller 200, and the completion response may be transferred using the device controller 200 as a master.
- the processes S717, S719, and S721 are similarly performed for the storage device 34 storing the new parity 0 to reflect the update of the new parity 0.
- the storage device 31 in steps S716, 718, and 720 may be the storage device 34, and the new data 0 may be changed to the new parity 0.
- the storage controller 200 executes processing S722.
- the storage controller 200 releases the lock on the stripe 510.
- the processor 210 deletes the information of the lock management information 910 in the memory 220, thereby releasing the lock of the stripe 0.
- the storage controller 200 returns a completion response to the write command to the host computer 10.
- the processor 210 creates a completion response to the write command in the memory 220.
- the processor 200 issues an instruction to the host interface 240 and transfers the created completion response to the memory in the host computer 10.
- the creation of a completion response may be notified to the host computer 10 and the completion response may be transferred with the host computer 10 as a master.
- the number of data transfers between the storage controller and the storage device due to the parity update becomes one, the transfer load of the storage controller is reduced, and the write processing is accelerated.
- the restart process when a Timeout error occurs is explained. If the device controller 310 cannot receive a completion response to the parity update command even after a predetermined time has elapsed, the device controller 310 notifies the storage controller 200 that a Timeout error has occurred. Receiving this notification, the storage controller 200 instructs the management computer 15 to display on the management screen that a Timeout has occurred between storage devices having the storage device 31 as the transfer source and the storage device 34 as the transfer destination. Further, the storage controller 200 resumes from the processing S704 for transmitting the parity write command to the storage device 31. Since the storage controller 200 cannot recognize the progress of the process until it receives a completion response to the parity write command in process 715, which will be described later, when a timeout or other error occurs, the process is resumed from process S704.
- the old data 0 is managed in the readable state in the storage device 31, when the device controller 310 receives the parity write command, the old data 0 is acquired in the process S706. Can do. If the PBA indicating the storage location of the old data 0 has been updated with the information indicating the storage location of the new data 0, the old data 0 has been lost, and thus the processing cannot be resumed.
- the device controller 310 generates an intermediate parity in process S708, and transmits a parity update command to the storage device 34 in process S709. Also in the storage device 34, since the old parity 0 is managed in a readable state, the process of reading the old parity 0 can be executed in the process S711.
- the storage controller recognizes the completion of the parity update and sends a commit command, it maintains the old data and old parity in a readable state in preparation for a timeout error or other failure. Reliability can be improved.
- FIG. 13 shows a configuration example of the management screen 1400 displayed on the management computer 15 of the storage apparatus 20 or the server 80.
- the administrator uses the management screen shown in FIG. 13 (FIGS. 13-1 and 13-2) to inform the administrator of the offload function usage status of each storage device 31 to 34 and the status of the communication path between storage devices. Provide information such as.
- the management screen 1400 includes an inter-storage device path status table 1410 and an offload function status table 1420.
- the inter-storage device path status table 1410 shows the communication state between the storage devices.
- each communication path is shown to the user in two states of being used and not being used.
- the busy state indicates that the target data path is being used in the current settings, and the number of occurrences of Timeout is also displayed.
- FIG. 13A it can be seen that a large number of timeouts occur when the storage device storage device 34 is the transfer destination, and there is a high possibility that some abnormality of the storage device 34 has occurred.
- the unused state indicates that the path is not used in the current setting.
- the user can check whether there is communication in each inter-storage device path, and whether each path is normal. If Timeout occurs frequently, there may be a problem with the communication path or storage device. In this way, by displaying the occurrence status of Timeout in communication between storage devices on the management screen, the administrator can grasp the occurrence status of abnormalities, and maintenance management to maintain system reliability becomes easy. .
- the offload function status table 1420 shows the status of parity update and double write offload function for each storage device.
- the offload function status table indicates to the user three types of states of being used, not used, and not supported for the offload function such as parity update and double writing of each storage device.
- the in-use state indicates that the storage device corresponds to the target offload function, is currently included in the RAID group, and is in a state where processing can be offloaded.
- the non-use state indicates that the storage device supports the target offload function, but the process cannot be offloaded because it is not currently included in the RAID group.
- the incompatible state indicates that the storage device does not support the target offload function.
- Example 1 The operation of Example 1 has been described above.
- the storage controller transfers the new data to the storage device that stores the old data, and the device that stores the old data generates the intermediate parity and then the old data
- the storage controller transfers the new data to the storage device that stores the old parity, the storage device that stores the old parity acquires from the storage device that stores the old data, stores the new data after updating the parity New data may be transferred to the storage device.
- the present invention is not limited to this. Any configuration that includes a plurality of storage devices and a higher-level device that manages them may be used.
- the present invention can be applied to the system configuration 2.
- the storage device 20 may be read as the server 80.
- the parity update process by random write, it becomes unnecessary for the storage controller to acquire old data, old parity, and intermediate parity from the storage device as the parity is updated.
- the parity generation process itself by the storage controller is not necessary. This reduces the I / O processing load and data transfer load of the storage controller.
- the concentration of data transfer load on the storage controller bus is eliminated.
- the storage controller is prevented from becoming a bottleneck in performance, and the performance of a high-speed storage device can be utilized, thereby speeding up the write processing.
- a host device for a storage device such as a storage controller or a server updates parity by communication between storage devices when transferring data to the storage device.
- a host device for a storage device such as a storage controller or a server updates parity by communication between storage devices when transferring data to the storage device.
- there is no data transfer from the host device to the storage device but a process in the case of giving an instruction accompanied by generation of new data or update of data in the storage device will be described.
- a host device instructs the storage device to search for data stored in the storage device. In this system, the storage device searches stored data based on an instruction from the host device, and newly creates search result data.
- the search result data is provided to the host device in association with the device LBA provided by the storage device.
- the storage device belonging to the RAID group is either the parity data of another storage device of the stripe corresponding to the device LBA storing the search result data or the second storage device. It becomes necessary to update the area storing the overwritten data.
- Example 2 shows a parity update operation in system configuration 2 for data generated in a storage device.
- the storage devices 31 to 34 form one 3 DataRAID + 1 Parity RAID5 group.
- the operation when the old data 0 is updated to the new data 0 in the storage device 31 according to an instruction from the host device will be described.
- the old data 0 is updated
- the old parity 0 of the storage device 34 is updated to the new parity 0.
- the update of the storage device 31 data has occurred due to the storage of the database processing result offloaded to the storage device 31.
- the data update occurs due to the database processing.
- the data update may occur due to the storage of the result of the physical simulation. In the following, detailed description of the same processing as in the first embodiment is omitted.
- FIG. 14 shows the data flow between devices regarding the operation of updating data by database processing from the server 80 to the storage device 31. Further, the parity update process accompanying the database process is shown in the ladder chart of FIG.
- the server 80 secures in advance the lock of stripe 0 to which the device LBA that is the data write destination of the database process result belongs. Specifically, the processor 81 first determines a storage device that stores data to be subjected to database processing, and determines a device LBA that stores data resulting from the database processing in the storage device. After the database processing is completed, the server can access the processing result using this device LBA. Then, the processor 81 refers to the lock management information 910 stored in the memory 82, and confirms the lock state of the stripe 0 corresponding to the device LBA. If stripe 0 is already locked, processing is suspended until the lock is released. When the lock of the stripe 0 is released, the processor 81 updates the lock management information in the memory 82 to ensure the lock of the stripe 0.
- the processor 81 transfers a database process offload command to the device controller 310. Specifically, first, in the server 80, the processor 81 generates a database processing offload command in the memory 82.
- the database processing offload command includes information such as the device LBA of the processing target data, the processing result storage destination device LBA, and the contents of the requested database processing.
- the processor 82 instructs the chip set 84 to notify the storage device 31 of new command generation by incrementing the value of the specific area of the memory 312, for example.
- the processor 311 issues an instruction to the I / O interface 315 to transfer the command generated in the memory 82 to the data buffer 313.
- the command transfer may be performed with the server 80 as a master.
- the storage device 31 performs the designated database process.
- the storage device 31 analyzes the database processing offload command and performs the requested database processing.
- the processor 311 stores the analysis target data read from the NVM array 310 in the data buffer 313 based on the device LBA specified in the command, performs database processing, and stores the obtained analysis result in the memory buffer 313.
- the process is executed by a command.
- the database process may be executed every predetermined period as a batch process set in advance.
- a write for storing new data 0 obtained as a result of the analysis in the NVM array 410 is generated.
- the parity update processing is the same as that in which random writing occurred.
- Processes S1104 to S1112 are the same as the processes 706 to 714 of the first embodiment, and a description thereof will be omitted. Further, the processing S1113 to S1120 is the same as the storage controller 200 in the processing S715 to S722 of the first embodiment when read as the server 80, and the description thereof is omitted.
- the parity can be updated by communication between the storage devices. Further, since the server does not perform database search or analysis, the load on the server is further reduced, and the database search or analysis processing is accelerated.
- Example 2 was demonstrated.
- the operation of the server shown in the system configuration 2 has been described.
- the present invention is not limited to this. Any configuration that includes a plurality of storage devices and a higher-level device that can issue an instruction accompanying data update without transferring data to the storage devices may be used.
- the first embodiment in an example of a RAID configuration using parity, an example in which a plurality of storage devices update parity by communicating with each other without passing through a storage controller or server has been described.
- the third embodiment in which the same data is stored in a plurality of storage devices to maintain data redundancy, the plurality of storage devices are the same.
- An example of holding data will be described. In other words, the same data (redundant data) is stored in a storage device different from the storage device that stores the write request target data, with the same data as the write request target data as redundant data.
- Example 3 shows the operation when a write request occurs in the data storage configuration when RAID 1 is applied.
- a storage configuration is assumed in which a RAID group 600 is assembled with storage devices 31 and 32 with respect to the system configuration 1, and data is duplicated.
- the data recording configuration according to the third embodiment will be described.
- the logical storage spaces 61 and 62 provided by the storage devices 31 and 32 belonging to the RAID group 600 are provided to the host computer 10 as one logical volume 60.
- the storage area configured by the RAID group 600 may be provided to the host computer 10 as a plurality of logical volumes.
- FIG. 16 shows the data arrangement in the logical storage spaces 61 and 62 in the RAID group 600.
- each logical block in the logical volume 60 is associated with one logical block in both the logical storage spaces 61 and 62.
- the data of the logical block whose storage LBA is 2 is stored in both the logical block of the device LBA 2 in the logical storage space 61 and the logical block of the device LBA 2 in the logical storage space 62.
- a stripe is composed of two logical blocks that store the same data.
- Fig. 17 shows the data flow between each device when writing. Further, the ladder chart in FIG. 18 shows the processing at the time of writing.
- the processes S1201 to S1203 are the same as the processes S701 to S703 in the first embodiment, and thus description thereof is omitted.
- the storage controller 200 transfers a double write command to the storage device 31.
- the processor 210 specifies the storage destination storage devices 31 and 32 for the new data 0 and the respective storage destination devices LBA from the storage LBA specified by the write command acquired in step S1201.
- the processor 210 creates a double write command on the memory 220.
- the double write command in addition to the storage location of the new data 0 on the memory 220 and the write destination device LBA in the storage device 31, the storage device 32 information and the storage device 32 are stored as the storage destination information of the data to be duplicated. Including the destination device LBA. Note that the data storage destination information to be duplicated may be distributed as RAID management information to the storage devices 31 and 32 in advance without being included in the duplicate write command.
- the processor 210 instructs the I / O interface 250 to notify the storage device 31 of the creation of the double write command.
- the processor 311 in the storage device 31 instructs the I / O interface 315 to transfer the double write command in the memory 220 to the data buffer 31.
- the device controller 310 acquires new data 0 from the storage controller 200.
- the processor 311 issues an instruction to the I / O interface 315 to transfer new data 0 from the area of the memory 220 designated by the double write command to the data buffer 313.
- the device controller 310 writes new data 0 to the NVM array 410.
- the processor 311 issues an instruction to the NVM interface 316 based on the device LBA included in the double write command, and causes the new data 0 to be written to the NVM chip 411 in the NVM array 410.
- the device controller 310 transfers a write command to the device controller 320.
- the processor 311 creates a write command in the data buffer 313.
- the write command includes the device LBA that stores the duplicated data and the address of new data 0 in the data buffer 313 based on the duplicate write command acquired in step S1203. Then, the created write command is transferred to the device controller 320.
- the device controller 320 acquires new data 0 from the device controller 310.
- the processor 321 causes the I / O interface 325 to transfer new data 0 from the address of the data buffer 313 specified by the write command to the data buffer 323.
- the device controller 320 writes new data 0 to the NVM array 420.
- the processor 321 issues an instruction to the NVM interface 326 based on the device LBA included in the write command, and writes the new parity 519 to the NVM chip 421 in the NVM array 420.
- the subsequent processing may be performed earlier than processing S1208.
- Processes S1210 to 1219 are the same as processes S714 to S723 of the first embodiment, and a description thereof will be omitted.
- the example 3 has been described above.
- the operation of the storage apparatus shown in the system configuration 1 has been described.
- the present invention is not limited to this. Any configuration that includes a plurality of storage devices and a higher-level device that manages them may be used.
- a write request may be generated in a storage controller or a storage device.
- the storage controller or server that manages and uses the RAID group when data in the storage device is updated, only writes data to one storage device, and double-writes data to two devices. Can be implemented. This reduces the IO processing load and data transfer load of the storage controller, and speeds up the write process.
- the parity update process that occurs with random write is shown, but in the fourth embodiment described below, the parity update process that occurs with sequential write is shown.
- the sequential write parity calculation is efficiently performed by a host device (storage controller or server) that once holds the data in the stripe necessary for the parity calculation.
- the host device does not have a parity calculation function, it is not possible to calculate parity. Even if the host device has a parity operation function, the performance may deteriorate if the host device has a high load. In such a case, the parity update process of the fourth embodiment is performed. Note that the command transfer process is the same as in the previous embodiments, and a detailed description thereof will be omitted as appropriate.
- the storage devices 31 to 34 form one 334Data + 1 Parity RAID group.
- the operation when the host computer 10 transfers a write command for updating the old data 0-2 stored in the storage devices 31-33 to the new data 0-2 to the storage device 20 will be described. It is assumed that the old parity 0 in the storage device 34 is updated to the new parity 0 with the update of the old data 0-2.
- Fig. 19 shows the data flow between each device when writing.
- the light processing is shown in the ladder chart of FIG. 20 (FIGS. 20-1 and 20-2).
- Processes S1301 to S1303 are the same as the processes S701 to S703 of the first embodiment, and a description thereof will be omitted.
- the data acquired by the storage controller 200 from the host computer 10 is new data 0-2.
- the processor 210 determines whether to perform parity update as random write or sequential write. In this embodiment, since new data 0 to 2 for three logical blocks in the stripe are received, it is determined to update the parity as a sequential write.
- the storage controller 200 transfers a parity write command to the storage device 34.
- the processor 210 refers to the RAID management information 810, and from the storage LBA specified by the write command acquired in step S1301, the storage devices 31 to 33 that are the storage destinations of the new data 0 to 2, respectively. Identify the device LBA.
- the storage device 34 and the device LBA that are the storage destination of the parity 514 of the stripe 0 to which the update target logical block belongs are also specified.
- the processor 210 creates a parity write command on the memory 220.
- the parity write command here is a command for instructing to calculate a new parity from the new data 0 to 2 to be transferred.
- the parity write command includes the device LBA to which the new parity 0 is written, the storage location information of the new data 0 to 2 in the memory 220, and the information for specifying the storage device that is the storage destination of each of the new data 0 to 2.
- the device controller 340 acquires new data 0 to 2 from the storage controller 200.
- the device controller 340 In process S1306, the device controller 340 generates a new parity 0 based on the new data 0-2. Specifically, in the device controller 340, the processor 341 instructs the parity operation unit 344 to read out the new data 0 to 2 stored in the data buffer 343, perform the parity operation, and set the result as the new parity 0. It is stored in the data buffer 343.
- the processor 341 records the storage location of the new parity 0 in the update information 613 of the address translation table 610, and manages both the new parity 0 and the old parity 0. Omitted.
- the device controller 340 creates a write command to be transferred to each of the storage devices 31 to 33, and transfers the created write command. From the write command transfer in process S1308 to the completion response to the write command in process 1311, writing of new data 0 to 2 in each of the storage devices 31 to 33 is similarly performed.
- processing S1308 to processing S1311 will be described by taking writing of new data 0 to the storage device 31 as an example.
- the processor 341 of the device controller 340 creates a write command in the data buffer 343.
- the processor 341 stores the new data 0 on the data buffer 343 and the device LBA of the storage destination of the new data 0 specified based on the RAID management information 810 and the storage device information of the storage destination of the new data 0 included in the write parity write command. Include location information. Then, the processor 341 instructs the I / O interface 345 to notify the storage device 31 of the generation of the write command. In the storage device 31 that has received the notification, the processor 311 instructs the I / O interface 315 to transfer the write command on the memory buffer 313 to the memory buffer 313 one by one.
- the device controller 310 acquires new data 0 from the device controller 340.
- the processor 311 records the storage location of the new data 0 in the update information 613 of the address conversion table 610, and manages both the new data 0 and the old data 0. Omitted.
- the device controller 310 returns a write command completion response to the device controller 340.
- the device controller 340 receives a completion response to the write command from each of the storage devices 31 to 34, it returns a completion response to the parity write command to the storage controller 200.
- Processes S1313 to 1318 are the same as processes S716 to 723 of the first embodiment, and a description thereof will be omitted. Since the fourth embodiment targets sequential write, the storage controller 200 sends a commit command to each of the storage devices 31 to 34 in the RAID group in S1313, and receives a completion response from each of the storage devices 31 to 34 in S1316. Later, the lock is released in S1315.
- the storage controller does not need to generate parity, and the load on the storage controller is reduced. Further, even if the storage controller does not have a parity generation function, efficient parity generation can be performed, and the write processing is speeded up.
- the operation of the fourth embodiment has been described above.
- Figures 19 and 20 Figures 20-1 and 20-2
- the storage controller transfers new data to the device that stores the old parity, and the device that stores the old parity generates the new parity and then the same
- An example of instructing data writing to another storage device belonging to a RAID stripe was described. However, the storage controller may transfer new data to the disk storing the old data, and the device storing the old data may instruct writing of data to other storage devices belonging to the same RAID stripe.
- the fourth embodiment is directed to the sequential write generated in the host computer, the sequential write generated in the storage device as in the second embodiment may be targeted.
- the storage device in which the sequential write has occurred performs a parity operation, and performs parity update by instructing each storage device having the logical block to be written to update data.
- the operation of the storage apparatus shown in the system configuration 1 has been described.
- the present invention is not limited to this. Any configuration that includes a plurality of storage devices and a higher-level device that manages them may be used.
- the present invention may be applied to the system configuration 2.
- Example 5 shows parity update processing during rebuilding of a storage device.
- the device controller 310 has an XOR operation function, and can execute data restoration processing. It is assumed that the storage device 32 has failed and has been replaced with a new storage device 35.
- the storage controller 200 instructs the storage device 35 to execute rebuilding.
- the device controller 310 of the storage device 35 restores the data in order from the top of the device LBA.
- the device controller 310 of the storage device 35 instructs the other storage devices 31, 33, 34 in the RAID group to transfer data and parity belonging to stripe 0. .
- the device controller 310 of the storage device 35 executes the XOR operation of the transferred Data0, Data1, and Parity0 to restore Data1. Further, by sequentially restoring Data5 and Parity2, all data stored in the storage device 32 is restored. After all the data is restored, the device controller 310 of the storage device 35 sends a response indicating that the rebuild is complete to the storage controller 200.
- the storage controller 200 When the storage controller 200 receives a write command instructing data update to the storage device 35 being rebuilt, if the area specified by the write command is before data restoration, the write data is written as it is and the parity is written. When updating, inconsistency occurs. Therefore, data consistency can be maintained by one of the following two processes.
- FIG. 21 shows a flowchart of data update processing to the storage device during rebuilding by the storage controller.
- FIG. 22 shows a flowchart of the storage device during rebuilding.
- the storage controller 200 receives a write command from the host computer 10.
- the storage controller 200 that has received the write command identifies a storage device that is a data update target, and determines whether the storage device is being rebuilt. Here, the storage device 35 is assumed. If the storage device 35 is being rebuilt (step S2102: YES), the process proceeds to step S2103.
- the storage controller 200 instructs the storage device 35 to secure the lock of the device LBA to be updated.
- the device controller 310 of the storage device 35 receives a lock securing instruction.
- the device controller 310 determines whether data restoration of the instructed device LBA has been completed.
- step S2203 the device controller 310 responds to the storage controller 200 with “data restored”. This is because the normal write process can be executed if the data has been restored. If the data restoration has not been completed (step S2203: No), in step S2204, the device controller 310 determines whether the instructed device LBA is in the process of data restoration.
- the device controller 310 responds “unlockable” to the storage controller 200 in process S2205. Inconsistency occurs when data is updated during the data restoration process. In order to avoid this, it responds that lock is impossible.
- the device controller 310 secures the lock of the designated device LBA.
- the device controller 310 does not restore data for the device LBA while securing the lock.
- the device controller 310 responds to the storage controller 200 with “secure lock”.
- the storage controller 200 receives a response to the lock securing instruction from the storage device 35.
- the storage controller 200 that has received the response confirms the content of the response and determines the next process to be executed accordingly.
- step S2103 If the content of the response is “unlockable”, the storage controller 200 executes step S2103 again. Thereby, inconsistency due to data update during data restoration processing can be avoided.
- the write process described in the first embodiment is executed for the storage device 35 in process S2106. Since the storage device 35 does not execute the data restoration process while the lock is secured, no data inconsistency occurs even if the write process is executed.
- the storage controller 200 instructs the storage device 35 to release the lock in process S2107. The storage device 35 that has received the lock release instruction releases the lock of the locked device LBA.
- the storage controller 200 instructs to preferentially restore the data of the device LBA that is the target of the write process.
- the device controller 310 of the storage device 35 executes data restoration processing for the instructed device LBA.
- the device controller 310 records the device LBA for which the data restoration is completed in the memory 312 and notifies the storage controller 200 of the completion.
- the storage controller that has received this completion notification executes the write process described in the first embodiment. Even in this process, data and parity can be updated while maintaining data consistency during rebuilding.
- Host computer 20 Storage device 200: Storage controller 31-34: Storage device 80: Server
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Quality & Reliability (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Techniques For Improving Reliability Of Storages (AREA)
Abstract
Description
また、RAIDグループを構成する全デバイスのデータがストレージコントローラ経由で転送されるため、データ転送負荷がストレージコントローラ内のバスに集中する。このため、記憶デバイスの数を増やしてもバスがネックになり、性能向上が果たされない。
メモリ220はストレージ装置20全体の管理情報(例えば、RAID管理情報810、ロック管理情報910)の記憶、ホスト計算機10からのリードコマンドやライトコマンド及びコマンドの対象データの一時的な記憶を行う。
以上のように、ランダムライトによるパリティの更新処理において、パリティの更新に伴ってストレージコントローラが記憶デバイスから旧データ、旧パリティ、中間パリティを取得することが不要となる。また、ストレージコントローラによるパリティ生成の処理そのものも不要となる。そのため、ストレージコントローラのI/O処理負荷及びデータ転送負荷が軽減される。また、ストレージコントローラ内バスへのデータ転送負荷の集中を解消する。ストレージコントローラが性能のネックとなることが回避され、また高速な記憶デバイスの性能を活かすことができ、ライト処理が高速化する。
例として、記憶デバイス31内で旧データ0を新データ0へ更新するライトコマンドを、ストレージ装置20へ転送したときの動作を説明する。図17に示すように、更新前の旧データ0は記憶デバイス31、32の論理記憶領域61、62のそれぞれに格納され、書き込みに伴い記憶デバイス31、32の双方のデータが更新されるものとする。
20: ストレージ装置
200: ストレージコントローラ
31~34: 記憶デバイス
80: サーバ
Claims (24)
- RAIDグループを構成し、1つのバスに接続され、相互に通信を行う複数の記憶デバイスを備え、
前記複数の記憶デバイスのそれぞれは、デバイスコントローラと、データを格納する記憶媒体と、を有し、
前記複数の記憶デバイスは、旧データを格納する第一記憶デバイスと、前記旧データに対応する旧パリティを格納する第二記憶デバイスと、を含み、
前記第一記憶デバイスの第一デバイスコントローラは、
前記旧データと、前記旧データを更新する新データと、に基づいて中間パリティを生成し、前記旧データに対応する前記旧パリティを格納している前記第二記憶デバイスを指定して、前記第二記憶デバイスに前記中間パリティを送信し、
前記第二記憶デバイスの第二デバイスコントローラは、前記中間パリティと前記旧パリティに基づいて新パリティを生成する
ことを特徴とする情報処理システム。 - 前記複数の記憶デバイスそれぞれのデバイスコントローラは、
RAID管理情報を受信し、
前記RAID管理情報は、前記RAIDグループのRAIDレベル、複数のデータ及び前記複数のデータから生成されるパリティを含むストライプに含まれるデータの数及びパリティの数、前記ストライプに含まれるデータ及びパリティの1つあたりのサイズ並びに前記複数の記憶デバイスそれぞれの記憶領域を一意に識別するデバイスLBA(Logical Block Address)の情報を含む
ことを特徴とする請求項1に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記RAID管理情報に基づいて前記旧パリティを格納している前記第二記憶デバイスを特定し、前記第二記憶デバイスの前記旧パリティに対応するデバイスLBAを指定して、前記中間パリティと前記旧パリティに基づく前記新パリティの生成指示を送信する
ことを特徴とする請求項2に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記新データを受信した後、前記旧データ及び前記新データを読み出し可能な状態に維持する
ことを特徴とする請求項3に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記新パリティの更新完了を通知するコミットコマンドを受信した後に、前記旧データを破棄する
ことを特徴とする請求項4に記載の情報処理システム。 - 前記第二デバイスコントローラは、
前記新パリティを生成した後、前記旧パリティ及び前記新パリティを読み出し可能な状態に維持する
ことを特徴とする請求項3に記載の情報処理システム。 - 前記第二デバイスコントローラは、
前記新パリティの更新完了を通知するコミットコマンドを受信した後に、前記旧パリティを破棄する
ことを特徴とする請求項4に記載の情報処理システム。 - 前記第二のデバイスコントローラは、
複数の旧データをそれぞれ更新する複数の新データを受信すると、前記複数のデータに基づいて新パリティを生成し、
前記複数の新データのそれぞれを前記RAIDグループに含まれる他の複数の記憶デバイスのそれぞれに転送する
ことを特徴とする請求項1に記載の情報処理システム。 - それぞれがデバイスコントローラと、データを格納する記憶媒体と、を有し、相互に通信を行う複数の記憶デバイスと、
前記複数の記憶デバイスとバスで接続され、前記複数の記憶デバイスをRAIDグループとして制御するストレージコントローラと、
を備え、
前記複数の記憶デバイスは、旧データを格納する第一記憶デバイスと、前記旧データに対応する旧パリティを格納する第二記憶デバイスと、を含み、
前記ストレージコントローラは、前記旧データを更新する新データを前記第一記憶デバイスに送信し、
前記第一記憶デバイスの第一デバイスコントローラは、
前記旧データと、前記新データと、に基づいて中間パリティを生成し、前記旧データに対応する前記旧パリティを格納している前記第二記憶デバイスを指定して、前記第二記憶デバイスに前記中間パリティを送信し、
前記第二記憶デバイスの第二デバイスコントローラは、前記中間パリティと前記旧パリティに基づいて新パリティを生成する
ことを特徴とするストレージ装置。 - 前記ストレージコントローラは、前記新データに対応する新パリティの更新を指示するパリティライトコマンドを前記第一の記憶デバイスに送信する
ことを特徴とする請求項9に記載のストレージ装置。 - 前記第一記憶デバイスは、
前記第一コマンドを受信すると、前記第二記憶デバイスに前記中間パリティと前記旧パリティに基づく前記新パリティの生成を指示する第二パリティ更新を送信し、
前記第二記憶デバイスから前記パリティ更新コマンドに対する完了応答を受信すると、前記ストレージコントローラに前記パリティライトコマンドに対する完了応答を送信する
ことを特徴とする請求項10に記載のストレージ装置。 - 前記ストレージコントローラは、
前記第一記憶デバイスから前記パリティライトコマンドに対する完了応答を受信すると、
前記第一記憶デバイス及び前記第二記憶デバイスに前記新パリティの更新完了を通知するコミットコマンドを送信する
ことを特徴とする請求項11に記載のストレージ装置。 - 前記第一デバイスコントローラは、前記第二記憶デバイスに前記パリティ更新コマンドを送信した後、所定時間を経過しても前記第二記憶デバイスからの完了応答を受信しない場合、前記ストレージコントローラにタイムアウトの発生を通知する
ことを特徴とする請求項12に記載のストレージ装置。 - 前記ストレージコントローラは、出力画面を有する管理計算機に接続されており、前記タイムアウトの発生の通知を受信した場合、前記管理計算機の出力画面に前記第一記憶デバイスと前記第二記憶デバイス間の前記タイムアウトの発生状況を表示させる
ことを特徴とする請求項13に記載のストレージ装置。 - デバイスコントローラと、旧データを格納する記憶媒体と、を有する記憶デバイスであって、
前記デバイスコントローラは、
複数の他の記憶デバイスと通信を行うように接続され、前記記憶デバイス及び前記複数の他の記憶デバイスがRAIDグループとして制御されている場合に、
前記旧データと、前記旧データを更新する新データと、に基づいて中間パリティを生成し、前記複数の他の記憶デバイスのうち前記旧データに対応する旧パリティを格納している特定記憶デバイスを指定して前記中間パリティを転送し、前記特定記憶デバイスに前記中間パリティと前記旧データに基づく新パリティの生成指示を送信する
ことを特徴とする記憶デバイス。 - 前記デバイスコントローラは、
RAID管理情報を受信し、
前記RAID管理情報は、前記RAIDグループのRAIDレベル、複数のデータ及び前記複数のデータから生成されるパリティを含むストライプに含まれるデータの数及びパリティの数、前記ストライプに含まれるデータ及びパリティの1つあたりのサイズ並びに前記複数の記憶デバイスそれぞれの記憶領域を一意に識別するデバイスLBA(Logical Block Address)の情報を含む
ことを特徴とする請求項15に記載の記憶デバイス。 - 前記デバイスコントローラは、
前記RAID管理情報に基づいて前記旧パリティを格納している前記特定記憶デバイスを特定し、前記特定記憶デバイスの前記旧パリティに対応するデバイスLBAを指定して、前記中間パリティと前記旧パリティに基づく前記新パリティの生成指示を送信する
ことを特徴とする請求項16に記載の記憶デバイス。 - 前記デバイスコントローラは、
前記新データを受信した後、前記旧データ及び前記新データを読み出し可能な状態に維持する
ことを特徴とする請求項17に記載の記憶デバイス。 - 前記デバイスコントローラは、
前記新パリティの更新完了を通知するコミットコマンドを受信した後に、前記旧データを破棄する
ことを特徴とする請求項18に記載の記憶デバイス。 - RAIDグループを構成し、1つのバスに接続され、相互に通信を行う複数の記憶デバイスを備え、
前記複数の記憶デバイスのそれぞれは、デバイスコントローラと、データを格納する記憶媒体と、を有し、
前記複数の記憶デバイスは、第一旧データを格納する第一記憶デバイスと、前記第一旧データに対応する、前記第一旧データと同一のデータである第二旧データを格納する第二記憶デバイスと、を含み、
前記第一記憶デバイスの第一デバイスコントローラは、
前記第一旧データが第一新データに更新された場合、前記データに対応する前記第二旧データを格納している前記第二記憶デバイスを指定して、前記第二記憶デバイスに前記第一新データを送信する
ことを特徴とする情報処理システム。 - 前記複数の記憶デバイスそれぞれのデバイスコントローラは、
RAID管理情報を受信し、
前記RAID管理情報は、前記RAIDグループのRAIDレベル、複数のデータを含むストライプに含まれるデータの数、前記ストライプに含まれるデータの1つあたりのサイズ並びに前記複数の記憶デバイスそれぞれの記憶領域を一意に識別するデバイスLBA(Logical Block Address)の情報を含む
ことを特徴とする請求項20に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記RAID管理情報に基づいて前記第二旧データを格納している前記第二記憶デバイスを特定し、前記第二記憶デバイスの前記第二旧データに対応するデバイスLBAを指定して、前記第一新データ送信する
ことを特徴とする請求項21に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記第一新データを受信した後、前記第一旧データ及び前記第一新データを読み出し可能な状態に維持する
ことを特徴とする請求項22に記載の情報処理システム。 - 前記第一デバイスコントローラは、
前記第一新データの前記第二記憶デバイスへの格納完了を通知するコミットコマンドを受信した後に、前記第一旧データを破棄する
ことを特徴とする請求項23に記載の情報処理システム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2015/065719 WO2016194095A1 (ja) | 2015-06-01 | 2015-06-01 | 情報処理システム、ストレージ装置及び記憶デバイス |
JP2017521355A JP6328335B2 (ja) | 2015-06-01 | 2015-06-01 | ストレージ装置及びその制御方法 |
US15/531,795 US10102070B2 (en) | 2015-06-01 | 2015-06-01 | Information processing system, storage apparatus and storage device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2015/065719 WO2016194095A1 (ja) | 2015-06-01 | 2015-06-01 | 情報処理システム、ストレージ装置及び記憶デバイス |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016194095A1 true WO2016194095A1 (ja) | 2016-12-08 |
Family
ID=57441887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2015/065719 WO2016194095A1 (ja) | 2015-06-01 | 2015-06-01 | 情報処理システム、ストレージ装置及び記憶デバイス |
Country Status (3)
Country | Link |
---|---|
US (1) | US10102070B2 (ja) |
JP (1) | JP6328335B2 (ja) |
WO (1) | WO2016194095A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019049224A1 (ja) * | 2017-09-06 | 2019-03-14 | 株式会社日立製作所 | 分散ストレージシステム及び分散記憶制御方法 |
JP2020126420A (ja) * | 2019-02-04 | 2020-08-20 | Necプラットフォームズ株式会社 | ストレージ装置、ストレージシステム、ストレージ制御方法、及び、ストレージ制御プログラム |
CN111831217A (zh) * | 2019-04-19 | 2020-10-27 | 株式会社日立制作所 | 存储***、其驱动器框体、以及奇偶校验计算方法 |
US11971782B2 (en) | 2020-02-20 | 2024-04-30 | Sk Hynix Nand Product Solutions Corp. | On-SSD erasure coding with uni-directional commands |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102017004620A1 (de) * | 2016-07-04 | 2018-01-04 | Sew-Eurodrive Gmbh & Co Kg | Sicherheitseinrichtung und Verfahren zum Betreiben eines Systems |
WO2018051505A1 (ja) * | 2016-09-16 | 2018-03-22 | 株式会社日立製作所 | ストレージシステム |
US10642690B1 (en) * | 2016-12-20 | 2020-05-05 | Tintri By Ddn, Inc. | Policy-driven raid rebuild |
US10459795B2 (en) * | 2017-01-19 | 2019-10-29 | International Business Machines Corporation | RAID systems and methods for improved data recovery performance |
US10970204B2 (en) | 2017-08-29 | 2021-04-06 | Samsung Electronics Co., Ltd. | Reducing read-write interference by adaptive scheduling in NAND flash SSDs |
US11221958B2 (en) | 2017-08-29 | 2022-01-11 | Samsung Electronics Co., Ltd. | System and method for LBA-based RAID |
US10996888B2 (en) * | 2017-10-31 | 2021-05-04 | Qualcomm Incorporated | Write credits management for non-volatile memory |
US11151037B2 (en) | 2018-04-12 | 2021-10-19 | International Business Machines Corporation | Using track locks and stride group locks to manage cache operations |
US10831597B2 (en) | 2018-04-27 | 2020-11-10 | International Business Machines Corporation | Receiving, at a secondary storage controller, information on modified data from a primary storage controller to use to calculate parity data |
US10884849B2 (en) | 2018-04-27 | 2021-01-05 | International Business Machines Corporation | Mirroring information on modified data from a primary storage controller to a secondary storage controller for the secondary storage controller to use to calculate parity data |
JP7367359B2 (ja) * | 2018-08-10 | 2023-10-24 | 株式会社デンソー | 車両用電子制御システム、ファイルの転送制御方法、ファイルの転送制御プログラム及び装置 |
US11797445B2 (en) * | 2021-12-06 | 2023-10-24 | Western Digital Technologies, Inc. | Data storage device and method for preventing data loss during an ungraceful shutdown |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07261946A (ja) * | 1994-03-22 | 1995-10-13 | Hitachi Ltd | アレイ型記憶装置 |
JPH1074129A (ja) * | 1996-08-30 | 1998-03-17 | Hitachi Ltd | ディスクアレイ装置 |
JP2005044147A (ja) * | 2003-07-22 | 2005-02-17 | Ntt Data Corp | ディスクアレイ装置、ディスクアレイシステム、パリティデータ分散方法及びコンピュータプログラム |
JP2007323224A (ja) * | 2006-05-31 | 2007-12-13 | Hitachi Ltd | フラッシュメモリストレージシステム |
JP2012519319A (ja) * | 2009-05-25 | 2012-08-23 | 株式会社日立製作所 | ストレージサブシステム |
JP2014203233A (ja) * | 2013-04-04 | 2014-10-27 | 株式会社日立製作所 | ストレージシステム及びストレージシステムにおいてデータを更新する方法 |
JP2015515033A (ja) * | 2012-04-27 | 2015-05-21 | 株式会社日立製作所 | ストレージシステム |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5640506A (en) * | 1995-02-15 | 1997-06-17 | Mti Technology Corporation | Integrity protection for parity calculation for raid parity cache |
JP3713788B2 (ja) | 1996-02-28 | 2005-11-09 | 株式会社日立製作所 | 記憶装置および記憶装置システム |
US6526477B1 (en) * | 1999-09-03 | 2003-02-25 | Adaptec, Inc. | Host-memory based raid system, device, and method |
US9778986B2 (en) * | 2014-03-28 | 2017-10-03 | Hitachi, Ltd. | Storage system |
US9858146B2 (en) * | 2015-12-21 | 2018-01-02 | International Business Machines Corporation | Reducing latency for raid destage operations |
-
2015
- 2015-06-01 WO PCT/JP2015/065719 patent/WO2016194095A1/ja active Application Filing
- 2015-06-01 US US15/531,795 patent/US10102070B2/en not_active Expired - Fee Related
- 2015-06-01 JP JP2017521355A patent/JP6328335B2/ja not_active Expired - Fee Related
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07261946A (ja) * | 1994-03-22 | 1995-10-13 | Hitachi Ltd | アレイ型記憶装置 |
JPH1074129A (ja) * | 1996-08-30 | 1998-03-17 | Hitachi Ltd | ディスクアレイ装置 |
JP2005044147A (ja) * | 2003-07-22 | 2005-02-17 | Ntt Data Corp | ディスクアレイ装置、ディスクアレイシステム、パリティデータ分散方法及びコンピュータプログラム |
JP2007323224A (ja) * | 2006-05-31 | 2007-12-13 | Hitachi Ltd | フラッシュメモリストレージシステム |
JP2012519319A (ja) * | 2009-05-25 | 2012-08-23 | 株式会社日立製作所 | ストレージサブシステム |
JP2015515033A (ja) * | 2012-04-27 | 2015-05-21 | 株式会社日立製作所 | ストレージシステム |
JP2014203233A (ja) * | 2013-04-04 | 2014-10-27 | 株式会社日立製作所 | ストレージシステム及びストレージシステムにおいてデータを更新する方法 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019049224A1 (ja) * | 2017-09-06 | 2019-03-14 | 株式会社日立製作所 | 分散ストレージシステム及び分散記憶制御方法 |
JPWO2019049224A1 (ja) * | 2017-09-06 | 2019-11-07 | 株式会社日立製作所 | 分散ストレージシステム及び分散記憶制御方法 |
CN111052090A (zh) * | 2017-09-06 | 2020-04-21 | 株式会社日立制作所 | 分布式存储***和分布式存储控制方法 |
US11321208B2 (en) | 2017-09-06 | 2022-05-03 | Hitachi, Ltd. | Distributed storage system and distributed storage control method |
JP7113832B2 (ja) | 2017-09-06 | 2022-08-05 | 株式会社日立製作所 | 分散ストレージシステム及び分散記憶制御方法 |
CN111052090B (zh) * | 2017-09-06 | 2023-09-29 | 株式会社日立制作所 | 分布式存储***和分布式存储控制方法 |
JP2020126420A (ja) * | 2019-02-04 | 2020-08-20 | Necプラットフォームズ株式会社 | ストレージ装置、ストレージシステム、ストレージ制御方法、及び、ストレージ制御プログラム |
CN111831217A (zh) * | 2019-04-19 | 2020-10-27 | 株式会社日立制作所 | 存储***、其驱动器框体、以及奇偶校验计算方法 |
JP2020177501A (ja) * | 2019-04-19 | 2020-10-29 | 株式会社日立製作所 | ストレージシステム、そのドライブ筐体、およびパリティ演算方法。 |
US11971782B2 (en) | 2020-02-20 | 2024-04-30 | Sk Hynix Nand Product Solutions Corp. | On-SSD erasure coding with uni-directional commands |
Also Published As
Publication number | Publication date |
---|---|
US20170322845A1 (en) | 2017-11-09 |
JPWO2016194095A1 (ja) | 2017-07-20 |
US10102070B2 (en) | 2018-10-16 |
JP6328335B2 (ja) | 2018-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6328335B2 (ja) | ストレージ装置及びその制御方法 | |
JP5937697B2 (ja) | ストレージシステム | |
US7975115B2 (en) | Method and apparatus for separating snapshot preserved and write data | |
US9563377B2 (en) | Computer system and method of controlling computer system | |
JP6009095B2 (ja) | ストレージシステム及び記憶制御方法 | |
JP6228347B2 (ja) | ストレージ装置及び記憶デバイス | |
US10019362B1 (en) | Systems, devices and methods using solid state devices as a caching medium with adaptive striping and mirroring regions | |
JP4440803B2 (ja) | 記憶装置、その制御方法及びプログラム | |
US7502955B2 (en) | Disk array system and control method thereof | |
US20150301964A1 (en) | Methods and systems of multi-memory, control and data plane architecture | |
JP6600698B2 (ja) | 計算機システム | |
JP6007332B2 (ja) | ストレージシステム及びデータライト方法 | |
WO2014170984A1 (ja) | ストレージシステム及び記憶制御方法 | |
JP2010086424A (ja) | ストレージ装置の管理装置 | |
JP2008015769A (ja) | ストレージシステム及び書き込み分散方法 | |
JP2009093529A (ja) | 仮想ボリュームにおける仮想領域に動的に実領域を割り当てるストレージシステム | |
US10394484B2 (en) | Storage system | |
WO2018051505A1 (ja) | ストレージシステム | |
US20180307427A1 (en) | Storage control apparatus and storage control method | |
US11210214B2 (en) | Storage system and compression method of storing compressed data from storage controller physical address space to logical and physical address space of nonvolatile memory | |
WO2016194162A1 (ja) | 計算機システム | |
WO2018055686A1 (ja) | 情報処理システム | |
JP6163588B2 (ja) | ストレージシステム | |
WO2017212515A1 (ja) | ストレージシステム、計算機、およびストレージ制御方法 | |
US20230280945A1 (en) | Storage system and control method for storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15894130 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017521355 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15531795 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15894130 Country of ref document: EP Kind code of ref document: A1 |