JPH07261945A

JPH07261945A - Disk array device and disk array dividing method

Info

Publication number: JPH07261945A
Application number: JP6072655A
Authority: JP
Inventors: Hitoshi Tsunoda; 仁角田; Yoshifumi Takamoto; 良史高本; Yoshihisa Kamo; 善久加茂
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1994-03-17
Filing date: 1994-03-17
Publication date: 1995-10-13

Abstract

PURPOSE:To lower the probability of disappearance of data and to improve the performance against the faults by successively recovering the data of the faulty drives in order of lower parity group levels. CONSTITUTION:A logical group 10 mixedly includes a parity group which constructs a level 5 with 5 pieces of data and a parity (5D+1P), a parity group which constructs a level 5 with 3 pieces of data and a parity (3D+1P) and a parity group obtained by duplication. These parity group levels are sorted in order of higher reliability and performance of them. For instance, the first drive 12 has a fault and the data stored in this drive are recovered. Under such conditions, the parity groups are successively processed for their faults in order of lower levels (higher probability of disappearance of data when the second drive has a fault).

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、コンピュータシステム
などに適用して好適なディスクファイルシステムに関
し、特に、高性能な入出力動作を可能とするディスクア
レイ装置およびディスクアレイの区分け方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a disk file system suitable for being applied to a computer system or the like, and more particularly to a disk array device and a disk array partitioning method that enable high performance input / output operation.

【０００２】[0002]

【従来の技術】現在のコンピュータシステムにおいて
は、ＣＰＵなどの上位側が必要とするデ−タは２次記憶
装置に格納され、ＣＰＵなどが必要とするときに応じて
２次記憶装置に対してデ−タの書き込みおよび読み出し
を行っている。この２次記憶装置としては、一般に不揮
発な記憶媒体が使用され、代表的なものとして磁気ディ
スク装置（以下、ドライブとする）や、光ディスクなど
があげられる。2. Description of the Related Art In a current computer system, data required by a high-order side such as a CPU is stored in a secondary storage device, and the data is stored in the secondary storage device as needed by the CPU. -Writing and reading data. A non-volatile storage medium is generally used as the secondary storage device, and typical examples thereof include a magnetic disk device (hereinafter referred to as a drive) and an optical disk.

【０００３】近年高度情報化に伴い、コンピュータシス
テムにおいて、この種の２次記憶装置の高性能化が要求
されてきた。その一つの解として、多数の比較的容量の
小さなドライブにより構成されるディスクアレイが考え
られている。In recent years, as computerization has advanced, there has been a demand for higher performance of this type of secondary storage device in computer systems. As one solution, a disk array composed of a large number of drives each having a relatively small capacity is considered.

【０００４】公知の文献として、「D.Patterson,G.Gibs
on,and R.H.Kartz;A Case for Redundant Arrays of In
expensive Disks(RAID),in ACM SIGMOD Conference,Chi
cago,IL,(June1988)」がある。この文献においては、デ
ータを二重化するディスクアレイ（レベル１）と、デー
タを分割して並列に処理を行うディスクアレイ（レベル
３）と、データを分散して独立に扱うディスクアレイ
（レベル４、５）について、その性能および信頼性の検
討結果が報告されている。現在この論文に書かれている
方式が、最も一般的なディスクアレイと考えられてい
る。As a known document, "D. Patterson, G. Gibs
on, and RHKartz; A Case for Redundant Arrays of In
expensive Disks (RAID), in ACM SIGMOD Conference, Chi
cago, IL, (June 1988) ”. In this document, a disk array (level 1) for duplicating data, a disk array (level 3) for dividing data and processing in parallel, and a disk array (level 4, 5) for separately handling data separately. ), The results of examination of its performance and reliability have been reported. The method currently described in this paper is considered to be the most common disk array.

【０００５】まず、レベル３のディスクアレイについて
簡単に説明する。レベル３のディスクアレイでは、上位
から与えられた１つのデータを分割し、複数のドライブ
に振り分けて格納する。そのデータを読み出す場合は、
複数のドライブから分割されたデータ片を集めて結合
し、上位へ転送する。このため、レベル３では複数ドラ
イブでの並列処理が可能となり、転送速度の向上を図る
ことができる。First, a level 3 disk array will be briefly described. In the level 3 disk array, one piece of data given from the upper level is divided and divided among a plurality of drives for storage. To read that data,
Divided data pieces are collected from multiple drives, combined, and transferred to a higher level. Therefore, at level 3, parallel processing can be performed by a plurality of drives, and the transfer speed can be improved.

【０００６】次に、データを分割せずに個々のデータを
分散して、独立に扱うディスクアレイ（レベル５）につ
いて説明する。レベル４は、レベル５においてディスク
アレイを構成する各ドライブに分散しているパリティ
を、パリティのみを格納する１台のドライブに格納する
ようにしたものである。Next, a disk array (level 5) in which individual data is distributed without dividing the data and handled independently will be described. Level 4 is such that the parity distributed in each drive forming the disk array in level 5 is stored in one drive that stores only parity.

【０００７】レベル４、５のディスクアレイでは、個々
のデータを分割せずに独立に扱い、多数の比較的容量の
小さなドライブに分散して格納する。現在、一般に使用
されている汎用大型コンピュータシステムの２次記憶装
置では、１ドライブ当りの容量が大きいため、他の読み
出し／書き込み要求に当該ドライブが使用されて、その
ドライブを使用できずに待たされることが多く発生し
た。このレベル５（または４）のディスクアレイでは、
汎用大型コンピュータシステムの２次記憶装置で使用さ
れている大容量のドライブを、多数の比較的容量の小さ
なドライブで構成し、データを分散して格納してあるた
め、読み出し／書き込み要求が増加してもディスクアレ
イの複数のドライブで分散して処理することが可能とな
り、読み出し／書き込み要求が待たされることが減少す
る。In the level 4 and level 5 disk arrays, individual data is handled independently without being divided, and is distributed and stored in a large number of relatively small capacity drives. In a secondary storage device of a general-purpose large-scale computer system that is generally used at present, since the capacity per drive is large, the drive is used for another read / write request, and the drive cannot be used and is kept waiting. Many things happened. In this level 5 (or 4) disk array,
The large-capacity drive used in the secondary storage device of a general-purpose large-scale computer system is composed of a large number of relatively small-capacity drives to store data in a distributed manner, which increases read / write requests. However, it becomes possible to perform distributed processing by a plurality of drives in the disk array, and the waiting time for read / write requests is reduced.

【０００８】次に、ディスクアレイにおけるパリティに
ついて説明する。ディスクアレイは従来の大容量のドラ
イブを、比較的容量の小さな多数のドライブで構成する
ため、部品点数が増加し障害が発生する確率が高くな
る。このため、ディスクアレイでは、パリティを用意す
る。Next, the parity in the disk array will be described. Since the disk array comprises a conventional large-capacity drive with a large number of relatively small-capacity drives, the number of parts increases and the probability of failure increases. For this reason, parity is prepared in the disk array.

【０００９】図５はレベル５（レベル４でも同様）にお
けるパリティの作成方法を示し、図３は従来のディスク
アレイのレベル５におけるこれらのデータおよびパリテ
ィの格納状態を示す。FIG. 5 shows a method of creating parity in level 5 (same in level 4), and FIG. 3 shows a storage state of these data and parity in level 5 of the conventional disk array.

【００１０】図５に示すように、パリティ（Ｐａｒｉｔ
ｙ）は、データ＃１（Ｄａｔａ＃１）からデータ＃５
（Ｄａｔａ＃５）の各データ間で、対応する各ビット毎
に排他的論理和をとることにより、作成される。このよ
うにして作成されたパリティは、このパリティの作成に
関与したデータが格納されているドライブ以外のドライ
ブに格納される。As shown in FIG. 5, parity (Parit)
y) is data # 1 (Data # 1) to data # 5
It is created by taking the exclusive OR for each corresponding bit between each data of (Data # 5). The parity created in this way is stored in a drive other than the drive in which the data involved in creating the parity is stored.

【００１１】具体的には、図３に示すように、各々独立
したデータ＃１からデータ＃５は、ドライブ＃１からド
ライブ＃６の何れかのドライブにそれぞれ格納され、こ
れらのデータから作成されたパリティは、それらのデー
タが格納されたドライブ以外のドライブに格納されてい
る。Specifically, as shown in FIG. 3, independent data # 1 to data # 5 are stored in any one of the drives # 1 to # 6 and are created from these data. The parity is stored in a drive other than the drive in which the data is stored.

【００１２】なお、図３はレベル５のディスクアレイを
示しているので、パリティは６台のドライブ＃１〜＃６
に分散されて格納される。レベル４では、１台のドライ
ブにパリティを格納するから、例えば、ドライブ＃６に
まとめてパリティを格納することになる。Since FIG. 3 shows a level 5 disk array, the parity is 6 drives # 1 to # 6.
It is distributed and stored in. At level 4, since the parity is stored in one drive, the parity is collectively stored in drive # 6, for example.

【００１３】図３において、ディスクアレイを構成する
ドライブ＃１から＃６を論理グループと呼ぶ。また、論
理グループ内で、パリティを作成するデータの集合（図
５においては、データ＃１〜＃５）とこれらのデータか
ら作成されたパリティとを合わせたデータの集合をパリ
ティグループと呼ぶ。図３では、６台のドライブ＃１〜
＃６により論理グループが構成され、この論理グループ
内の全パリティグループは５個のデータと１個のパリテ
ィで構成されている。In FIG. 3, the drives # 1 to # 6 constituting the disk array are called a logical group. In addition, a set of data in which a parity is created (data # 1 to # 5 in FIG. 5) and a parity created from these data in the logical group is called a parity group. In FIG. 3, six drives # 1 to # 1
A logical group is constituted by # 6, and all parity groups in this logical group are constituted by 5 data and 1 parity.

【００１４】論理グループを構成するドライブ＃１から
＃６の任意の１台のドライブに障害が発生した場合、障
害が発生したドライブ内の各データは、正常なドライブ
のデータを用いて復元することができる。すなわち、障
害が発生したドライブ内の各データが所属するパリティ
グループ毎に、正常なドライブ内のデータとパリティと
の排他的論理和をとれば、障害が発生したドライブ内の
各データを復元することができる。When a failure occurs in any one of the drives # 1 to # 6 forming the logical group, each data in the failed drive must be restored using the data of a normal drive. You can That is, for each parity group to which each piece of data in the failed drive belongs, if the exclusive OR of the data in the normal drive and the parity is taken, each piece of data in the failed drive can be restored. You can

【００１５】一方、特開平５−２５７６１１号には、論
理グループを複数の領域（パーティション）に分け、こ
のように分割した各領域に対し、異なるＲＡＩＤのレベ
ルを設定する方法が開示されている。この方法では、例
えば、論理グループを６台のドライブで構成した場合、
各領域は６台のドライブにまたがるか、または各ドライ
ブ毎に６個の領域に分割する。そして、そのような領域
を用いて論理ユニットを構成することにより、単一のデ
ィスクドライブセット上に複数の論理ユニットを構成す
る。On the other hand, Japanese Patent Laid-Open No. 5-257611 discloses a method of dividing a logical group into a plurality of areas (partitions) and setting different RAID levels for the respective areas thus divided. In this method, for example, when a logical group consists of 6 drives,
Each area spans 6 drives or is divided into 6 areas for each drive. Then, by forming a logical unit using such an area, a plurality of logical units are formed on a single disk drive set.

【００１６】[0016]

【発明が解決しようとする課題】従来のディスクアレイ
では以下に示す２つの問題点が生じる。The conventional disk array has the following two problems.

【００１７】まず、第１の問題点は、可用性に関してで
ある。ディスクアレイでは、ある論理グループ内の任意
の１台のドライブに障害が発生しても、パリティにより
回復することが可能なため、データ消失とはならない。
しかし、１台目の障害ドライブを回復する前に、同一論
理グループにおいて残りの任意のドライブで障害が発生
した場合、パリティによる回復は不可能でありデータ消
失となる。First, the first problem is availability. In the disk array, even if a failure occurs in any one drive in a certain logical group, it is possible to recover by the parity, and therefore data loss does not occur.
However, if a failure occurs in any of the remaining drives in the same logical group before recovery of the first failed drive, recovery by parity is impossible and data will be lost.

【００１８】このため、論理グループを構成するドライ
ブ数（パリティグループを構成するデータ数）が多い
と、２台目の障害が発生する確率が高くなり、データ消
失を起こす確率が高くなる。そこで、このようなデータ
消失を避けるため、ドライブに障害が発生した場合、ど
んなに重要な処理を行っていても、その処理を中止し、
なるべく早くデータ回復を行なう必要がある。For this reason, if the number of drives forming the logical group (the number of data forming the parity group) is large, the probability of occurrence of the second failure increases, and the probability of data loss increases. Therefore, in order to avoid such data loss, if a drive failure occurs, no matter how important the processing is, it will be stopped.
It is necessary to recover data as soon as possible.

【００１９】特願平３−９４７２８号では、パリティグ
ループで２個のパリティを設け、１台のドライブに障害
が発生した場合と２台のドライブに障害が発生した場合
とで、回復処理を行う仕方を変える方法が開示されてい
る。これは、論理グループにおいて、１台目のドライブ
に障害が発生した場合は、もう１台ドライブが壊れても
データ消失にならないため比較的ゆっくり回復処理を行
ない、２台のドライブに障害が発生した場合は、もう１
台のドライブに障害が発生するとデータ消失になるため
早急に回復処理を行なうようにしたものである。In Japanese Patent Application No. 3-94728, two parities are provided in a parity group and a recovery process is performed depending on whether one drive fails or two drives fail. A method of changing the way is disclosed. This is because in the logical group, if the first drive fails, the data will not be lost even if the other drive is broken, so recovery processing is performed relatively slowly, and two drives fail. If one more
If a failure occurs in one drive, data will be lost, so recovery processing is performed immediately.

【００２０】この方法では、１台のドライブ障害では早
急な回復を行わなくても済むため、可用性は向上する。
しかし、論理グループを構成する全てのパリティグルー
プに対し一律に２個ずつのパリティを用意しなければな
らないため、パリティのために使用する容量がさらに１
ドライブ分必要となり、コストアップとなる。With this method, availability is improved because one drive failure does not require immediate recovery.
However, since it is necessary to uniformly prepare two pieces of parity for all the parity groups forming the logical group, the capacity used for the parity is further reduced to one.
The drive is required, which increases the cost.

【００２１】今後、ディスクアレイの有力な適用先であ
るファイルサーバでは、大規模化が進み、１台のファイ
ルサーバを多数のユーザが使用するようになる。このよ
うな環境では、ファイルサーバに対し、高い可用性が要
求されるため、ドライブに障害が発生しても、すぐに長
時間ディスクアレイを停止してデータ回復を行なうこと
が難しくなる。さらに、ファイルサーバでは、このよう
な可用性に対し払うコストをなるべく低くする必要があ
る。In the future, the file server, which is a major application destination of the disk array, will become larger in size, and a large number of users will use one file server. In such an environment, since the file server is required to have high availability, it is difficult to immediately stop the disk array for a long time to recover data even if a drive fails. Furthermore, in the file server, it is necessary to reduce the cost paid for such availability as much as possible.

【００２２】第２の問題点は、障害時の性能低下に関し
てである。先に述べたように、ディスクアレイでは、パ
リティグループにおいて１個のパリティを用意した場
合、論理グループ内の１台のドライブの障害に対して
は、この障害が発生したドライブ内のデータはパリティ
グループ単位で残りの正常なドライブ内のデータおよび
パリティにより回復することが可能である。この機能を
使用し、論理グループ内の任意のドライブに障害が発生
し、この障害が発生したドライブ内のデータに対しＣＰ
Ｕから読み出しまたは書き込み要求が発行された場合、
回復処理の機能を用いることで、受け付けることが可能
である。The second problem is the deterioration of performance at the time of failure. As described above, in the disk array, when one parity is prepared in the parity group, if one drive in the logical group fails, the data in the drive in which the failure occurs will be the parity group. It is possible to recover by the data and parity in the remaining normal drive in units. When this function is used, any drive in the logical group fails, and CP for the data in the failed drive
When a read or write request is issued from U,
It can be accepted by using the function of the recovery process.

【００２３】具体的には、ＣＰＵから障害ドライブのデ
ータの読み出し要求が発行された場合は、当該論理グル
ープ内の正常なドライブの全てのデータとパリティとか
ら障害ドライブ内の当該データを復元し、ＣＰＵへ転送
する。また、ＣＰＵから障害ドライブ内に書き込み要求
が発行された場合は、読み出しと同様に当該パリティグ
ループ内の正常なドライブの全てのデータと書き込みデ
ータとからパリティを作成し、パリティが格納されてい
るドライブに格納することで、パリティの更新を行な
う。Specifically, when the CPU issues a request to read the data of the failed drive, the data in the failed drive is restored from all the data and parity of the normal drives in the logical group, Transfer to CPU. When the CPU issues a write request to the failed drive, parity is created from all the data and write data of the normal drive in the parity group, as in the case of reading, and the drive in which the parity is stored. The parity is updated by storing in.

【００２４】このように、回復処理を行なう前に障害が
発生したドライブへ読み出しまたは書き込み要求が発行
されると、論理グループ内の全てのドライブに対し読み
出し要求が発行されるため、大きく性能が低下する。特
に、論理グループを構成するドライブ数（パリティグル
ープを構成するデータ数）が多いほど、発行される読み
出し要求の数が増えるため、性能低下が大きくなる。As described above, when a read or write request is issued to a drive in which a failure has occurred before the recovery processing is performed, the read request is issued to all the drives in the logical group, resulting in a large decrease in performance. To do. In particular, as the number of drives forming the logical group (the number of data forming the parity group) increases, the number of read requests issued increases, resulting in a large decrease in performance.

【００２５】一方、上述の特開平５−２５７６１１号の
方法によれば、論理グループを構成する複数のドライブ
に異なるＲＡＩＤレベルの複数の論理ユニットを構成す
ることが可能である。On the other hand, according to the method of the above-mentioned Japanese Patent Laid-Open No. 5-257611, it is possible to configure a plurality of logical units of different RAID levels in a plurality of drives constituting a logical group.

【００２６】しかし、１つの論理グループ中に、例え
ば、５つのデータと１つのパリティとからなるパリティ
グループ、３つのデータと１つのパリティとからなるパ
リティグループ、５つのデータと２つのパリティとから
なるパリティグループなど、種々の構成のパリティグル
ープを混在させることはできないという問題がある。ま
た、障害回復は論理グループを単位として行うので、論
理グループ中に設定した種々のＲＡＩＤレベルの複数の
論理ユニットに対し、各ＲＡＩＤレベルの特性（データ
消失確率、障害時の性能）に応じた障害回復処理を行う
ことなども示されていない。However, in one logical group, for example, a parity group consisting of 5 data and 1 parity, a parity group consisting of 3 data and 1 parity, 5 data and 2 parity. There is a problem that parity groups of various configurations such as parity groups cannot be mixed. In addition, since failure recovery is performed in units of logical groups, failure in accordance with the characteristics of each RAID level (probability of data loss, performance at the time of failure) for a plurality of logical units of various RAID levels set in the logical group. There is no indication that recovery processing should be performed.

【００２７】本発明の目的は、ディスクアレイの改良に
ある。また、本発明の目的は、１つの論理グループ中に
種々の構成のパリティグループを混在させることのでき
るディスクアレイ装置およびディスクアレイの区分け方
法を提供することにある。さらに、本発明の目的は、Ｒ
ＡＩＤレベルやパリティグループの構成ごとに異なる特
性、例えばデータ消失確率や障害時の性能低下の度合に
応じて、分散して障害回復処理を行うディスクアレイ装
置およびディスクアレイの区分け方法を提供することに
ある。An object of the present invention is to improve a disk array. Another object of the present invention is to provide a disk array device and a disk array partitioning method in which parity groups of various configurations can be mixed in one logical group. Further, the object of the present invention is to
To provide a disk array device and a disk array partitioning method that perform failure recovery processing in a distributed manner according to characteristics that differ depending on the AID level and the configuration of a parity group, such as the probability of data loss and the degree of performance degradation at the time of failure. is there.

【００２８】[0028]

【課題を解決するための手段】本発明は、論理グループ
を構成する複数台のドライブを含むディスク装置と、該
ディスク装置を管理する制御装置とを備えたディスクア
レイ装置において、ｉ（ｉはｉ≧１の整数）個のデータ
と該データから作成したｊ（ｊはｊ≧１の整数）個のエ
ラー訂正用データとから構成される第１のパリティグル
ープを格納するための領域と、上記第１のパリティグル
ープとは異なる構成の第２のパリティグループを格納す
るための領域とが、前記論理グループ中に混在している
ことを特徴とする。上記パリティグループは、二重化の
パリティグループ（ｉ＝ｊ＝１）も含む。According to the present invention, there is provided a disk array device including a disk device including a plurality of drives forming a logical group, and a controller for managing the disk device. An area for storing a first parity group composed of (≧ 1 integer) data and j (j is an integer of j ≧ 1) error correction data created from the data; An area for storing a second parity group having a different configuration from one parity group is mixed in the logical group. The parity group also includes a redundant parity group (i = j = 1).

【００２９】また、本発明は、論理グループを構成する
ｎ台のドライブを含むディスク装置と、該ディスク装置
を管理する制御装置とを備えたディスクアレイ装置にお
いて、前記論理グループ内の各ドライブをパーティショ
ンで区切り、ｎ台以下のｍ台の任意のドライブの任意の
パーティションを任意の数選択し、該選択したパーティ
ションによりパーティショングループを設定し、前記論
理グループは、互いにｍが異なる複数のパーティション
グループにより構成されることを特徴とする。Further, according to the present invention, in a disk array device including a disk device including n drives forming a logical group and a control device for managing the disk device, each drive in the logical group is partitioned. Partition, and select any number of arbitrary partitions of n or less m drives, set a partition group by the selected partition, and the logical group is configured by a plurality of partition groups in which m is different from each other. It is characterized by being done.

【００３０】さらに、上位装置からのデ−タの入出力要
求に対する、当該デ−タを格納してある、または格納す
るディスク装置と、該ディスク装置を管理する制御装置
とからなるディスクアレイ装置において、前記ディスク
装置を多数のドライブにより構成し、これらのドライブ
を２台以上のｎ台のドライブの論理グループにグループ
分けし、各論理グループ内の各ドライブをパーティショ
ンで区切るとともに、ｎ台以下のｍ台の任意のドライブ
の任意のパーティションを任意の数選択し、該選択した
パーティションによりパーティショングループを設定
し、前記論理グループは、互いにｍが異なる複数のパー
ティショングループにより構成されることを特徴とす
る。Furthermore, in response to an input / output request of data from a higher-level device, a disk array device comprising a disk device which stores or stores the data and a control device which manages the disk device. , The disk device is composed of a large number of drives, these drives are divided into logical groups of two or more n drives, and each drive in each logical group is partitioned by a partition and m or less It is characterized in that an arbitrary number of arbitrary partitions of arbitrary drives are selected, a partition group is set by the selected partitions, and the logical group is composed of a plurality of partition groups having mutually different m.

【００３１】[0031]

【作用】以上のように、本発明では、論理グループを構
成する各パリティグループ間でパリティグループレベル
を可変とし、論理グループをいくつかのパリティグルー
プレベルで構成する。すなわち、論理グループ内には、
複数の種類のパリティグループが混在することができ
る。各パリティグループは、信頼性および性能に対し特
徴がある。そこで、パリティグループの構成により、２
台目のドライブに障害が発生することによりデータ消失
する確率が異なり、パリティグループレベルとして分類
する。このことは、障害時における性能低下においても
同様なことがいえる。As described above, according to the present invention, the parity group level is made variable among the parity groups constituting the logical group, and the logical group is constituted by several parity group levels. That is, within a logical group,
Multiple types of parity groups can be mixed. Each parity group is characterized by reliability and performance. Therefore, depending on the configuration of the parity group, 2
The probability of data loss due to the failure of the second drive is different, and it is classified as the parity group level. This can be said to be the same when the performance is deteriorated at the time of failure.

【００３２】そこで、１台目のドライブに障害が発生し
た場合、例えば、パリティグループレベルの低い（２台
目の障害が発生した場合にデータ消失となる確率の高
い）パリティグループから順次回復処理を行なうという
ようにする。Therefore, when a failure occurs in the first drive, for example, recovery processing is sequentially performed from a parity group having a low parity group level (high probability of data loss when a second failure occurs). Do it.

【００３３】このように論理ボリュームを構成する各パ
リティグループ間でパリティグループレベルを可変と
し、論理グループをいくつかのパリティグループレベル
で構成し、パリティグループレベルの低い順に順次回復
処理を行なうことにより、信頼性を確保し、効率良く回
復処理を行なうことが可能となり、しかも、障害時の性
能低下を抑えることが可能となる。In this way, the parity group level is made variable among the respective parity groups constituting the logical volume, the logical groups are constituted by several parity group levels, and the recovery processing is performed sequentially in the ascending order of the parity group levels. It is possible to secure reliability and perform recovery processing efficiently, and it is also possible to suppress performance degradation at the time of failure.

【００３４】[0034]

【実施例】以下、図面を用いて、本発明の実施例を説明
する。Embodiments of the present invention will be described below with reference to the drawings.

【００３５】［実施例１］図１は、本発明の第１の実施
例に係るディスクアレイ装置の全体構成を示す。[Embodiment 1] FIG. 1 shows the overall configuration of a disk array device according to a first embodiment of the present invention.

【００３６】本実施例のディスクアレイは、大まかに
は、ＲＡＩＤのレベル５の制御を行うディスクアレイコ
ントローラ（以下、ＡＤＣと呼ぶ）２、および論理グル
ープ１０から構成される。ＣＰＵ１は、ディスクアレイ
に読み出し命令や書き込み命令を発行する上位装置であ
る。The disk array of this embodiment is roughly composed of a disk array controller (hereinafter referred to as ADC) 2 for controlling RAID level 5 and a logical group 10. The CPU 1 is a host device that issues a read command and a write command to the disk array.

【００３７】論理グループ１０は、ｍ台のドライブ１２
と、各々のドライブ１２とＡＤＣ２とを接続するドライ
ブユニットパス１１−１，１１−２により、構成され
る。なお、このドライブ１２の数は、本発明の効果を得
るには、特に制限は無い。The logical group 10 includes m drives 12
And drive unit paths 11-1 and 11-2 connecting each drive 12 and the ADC 2. The number of the drives 12 is not particularly limited in order to obtain the effects of the present invention.

【００３８】論理グループ１０内で、パリティを作成す
るデータの集合とこれらのデータから作成されたパリテ
ィとを合わせた集合が、パリティグループである。従
来、この論理グループ１０がドライブ障害の回復単位で
あった。本実施例では、この論理グループ１０中に異な
る構成のパリティグループが混在し、それらパリティグ
ループごとの特性に応じた障害回復を行うようにしてい
る。これについては、後に詳しく説明する。Within the logical group 10, the set of data for which parity is created and the parity created from these data are combined to form a parity group. Conventionally, this logical group 10 has been the recovery unit for drive failure. In this embodiment, parity groups having different configurations are mixed in this logical group 10 and failure recovery is performed according to the characteristics of each parity group. This will be described in detail later.

【００３９】次に、図１を参照して、ＡＤＣ２の内部構
造について説明する。Next, the internal structure of the ADC 2 will be described with reference to FIG.

【００４０】ＡＤＣ２は、大まかには、チャネルパスデ
ィレクタ３と、チャネルパス８−１，８−２と、キャッ
シュメモリ１６と、ドライブパス９−１，９−２とから
なる。また、パスは大きく２つのクラスタ７−１，７−
２に分けられている。The ADC 2 roughly comprises a channel path director 3, channel paths 8-1 and 8-2, a cache memory 16 and drive paths 9-1 and 9-2. Also, the path is roughly two clusters 7-1 and 7-
It is divided into two.

【００４１】チャネルパス８−１，８−２は、チャネル
パスディレクタ３とキャッシュメモリ１６との間のパス
である。キャッシュメモリ１６は、バッテリバックアッ
プ等により不揮発化された半導体メモリである。ドライ
ブパス９−１，９−２は、キャッシュメモリ１６とドラ
イブ１２との間のパスである。それぞれのチャネルパス
８−１，８−２とドライブパス９−１，９−２とは、キ
ャッシュメモリ１６を介して接続されている。The channel paths 8-1 and 8-2 are paths between the channel path director 3 and the cache memory 16. The cache memory 16 is a semiconductor memory that is non-volatile by battery backup or the like. The drive paths 9-1 and 9-2 are paths between the cache memory 16 and the drive 12. The respective channel paths 8-1, 8-2 and the drive paths 9-1, 9-2 are connected via a cache memory 16.

【００４２】キャッシュメモリ１６には、データとアド
レステーブルが格納され、パリティの作成時にはワーク
エリアにもなる。このキャッシュメモリ１６およびその
中のアドレステーブルは、ＡＤＣ２内の全てのクラスタ
７において共有で使用される。アドレステーブルの詳細
は、図４を参照して後に説明する。The cache memory 16 stores data and an address table, and also serves as a work area when creating parity. The cache memory 16 and the address table therein are shared by all the clusters 7 in the ADC 2. Details of the address table will be described later with reference to FIG.

【００４３】この、アドレステーブルは、システムの電
源をオンしたときに、チャネルパス８内のマイクロプロ
セッサ（ＭＰ）２０により、論理グループ１０内のある
特定の１台またはそれ以上のドライブ１２から、キャッ
シュメモリ１６に、ＣＰＵ１の関知無しに自動的に読み
込まれる。一方、電源をオフするときは、ＭＰ２０によ
り、キャッシュメモリ１６内のアドレステーブルを、読
み込んできたドライブ１２内の所定の場所に、ＣＰＵ１
の関知無しに自動的に格納する。This address table is cached from a certain one or more drives 12 in the logical group 10 by the microprocessor (MP) 20 in the channel path 8 when the system is powered on. It is automatically read into the memory 16 without any knowledge of the CPU 1. On the other hand, when the power is turned off, the CPU 20 sets the address table in the cache memory 16 to a predetermined location in the drive 12 that has been read by the MP 20.
Automatically store without knowledge of.

【００４４】ＣＰＵ１より発行されたコマンドは、外部
インターフェースパス４を通ってＡＤＣ２のチャネルパ
スディレクタ３に入力する。ＣＰＵ１からコマンドが発
行された場合、まずＡＤＣ２内のチャネルパスディレク
タ３により、そのコマンドの受付が可能かどうか判断す
る。The command issued by the CPU 1 is input to the channel path director 3 of the ADC 2 through the external interface path 4. When a command is issued from the CPU 1, the channel path director 3 in the ADC 2 first determines whether or not the command can be accepted.

【００４５】具体的には、図１に示すように、ＣＰＵ１
からＡＤＣ２に送られてきたコマンドは、まずインター
フェースアダプタ（以下、ＩＦＡｄｐと呼ぶ）５によ
り取り込まれる。次に、ＭＰ２０は、外部インターフェ
ースパス４の中で使用可能なパスがあるか否かを調べ
る。使用可能な外部インターフェースパス４がある場
合、ＭＰ２０は、チャネルパススイッチ６を切り換えて
コマンドの受付け処理を行なう。コマンドを受け付ける
ことができない場合は、受付不可の応答をＣＰＵ１へ送
る。Specifically, as shown in FIG. 1, the CPU 1
The command sent from the ADC 2 to the ADC 2 is first captured by the interface adapter (hereinafter, referred to as IF Adp) 5. Next, the MP 20 checks whether or not there is a usable path in the external interface path 4. If there is a usable external interface path 4, the MP 20 switches the channel path switch 6 to perform command acceptance processing. If the command cannot be accepted, a response indicating that the command cannot be accepted is sent to the CPU 1.

【００４６】このようにしてコマンドの受付が可能にな
った後に、後で説明するような読み出し処理または書き
込み処理（新規または更新）を開始する。具体的な読み
出しまたは書き込み処理を説明する前に、まず、初期設
定におけるパリティグループレベルの設定法とアドレス
変換法について説明する。After the command can be accepted in this way, a read process or a write process (new or update) as described later is started. Before describing a specific read or write process, first, a parity group level setting method and an address conversion method in the initial setting will be described.

【００４７】（パリティグループレベルの設定法）(Parity group level setting method)

【００４８】図２は、本実施例における論理グループ１
０内のデータ配置図である。本実施例では、この図に示
すように、ＳＤ＃１からＳＤ＃６の６台のドライブ１２
により論理グループ１０を構成する。FIG. 2 shows the logical group 1 in this embodiment.
It is a data allocation diagram in 0. In this embodiment, as shown in this figure, six drives 12 of SD # 1 to SD # 6 are used.
The logical group 10 is constructed by

【００４９】図２において、論理グループ１０内には、
５個のデータと１個のパリティ（５Ｄ＋１Ｐとする）で
レベル５を構成するパリティグループと、３個のデータ
と１個のパリティ（３Ｄ＋１Ｐ）でレベル５を構成する
パリティグループと、二重化によるパリティグループと
の、３種類のパリティグループが混在している。これら
の３種類のパリティグループに対し、５Ｄ＋１Ｐ，３Ｄ
＋１Ｐ，二重化の順に、信頼性は向上し、また障害時の
性能低下はこの順に小さくなる。そこで、このようにパ
リティグループによる信頼性、性能の良い順に、パリテ
ィグループレベルが高いとして分類する。In FIG. 2, in the logical group 10,
A parity group that forms level 5 with 5 pieces of data and 1 parity (5D + 1P), a parity group that forms level 5 with 3 pieces of data and 1 parity (3D + 1P), and a parity group by duplication And three types of parity groups are mixed. For these 3 types of parity groups, 5D + 1P, 3D
The reliability is improved in the order of + 1P and duplexing, and the performance degradation at the time of failure becomes smaller in this order. Therefore, the parity groups are classified as having a higher parity group level in order of reliability and performance.

【００５０】ユーザは、所望の使用環境を得るために、
初期設定の段階で、論理グループ１０内に、どのレベル
のパリティグループをどれだけ用意するかを予め設定す
る。この設定は、ユーザが、ＡＤＣ２に対し指示する。
この指示は、ＡＤＣ２では、ＭＰ２０が受け取る。その
指示に応じて、ＭＰ２０は、キャッシュメモリ１６内に
アドレステーブルを作成する。In order to obtain a desired usage environment, the user
At the stage of initial setting, how many levels of parity groups are prepared in the logical group 10 is preset. This setting is instructed to the ADC 2 by the user.
In the ADC 2, the MP 20 receives this instruction. In response to the instruction, the MP 20 creates an address table in the cache memory 16.

【００５１】図４は、キャッシュメモリ１６内に作成さ
れるアドレステーブルの構造を示す。ここでは、図２の
ように５Ｄ＋１Ｐ，３Ｄ＋１Ｐ，二重化の３種類のパリ
ティグループを設けた場合のアドレステーブルを示して
いる。アドレステーブルは、図４の（ａ）（ｂ）（ｃ）
に示すように、パリティグループレベルに対応して作成
される。FIG. 4 shows the structure of the address table created in the cache memory 16. Here, an address table in the case where three types of parity groups of 5D + 1P, 3D + 1P, and duplex are provided as shown in FIG. 2 is shown. The address table is shown in (a), (b) and (c) of FIG.
As shown in, it is created corresponding to the parity group level.

【００５２】各パリティグループレベル２１に対応した
アドレステーブルは、データ名２２、キャッシュアドレ
ス２３、データドライブ番号（ＤＤｒｉｖｅＮｏ．）
２４、障害フラグ２５、パリティドライブ番号（ＰＤｒ
ｉｖｅＮｏ．）２６、およびＳＣＳＩ内Ａｄｄｒ２７
の各フィールドで構成される。The address table corresponding to each parity group level 21 has a data name 22, a cache address 23, and a data drive number (DDrive No.).
24, failure flag 25, parity drive number (PDr
iv No. ) 26, and Addr27 in SCSI
It is composed of each field of.

【００５３】データ名２２には、ＣＰＵ１が指定する論
理アドレスが登録される。データ名２２が登録されてい
ない部分が空き領域を示すことになる。キャッシュアド
レス２３には、このデータ名２２に対応するデータがキ
ャッシュメモリ１６内に存在する場合は、キャッシュメ
モリ１６内のアドレスが登録される。逆に、キャッシュ
アドレス２３が登録されていない（空欄）場合は、この
データ名２２のデータがキャッシュメモリ１６内に存在
していないことを示す。In the data name 22, a logical address designated by the CPU 1 is registered. A portion where the data name 22 is not registered indicates a free area. In the cache address 23, when the data corresponding to the data name 22 exists in the cache memory 16, the address in the cache memory 16 is registered. On the contrary, if the cache address 23 is not registered (blank), it means that the data of the data name 22 does not exist in the cache memory 16.

【００５４】データドライブ番号（ＤＤｒｉｖｅＮ
ｏ．）２４には、このデータ名２２のデータが格納され
ているドライブ１２の番号が登録してある。障害フラグ
２５は、このデータが格納されているドライブ１２に障
害が発生している場合に、オン（１）となる。当該ドラ
イブが正常であれば、障害フラグ２５はオフ（０）であ
る。パリティドライブ番号（ＰＤｒｉｖｅＮｏ．）２
６には、パリティグループ内において、データ名２２の
データが関与しているパリティの格納されているドライ
ブ１２の番号が登録されている。Data drive number (DDrive N
o. ) 24, the number of the drive 12 in which the data of this data name 22 is stored is registered. The failure flag 25 is turned on (1) when the drive 12 storing this data has a failure. If the drive is normal, the failure flag 25 is off (0). Parity drive number (PDrive No.) 2
In 6 is registered the number of the drive 12 storing the parity in which the data of the data name 22 is involved in the parity group.

【００５５】ＳＣＳＩ内Ａｄｄｒ２７には、データ名２
２のデータおよびこれが関与したパリティが格納されて
いるドライブ１２内の物理的なアドレスが登録されてい
る。このＳＣＳＩ内Ａｄｄｒ２７が示す物理的なアドレ
スは、図１７に示すように、当該データが格納されてい
るトラックが所属するシリンダの位置と、そのシリンダ
内において当該データが格納されているトラックを決定
するヘッドアドレスと、そのトラック内のレコードの位
置で表される。Data name 2 is stored in Addr 27 in SCSI.
The physical address in the drive 12 in which the data of 2 and the parity related to this are stored is registered. The physical address indicated by the Addr 27 in the SCSI determines the position of the cylinder to which the track storing the data belongs and the track storing the data in the cylinder, as shown in FIG. It is represented by the head address and the position of the record within that track.

【００５６】具体的には、要求データが格納されている
当該ドライブ１２の番号がデータドライブ番号（ＤＤｒ
ｉｖｅＮｏ．）２４に登録されており、当該ドライブ
１２内のシリンダ番号であるシリンダアドレスと、シリ
ンダにおいてトラックを選択するヘッドの番号であるヘ
ッドアドレスと、トラック内の当該レコードの位置を示
すレコードアドレスとが、ＳＣＳＩ内Ａｄｄｒ２７に登
録されている。Specifically, the number of the drive 12 in which the requested data is stored is the data drive number (DDr
iv No. ) 24, the cylinder address that is the cylinder number in the drive 12, the head address that is the number of the head that selects the track in the cylinder, and the record address that indicates the position of the record in the track. It is registered in Addr 27 in SCSI.

【００５７】本実施例では、論理グループ１０を構成す
る各ドライブ１２において、パリティグループを構成す
るデータと、これらのデータから作成されたパリティと
は、同一のＳＣＳＩ内Ａｄｄｒ２７の位置に格納され
る。In the present embodiment, in each drive 12 constituting the logical group 10, the data constituting the parity group and the parity created from these data are stored at the same SCSI Addr 27 position.

【００５８】具体的には、例えば図４（ａ）において、
データ名２２がＤａｔａ＃１からＤａｔａ＃５のデータ
に対してはＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ１である
が、これは、ドライブＳＤ＃１に格納されているＤａｔ
ａ＃１と、ドライブＳＤ＃２に格納されているＤａｔａ
＃２と、ドライブＳＤ＃３に格納されているＤａｔａ＃
３と、ドライブＳＤ＃４に格納されているＤａｔａ＃４
と、ドライブＳＤ＃５に格納されているＤａｔａ＃５
と、ドライブＳＤ＃６に格納されているパリティとが、
それぞれのドライブのアドレスＤＡＤＲ１の位置に格納
されているということを示している。これらのデータの
集合が、パリティグループを構成する。Specifically, for example, in FIG.
For the data whose data name 22 is Data # 1 to Data # 5, Addr27 in SCSI is DADR1. This is Data stored in the drive SD # 1.
a # 1 and Data stored in the drive SD # 2
# 2 and Data # stored in the drive SD # 3
3 and Data # 4 stored in the drive SD # 4
And Data # 5 stored in the drive SD # 5
And the parity stored in the drive SD # 6,
It is shown that the data is stored at the address DADR1 of each drive. A set of these data forms a parity group.

【００５９】なお、本実施例では、パリティグループを
構成するデータおよびパリティを同一のＳＣＳＩ内Ａｄ
ｄｒ２７に格納するようにしたが、ＳＣＳＩ内Ａｄｄｒ
２７がドライブごとに異なっていてもよい。その場合
は、ドライブ番号２４，２６に、ドライブ番号に加えて
各ドライブ内のＳＣＳＩ内Ａｄｄｒを格納するようにす
ればよい。In this embodiment, the data and parity forming the parity group are assigned to the same SCSI Ad.
It was stored in dr27, but Addr in SCSI
27 may be different for each drive. In that case, in addition to the drive numbers, the SCSI Addr in each drive may be stored in the drive numbers 24 and 26.

【００６０】本実施例では、パリティグループの設定は
シリンダ単位とする。図１７に示すように、ドライブ１
２のディスクには同心円上にトラックが設定されてい
る。このトラック上にデータは記録される。各ディスク
に対応した読み書きヘッドは全て１個のアクチュエータ
に取り付けられており、このアクチュエータの移動にと
もない全ての読み書きヘッドは一様に移動する。アクチ
ュエータの一回の移動で位置付けられる各ディスク上の
トラックの集合をシリンダと言う。In this embodiment, the parity group is set in cylinder units. As shown in FIG. 17, drive 1
The second disc has concentric tracks. Data is recorded on this track. The read / write heads corresponding to each disk are all attached to one actuator, and all the read / write heads move uniformly with the movement of this actuator. A set of tracks on each disk positioned by one movement of the actuator is called a cylinder.

【００６１】各パリティグループレベルに対し、パリテ
ィグループを設定する場合は、図１８に示すように、こ
のシリンダの集合単位で設定する。図１８では、外周側
のシリンダの集合（斜線部分）にパリティグループレベ
ルが５Ｄ＋１Ｐのパリティグループを設定し、内周側の
シリンダの集合（黒く塗り潰した部分）にパリティグル
ープレベルが３Ｄ＋１Ｐおよび二重化のパリティグルー
プを混在して設定している。When a parity group is set for each parity group level, it is set for each set of cylinders as shown in FIG. In FIG. 18, a parity group having a parity group level of 5D + 1P is set in the set of cylinders on the outer peripheral side (hatched portion), and a set of parity group levels of 3D + 1P and duplexing parity is set in the set of cylinders on the inner peripheral side (the blackened part). The groups are set mixedly.

【００６２】ユーザは、初期設定する際に、各パリティ
グループレベルに割り当てる領域（シリンダの集合）に
対応したＤＤｒｉｖｅＮｏ．２４とＰＤｒｉｖｅＮ
ｏ．２６とそれらのＳＣＳＩ内Ａｄｄｒ２７とを、図４
のアドレステーブル上に確保する。At the time of initial setting, the user selects the DDrive No. corresponding to the area (set of cylinders) assigned to each parity group level. 24 and P Drive N
o. 26 and their Addr 27 in SCSI are shown in FIG.
Reserved on the address table of.

【００６３】なお、本実施例では、このようにシリンダ
単位でパリティグループの設定をおこなったが、トラッ
ク単位、レコード単位で設定しても本発明の効果が得ら
れることは明らかであり、設定する単位の制約は無い。In this embodiment, the parity group is set in the cylinder unit as described above, but it is clear that the effect of the present invention can be obtained even if the parity group is set in the track unit and the record unit. There is no unit restriction.

【００６４】また、本実施例ではパリティグループレベ
ルを５Ｄ＋１Ｐ、３Ｄ＋１Ｐ、および二重化の３通りと
したが、この設定は使用環境等により自由に設定するこ
とが可能である。例えば、より高信頼なパリティグルー
プが必要であれば４Ｄ＋２Ｐのパリティグループレベル
を設定したり、２Ｄ＋１Ｐのようなパリティグループレ
ベルを設定することも可能である。４Ｄ＋２Ｐのよう
に、２つのパリティを設ける場合は、図４のパリティド
ライブ番号（ＰＤｒｉｖｅＮｏ．）２６のフィールド
を２つ設ければよい。In this embodiment, the parity group level is set to 5D + 1P, 3D + 1P, and duplex, but this setting can be freely set depending on the environment of use. For example, if a more reliable parity group is required, a parity group level of 4D + 2P can be set, or a parity group level such as 2D + 1P can be set. When two parities are provided as in 4D + 2P, two fields of the parity drive number (PDrive No.) 26 in FIG. 4 may be provided.

【００６５】また、論理グループ１０を構成するパリテ
ィグループにおけるパリティグループレベルの割合の設
定も使用環境等により自由に設定することが可能であ
る。例えば、図２において、５Ｄ＋１Ｐのパリティグル
ープレベルのパリティグループの数（割当て量）を減ら
し、３Ｄ＋１Ｐと二重化のパリティグループの割当て量
を増やしたり、代わりに４Ｄ＋２Ｐのパリティグループ
を設定したりすることも可能である。Further, the ratio of the parity group level in the parity groups forming the logical group 10 can be freely set depending on the usage environment. For example, in FIG. 2, it is possible to reduce the number of parity groups (allocation amount) at the parity group level of 5D + 1P and increase the allocation amount of 3D + 1P and redundant parity groups, or to set a parity group of 4D + 2P instead. Is.

【００６６】以上のように、本発明では、初期設定の際
にディスクアレイの使用される環境を考慮して、ディス
クアレイの論理グループ１０を構成するパリティグルー
プのパリティグループレベルの種類や、構成内容、さら
には、割合を自由に設定することが可能である。As described above, according to the present invention, in consideration of the environment in which the disk array is used at the time of initial setting, the type of the parity group level of the parity group forming the logical group 10 of the disk array and the configuration contents Moreover, it is possible to set the ratio freely.

【００６７】（アドレス変換法）(Address conversion method)

【００６８】次に、図４のアドレステーブルを用いたア
ドレスの変換法について説明する。Next, an address conversion method using the address table of FIG. 4 will be described.

【００６９】図１のＣＰＵ１は、論理アドレスとしてデ
ータ名２２を指定して、書き込み命令や読み出し命令を
発行する。これに応じて、ＡＤＣ２のＭＰ２０は、指定
された論理アドレスで図４のアドレステーブルを参照す
ることにより、そのデータが実際に格納されているドラ
イブ番号２４、ＳＣＳＩ内Ａｄｄｒ２７、およびキャッ
シュアドレス２３を決定する。The CPU 1 of FIG. 1 designates the data name 22 as a logical address and issues a write command or a read command. In response to this, the MP 20 of the ADC 2 refers to the address table of FIG. 4 with the designated logical address to determine the drive number 24, the SCSI Addr 27, and the cache address 23 in which the data is actually stored. To do.

【００７０】例えば、ＣＰＵ１からＤａｔａ＃２に対す
る要求が発行された場合、図４のアドレステーブルか
ら、当該データの位置は、ドライブＳＤ＃２のＳＣＳＩ
内Ａｄｄｒ２７がＤＡＤＲ１の位置であることが分か
る。このようにして、論理アドレスとしてのデータ名
が、物理的なアドレスへ変換される。For example, when a request for Data # 2 is issued from the CPU 1, the location of the data is found to be the SCSI of the drive SD # 2 from the address table of FIG.
It can be seen that the Addr 27 is the position of DADR1. In this way, the data name as the logical address is converted into the physical address.

【００７１】また、このときアドレステーブルにおいて
Ｄａｔａ＃２のキャッシュアドレス２３にＣＡＤＲ１が
登録されているため、このデータはキャッシュメモリ１
６内のＣＡＤＲ１に存在することが分かる。もし、キャ
ッシュアドレス２３に登録されていない場合は、当該デ
ータはキャッシュメモリ１６内には存在しないことにな
る。さらに、このＤａｔａ＃２が関与したパリティにつ
いては、図４のアドレステーブルから、パリティドライ
ブ番号（ＰＤｒｉｖｅＮｏ）２６がＳＤ＃６であるド
ライブ１２の、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ１の
位置に、格納されていることがわかる。At this time, since CADR1 is registered in the cache address 23 of Data # 2 in the address table, this data is stored in the cache memory 1.
It can be seen that it exists in CADR1 within 6. If the cache address 23 is not registered, the data does not exist in the cache memory 16. Further, regarding the parity involving this Data # 2, from the address table of FIG. 4, the Addr 27 in SCSI of the drive 12 whose parity drive number (PDrive No) 26 is SD # 6 is stored at the position of DADR1. You can see that

【００７２】このようにして、ＭＰ２０は、ＣＰＵ１か
ら指定された論理アドレス２２を実際に読み出し／書き
込みを行うドライブ１２の物理的なアドレスに変換した
後、その物理アドレスに、読み出しまたは書き込み要求
を発行する。In this way, the MP 20 converts the logical address 22 designated by the CPU 1 into the physical address of the drive 12 which actually reads / writes, and then issues a read or write request to the physical address. To do.

【００７３】次に、このようなアドレス変換を行い、具
体的にデータを読み出しまたは書き込む際の処理法につ
いて説明する。Next, a processing method for performing such address conversion and specifically reading or writing data will be described.

【００７４】（新規書き込み処理）(New writing process)

【００７５】まず、新規にデータを書き込む方法につい
て、図１を用いて説明する。また、新規にデータを書き
込む際のＭＰ２０の処理フローを図６に示す。First, a method of newly writing data will be described with reference to FIG. Further, FIG. 6 shows a processing flow of the MP20 when newly writing data.

【００７６】まず図１を参照して説明する。ＣＰＵ１か
らのコマンドを受け取ると、ＡＤＣ２のＭＰ２０は、そ
のコマンドを処理可能かどうか調べ、可能な場合は処理
可能だという応答をＣＰＵ１へ返す。ＣＰＵ１より発行
されたコマンドは、ＩＦＡｄｐ５を介してＡＤＣ２に
取り込まれ、ＭＰ２０により読み出し要求か書き込み要
求か解読される。書き込み要求の場合は、以下のように
処理する。First, description will be made with reference to FIG. Upon receiving the command from the CPU 1, the MP 20 of the ADC 2 checks whether the command can be processed, and if possible, returns a response indicating that the command can be processed to the CPU 1. The command issued by the CPU 1 is taken into the ADC 2 via the IF Adp 5 and is decoded by the MP 20 as a read request or a write request. In the case of a write request, it is processed as follows.

【００７７】ＣＰＵ１では処理可能だという応答を受け
取った後に、ＡＤＣ２へ書き込みデータを転送する。こ
のとき、ＡＤＣ２では、ＭＰ２０の指示により、チャネ
ルパスディレクタ３のチャネルパススイッチ６が当該外
部インターフェースパス４とＩＦＡｄｐ５を当該チャ
ネルパス８と接続し、ＣＰＵ１とＡＤＣ２との間の接続
を確立する。ＣＰＵ１とＡＤＣ２との間の接続を確立し
た後、ＣＰＵ１からのデータ転送を受け付ける。After receiving the response that the processing is possible, the CPU 1 transfers the write data to the ADC 2. At this time, in the ADC 2, the channel path switch 6 of the channel path director 3 connects the external interface path 4 and the IF Adp 5 to the channel path 8 according to the instruction from the MP 20, and establishes the connection between the CPU 1 and the ADC 2. After establishing the connection between the CPU 1 and the ADC 2, the data transfer from the CPU 1 is accepted.

【００７８】ＣＰＵ１からは、論理アドレス（データ
名）と書き込みデータが転送される。チャネルインター
フェース回路（ＣＨＩＦ）１３は、ＭＰ２０の指示に
より、ＣＰＵ１から転送されたこれらのデータに対しプ
ロトコル変換を施す。これにより、ＣＰＵ１からのデー
タは、外部インターフェースパス４での転送速度からＡ
ＤＣ２内での処理速度に速度調整される。具体的には、
ＣＰＵ１とＡＤＣ２との間のチャネルインターフェース
を光のインターフェースにした場合、ＣＨＩＦ１３
は、光のインターフェースのプロトコルをＡＤＣ２内の
電気処理でのプロトコルに変換する。A logical address (data name) and write data are transferred from the CPU 1. The channel interface circuit (CH IF) 13 performs protocol conversion on these data transferred from the CPU 1 according to an instruction from the MP 20. As a result, the data from the CPU 1 is
The speed is adjusted to the processing speed in DC2. In particular,
When the channel interface between the CPU1 and the ADC2 is an optical interface, CH IF13
Converts the protocol of the optical interface into a protocol for electrical processing in the ADC 2.

【００７９】ＣＨＩＦ１３におけるプロトコル変換お
よび速度制御の完了後、データは、データ制御回路（Ｄ
ＣＣ）１４によるデータ転送制御を受け、キャッシュア
ダプタ回路（ＣＡｄｐ）１５に転送され、ＣＡｄｐ
１５によりキャッシュメモリ１６内に格納される。After completion of the protocol conversion and speed control in the CH IF 13, the data is transferred to the data control circuit (D
CC) 14, the data is transferred to the cache adapter circuit (C Adp) 15, and the data is transferred to C Adp.
It is stored in the cache memory 16 by 15.

【００８０】ＣＡｄｐ１５は、ＭＰ２０の指示に応じ
て、キャッシュメモリ１６に対するデータの読み出しお
よび書き込みを行う回路であり、キャッシュメモリ１６
の状態の監視、各読み出し、および書き込み要求に対
し、排他制御を行う回路である。The C Adp 15 is a circuit for reading and writing data from and to the cache memory 16 in accordance with instructions from the MP 20.
It is a circuit that monitors the state of, and performs exclusive control for each read and write request.

【００８１】図６のフローチャートを参照して、ＭＰ２
０による処理の手順をさらに詳しく説明する。ＭＰ２０
が書き込み要求のコマンドを認識し、しかも、書き込む
データが初めて書き込まれる新規データと認識すると
（ステップ３０）、ＭＰ２０は論理アドレスとしてＣＰ
Ｕ１から送られてきたデータ名をアドレステーブルへ登
録する処理を開始する。Referring to the flowchart of FIG. 6, MP2
The processing procedure by 0 will be described in more detail. MP20
Recognizes the command of the write request, and recognizes that the data to be written is new data to be written for the first time (step 30), the MP 20 uses CP as a logical address.
The process of registering the data name sent from U1 in the address table is started.

【００８２】まず、ＭＰ２０は、パリティグループレベ
ルを認識する（ステップ３１）。すなわち、上位からパ
リティグループレベルの指定がある場合は、当該パリテ
ィグループレベルの領域に格納するように、用いるべき
アドレステーブルを決定する。ここでは、書き込み要求
のコマンドが与えられる際、データを書き込むパリティ
グループレベルがＣＰＵ１から指定されるものとする。First, the MP 20 recognizes the parity group level (step 31). That is, when the parity group level is designated from the upper level, the address table to be used is determined so as to be stored in the area of the parity group level. Here, it is assumed that when a write request command is given, the parity group level for writing data is designated by the CPU 1.

【００８３】具体的には、５Ｄ＋１Ｐのパリティグルー
プレベルの領域に格納するようにＣＰＵ１から指示され
た場合は図４（ａ）のアドレステーブルとし、３Ｄ＋１
Ｐでは図４（ｂ）のアドレステーブルとし、二重化の場
合は図４（ｃ）のアドレステーブルとする。Specifically, when the CPU 1 instructs to store in the area of the parity group level of 5D + 1P, the address table of FIG.
In P, the address table of FIG. 4B is used, and in the case of duplication, the address table of FIG. 4C is used.

【００８４】このようにアドレステーブルを決定した
後、ＭＰ２０は、当該アドレステーブルにおいてデータ
名が登録されていない空き領域を探す（ステップ３
２）。本実施例では、図４のアドレステーブルにおいて
データ名２２が登録されていない場合、その項に登録さ
れているＤＤｒｉｖｅＮｏ．２４のドライブ１２内の
ＳＣＳＩ内Ａｄｄｒ２７の領域は、データが格納されて
いない空き領域である。After determining the address table in this way, the MP 20 searches for an empty area in which the data name is not registered in the address table (step 3).
2). In this embodiment, when the data name 22 is not registered in the address table of FIG. 4, the DDrive No. The area of the Addr 27 in SCSI in the drive 12 of 24 is a free area in which data is not stored.

【００８５】この空き領域には、初期設定の段階からデ
ータが書き込まれていない場合と、以前はデータが書き
込まれていたがこのデータが不要となり削除した場合と
がある。データの削除は、ＭＰ２０により、図４のアド
レステーブルにおいて当該データ名２２を削除すること
により行われる。また、このときのパリティは、このよ
うにデータが格納されていない領域の全てのビットが０
のデータと見なして作成され、アドレステーブルで指定
されているＰＤｒｉｖｅＮｏ．２６のドライブ１２
の、データと同じＳＣＳＩ内Ａｄｄｒ２７の位置に、格
納される。There are cases where data is not written in this empty area from the initial setting stage, and cases where data has been written before but this data is no longer needed and is deleted. Data is deleted by the MP 20 by deleting the data name 22 in the address table of FIG. In addition, as for the parity at this time, all the bits in the area where the data is not stored are 0.
No. of the PDrive No. specified in the address table. Drive 12 of 26
Is stored in the same position of Addr 27 in SCSI as the data.

【００８６】次に、新規書き込みデータをキャッシュメ
モリ１６に格納する（ステップ３３）。キャッシュメモ
リ１６に格納するまでのデータの流れは、図１を参照し
て上述した。そして、ステップ３２で探し出したアドレ
ステーブルの空き領域の項へデータ名２２を登録し、さ
らにキャッシュアドレス２３に新規書き込みデータを格
納したキャッシュメモリ１６内のアドレスを登録するこ
とにより、新規書き込みデータの登録を行う（ステップ
３４）。Next, the new write data is stored in the cache memory 16 (step 33). The data flow up to storage in the cache memory 16 has been described above with reference to FIG. Then, the data name 22 is registered in the empty area of the address table found in step 32, and the address in the cache memory 16 storing the new write data is registered in the cache address 23 to register the new write data. Is performed (step 34).

【００８７】次に、ＭＰ２０は、新規書き込みデータを
用いてパリティを更新し、論理グループ１０内のドライ
ブ１２へ、新規書き込みデータと新パリティを格納する
処理を開始する。以下、更新前のパリティを旧パリテ
ィ、更新されたパリティを新パリティと、それぞれ呼ぶ
こととする。Next, the MP 20 updates the parity by using the new write data, and starts the process of storing the new write data and the new parity in the drive 12 in the logical group 10. Hereinafter, the parity before update will be referred to as the old parity, and the updated parity will be referred to as the new parity.

【００８８】まず、ＭＰ２０は、いまアドレステーブル
に登録した新規データに対応するＤＤｒｉｖｅＮｏ.
２９およびＰＤｒｉｖｅＮｏ.３４と、それらのドラ
イブ内の物理的なアドレスであるＳＣＳＩ内Ａｄｄｒ２
７とを認識する（ステップ３５）。そして、ＤＤｒｉｖ
ｅＮｏ.２９のドライブに対して、キャッシュメモリ
１６に格納されている新規データの書き込みを指示する
（ステップ３６）。First, the MP20 sends the DDrive No. corresponding to the new data registered in the address table.
29 and PDrive No. 34 and SCSI Addr2 which is a physical address in those drives.
7 is recognized (step 35). And DDriv
e The drive of No. 29 is instructed to write the new data stored in the cache memory 16 (step 36).

【００８９】また、ＰＤｒｉｖｅＮｏ.３４のドライ
ブに対して、パリティの更新を指示する。すなわち、Ｐ
ＤｒｉｖｅＮｏ.３４のドライブから旧パリティを読
み出し（ステップ３７）、旧パリティと新規書き込みデ
ータとから新パリティを作成し（ステップ３８）、当該
新パリティをＰＤｒｉｖｅＮｏ.３４のドライブに書
き込む（ステップ３９）。Further, the drive of PDrive No. 34 is instructed to update the parity. That is, P
The old parity is read from the drive of Drive No. 34 (step 37), the new parity is created from the old parity and the new write data (step 38), and the new parity is written to the drive of P Drive No. 34 (step 39). .

【００９０】図２および図４の例を用いて、ステップ３
５以降の処理を具体的に説明する。Using the example of FIGS. 2 and 4, step 3
The process after 5 will be specifically described.

【００９１】図２および図４において、ＭＰ２０は、Ｄ
ＤｒｉｖｅＮｏ．２４がＳＤ＃２のドライブ１２のＳ
ＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ１の位置に、新規にＤ
ａｔａ＃２を書き込むと、認識したとする。また、ＭＰ
２０は、図４のアドレステーブルから、この新規書き込
みデータに対応するパリティの格納位置（ＰＤｒｉｖｅ
Ｎｏ．２６がＳＤ＃６のドライブ１２のＳＣＳＩ内Ａ
ｄｄｒ２７がＤＡＤＲ１の位置）を認識する。In FIGS. 2 and 4, MP20 is D
Drive No. 24 is S of the drive 12 of SD # 2
Add D27 in CSI to the position of DADR1 and newly add D
It is assumed that when the data # 2 is written, it is recognized. Also, MP
20 is the storage position (PDrive of the parity corresponding to this new write data from the address table of FIG.
No. 26 is A in SCSI of drive 12 of SD # 6
ddr27 recognizes the position of DADR1).

【００９２】このように新規に書き込むデータとこのデ
ータを書き込んだ後に関与するパリティの論理グループ
１０内の物理アドレスを認識した後、ＭＰ２０は、ドラ
イブＳＤ＃２に対し新規に書き込むデータの書き込み処
理を開始し、またドライブＳＤ＃６に対しパリティの更
新処理を開始する。After recognizing the data to be newly written and the physical address in the logical group 10 of the parity involved after writing this data, the MP 20 performs the writing process of the data to be newly written to the drive SD # 2. Then, the parity update process is started for the drive SD # 6.

【００９３】図１を参照して、パリティの更新処理につ
いて説明する。Parity update processing will be described with reference to FIG.

【００９４】パリティの更新処理では、まずＭＰ２０
が、ドライブインターフェース（ＤｒｉｖｅＩＦ）１
８に対し、当該ドライブ１２へ旧パリティの読み出し要
求を発行するように指示する。ＤｒｉｖｅＩＦ１８で
は、ＳＣＳＩの読み出し処理手順に従って、読み出しコ
マンドをドライブユニットパス１１を介して発行する。In the parity update process, first, the MP20
But drive interface (Drive IF) 1
8 is instructed to the drive 12 to issue an old parity read request. The Drive IF 18 issues a read command via the drive unit path 11 in accordance with the SCSI read processing procedure.

【００９５】ＤｒｉｖｅＩＦ１８から読み出しコマン
ドを発行された当該ドライブ１２においては、指示され
たＳＣＳＩ内Ａｄｄｒ２７へシーク、回転待ちのアクセ
ス処理を行なう。当該ドライブ１２におけるアクセス処
理が完了した後、当該ドライブ１２は、当該旧パリティ
を読み出し、ドライブユニットパス１１を介してＤｒｉ
ｖｅＩＦ１８へ転送する。In the drive 12 to which the read command is issued from the Drive IF 18, the seek and rotation waiting access processing is performed to the designated SCSI Addr 27. After the access processing in the drive 12 is completed, the drive 12 reads the old parity and drives the drive unit path 11 to drive the drive parity.
ve IF18.

【００９６】ＤｒｉｖｅＩＦ１８では、転送されてき
た当該旧パリティをドライブ１２側のキャッシュアダプ
タ回路（ＣＡｄｐ）１７に転送する。ＣＡｄｐ１７
は、キャッシュメモリ１６に当該旧パリティを格納す
る。このとき、ＣＡｄｐ１７は、ＭＰ２０に対し、キ
ャッシュメモリ１６に当該旧パリティを格納したことを
報告する。The Drive IF 18 transfers the transferred old parity to the cache adapter circuit (C Adp) 17 on the drive 12 side. C Adp17
Stores the old parity in the cache memory 16. At this time, the C Adp 17 reports to the MP 20 that the old parity is stored in the cache memory 16.

【００９７】このように、当該旧パリティをキャッシュ
メモリ１６に読み出した後、ＭＰ２０は、パリティ生成
回路（ＰＧ）１９に対し、新パリティの作成を指示す
る。ＰＧ１９は、キャッシュメモリ１６内に格納されて
いる新規書き込みデータと当該旧パリティとで排他的論
理和の計算を行ない更新後の新パリティを作成する。作
成した新パリティは、キャッシュメモリ１６へ格納され
る。In this way, after reading the old parity into the cache memory 16, the MP 20 instructs the parity generation circuit (PG) 19 to create a new parity. The PG 19 calculates the exclusive OR of the new write data stored in the cache memory 16 and the old parity, and creates the updated new parity. The created new parity is stored in the cache memory 16.

【００９８】新パリティのキャッシュメモリ１６への格
納完了後、ＭＰ２０は、新規書き込みデータ（この例で
は、Ｄａｔａ＃２）をドライブＳＤ＃２に、新パリティ
をドライブＳＤ＃６に、それぞれ書き込む制御を行う。
それぞれのドライブにおける格納位置は、ＳＣＳＩ内Ａ
ｄｄｒ２７に示されるＤＡＤＲ１である。After the storage of the new parity in the cache memory 16 is completed, the MP 20 controls to write the new write data (Data # 2 in this example) to the drive SD # 2 and the new parity to the drive SD # 6. To do.
The storage location in each drive is A in SCSI.
It is DADR1 shown in ddr27.

【００９９】次に、新規書き込みデータや新パリティの
ドライブ１２への書き込み処理について説明する。Next, the process of writing new write data and new parity to the drive 12 will be described.

【０１００】まず、ＭＰ２０が、ＤｒｉｖｅＩＦ１８
に対し、当該ドライブ１２へ書き込み要求（新規書き込
みデータや新パリティの書き込み）を発行するように指
示する。ＤｒｉｖｅＩＦ１８では、ＳＣＳＩの書き込
み処理手順に従って、当該ドライブ１２に対し書き込み
コマンドをドライブユニットパス１１を介して発行す
る。First, MP20 is driven by Drive IF18.
The drive 12 is instructed to issue a write request (write of new write data or new parity). The Drive IF 18 issues a write command to the drive 12 via the drive unit path 11 according to the SCSI write processing procedure.

【０１０１】ＤｒｉｖｅＩＦ１８から書き込みコマン
ドを発行された当該ドライブ１２においては、指示され
たＳＣＳＩ内Ａｄｄｒ２７へシーク、回転待ちのアクセ
ス処理を行なう。当該ドライブ１２におけるアクセス処
理が完了した後、キャッシュアダプタ回路(ＣＡｄｐ)
１７は、キャッシュメモリ１６から新規書き込みデータ
または新パリティを読み出し、ＤｒｉｖｅＩＦ１８へ
転送する。In the drive 12 to which the write command is issued from the Drive IF 18, the access process for seeking and rotating to the instructed Addr 27 in SCSI is performed. After the access process in the drive 12 is completed, the cache adapter circuit (C Adp)
17 reads new write data or new parity from the cache memory 16 and transfers it to the Drive IF 18.

【０１０２】ＤｒｉｖｅＩＦ１８では、転送されてき
た新規書き込みデータまたは新パリティをドライブユニ
ットパス１１を介して当該ドライブ１２に転送し、当該
ドライブ１２の当該アドレスに新規書き込みデータまた
は新パリティを書き込む。このとき、ＣＡｄｐ１７
は、ＭＰ２０に対し、当該ドライブ１２に新規書き込み
データまたは新パリティを格納したことを報告する。The Drive IF 18 transfers the transferred new write data or new parity to the drive 12 via the drive unit path 11, and writes the new write data or new parity to the address of the drive 12. At this time, C Adp17
Informs the MP 20 that new write data or new parity has been stored in the drive 12.

【０１０３】（書き込み処理）(Write processing)

【０１０４】次に、すでにドライブ１２内に書き込まれ
ているデータを新しいデータに書き換える更新の場合に
ついて説明する。図７は、更新の書き込み処理の手順を
示すフローチャートである。Next, the case of updating in which the data already written in the drive 12 is rewritten to new data will be described. FIG. 7 is a flowchart showing the procedure of the update writing process.

【０１０５】データの更新要求が発生する（ステップ４
０）と、ＭＰ２０は、ＣＰＵ１が指定した論理アドレス
（データ名）からアドレステーブルを参照し、データお
よびパリティが格納されているドライブ１２のＤＤｒｉ
ｖｅＮｏ．２４とＰＤｒｉｖｅＮｏ．２６、それら
のドライブ１２内の物理的なアドレスであるＳＣＳＩ内
Ａｄｄｒ２７、キャッシュアドレス２３、および障害フ
ラグ２５を認識する（ステップ４１）。A data update request is generated (step 4)
0), the MP 20 refers to the address table from the logical address (data name) designated by the CPU 1, and DDri of the drive 12 in which data and parity are stored.
ve No. 24 and PDrive No. 26, the SCSI Addr 27 which is a physical address in the drive 12, the cache address 23, and the failure flag 25 are recognized (step 41).

【０１０６】例えば、図２および図４の例で、ＣＰＵ１
からドライブＳＤ＃２のＳＣＳＩ内Ａｄｄｒ２７がＤＡ
ＤＲ１のＤａｔａ＃２に対し、更新する書き込み要求が
発行されたとする。このとき、まずＭＰ２０は、アドレ
ステーブルにより更新されるデータ（旧データ）のＤＤ
ｒｉｖｅＮｏ．２４、ＳＣＳＩ内Ａｄｄｒ２７、キャ
ッシュアドレス２３、および更新されるパリティ（旧パ
リティ）のＰＤｒｉｖｅＮｏ．２６を認識する。For example, in the example of FIGS. 2 and 4, CPU1
From drive SD # 2 SCSI addr 27 is DA
It is assumed that a write request for updating is issued to Data # 2 of DR1. At this time, the MP 20 first adds the DD of the data (old data) updated by the address table.
live No. 24, Addr 27 in SCSI, cache address 23, and PDrive No. of the parity to be updated (old parity). Recognize 26.

【０１０７】次に、当該データの格納されている当該ド
ライブ１２に対するアドレステーブル内の障害フラグ２
５がオフ（０）か否か判別する。障害フラグ２５がオン
（１）なら、障害時の更新処理（図８）に進む。障害フ
ラグ２５がオフ（０）なら、ＭＰ２０は、このドライブ
１２は正常と認識し、以下のように処理する。Next, the failure flag 2 in the address table for the drive 12 in which the data is stored.
It is determined whether or not 5 is off (0). If the failure flag 25 is on (1), the process proceeds to the update processing at the time of failure (FIG. 8). If the failure flag 25 is off (0), the MP 20 recognizes that the drive 12 is normal and processes as follows.

【０１０８】更新する新データは、新規データの書き込
みのときの新規書き込みデータと同様に、ＣＰＵ１から
キャッシュメモリ１６に格納される（ステップ４３）。
そして、ＭＰ２０は、アドレステーブルの当該キャッシ
ュアドレス２３に、書き込む新データを格納したキャッ
シュメモリ１６内のアドレスを登録する（ステップ４
４）。The new data to be updated is stored in the cache memory 16 from the CPU 1 in the same manner as the new write data at the time of writing the new data (step 43).
Then, the MP 20 registers the address in the cache memory 16 storing the new data to be written, in the cache address 23 of the address table (step 4).
4).

【０１０９】旧データがキャッシュメモリ１６にない場
合は、旧データと旧パリティを読み出しキャッシュメモ
リ１６に格納する（ステップ４８，４５）。旧データが
キャッシュメモリ１６にある場合は、ステップ４８はス
キップして、旧パリティのみドライブ１２から読み出し
キャッシュメモリ１６に格納する（ステップ４５）。こ
のときの旧データおよび旧パリティをドライブ１２から
読み出しキャッシュメモリ１６に格納する方法は、先に
説明した新規書き込み時のパリティ更新処理における旧
パリティのドライブ１２からキャッシュメモリ１６への
読み出し方法と同じである。If the old data is not in the cache memory 16, the old data and the old parity are read out and stored in the cache memory 16 (steps 48 and 45). If the old data is in the cache memory 16, step 48 is skipped and only the old parity is read from the drive 12 and stored in the cache memory 16 (step 45). The method of reading the old data and the old parity from the drive 12 and storing them in the cache memory 16 at this time is the same as the method of reading the old parity from the drive 12 to the cache memory 16 in the parity update processing at the time of new writing described above. is there.

【０１１０】データを格納するＤＤｒｉｖｅＮｏ．２
４のドライブでは、旧データの読み出しの後、新データ
を当該ドライブのＳＣＳＩ内Ａｄｄｒ２７の位置に書き
込む（ステップ４９）。パリティについては、このよう
に読み出した旧データ、旧パリティと書き込む新データ
とで、先に説明した新規書き込み時のパリティ更新処理
と同様に排他的論理和の計算を行ない、更新後の新パリ
ティを作成し、キャッシュメモリ１６に格納する（ステ
ップ４６）。そして、作成した新パリティを、ＰＤｒｉ
ｖｅＮｏ．２６のドライブのＳＣＳＩ内Ａｄｄｒ２７
の位置に書き込む（ステップ４７）。DDrive No. for storing data. Two
In the drive of No. 4, after reading the old data, the new data is written to the position of Addr 27 in SCSI of the drive (step 49). Regarding parity, the old data read in this way, the old parity and the new data to be written are used to calculate the exclusive OR in the same way as the parity update processing at the time of new writing described above, and the new parity after updating is calculated. It is created and stored in the cache memory 16 (step 46). And, the new parity created is PDri
ve No. Addr27 in SCSI of 26 drives
Is written in the position (step 47).

【０１１１】新データおよび新パリティの書き込みは、
先に説明した新規書き込み時と同様にして行う。Writing new data and new parity
It is performed in the same manner as the above-described new writing.

【０１１２】（障害ドライブ１２への書き込み処理）(Processing for writing to the failed drive 12)

【０１１３】次に、すでにドライブ１２内に書き込まれ
ているデータを新しいデータに書き換える更新の際に、
更新される旧データが格納されているドライブ１２に障
害が発生している場合の処理方法について説明する。図
８は、そのような障害時の更新処理の手順を示すフロー
チャートである。Next, when updating the data already written in the drive 12 with new data,
A processing method when a failure occurs in the drive 12 storing the old data to be updated will be described. FIG. 8 is a flowchart showing the procedure of the update process in the event of such a failure.

【０１１４】図８の処理が開始される前に、ＭＰ２０
は、すでに図７のステップ４１で、ＣＰＵ１が指定した
論理アドレス（データ名）からアドレステーブルを参照
し、データおよびパリティが格納されているドライブ１
２のＤＤｒｉｖｅＮｏ．２４とＰＤｒｉｖｅＮｏ．
２６、それらのドライブ１２内の物理的なアドレスであ
るＳＣＳＩ内Ａｄｄｒ２７、キャッシュアドレス２３、
および障害フラグ２５を認識している。ステップ４２
で、当該データの格納されている当該ドライブ１２に対
するアドレステーブル内の障害フラグ２５がオン（１）
なら、ＭＰ２０は、このドライブ１２は異常と認識し、
図８の処理を開始する。Before the processing of FIG. 8 is started, MP20
7 refers to the address table from the logical address (data name) specified by the CPU 1 in step 41 of FIG. 7, and the drive 1 in which data and parity are stored
No. 2 DD Drive No. 2 24 and PDrive No.
26, SCSI addr 27 which is a physical address in the drive 12, cache address 23,
And the failure flag 25 is recognized. Step 42
Then, the failure flag 25 in the address table for the drive 12 in which the data is stored is turned on (1).
Then, the MP20 recognizes that this drive 12 is abnormal,
The process of FIG. 8 is started.

【０１１５】例えば、図２および図４の例で、ＣＰＵ１
からドライブＳＤ＃１のＳＣＳＩ内Ａｄｄｒ２７がＤＡ
ＤＲ１のＤａｔａ＃１に対し、更新する書き込み要求が
発行されたとする。このとき、まずＭＰ２０は、アドレ
ステーブルにより更新されるデータ（旧データ）のＤＤ
ｒｉｖｅＮｏ．２４、ＳＣＳＩ内Ａｄｄｒ２７、キャ
ッシュアドレス２３、および更新されるパリティ（旧パ
リティ）のＰＤｒｉｖｅＮｏ．２６を認識する。この
とき、アドレステーブルにおいて、旧データＤａｔａ＃
１の格納されているドライブＳＤ＃１の障害フラグ２５
がオン（１）となっているため、ＭＰ２０は、ドライブ
ＳＤ＃１に障害が発生していると認識し、図８のフロー
チャートに示す障害時の書き込み処理を開始する。For example, in the example of FIGS. 2 and 4, CPU1
From the drive SD # 1 SCSI addr 27 is DA
It is assumed that a write request for updating is issued to Data # 1 of DR1. At this time, the MP 20 first adds the DD of the data (old data) updated by the address table.
live No. 24, Addr 27 in SCSI, cache address 23, and PDrive No. of the parity to be updated (old parity). Recognize 26. At this time, in the address table, the old data Data #
Fault flag 25 of drive SD # 1 in which 1 is stored
Is on (1), the MP 20 recognizes that a failure has occurred in the drive SD # 1, and starts the write processing at the time of failure shown in the flowchart of FIG.

【０１１６】図８において、まず更新する新データは、
新規データの書き込みのときの新規書き込みデータと同
様に、ＣＰＵ１からキャッシュメモリ１６に格納される
（ステップ５０）。次に、ＭＰ２０は、アドレステーブ
ルの当該キャッシュアドレス２３に、キャッシュメモリ
１６内の書き込む新データを格納したアドレスを登録す
る（ステップ５１）。In FIG. 8, first, the new data to be updated is
The data is stored in the cache memory 16 from the CPU 1 in the same manner as the new write data at the time of writing new data (step 50). Next, the MP 20 registers the address storing the new data to be written in the cache memory 16 in the cache address 23 of the address table (step 51).

【０１１７】次に、ＭＰ２０は、パリティグループ内で
旧データの回復処理に関する全データおよびパリティの
アドレスを、アドレステーブルにより認識する（ステッ
プ５２）。そして、ＭＰ２０は、それぞれのドライブ１
２に対し、これらのデータおよびパリティを読み出し、
キャッシュメモリ１６に格納する（ステップ５３）。こ
のとき、これらのデータおよびパリティの中で、アドレ
ステーブルにおいてキャッシュアドレス２３にアドレス
が登録されているものは、もうすでにキャッシュメモリ
１６に存在しているとして、ＭＰ２０はドライブ１２か
らの読み出し処理は行わない。Next, the MP 20 recognizes the addresses of all the data and the parity related to the recovery process of the old data in the parity group by the address table (step 52). And the MP20 is the drive 1
2, these data and parity are read,
It is stored in the cache memory 16 (step 53). At this time, among these data and parity, the one whose address is registered in the cache address 23 in the address table is already in the cache memory 16, and the MP 20 performs the reading process from the drive 12. Absent.

【０１１８】なお、これらのデータおよびパリティをド
ライブ１２から読み出しキャッシュメモリ１６に格納す
る方法は、正常時の書き込み処理と同様に、先に説明し
た新規書き込み時のパリティ更新処理における旧パリテ
ィのドライブ１２からキャッシュメモリ１６への読み出
し方法と同じである。The method of storing these data and parity from the drive 12 in the read cache memory 16 is the same as the write processing at the normal time, and the drive 12 of the old parity in the parity update processing at the time of new writing described above. From the cache memory 16 to the cache memory 16.

【０１１９】このようにして読み出した全データおよび
旧パリティと書き込む新データとで、正常時の書き込み
処理と同様に先に説明した新規書き込み時のパリティ更
新処理のように、排他的論理和計算を行ない、更新後の
新パリティを作成しキャッシュメモリ１６に格納する
（ステップ５４）。このようにパリティの更新が完了し
たら、先に説明した新規書き込み時のパリティ更新処理
と同様な方法で、新パリティのみを当該ドライブ１２
（ＰＤｒｉｖｅＮｏ．２６のドライブ）の当該ＳＣＳ
Ｉ内Ａｄｄｒ２７の位置に格納することで、パリティの
更新のみを行う（ステップ５５）。In this way, the exclusive OR calculation is performed on all the data read and the old parity and the new data to be written, as in the normal write process, as in the parity update process at the new write described above. The updated new parity is created and stored in the cache memory 16 (step 54). When the parity update is completed in this way, only the new parity is updated in the drive 12 by the same method as the parity update process at the time of new writing described above.
SCS of (PDrive No. 26 drive)
By storing it in the position of Addr 27 in I, only the parity is updated (step 55).

【０１２０】（読み出し処理）(Reading process)

【０１２１】次に、すでにドライブ１２内に書き込まれ
ているデータを読み出す場合について説明する。図９
は、このようなデータの読み出し処理の手順を示すフロ
ーチャートである。Next, the case of reading the data already written in the drive 12 will be described. Figure 9
FIG. 6 is a flowchart showing the procedure of such data read processing.

【０１２２】データの読み出し要求が発生する（ステッ
プ５６）と、ＭＰ２０は、ＣＰＵ１が指定した論理アド
レス（データ名）からアドレステーブルを参照し、読み
出したいデータが格納されているドライブ１２のＤｒｉ
ｖｅＮｏ．２４、そのドライブ１２内のＳＣＳＩ内Ａ
ｄｄｒ２７、および障害フラグ２５を認識する（ステッ
プ５７）。そして、当該データの格納されている当該ド
ライブ１２に対するアドレステーブル内の障害フラグ２
５がオフ（０）か否か判別する（ステップ５８）。障害
フラグ２５がオン（１）なら、障害時の読み出し処理
（図１０）に進む。障害フラグ２５がオフ（０）なら、
ＭＰ２０は、このドライブ１２は正常と認識し、以下の
ように処理する。When a data read request is generated (step 56), the MP 20 refers to the address table from the logical address (data name) designated by the CPU 1 and drives the drive 12 in which the data to be read is stored.
ve No. 24, SCSI A in the drive 12
The ddr 27 and the failure flag 25 are recognized (step 57). Then, the failure flag 2 in the address table for the drive 12 in which the data is stored
It is determined whether or not 5 is off (0) (step 58). If the failure flag 25 is on (1), the process proceeds to the read processing at the time of failure (FIG. 10). If the failure flag 25 is off (0),
The MP 20 recognizes that the drive 12 is normal and processes as follows.

【０１２３】まず、ＭＰ２０は、アドレステーブルの当
該データ名に対するキャッシュアドレス２３を調べ（ス
テップ５９）、キャッシュメモリ１６内に読み出したい
データが存在するかどうか判定する（ステップ６０）。
キャッシュアドレス２３のフィールドにアドレスが登録
されており、キャッシュメモリ１６内に読み出したいデ
ータが格納されている場合（キャッシュヒット）は、Ｍ
Ｐ２０が、キャッシュメモリ１６から当該データを読み
出す制御を開始する（ステップ６１）。キャッシュメモ
リ１６内に無い場合（キャッシュミス）は、当該ドライ
ブ１２に対し、その内部の当該データを読み出す制御を
開始する（ステップ６２以降）。First, the MP 20 checks the cache address 23 for the data name in the address table (step 59) and determines whether or not there is data to be read in the cache memory 16 (step 60).
If the address is registered in the field of the cache address 23 and the data to be read is stored in the cache memory 16 (cache hit), M
The P20 starts control of reading the data from the cache memory 16 (step 61). If it does not exist in the cache memory 16 (cache miss), the drive 12 starts control to read the relevant data therein (from step 62).

【０１２４】キャッシュヒット時、ＭＰ２０は、アドレ
ステーブルによりＣＰＵ１から指定してきた論理アドレ
ス（データ名）を、当該データが格納されているキャッ
シュメモリ１６内のキャッシュアドレス２３に変換し、
キャッシュメモリ１６へ当該データを読み出しに行く
（ステップ６１）。具体的には、ＭＰ２０の指示の元
で、キャッシュアダプタ回路（ＣＡｄｐ）１５により
キャッシュメモリ１６から当該データは読み出される。At the time of a cache hit, the MP 20 converts the logical address (data name) designated by the CPU 1 in the address table into the cache address 23 in the cache memory 16 in which the data is stored,
The data is read out to the cache memory 16 (step 61). Specifically, the data is read from the cache memory 16 by the cache adapter circuit (C Adp) 15 under the instruction of the MP 20.

【０１２５】ＣＡｄｐ１５により読み出されたデータ
は、データ制御回路（ＤＣＣ）１４の制御により、チャ
ネルインターフェース回路（ＣＨＩＦ）１３に転送さ
れる。ＣＨＩＦ１３では、この読み出しデータに対
し、ＣＰＵ１におけるチャネルインターフェースのプロ
トコルに変換する処理を施し、チャネルインターフェー
スに対応する速度に速度調整する。ＣＨＩＦ１３にお
けるプロトコル変換および速度調整後は、チャネルパス
ディレクタ３において、チャネルパススイッチ６が外部
インターフェースパス４を選択し、ＩＦＡｄｐ５によ
りＣＰＵ１へデータ転送を行なう。The data read by the C Adp 15 is transferred to the channel interface circuit (CH IF) 13 under the control of the data control circuit (DCC) 14. The CH IF 13 performs a process of converting the read data into the protocol of the channel interface in the CPU 1, and adjusts the speed to a speed corresponding to the channel interface. After the protocol conversion and speed adjustment in the CH IF 13, in the channel path director 3, the channel path switch 6 selects the external interface path 4, and the IF Adp 5 transfers the data to the CPU 1.

【０１２６】一方、ステップ６０の判別において、キャ
ッシュミスの場合、ＭＰ２０は、キャッシュヒット時と
同様に、アドレステーブルにより、ＣＰＵ１が指定した
論理アドレス（データ名）に対応するＤＤｒｉｖｅＮ
ｏ．２４、そのドライブ内の物理的なアドレスであるＳ
ＣＳＩ内Ａｄｄｒ２７、および障害フラグ２５を認識す
る（ステップ６２）。On the other hand, if it is determined in step 60 that there is a cache miss, the MP20 uses the address table to specify the DDrive N corresponding to the logical address (data name) designated by the CPU 1 as in the cache hit.
o. 24, S which is the physical address in the drive
The CSI Addr 27 and the failure flag 25 are recognized (step 62).

【０１２７】次に、ＭＰ２０は、その物理アドレスに対
し、ＤｒｉｖｅＩＦ１８に、当該ドライブ１２への読
み出し要求を発行するように指示する（ステップ６
３）。ＤｒｉｖｅＩＦ１８では、ＳＣＳＩの読み出し
処理手順に従って、読み出しコマンドをドライブユニッ
トパス１１を介して発行する。Next, the MP 20 instructs the Drive IF 18 to issue a read request to the drive 12 for the physical address (step 6).
3). The Drive IF 18 issues a read command via the drive unit path 11 in accordance with the SCSI read processing procedure.

【０１２８】ＤｒｉｖｅＩＦ１８から読み出しコマン
ドを発行された当該ドライブ１２においては、指示され
たＳＣＳＩ内Ａｄｄｒ２７へシーク、回転待ちのアクセ
ス処理を行なう。当該ドライブ１２におけるアクセス処
理が完了した後、当該ドライブ１２は、当該データを読
み出し、ドライブユニットパス１１を介してＤｒｉｖｅ
ＩＦ１８へ転送する。In the drive 12 to which the read command has been issued from the Drive IF 18, the access process for seeking and rotating to the instructed Addr 27 in SCSI is performed. After the access process in the drive 12 is completed, the drive 12 reads the data and drives through the drive unit path 11.
Transfer to IF18.

【０１２９】ＤｒｉｖｅＩＦ１８では、転送されてき
た当該データをドライブ１２側のキャッシュアダプタ回
路（ＣＡｄｐ）１７に転送し、ＣＡｄｐ１７では、
キャッシュメモリ１６にデータを格納する（ステップ６
３）。このとき、ＣＡｄｐ１７は、ＭＰ２０に対し、
キャッシュメモリ１６にデータを格納したことをＭＰ２
０に報告する。ＭＰ２０は、この報告を元に、アドレス
テーブルのＣＰＵ１が読み出し要求を発行した論理アド
レス（データ名）のキャッシュアドレス２３に、データ
を格納したキャッシュメモリ１６内のアドレスを登録す
る（ステップ６５）。以降は、キャッシュヒット時と同
様な手順で、ＣＰＵ１へ当該データを転送する（ステッ
プ６４）。In Drive IF 18, the transferred data is transferred to the cache adapter circuit (C Adp) 17 on the drive 12 side, and in C Adp 17,
Data is stored in the cache memory 16 (step 6)
3). At this time, C Adp17 is
MP2 indicates that the data is stored in the cache memory 16.
Report to 0. Based on this report, the MP 20 registers the address in the cache memory 16 storing the data in the cache address 23 of the logical address (data name) to which the CPU 1 of the address table has issued the read request (step 65). After that, the data is transferred to the CPU 1 in the same procedure as when the cache hits (step 64).

【０１３０】（障害ドライブ１２からの読み出し処理）(Reading process from faulty drive 12)

【０１３１】次に、障害ドライブ１２内に書き込まれて
いるデータを読み出す場合について説明する。図１０
は、障害ドライブからの読み出し処理の手順を示すフロ
ーチャートである。Next, the case of reading the data written in the failed drive 12 will be described. Figure 10
6 is a flowchart showing a procedure of a read process from a faulty drive.

【０１３２】図１０の処理が開始される前に、ＭＰ２０
は、すでに図９のステップ５７で、ＣＰＵ１が指定した
論理アドレス（データ名）からアドレステーブルを参照
し、読み出したいデータが格納されているドライブ１２
のＤｒｉｖｅＮｏ．２４、そのドライブ１２内のＳＣ
ＳＩ内Ａｄｄｒ２７、および障害フラグ２５を認識して
いる。また、ステップ５８で、当該データの格納されて
いる当該ドライブ１２に対するアドレステーブル内の障
害フラグ２５を判別する。障害フラグ２５がオン（１）
なら、ＭＰ２０は、このドライブ１２は異常と認識し、
図１０の処理を開始する。Before the processing of FIG. 10 is started, MP20
Is the drive 12 in which the data to be read is already stored by referring to the address table from the logical address (data name) designated by the CPU 1 in step 57 of FIG.
Drive No. 24, SC in the drive 12
The SI Addr 27 and the failure flag 25 are recognized. In step 58, the failure flag 25 in the address table for the drive 12 in which the data is stored is determined. Fault flag 25 is on (1)
Then, the MP20 recognizes that this drive 12 is abnormal,
The process of FIG. 10 is started.

【０１３３】まず、ＭＰ２０は、アドレステーブルの当
該データ名に対するキャッシュアドレス２３を調べ（ス
テップ６６）、キャッシュメモリ１６内に読み出したい
データが存在するかどうか判定する（ステップ６７）。
キャッシュアドレス２３のフィールドにアドレスが登録
されており、キャッシュメモリ１６内に読み出したいデ
ータが格納されている場合（キャッシュヒット）は、Ｍ
Ｐ２０が、キャッシュメモリ１６から当該データを読み
出しＣＰＵ１へ当該データを転送する（ステップ６
８）。キャッシュメモリ１６内に無い場合（キャッシュ
ミス）は、当該ドライブ１２に対し、その内部の当該デ
ータを読み出す制御を開始する（ステップ６９以降）。First, the MP 20 checks the cache address 23 for the data name in the address table (step 66), and determines whether or not there is data to be read in the cache memory 16 (step 67).
If the address is registered in the field of the cache address 23 and the data to be read is stored in the cache memory 16 (cache hit), M
The P20 reads the data from the cache memory 16 and transfers the data to the CPU 1 (step 6).
8). If it does not exist in the cache memory 16 (cache miss), control for reading the data in the drive 12 is started (step 69 and after).

【０１３４】キャッシュミスの場合、ＭＰ２０は、アド
レステーブルにより、パリティグループ内で旧データの
回復処理に関与する全データおよびパリティが格納され
ているドライブ１２のＤＤｒｉｖｅＮｏ．２４および
ＰＤｒｉｖｅＮｏ．２６、そのドライブ内の物理的な
アドレスであるＳＣＳＩ内Ａｄｄｒ２７、並びに障害フ
ラグ２５を認識する（ステップ６９）。In the case of a cache miss, the MP 20 uses the address table to determine the DDrive No. of the drive 12 in which all the data and the parity involved in the recovery process of the old data in the parity group are stored. 24 and PDrive No. 24. 26, the SCSI addr 27 which is a physical address in the drive, and the failure flag 25 are recognized (step 69).

【０１３５】次に、ＭＰ２０は、それぞれのドライブ１
２に対し、これらのデータおよびパリティを読み出し、
キャッシュメモリ１６に格納する（ステップ７０）。こ
のとき、これらのデータおよびパリティの中で、アドレ
ステーブルにおいてキャッシュアドレス２３にアドレス
が登録されているものは、もうすでにキャッシュメモリ
１６に存在しているとして、ＭＰ２０は、ドライブ１２
からの読み出し処理は行わない。Next, the MP20 is configured to drive each drive 1
2, these data and parity are read,
It is stored in the cache memory 16 (step 70). At this time, among these data and parity, it is assumed that the address whose address is registered in the cache address 23 in the address table already exists in the cache memory 16, and the MP 20 determines that the drive 12
The read processing from is not performed.

【０１３６】なお、これらのデータおよびパリティをド
ライブ１２から読み出しキャッシュメモリ１６に格納す
る方法は、先に説明した新規書き込み時のパリティ更新
処理における旧パリティのドライブ１２からキャッシュ
メモリ１６への読み出し方法と同じである。The method of reading these data and parity from the drive 12 and storing them in the cache memory 16 is the same as the method of reading the old parity from the drive 12 to the cache memory 16 in the parity update processing at the time of new writing described above. Is the same.

【０１３７】次に、ＭＰ２０は、ＰＧ１９に対し、デー
タの復元を指示する。ＰＧ１９は、ステップ７０で読み
出したデータおよびパリティを用いて排他的論理和計算
を行ない、障害ドライブ１２内に格納されている当該デ
ータを復元し、キャッシュメモリ１６に格納する（ステ
ップ７１）。このとき、ＣＡｄｐ１７は、ＭＰ２０に
対し、キャッシュメモリ１６にデータを格納することを
報告する。ＭＰ２０は、アドレステーブルのＣＰＵ１が
読み出し要求を発行した論理アドレス（データ名）のキ
ャッシュアドレス２３に、データを格納したキャッシュ
メモリ１６内のアドレスを登録する（ステップ７２）。
以降は、キャッシュヒット時と同様な手順で、ＣＰＵ１
へ当該データを転送する（ステップ７１）。Next, the MP 20 instructs the PG 19 to restore the data. The PG 19 performs an exclusive OR calculation using the data and parity read in step 70, restores the data stored in the failed drive 12, and stores it in the cache memory 16 (step 71). At this time, the C Adp 17 reports to the MP 20 that the data is stored in the cache memory 16. The MP 20 registers the address in the cache memory 16 storing the data in the cache address 23 of the logical address (data name) to which the CPU 1 of the address table has issued the read request (step 72).
After that, the CPU 1 performs the same procedure as the cache hit.
The data is transferred to (step 71).

【０１３８】（障害回復処理）(Failure recovery processing)

【０１３９】次に、ドライブ１２に障害が発生した場合
の、障害ドライブ１２内のデータを回復する手順につい
て説明する。図１１は、そのような障害回復の手順を示
すフローチャートである。Next, a procedure for recovering the data in the failed drive 12 when the failure occurs in the drive 12 will be described. FIG. 11 is a flowchart showing a procedure for such failure recovery.

【０１４０】まず、ＭＰ２０は、論理グループ１０内の
任意のドライブ１２に障害が発生したことを認識する
（ステップ７３）と、障害が発生したドライブ１２に関
するアドレステーブル上の全ての項に対し、障害フラグ
２５をオン（１）とする。例えば、ドライブＳＤ＃１に
障害が発生したときは、ＭＰ２０により、図４のアドレ
ステーブルに示すように、ドライブＳＤ＃１に関する項
の全ての障害フラグ２５がオン（１）とされる。その
後、ステップ７５以降の処理に進む。First, when the MP 20 recognizes that a failure has occurred in any of the drives 12 in the logical group 10 (step 73), all the items on the address table relating to the failed drive 12 are failed. The flag 25 is turned on (1). For example, when a failure occurs in the drive SD # 1, the MP20 turns on (1) all the failure flags 25 in the section related to the drive SD # 1, as shown in the address table of FIG. Then, the process proceeds to step 75 and the subsequent steps.

【０１４１】ステップ７５以降の処理の説明の前に、本
実施例の障害回復方式について例を用いて説明する。Before the description of the processing after step 75, the failure recovery method of the present embodiment will be described using an example.

【０１４２】図２および図４に示す例では、論理グルー
プ１０を構成するドライブ１２内には、５個のデータと
１個のパリティ（５Ｄ＋１Ｐ）、３個のデータと１個の
パリティ（３Ｄ＋１Ｐ）、および二重化の３種類のパリ
ティグループレベルのパリティグループが存在する。Ｄ
ＡＤＲ１，２，３はパリティグループレベルが５Ｄ＋１
Ｐのパリティグループで、ＤＡＤＲ４，５，６はパリテ
ィグループレベルが３Ｄ＋１Ｐと二重化ののパリティグ
ループが混在している。In the examples shown in FIGS. 2 and 4, five data and one parity (5D + 1P), three data and one parity (3D + 1P) are included in the drive 12 that constitutes the logical group 10. , And dual parity group level parity groups exist. D
ADR1, 2, and 3 have a parity group level of 5D + 1
In the P parity group, DADRs 4, 5 and 6 have a parity group level of 3D + 1P and a dual parity group.

【０１４３】この３種類のパリティグループレベルで
は、５Ｄ＋１Ｐ、３Ｄ＋１Ｐ、二重化の順に信頼性は高
く、障害時の性能も高い。At these three types of parity group levels, the reliability is high in the order of 5D + 1P, 3D + 1P, and duplication, and the performance at the time of failure is also high.

【０１４４】具体的には、ＤＡＤＲ１，２，３の５Ｄ＋
１Ｐのパリティグループでは、１台目のドライブ１２に
障害が発生し、回復処理前にさらに論理グループ１０内
の任意のドライブ１２に障害が発生すると、データ消失
となる。一方、ＤＡＤＲ４，５，６の３Ｄ＋１Ｐのパリ
ティグループでは、１台目のドライブ１２に障害が発生
し、論理グループ１０で回復処理前にさらにもう１台の
ドライブ１２に障害が発生しても、パリティグループを
構成しているデータが格納されているドライブ１２でな
ければデータ消失とはならない。さらに、二重化のパリ
ティグループに関しては、二重化のペアのドライブ１２
でなければデータ消失とはならない。Specifically, 5D + of DADR1, 2, 3
In the 1P parity group, if the first drive 12 fails and any drive 12 in the logical group 10 fails before the recovery process, data will be lost. On the other hand, in the 3D + 1P parity group of DADR 4, 5, and 6, even if a failure occurs in the first drive 12 in the logical group 10 and another failure occurs in the other drive 12 before the recovery processing, the parity The data is not lost unless the drive 12 stores the data forming the group. Further, regarding the redundant parity group, the drive 12 of the redundant pair is used.
Otherwise, no data will be lost.

【０１４５】このように、パリティグループの構成によ
り、２台目のドライブ１２に障害が発生することにより
データ消失する確率が異なる。逆にいえば、そのような
確率が異なるパリティグループは、異なるパリティグル
ープレベルとして分類している。As described above, the probability of data loss due to a failure in the second drive 12 differs depending on the configuration of the parity group. Conversely, parity groups having different probabilities are classified as different parity group levels.

【０１４６】また、このパリティグループレベルによる
分類は、以下に示すように、障害時における性能低下に
おいても適用できる。例えば、先の障害時における書き
込み、読み出し処理で説明したように、障害が発生した
ドライブ１２に対し書き込みまたは読み出し要求が発生
した場合、５Ｄ＋１Ｐでは５台のドライブ１２に対し読
み出し要求を発行しなければならない。しかし、３Ｄ＋
１Ｐでは３台のドライブ１２に対する読み出し要求です
み、二重化に対しては１台のみである。このように、障
害が発生したドライブ１２に対する読み出し要求または
書き込み要求が発生した場合、回復処理において読み出
し要求を発行しなければならないドライブ１２の数が５
Ｄ＋１Ｐ、３Ｄ＋１Ｐ、二重化の順に少ないため、この
順に障害時の性能は高い。Further, the classification by the parity group level can be applied to the performance deterioration at the time of failure, as shown below. For example, as described in the writing / reading process at the time of failure, when a writing / reading request is issued to the drive 12 in which the failure has occurred, the reading request must be issued to the five drives 12 in 5D + 1P. I won't. However, 3D +
In 1P, only one read request is required for the three drives 12, and only one is required for duplication. As described above, when a read request or a write request is issued to the failed drive 12, the number of drives 12 that must issue the read request in the recovery process is five.
Since D + 1P, 3D + 1P, and duplex are less in order, the performance at the time of failure is higher in this order.

【０１４７】そこで、本実施例では、１台目のドライブ
１２に障害が発生し、この障害が発生したドライブ１２
内のデータを復元する回復処理を行う場合、パリティグ
ループレベルの低い（２台目の障害が発生した場合にデ
ータ消失となる確率（データ消失確率）の高い）パリテ
ィグループから、図１１および図１２に示すように、順
次回復処理を行なうようにしている。Therefore, in this embodiment, a failure occurs in the first drive 12 and the drive 12 in which this failure has occurred.
11 and FIG. 12 when performing a recovery process for restoring the data in the internal parity group, the parity group having a lower parity group level (having a higher probability of data loss (data loss probability) when a second failure occurs) is used. As shown in, the recovery process is sequentially performed.

【０１４８】現在、ドライブ１２の容量の増加に伴いデ
ィスクアレイの有力な適用先の一つであるファイルサー
バの大規模化が進み、多くのユーザが使用するようにな
ってきた。このため、ファイルサーバのディスクに障害
が発生し、障害回復のために長時間停止すると、多くの
ユーザの業務を停止することになり、大きな被害を生じ
る。このため、多くのユーザが使用している時間帯では
なく、なるべくファイルシステムを使用するユーザが少
ないときに停止して回復処理を行いたい。At present, as the capacity of the drive 12 increases, the file server, which is one of the promising application destinations of the disk array, has become larger in size, and has been used by many users. For this reason, when a failure occurs in the disk of the file server and the operation is stopped for a long time to recover from the failure, the work of many users is stopped, which causes a great damage. For this reason, it is desirable to stop and perform recovery processing when the number of users who use the file system is as small as possible rather than during the time when many users are using it.

【０１４９】本発明では、高い信頼性を確保するととも
に障害時の性能低下を最小限に抑さえ、回復処理を集中
してではなく、時間を分散して行う。そこで、具体的な
回復方法を以下に示す。In the present invention, high reliability is ensured and performance deterioration at the time of failure is suppressed to a minimum, and recovery processing is performed not in a concentrated manner but in a dispersed manner. Therefore, a specific recovery method is shown below.

【０１５０】図２および図４において、ドライブＳＤ＃
１に障害が発生したとする。ドライブＳＤ＃１には、Ｓ
ＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ１の位置にＤａｔａ＃
１が、ＤＡＤＲ２にＤａｔａ＃６が、ＤＡＤＲ３にＤａ
ｔａ＃１１が、ＤＡＤＲ４にＤａｔａ＃１６が、ＤＡＤ
Ｒ５にＤａｔａ＃２０が、ＤＡＤＲ６にパリティが、そ
れぞれ格納されている。Ｄａｔａ＃１，Ｄａｔａ＃６，
Ｄａｔａ＃１１は５Ｄ＋１Ｐのパリティグループレベル
のパリティグループに所属し、Ｄａｔａ＃１６は二重化
のパリティグループレベルのパリティグループに所属
し、Ｄａｔａ＃２０，パリティは３Ｄ＋１Ｐのパリティ
グループレベルのパリティグループに所属している。In FIGS. 2 and 4, drive SD #
Suppose that 1 fails. For drive SD # 1, S
Addr27 in CSI is Data # at the position of DADR1.
1 for DADR2, Data # 6 for DADR3, and Da for DADR3
ta # 11, DADR4, Data # 16, DAD
Data # 20 is stored in R5, and parity is stored in DADR6. Data # 1, Data # 6
Data # 11 belongs to the parity group of the 5D + 1P parity group level, Data # 16 belongs to the parity group of the redundant parity group level, and Data # 20 and parity belong to the parity group of the 3D + 1P parity group level. There is.

【０１５１】そこで、ＭＰ２０は、まずデータ消失確率
が高く障害時の性能が低い５Ｄ＋１Ｐのパリティグルー
プレベルのパリティグループに所属するデータの回復を
考える。回復とは、具体的には、この障害が発生したド
ライブＳＤ＃１を正常なドライブ１２に交換したり（例
えば、ドライブの交換を促すメッセージを出力しユーザ
にドライブを交換させる）、ドライブ１２の障害に備え
予備のドライブ１２を予め用意してある場合は、この予
備のドライブ１２に切り換えるなどの処理である。Therefore, the MP 20 first considers the recovery of data belonging to the parity group of the parity group level of 5D + 1P, which has a high probability of data loss and low performance in the event of a failure. Specifically, the recovery means that the drive SD # 1 in which the failure has occurred is replaced with a normal drive 12 (for example, a message prompting the replacement of the drive is output and the user is allowed to replace the drive). In the case where a spare drive 12 is prepared in advance for a failure, the processing is switched to this spare drive 12, for example.

【０１５２】図１１に戻って、上述したような回復処理
の手順を説明する。Returning to FIG. 11, the procedure of the recovery process as described above will be described.

【０１５３】まず、ＭＰ２０は、上位のＣＰＵ１から、
回復処理を行おうとしている障害ドライブ１２が所属す
る論理グループ１０へ発行される読み出しおよび書き込
み要求数（ＩＯ数）を調べる（ステップ７５）。そし
て、そのＩＯ数が、あらかじめ設定してある設定値以下
か否かを判別する（ステップ７６）。First, the MP20 is
The number of read and write requests (number of IOs) issued to the logical group 10 to which the failed drive 12 which is going to perform recovery processing belongs is checked (step 75). Then, it is determined whether or not the number of IOs is less than or equal to a preset value (step 76).

【０１５４】ＩＯ数が、その設定値を越えている場合
に、障害が発生したドライブ１２への読み出しまたは書
き込み要求が発生したら、上述の障害ドライブへの読み
出し処理または書き込み処理で説明したように、パリテ
ィを用いて上位からの読み出しまたは書き込み要求に答
える（ステップ７７）。一方、ＩＯ数が設定値以下にな
ったら、５Ｄ＋１Ｐのパリティグループレベルのパリテ
ィグループに所属するデータの回復処理を行い障害フラ
グをオフにする（ステップ７８）。回復処理は、障害ド
ライブ１２への読み出しで行ったように個々のデータを
復元する。When the number of IOs exceeds the set value and a read or write request is issued to the drive 12 in which a failure has occurred, as described in the above-mentioned read processing or write processing to the failed drive, A read or write request from the higher order is answered using the parity (step 77). On the other hand, if the number of IOs is less than or equal to the set value, the process of recovering the data belonging to the parity group of the parity group level of 5D + 1P is performed and the failure flag is turned off (step 78). The recovery process restores individual data as it did in the reading to the failed drive 12.

【０１５５】個々のデータの復元について、具体的に説
明する。まず、ＭＰ２０は、Ｄａｔａ＃１の復元を行
う。ＭＰ２０は、アドレステーブルにより、Ｄａｔａ＃
２，３，４，５とこれらのデータから作成されたパリテ
ィが格納されているＤＤｒｉｖｅＮｏ．２４とＰＤｒ
ｉｖｅＮｏ．２６、それらのドライブ１２内の物理的
なアドレスであるＳＣＳＩ内Ａｄｄｒ２７、および障害
フラグ２５を認識する。The restoration of individual data will be specifically described. First, the MP 20 restores Data # 1. MP20 uses the address table to
2, 3, 4, 5 and the DDrive No. in which the parity created from these data is stored. 24 and PDr
iv No. 26, the SCSI addr 27 which is a physical address in the drive 12, and the failure flag 25.

【０１５６】次に、ＭＰ２０は、それぞれのドライブ１
２に対し、これらのデータおよびパリティを読み出し、
キャッシュメモリ１６に格納する。このとき、これらの
データおよびパリティの中で、図４のアドレステーブル
ではＤａｔａ＃２，４，５に対してキャッシュアドレス
２３にアドレスが登録されているため、これらのデータ
はもうすでにキャッシュメモリ１６に存在している。そ
の場合は、ＭＰ２０はドライブ１２からの読み出し処理
は行わない。Next, the MP20 operates on each drive 1
2, these data and parity are read,
It is stored in the cache memory 16. At this time, among these data and parity, in the address table of FIG. 4, addresses are already registered in the cache address 23 for Data # 2, 4, and 5, so these data have already been stored in the cache memory 16. Existing. In that case, the MP 20 does not read from the drive 12.

【０１５７】なお、これらのデータおよびパリティをド
ライブ１２から読み出しキャッシュメモリ１６に格納す
る方法は、先に説明したドライブ１２からキャッシュメ
モリ１６への読み出し方法と同じである。The method of reading these data and parity from the drive 12 and storing them in the cache memory 16 is the same as the method of reading from the drive 12 to the cache memory 16 described above.

【０１５８】ＭＰ２０は、ＰＧ１９に対し、データの復
元とキャッシュメモリ１６への格納を指示する。ＰＧ１
９は、上述したように読み出したデータおよびパリティ
で排他的論理和計算を行ない、障害ドライブ１２内に格
納されているＤａｔａ＃１を復元し、キャッシュメモリ
１６に格納する。このとき、ＣＡｄｐ１７は、ＭＰ２
０に対して、キャッシュメモリ１６にデータを格納する
ことを報告する。The MP 20 instructs the PG 19 to restore the data and store it in the cache memory 16. PG1
9 performs an exclusive OR calculation with the data and the parity read as described above, restores Data # 1 stored in the faulty drive 12, and stores it in the cache memory 16. At this time, C Adp17 becomes MP2
0 is reported to store data in the cache memory 16.

【０１５９】ＭＰ２０は、この報告を元に、アドレステ
ーブルのＤａｔａ＃１のキャッシュアドレス２３に、キ
ャッシュメモリ１６内のデータ格納アドレスを登録し、
その障害フラグ２５をオフ（０）とする。このように復
元したデータを、例えば交換した正常なドライブ１２に
格納する。Ｄａｔａ＃６，１１についても同様に復元
し、交換した正常なドライブ１２に格納する。Based on this report, the MP 20 registers the data storage address in the cache memory 16 in the cache address 23 of Data # 1 in the address table,
The failure flag 25 is turned off (0). The data thus restored is stored in the replaced normal drive 12, for example. Data # 6 and 11 are similarly restored and stored in the replaced normal drive 12.

【０１６０】このように、５Ｄ＋１Ｐのパリティグルー
プレベルのパリティグループに所属するデータを復元
し、回復した後は、データ消失確率は低下し、障害時の
性能は向上する。As described above, after the data belonging to the parity group at the parity group level of 5D + 1P is restored and recovered, the probability of data loss is reduced and the performance at the time of failure is improved.

【０１６１】５Ｄ＋１Ｐのパリティグループレベルの回
復が完了したら、ＭＰ２０は、再び上位のＣＰＵ１か
ら、障害ドライブ１２（実際は、すでにステップ７８で
正常なドライブへの交換あるいは予備ドライブへの交換
が行われているが、この時点では復元されているのは５
Ｄ＋１Ｐのみであり、他のパリティグループレベルは復
元されていないので、便宜上、障害ドライブと呼ぶもの
とする）が所属する論理グループ１０へ発行される読み
出しおよび書き込み要求数（ＩＯ数）を調べ、そのＩＯ
数があらかじめ設定してある設定値以下か否かを判別す
る（ステップ７９）。When the recovery of the parity group level of 5D + 1P is completed, the MP20 again causes the faulty drive 12 (actually, the replacement to the normal drive or the replacement to the spare drive has already been performed in step 78) from the upper CPU 1. However, at this point, only 5 are restored.
Since only D + 1P and other parity group levels have not been restored, the number of read and write requests (the number of IOs) issued to the logical group 10 to which the other drive belongs is referred to for the sake of convenience. IO
It is determined whether the number is less than or equal to a preset value (step 79).

【０１６２】ＩＯ数が、その設定値を越えている場合
に、障害ドライブ１２の３Ｄ＋１Ｐまたは二重化のパリ
ティグループへの読み出しまたは書き込み要求が発生し
たら、上述の障害ドライブへの読み出し処理または書き
込み処理で説明したように、パリティを用いて上位から
の読み出しまたは書き込み要求に答える（ステップ８
０）。一方、ＩＯ数が設定値以下になったら、３Ｄ＋１
Ｐのパリティグループレベルのパリティグループに所属
するデータの回復処理を行い障害フラグをオフにする
（ステップ８１）。回復処理方法は、５Ｄ＋１Ｐのパリ
ティグループレベルと同様である。When the number of IOs exceeds the set value and a read or write request is made to the 3D + 1P of the faulty drive 12 or the redundant parity group, the above-mentioned read processing or write processing to the faulty drive will be described. As described above, the parity is used to answer the read or write request from the higher order (step 8).
0). On the other hand, if the number of IO is less than the set value, 3D + 1
The data belonging to the parity group of the P parity group level is recovered and the failure flag is turned off (step 81). The recovery processing method is similar to the parity group level of 5D + 1P.

【０１６３】このように、３Ｄ＋１Ｐのパリティグルー
プレベルのパリティグループに所属するデータを復元
し、回復した後は、データ消失確率はさらに低下し、障
害時の性能はさらに向上する。As described above, after the data belonging to the parity group of the parity group level of 3D + 1P is restored and recovered, the probability of data loss further decreases and the performance at the time of failure further improves.

【０１６４】３Ｄ＋１Ｐのパリティグループレベルの回
復が完了したら、ＭＰ２０は、再び上位のＣＰＵ１か
ら、障害ドライブ１２が所属する論理グループ１０へ発
行される読み出しおよび書き込み要求数（ＩＯ数）を調
べ、そのＩＯ数があらかじめ設定してある設定値以下か
否かを判別する（ステップ８２）。When the recovery of the parity group level of 3D + 1P is completed, the MP 20 checks the number of read and write requests (the number of IOs) issued from the upper CPU 1 to the logical group 10 to which the failed drive 12 belongs, and the IO It is determined whether the number is less than or equal to a preset value (step 82).

【０１６５】ＩＯ数が、その設定値を越えている場合
に、障害ドライブ１２の二重化のパリティグループへの
読み出しまたは書き込み要求が発生したら、上述の障害
ドライブへの読み出し処理または書き込み処理で説明し
たように、二重化の障害ドライブでない方のドライブを
用いて上位からの読み出しまたは書き込み要求に答える
（ステップ８３）。一方、ＩＯ数が設定値以下になった
ら、二重化のパリティグループレベルのパリティグルー
プに所属するデータの回復処理を行い障害フラグをオフ
にする（ステップ８４）。When the number of IOs exceeds the set value and a read or write request is made to the redundant parity group of the faulty drive 12, as described in the above-mentioned read or write process to the faulty drive. Then, the read or write request from the upper layer is answered using the drive which is not the redundant drive (step 83). On the other hand, when the number of IOs is equal to or less than the set value, the process of recovering the data belonging to the parity group of the redundant parity group level is performed and the failure flag is turned off (step 84).

【０１６６】回復処理方法は、例えば、ドライブＳＤ＃
１に格納されているＤａｔａ＃１６の二重化データが格
納されているドライブＳＤ＃２からＤａｔａ＃１６をキ
ャッシュメモリ１６に読み出して、キュシュメモリ１６
からドライブＳＤ＃１のＤＡＤＲ４にこのデータを書き
込むことにより行う。ドライブＳＤ＃２からＤａｔａ＃
１６をキャッシュメモリ１６に読み出すのは、先に述べ
た正常時の読み出し処理における、キャッシュミスの時
と同じである。また、キャッシュメモリ１６からドライ
ブＳＤ＃１にＤａｔａ＃１６を書き込むのは、先に述べ
た正常時の書き込み処理と同じである。The recovery processing method is, for example, drive SD #
Data # 16 is read from the drive SD # 2 in which the duplicated data of Data # 16 stored in No. 1 is stored in the cache memory 16 to
From the drive SD # 1 to DADR4. Drive SD # 2 to Data #
Reading 16 to the cache memory 16 is the same as that at the time of a cache miss in the above-described normal read processing. Further, writing Data # 16 from the cache memory 16 to the drive SD # 1 is the same as the above-described normal writing process.

【０１６７】図１２（ａ）は、従来方法の障害回復処理
と本実施例の障害回復処理とで、データ消失確率がどの
ように変化するかを示したものである。本実施例では、
ＩＯ状況が設定値以下の場合に、５Ｄ＋１Ｐの回復、３
Ｄ＋１Ｐの回復、および二重化の回復に、分散して回復
処理を行っている。FIG. 12A shows how the data loss probability changes between the failure recovery processing of the conventional method and the failure recovery processing of this embodiment. In this embodiment,
If the IO status is less than the set value, recover 5D + 1P, 3
Recovery processing is performed in a distributed manner for D + 1P recovery and duplex recovery.

【０１６８】５Ｄ＋１Ｐの回復を行う前の状態では、５
Ｄ＋１Ｐのパリティグループレベルについては、この時
点で障害を起こしているドライブ以外のどのドライブが
障害を起こしたとしても、１００％データ消失する。５
Ｄ＋１Ｐの回復を行った後では、３Ｄ＋１Ｐのパリティ
グループレベルについては、障害ドライブ以外の５台の
ドライブのうち３Ｄ＋１Ｐに関与する２台の何れかが障
害を起こさない限りデータ消失はしないから、データ消
失確率は４０％といえる。In the state before the recovery of 5D + 1P, 5
For the D + 1P parity group level, 100% data loss will occur no matter which drive other than the failing drive fails at this point. 5
After the recovery of D + 1P, for the parity group level of 3D + 1P, data will not be lost unless either of the two drives involved in 3D + 1P out of the five drives other than the failed drive fails. The probability is 40%.

【０１６９】さらに、３Ｄ＋１Ｐの回復を行った後で
は、二重化のパリティグループレベルについては、障害
ドライブ以外の５台のドライブのうち二重化に関与する
正常な１台が障害を起こさない限りデータ消失はしない
から、データ消失確率は２０％といえる。二重化の回復
を行った後は、回復処理がすべて完了し、データ消失確
率は０％となる。Furthermore, after the recovery of 3D + 1P, with respect to the parity group level of duplication, no data is lost unless one of the five drives other than the faulty drive that is involved in duplication fails. Therefore, it can be said that the data loss probability is 20%. After the duplex recovery is completed, the recovery process is completed and the data loss probability becomes 0%.

【０１７０】これに対し、従来は、例えば６台のドライ
ブのすべてを用いて５Ｄ＋１Ｐのパリティグループを設
定するので、障害ドライブ以外の５台のうちどの１台が
障害を起こしてもデータは消失し、したがって障害ドラ
イブの全体を回復するまではデータ消失確率は１００％
である。On the other hand, conventionally, for example, a 6D + 1P parity group is set by using all six drives, so that data is lost even if any one of the five drives other than the failed drive fails. Therefore, the probability of data loss is 100% until the entire failed drive is recovered.
Is.

【０１７１】図１２（ｂ）は、従来方法の障害回復処理
と本実施例の障害回復処理とで、性能がどのように変化
するかを示したものである。本実施例では、５Ｄ＋１Ｐ
の回復、３Ｄ＋１Ｐの回復、および二重化の回復を行う
ごとに、性能が徐々に回復する。これに対し、従来方法
では、回復がすべて完了するまで、性能は低下したまま
である。FIG. 12 (b) shows how the performance changes between the fault recovery process of the conventional method and the fault recovery process of this embodiment. In this embodiment, 5D + 1P
The performance gradually recovers with each recovery of 3D + 1P, and recovery of duplex. On the other hand, in the conventional method, the performance remains degraded until the recovery is completed.

【０１７２】なお、本実施例では、図２に示すような論
理グループ１０内のデータ配置で説明してきたが、論理
グループ１０内のデータ配置は、ユーザの初期設定の際
に自由に設定可能なため、制約はない。また、レベル５
において、パリティグループレベルを４Ｄ＋２Ｐのよう
にパリティグループ内のパリティ数を増加させ、さらに
信頼性（データ消失確率）を向上させたパリティグルー
プの設定も可能である。Although the data arrangement in the logical group 10 as shown in FIG. 2 has been described in the present embodiment, the data arrangement in the logical group 10 can be freely set at the time of initial setting by the user. Therefore, there are no restrictions. Also, level 5
In, the parity group level can be set to 4D + 2P such that the number of parities in the parity group is increased and the reliability (data loss probability) is further improved.

【０１７３】また、本実施例では、信頼性（データ消失
確率）と障害時の性能の観点から、データ消失確率が高
く、障害時の性能低下が大きいパリティグループレベル
の順（５Ｄ＋１Ｐ、３Ｄ＋１Ｐ、および二重化の順）に
回復処理を時間を分散して行った。Further, in the present embodiment, from the viewpoint of reliability (probability of data loss) and performance at the time of failure, parity group levels (5D + 1P, 3D + 1P, and The recovery process was performed in a timely manner in the order of duplexing).

【０１７４】この方法の変形例として以下のような方法
もある。まず、データの重要性が高いほど、消失確率が
低いパリティグループレベル（消失確率は、二重化、３
Ｄ＋１Ｐ、５Ｄ＋１Ｐの順に高くなる）に格納する。ま
た、回復処理は、データの消失確率が低いパリティグル
ープレベルの順（二重化、３Ｄ＋１Ｐ、５Ｄ＋１Ｐの
順）に行う。このように信頼性のみを重視した回復処理
を行うことにより、非常に重要なデータに対するデータ
消失確率を大きく減少させることが可能となる。As a modification of this method, there is the following method. First, the higher the importance of the data, the lower the probability of loss is at the parity group level (the loss probability is duplicated, 3
D + 1P, 5D + 1P). The recovery process is performed in the order of the parity group levels with the lowest data loss probability (duplication, 3D + 1P, 5D + 1P). By performing the recovery process that emphasizes only the reliability in this way, the data loss probability for very important data can be greatly reduced.

【０１７５】さらに、以下のようにしてもよい。まず、
頻繁にＣＰＵ１から読み出しまたは書き込み要求が発行
されるデータは、障害時の性能低下が小さくデータ消失
確率の小さい二重化のパリティグループに格納する。ま
た逆に、それ程、読み出しまたは書き込み要求が発行さ
れることがないデータは、障害時の性能低下が大きく、
しかもデータ消失確率の高い５Ｄ＋１Ｐのパリティグル
ープに格納する。また、ドライブ１２に障害が発生した
場合、データ消失確率の高いパリティグループから回復
処理を行うようにする。これにより、障害により性能低
下している期間を最小にすることが可能となる。Further, the following may be carried out. First,
Data to which a read or write request is frequently issued from the CPU 1 is stored in a duplicated parity group that has a small performance deterioration due to a failure and a small data loss probability. On the other hand, data that is not issued such a read or write request has a large performance degradation at the time of failure,
Moreover, the data is stored in the parity group of 5D + 1P, which has a high probability of data loss. Further, when a failure occurs in the drive 12, the recovery processing is performed from the parity group having a high data loss probability. This makes it possible to minimize the period during which performance is degraded due to a failure.

【０１７６】以上のように、本発明では、ユーザの使用
環境により、回復処理方法を自由に設定することも可能
である。さらに、回復処理を行う時間、および回復処理
の開始を判定する際の上位からの読み出し書き込み要求
の状況に対する設定値も、自由に設定することが可能で
ある。回復処理方法の設定は、ＭＰ２０に対し初期設定
の段階で指示する。As described above, according to the present invention, it is possible to freely set the recovery processing method depending on the usage environment of the user. Furthermore, it is possible to freely set the time for performing the recovery process and the set value for the status of the read / write request from the upper layer when determining the start of the recovery process. The setting of the recovery processing method is instructed to the MP 20 at the initial setting stage.

【０１７７】さらに、以下に示すように、本発明はバッ
クアップにも利用することが可能である。ドライブ１２
に書き込まれているデータが重要な場合、ドライブ１２
内のデータをＭＴあるいは光ディスク等に格納すること
でバックアップを取る。このようにバックアップを取っ
ておけば、ディスクアレイ内のドライブ１２に障害が発
生し、データ消失しても、このバックアップデータから
消失したデータを回復することが可能である。Further, as shown below, the present invention can also be used for backup. Drive 12
If the data written on the
A backup is taken by storing the data in the MT or an optical disk. By making a backup in this way, even if a failure occurs in the drive 12 in the disk array and data is lost, it is possible to recover the lost data from this backup data.

【０１７８】本発明を適用することにより、パリティグ
ループレベルの特性により、このバックアップ処理を行
う時間を分散することができる。具体的には、消失確率
は二重化、３Ｄ＋１Ｐ、５Ｄ＋１Ｐのパリティグループ
の順に高くなる。そこで、消失確率の高いパリティグル
ープほど頻繁にバックアップを取るようにする。これに
より、データ消失確率の高い危険なデータのみのバック
アップですむため、バックアップ時間を短縮することが
可能となり、また、バックアップを行う時間をパリティ
グループレベルの特性により分散することが可能とな
る。By applying the present invention, it is possible to disperse the time for performing this backup processing depending on the characteristics of the parity group level. Specifically, the disappearance probability becomes higher in the order of duplexing, 3D + 1P, and 5D + 1P parity groups. Therefore, a backup is performed more frequently for a parity group with a higher probability of loss. As a result, only the dangerous data with a high probability of data loss need be backed up, so the backup time can be shortened, and the backup time can be distributed according to the characteristics of the parity group level.

【０１７９】また、障害回復と同様に、データの重要性
が高いほど、データ消失確率が低いパリティグループレ
ベル（消失確率は、二重化、３Ｄ＋１Ｐ、５Ｄ＋１Ｐの
順に高くなる）に格納し、バックアップはデータの消失
確率が低いパリティグループレベルの順（二重化、３Ｄ
＋１Ｐ、５Ｄ＋１Ｐの順）に行う信頼性のみを重視した
バックアップも可能なことは明らかである。Similarly to the failure recovery, the higher the importance of the data is, the lower the data loss probability is stored in the parity group level (the loss probability becomes higher in the order of duplication, 3D + 1P, 5D + 1P), and the backup is made of the data. Order of parity group level with low loss probability (redundancy, 3D
It is obvious that the backup can be performed in the order of + 1P, 5D + 1P) with an emphasis only on reliability.

【０１８０】このバックアップは、障害回復処理と同様
に、ユーザが予め設定した値以下になったらバックアッ
プ処理を開始するようにすることも可能である。バック
アップは、システムからバックアップを促すメッセージ
を出力してユーザに行わせるようにしたり、自動的にパ
ックアップを取るようにしてもよい。Similar to the failure recovery process, the backup process can be started when the value becomes equal to or less than the value preset by the user. The backup may be performed by the user by outputting a message prompting the backup from the system, or may be automatically backed up.

【０１８１】従来では１台のドライブ１２内のデータの
総てに対しバックアップを取らなければならなかったた
め、バックアップの時間が非常に長くかかり、バックア
ップ中は通常の読み出しおよび書き込み処理を中止しな
けばならない。ファイルサーバのように多数のユーザが
使用するシステムにおいて、長時間連続してサービスを
中止することは、障害回復と同様に大きな問題となる。
本発明をバックアップに適用することにより、バックア
ップを行う時間をパリティグループレベルの特性により
分散することが可能となるので、本発明はファイルサー
バなどに用いて好適である。Conventionally, backup of all the data in one drive 12 has to be performed, so the backup time is very long, and normal read / write processing must be stopped during backup. I won't. In a system such as a file server used by a large number of users, suspending service continuously for a long time is a big problem as well as failure recovery.
By applying the present invention to backup, it is possible to distribute the backup time according to the characteristics of the parity group level, and therefore the present invention is suitable for use in a file server or the like.

【０１８２】［実施例２］次に、第２の実施例として、
ＲＡＩＤのレベル３において本発明を適用した例を示
す。本実施例のディスクアレイの構成や処理の手順など
は、上記第１の実施例と同様であるので、以下では第１
の実施例と異なる点を説明する。[Embodiment 2] Next, as a second embodiment,
An example of applying the present invention to RAID level 3 will be shown. The configuration and processing procedure of the disk array of this embodiment are the same as those of the first embodiment, so the first embodiment will be described below.
Differences from the embodiment will be described.

【０１８３】図１３は、本実施例での論理グループ１０
内のパリティグループの配置の一例を示す。論理グルー
プ１０において、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ
１，２，３は５Ｄ＋１Ｐのパリティグループレベルのパ
リティグループで、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ
４，５，６は２Ｄ＋１Ｐのパリティグループレベルのパ
リティグループである。FIG. 13 shows the logical group 10 in this embodiment.
An example of the arrangement of the parity groups in the above is shown. In the logical group 10, the SCSI Addr 27 is DADR
1, 2 and 3 are parity groups at a parity group level of 5D + 1P, and Addr 27 in SCSI is DADR.
4, 5 and 6 are parity groups at the 2D + 1P parity group level.

【０１８４】本実施例では、データ消失確率は、５Ｄ＋
１Ｐの方が２Ｄ＋１Ｐより高い。また、レベル３では、
５Ｄ＋１Ｐおよび２Ｄ＋１Ｐ共、障害時と正常時では性
能は変わらない。In the present embodiment, the data loss probability is 5D +.
1P is higher than 2D + 1P. Also, at level 3,
In both 5D + 1P and 2D + 1P, the performance does not change between the time of failure and the normal time.

【０１８５】５Ｄ＋１Ｐのパリティグループレベルのパ
リティグループについて、ＤＡＤＲ１のＤａｔａ＃１を
例に説明する。５Ｄ＋１Ｐのパリティグループは、上位
から転送されてきた１個のデータであるＤａｔａ＃１を
５個のサブデータ（Ｄａｔａ＃１−１，Ｄａｔａ＃１−
２，Ｄａｔａ＃１−３，Ｄａｔａ＃１−４，Ｄａｔａ＃
１−５）に分割し、このそれぞれのサブデータをＳＤ＃
１，２，３，４，５の５台のドライブ１２にパラレルに
書き込む。このとき、これらのサブデータから実施例１
と同様に図５に示すようにパリティを作成し、データの
書き込みと同時にドライブＳＤ＃６に書き込む。A parity group at the parity group level of 5D + 1P will be described by taking Data # 1 of DADR1 as an example. In the 5D + 1P parity group, Data # 1 which is one piece of data transferred from the upper layer is used as five sub-data (Data # 1-1, Data # 1-
2, Data # 1-3, Data # 1-4, Data #
1-5) and sub-data of each of them is SD #
Write in parallel to the five drives 1, 2, 3, 4, and 5. At this time, from these sub-data, Example 1
Similarly, the parity is created as shown in FIG. 5, and the data is written to the drive SD # 6 at the same time as the writing of the data.

【０１８６】また、２Ｄ＋１Ｐのパリティグループレベ
ルのパリティグループは、上位から転送されてきた１個
のデータを２個のサブデータに分割し、このそれぞれの
サブデータを２台のドライブ１２にパラレルに書き込
む。このとき、これらのサブデータからパリティを作成
し、データの書き込みと同時に書き込む。In the parity group of the 2D + 1P parity group level, one piece of data transferred from the upper level is divided into two pieces of sub data, and the respective sub data are written in parallel to the two drives 12. . At this time, a parity is created from these sub data and written at the same time as the data writing.

【０１８７】本実施例において、上記のようにパリティ
グループレベルを設定する方法は、実施例１と同様に、
図１４に示すようなアドレステーブルに対しユーザが初
期設定において行う。また、アドレス変換も実施例１と
同様に図１４に示すアドレステーブルを用いて行う。In the present embodiment, the method of setting the parity group level as described above is the same as in the first embodiment.
The user performs the initial setting for the address table as shown in FIG. Further, address conversion is also performed using the address table shown in FIG. 14 as in the first embodiment.

【０１８８】本実施例で用いる図１４のアドレステーブ
ルは、ほぼ図４のアドレステーブルと同じであるが、本
実施例はＲＡＩＤレベル３であるから、サブデータ名の
フィールド２８が加えられている。The address table of FIG. 14 used in this embodiment is almost the same as the address table of FIG. 4, but since this embodiment is RAID level 3, a sub data name field 28 is added.

【０１８９】本実施例では、データを新規または更新に
より書き込む場合にパリティを更新する処理が実施例１
とは異なる。実施例１では、レベル５のパリティ更新処
理を行うが、本実施例ではレベル３のパリティ更新処理
を行う。レベル３のパリティ更新処理では、レベル５の
ように旧データおよび旧パリティの読み出しは必要では
なく、書き込むデータから図５に示すようなパリティの
作成が可能である。それ以外においては、実施例１で示
した手順により、図１４に示すアドレステーブルに従い
書き込み処理を行う。In this embodiment, the processing of updating the parity when writing data by new or update is the first embodiment.
Is different from. In the first embodiment, level 5 parity update processing is performed, but in the present embodiment, level 3 parity update processing is performed. In the level 3 parity update process, the reading of old data and old parity is not required as in level 5, and the parity as shown in FIG. 5 can be created from the data to be written. Otherwise, the writing process is performed according to the address table shown in FIG. 14 by the procedure shown in the first embodiment.

【０１９０】一方、障害時の書き込みにおいては、実施
例１に示したレベル５では、パリティの更新のみを行っ
たが、本実施例のレベル３ではパリティグループにおい
て、障害が発生しているドライブ以外の正常なドライブ
に対してサブデータおよびパリティを正常時と同様にパ
ラレルに書き込む。On the other hand, in writing at the time of a failure, only the parity was updated at the level 5 shown in the first embodiment, but at the level 3 of the present embodiment, in the parity group other than the drive in which the failure has occurred. Sub-data and parity are written in parallel to the normal drive in the same way as in the normal operation.

【０１９１】また、障害時の読み出し処理では、実施例
１のレベル５と同様に、パリティグループにおいて、障
害が発生しているドライブ以外の正常なドライブからサ
ブデータおよびパリティを読み出し、これらから障害ド
ライブに格納されているサブデータを復元する。レベル
３では、このようにして復元したサブデータと正常なド
ライブから読み出したサブデータとを結合して、ＣＰＵ
１へ転送する。Further, in the read processing at the time of failure, as in level 5 of the first embodiment, in the parity group, sub-data and parity are read from normal drives other than the drive in which the failure has occurred, and the failed drive is read from these. Restore the sub data stored in. At level 3, the sub data restored in this way and the sub data read from the normal drive are combined, and the CPU
Transfer to 1.

【０１９２】論理グループ１０を構成する任意のドライ
ブ１２に障害が発生した場合の障害回復処理は、実施例
１と同様に図５に示すように、障害が発生したドライブ
１２内に格納されているサブデータを、残りの正常なド
ライブ１２に格納されているデータとパリティから復元
し、交換した正常なドライブ１２または予備のドライブ
１２に格納することで回復処理を行う。回復処理方法
は、実施例１と同様に、パリティグループレベルの低い
（２台目の障害が発生した場合にデータ消失となる確率
（データ消失確率）の高い）パリティグループから順次
回復処理を行なう。The failure recovery processing when a failure occurs in any of the drives 12 constituting the logical group 10 is stored in the failed drive 12 as shown in FIG. 5 as in the first embodiment. Recovery processing is performed by restoring the sub data from the data and parity stored in the remaining normal drive 12 and storing it in the replaced normal drive 12 or spare drive 12. As in the case of the first embodiment, the recovery processing method sequentially performs recovery processing from a parity group having a low parity group level (high probability of data loss (data loss probability) when a second failure occurs).

【０１９３】これにより、信頼性と障害時の性能を最小
限に抑さえ、回復処理を集中せずに、時間を分散して行
うことが可能となる。As a result, the reliability and the performance at the time of failure can be suppressed to the minimum, and the recovery processing can be performed in a dispersed manner without concentrating.

【０１９４】また、本実施例の変形例として、実施例１
の変形例と同様に、データの重要性が高いほど、消失確
率が低いパリティグループレベルに格納し、回復処理は
データの消失確率が低いパリティグループレベルの順に
行うようにしてもよい。このように信頼性のみを重視し
た回復処理を行うことにより、非常に重要なデータに対
するデータ消失確率を大きく減少させることが可能とな
る。As a modification of the present embodiment, the first embodiment will be described.
Similar to the modification example, the higher the importance of data, the lower the probability of loss may be stored in the parity group level, and the recovery process may be performed in the order of the parity group level in which the probability of data loss is lower. By performing the recovery process that emphasizes only the reliability in this way, the data loss probability for very important data can be greatly reduced.

【０１９５】以上述べたように、本実施例と実施例１で
は、論理グループ１０を構成するパリティグループが、
レベル５かレベル３の違いでのみで、実施例１と同様の
効果がある。このことから、本実施例の構成において、
ＲＡＩＤのレベル５を適用することも当然可能である。As described above, in the present embodiment and the first embodiment, the parity group forming the logical group 10 is
The same effect as that of the first embodiment is obtained only by the difference between level 5 and level 3. From this, in the configuration of the present embodiment,
It is naturally possible to apply RAID level 5.

【０１９６】［実施例３］上記第１の実施例ではレベル
５を基に論理グループを構成するパリティグループにお
いて、パリティグループレベルを自由に設定する方法を
示した。また、上記第２の実施例では、レベル３を基に
論理グループを構成するパリティグループにおいて、パ
リティグループレベルを自由に設定する方法を示した。
このようにすることにより、信頼性と障害時の性能を最
小限に抑さえ、回復処理の時間を分散して行うことが可
能となった。[Third Embodiment] The first embodiment has shown the method of freely setting the parity group level in the parity group forming the logical group based on the level 5. Further, in the second embodiment, the method of freely setting the parity group level in the parity group forming the logical group based on the level 3 has been described.
By doing so, it is possible to minimize the reliability and the performance in the event of a failure, and to distribute the recovery processing time.

【０１９７】そこで、本実施例では、論理グループを構
成するパリティグループにおいて、異なるＲＡＩＤのレ
ベルを設定する例を説明する。ユーザは、初期設定の段
階で、アドレステーブル上でパリティグループレベルを
設定する際に、異なるＲＡＩＤのレベルを設定すること
ができる。以下、実施例１と異なる点を示す。Therefore, in this embodiment, an example will be described in which different RAID levels are set in the parity groups forming the logical group. The user can set different RAID levels when setting the parity group level on the address table at the initial setting stage. Hereinafter, points different from Example 1 will be shown.

【０１９８】図１５は、本実施例での論理グループ１０
内のパリティグループの配置の一例を示す。論理グルー
プ１０において、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ１
はＲＡＩＤのレベル３において５Ｄ＋１Ｐのパリティグ
ループレベルのパリティグループで、ＳＣＳＩ内Ａｄｄ
ｒ２７がＤＡＤＲ２，３はＲＡＩＤのレベル５において
５Ｄ＋１Ｐのパリティグループレベルのパリティグルー
プで、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ４はＲＡＩＤ
のレベル３において３Ｄ＋１Ｐのパリティグループレベ
ルのパリティグループと二重化のパリティグループが混
在しており、ＳＣＳＩ内Ａｄｄｒ２７がＤＡＤＲ５，６
はＲＡＩＤのレベル５において３Ｄ＋１Ｐのパリティグ
ループレベルのパリティグループと二重化のパリティグ
ループが混在している。FIG. 15 shows the logical group 10 in this embodiment.
An example of the arrangement of the parity groups in the above is shown. In the logical group 10, Addr27 in SCSI is DADR1
Is a parity group of a parity group level of 5D + 1P in RAID level 3, and is Add in SCSI.
r27 is a DADR 2 and 3 is a parity group of a parity group level of 5D + 1P in the RAID level 5, and Addr 27 in SCSI is a DADR 4 is a RAID group
3D + 1P parity group level and redundant parity groups coexist at Level 3 of the above, and Addr 27 in SCSI is DADR 5,6.
In RAID level 5, a parity group of a 3D + 1P parity group level and a redundant parity group are mixed.

【０１９９】図１６は、この時のアドレステーブルを示
す。アドレステーブルは、ほぼ図４と同様であるが、本
実施例では異なるＲＡＩＤのレベルが混在するので、Ｒ
ＡＩＤレベルのフィールド２９が設けられている。ま
た、ＲＡＩＤレベル３の場合に用いるためのサブデータ
名のフィールド２８が設けられている。FIG. 16 shows the address table at this time. The address table is almost the same as that of FIG. 4, but since different RAID levels are mixed in this embodiment, R
An AID level field 29 is provided. Further, a sub data name field 28 for use in the case of RAID level 3 is provided.

【０２００】本実施例において、上記のようにＲＡＩＤ
のレベルおよびパリティグループレベルを設定する方法
は、実施例１および２と同様に、図１６に示すようなア
ドレステーブルに対しユーザが初期設定において行う。
また、アドレス変換も実施例１および２と同様に、図１
６に示すアドレステーブルを用いて行う。In this embodiment, as described above, RAID
The method of setting the level and the parity group level is carried out by the user in the initial setting for the address table as shown in FIG. 16, as in the first and second embodiments.
In addition, address conversion is performed in the same manner as in the first and second embodiments.
The address table shown in FIG.

【０２０１】ＤＡＤＲ１のＲＡＩＤのレベル３における
５Ｄ＋１Ｐのパリティグループレベルのパリティグルー
プおよびＤＡＤＲ４の３Ｄ＋１Ｐのパリティグループレ
ベルのパリティグループにおけるデータの書き込み方法
（パリティの作成方法）と読み出し方法は、正常時、障
害時共に、実施例２と同様である。The data writing method (parity creating method) and reading method in the parity group of 5D + 1P parity group level in the RAID level 3 of DADR1 and the parity group of the parity group level of 3D + 1P in DADR4 are Both are the same as in the second embodiment.

【０２０２】一方、ＤＡＤＲ２，３のＲＡＩＤのレベル
５における５Ｄ＋１Ｐのパリティグループレベルのパリ
ティグループおよびＤＡＤＲ５，６の３Ｄ＋１Ｐのパリ
ティグループレベルのパリティグループにおけるデータ
の書き込み方法（パリティの作成方法）と読み出し方法
は、正常時、障害時共に、実施例１と同様である。On the other hand, the data writing method (parity creating method) and the reading method in the parity group of the parity group level of 5D + 1P in the RAID level 5 of DADR2 and 3 and the parity group of the parity group level of 3D + 1P in DADR5 and 6 are as follows: The same as in the first embodiment for both normal and fault conditions.

【０２０３】ＤＡＤＲ２，５，６の二重化のパリティグ
ループレベルのパリティグループにおけるデータの書き
込み方法（パリティの作成方法）と読み出し方法は、正
常時、障害時共に、実施例１と同様である。The method of writing data (parity creating method) and the method of reading data in the parity group at the parity group level of the dual DADRs 2, 5, and 6 are the same as those in the first embodiment both in the normal state and in the fault state.

【０２０４】本実施例において、論理グループ１０を構
成する任意のドライブ１２に障害が発生した場合の障害
回復処理は、ＲＡＩＤのレベル３では実施例２、ＲＡＩ
Ｄのレベル５では実施例１と同様となり、両者共パリテ
ィグループレベルの低い（２台目の障害が発生した場合
にデータ消失となる確率（データ消失確率）の高い）パ
リティグループから順次回復処理を行なう。これによ
り、本実施例でも、実施例１、２と同様に、信頼性と障
害時の性能を最小限に抑さえ、回復処理を集中してでは
なく、時間を分散して行うことが可能となる。In this embodiment, the failure recovery processing when a failure occurs in any of the drives 12 constituting the logical group 10 is the same as in the second embodiment and RAI in the RAID level 3.
At level 5 of D, the same as in the first embodiment, both of which perform sequential recovery processing from a parity group having a low parity group level (high probability of data loss when a second failure occurs (data loss probability)). To do. As a result, also in the present embodiment, as in the first and second embodiments, it is possible to suppress the reliability and the performance at the time of failure to the minimum and perform the recovery processing in a distributed manner instead of concentrating it. Become.

【０２０５】本実施例では、一つの論理グループ内にお
いて、複数のＲＡＩＤのレベルと各ＲＡＩＤのレベルに
おいて、複数のパリティグループレベルが存在する。ユ
ーザは、自分のデータ量が大きく、しかもＣＰＵ１とデ
ィスクアレイ間で高速転送を行い、データの書き込みお
よび読み出し時間を短縮したい場合は、ＲＡＩＤのレベ
ル３の領域に書き込むように指示する。この時、データ
量および要求する転送速度、信頼性によりパリティグル
ープレベルを５Ｄ＋１Ｐにするか３Ｄ＋１Ｐにするかを
選択する。即ち、データ量が大きく、高速転送を必要と
する場合はパリティグループレベルを５Ｄ＋１Ｐを指定
し、比較的信頼性を要求する場合はパリティグループレ
ベルを３Ｄ＋１Ｐにする。In this embodiment, there are a plurality of RAID levels and a plurality of parity group levels for each RAID level in one logical group. When the user has a large amount of data and wants to perform high-speed transfer between the CPU 1 and the disk array to shorten the data writing / reading time, the user gives an instruction to write in the RAID level 3 area. At this time, the parity group level is selected to be 5D + 1P or 3D + 1P depending on the amount of data, the required transfer rate, and the reliability. That is, when the data amount is large and high-speed transfer is required, the parity group level is designated as 5D + 1P, and when relatively reliable is required, the parity group level is set as 3D + 1P.

【０２０６】また、ユーザは、自分のデータ量が小さい
ため、ＲＡＩＤのレベル３で高速転送を行っても効果が
無い場合は、ＲＡＩＤのレベル５の領域に書き込むよう
に指示する。さらに、ユーザは、自分のデータ量が小さ
く、頻繁に書き込み要求が発生し、しかも、重要なデー
タである場合は、二重化の領域に書き込むように指示す
る。[0206] Further, since the user has a small amount of data, if high-speed transfer at RAID level 3 has no effect, the user instructs to write in the area of RAID level 5. Further, the user gives an instruction to write in a duplicated area when the amount of data of the user is small, frequent write requests occur, and important data.

【０２０７】このように、ユーザからＣＰＵ１を介し指
示された要求は、ＡＤＣ２のＭＰ２０では、アドレステ
ーブルを調べ、要求された書き込み領域に対しＲＡＩＤ
のレベル３、レベル５、二重化の空き領域または更新す
る領域を認識し、その領域に書き込むように制御する。As described above, the request instructed by the user via the CPU 1 is checked in the address table in the MP 20 of the ADC 2 and the RAID is applied to the requested write area.
It recognizes the level 3 and level 5 of the above, the vacant area for duplication or the area to be updated, and controls to write to the area.

【０２０８】以上述べたように、本実施例と実施例１、
２とでは、論理グループ１０を構成するパリティグルー
プにおいて、ＲＡＩＤのレベル５とレベル３と二重化
（レベル１）を混在させたのみで、実施例１、２の効果
を得ることができる。As described above, this embodiment and the first embodiment,
With No. 2, the effects of Examples 1 and 2 can be obtained only by mixing RAID levels 5 and 3 and duplication (level 1) in the parity group forming the logical group 10.

【０２０９】[0209]

【発明の効果】従来のディスクアレイでは、論理グルー
プを構成する全てのパリティグループレベルは、図３に
示すように、全て５Ｄ＋１Ｐのように統一されていた。
このため、データ消失確率が均一となり、信頼性の確保
から障害回復のために長時間停止しなければならなかっ
た。もし、従来のような単一のパリティグループレベル
で論理グループを構成し、障害が発生したドライブの回
復処理を、上述の実施例のように分割して行った場合、
回復処理が全て完了するまでデータ消失確率および障害
時の性能は変わらず低いままであり、回復処理が完了す
るまでの時間が長ければ長いほどデータ消失する確率が
高い。In the conventional disk array, all parity group levels forming a logical group are unified as 5D + 1P as shown in FIG.
For this reason, the probability of data loss becomes uniform, and it has been necessary to stop for a long time in order to recover from failure in order to secure reliability. If a logical group is configured with a single parity group level as in the conventional case, and recovery processing of a drive in which a failure has occurred is performed by dividing it as in the above embodiment,
The data loss probability and the performance at the time of failure remain unchanged until the recovery processing is completed, and the longer the recovery processing is completed, the higher the probability of data loss.

【０２１０】しかし、本発明では、信頼性および性能の
異なる複数のパリティグループレベルのパリティグルー
プで論理グループを構成し、障害が発生したドライブの
データの回復処理を、パリティグループレベルの低い順
（データ消失確率および障害時の性能の低い順）に順次
行なう。これにより、パリティグループレベル単位で回
復処理を行う毎に、信頼性が向上するため、回復処理を
完了する時間を長くしてもデータ消失する確率は低い。
しかも、パリティグループレベル単位で回復処理を行う
毎に、障害時の性能が向上していく。However, according to the present invention, a logical group is formed by a plurality of parity group-level parity groups having different reliability and performance, and data recovery processing of a drive in which a failure has occurred is performed in the ascending order of parity group level (data Order of decreasing probability of loss and performance in case of failure). As a result, reliability is improved each time the recovery process is performed in units of parity group levels, and thus the probability of data loss is low even if the recovery process is completed for a long time.
Moreover, the performance at the time of failure is improved every time the recovery process is performed in units of parity group levels.

【０２１１】以上のことから、本発明によれば、回復処
理を行う時間を分散することが可能となり、しかも、効
率良く回復処理を行なうことが可能となる。From the above, according to the present invention, it is possible to disperse the time for performing the recovery process, and it is possible to perform the recovery process efficiently.

【図面の簡単な説明】[Brief description of drawings]

【図１】第１の実施例の全体構成図FIG. 1 is an overall configuration diagram of a first embodiment.

【図２】第１の実施例の論理グループ内のデータ配置図FIG. 2 is a data layout diagram in a logical group according to the first embodiment.

【図３】従来のディスクアレイの論理グループ内のデー
タ配置図FIG. 3 is a data layout diagram in a logical group of a conventional disk array

【図４】第１の実施例のアドレステーブル説明図FIG. 4 is an explanatory diagram of an address table according to the first embodiment.

【図５】パリティ作成説明図[Figure 5] Parity creation explanatory diagram

【図６】新規データの書き込み処理フローチャート図FIG. 6 is a flowchart of writing processing of new data.

【図７】更新の書き込み処理フローチャート図FIG. 7 is a flowchart of update writing processing.

【図８】障害時の更新の書き込み処理フローチャート図FIG. 8 is a flowchart of update writing processing at the time of failure.

【図９】読み出し処理フローチャート図FIG. 9 is a flowchart of a reading process.

【図１０】障害時の読み出し処理フローチャート図FIG. 10 is a flowchart of read processing at the time of failure.

【図１１】障害回復処理フローチャート図FIG. 11 is a flowchart of failure recovery processing.

【図１２】従来方法と本発明の障害回復処理の説明図FIG. 12 is an explanatory diagram of a conventional method and failure recovery processing of the present invention.

【図１３】第２の実施例の論理グループ内のデータ配置
図FIG. 13 is a data layout diagram in a logical group according to the second embodiment.

【図１４】第２の実施例のアドレステーブル説明図FIG. 14 is an explanatory diagram of an address table according to the second embodiment.

【図１５】第３の実施例の論理グループ内のデータ配置
図FIG. 15 is a data layout diagram in the logical group according to the third embodiment.

【図１６】第３の実施例のアドレステーブル説明図FIG. 16 is an explanatory diagram of an address table according to the third embodiment.

【図１７】ドライブ内部説明図[Fig. 17] Internal view of the drive

【図１８】パリティグループの割り当て説明図FIG. 18 is an explanatory diagram of allocation of parity groups.

【符号の説明】[Explanation of symbols]

１：ＣＰＵ、２：アレイディスクコントローラ（ＡＤ
Ｃ）、３：チャネルパスディレクタ、４：外部インター
フェースパス、５：インターフェースアダプタ、６：チ
ャネルパススイッチ、７：クラスタ、８：チャネルパ
ス、９：ドライブパス、１０：論理グループ、１２：ド
ライブ、１３：チャネルインターフェース（ＣＨＩ
Ｆ）回路タ、１４：データ制御回路（ＤＣＣ）、１５：
チャネル側キャッシュアダプタ（ＣＡｄｐ）、１６：
キャッシュメモリ、１７：ドライブ側キャッシュアダプ
タ（ＣＡｄｐ）、１８：ドライブインターフェース回
路（ＤｒｉｖｅＩＦ）、１９：パリティ生成回路（Ｐ
Ｇ）、２０：マイクロプロセッサ（ＭＰ）、２１：パリ
ティグループレベル、２２：データ名、２３：キャッシ
ュアドレス、２４：データドライブ番号（ＤＤｒｉｖｅ
Ｎｏ．）、２５：障害フラグ、２６：パリティドライ
ブ番号（ＰＤｒｉｖｅＮｏ．）、２７：ＳＣＳＩドラ
イブ内アドレス（ＳＣＳＩ内Ａｄｄｒ）、２８：サブデ
ータ名、２９：ＲＡＩＤレベル。1: CPU, 2: Array disk controller (AD
C), 3: channel path director, 4: external interface path, 5: interface adapter, 6: channel path switch, 7: cluster, 8: channel path, 9: drive path, 10: logical group, 12: drive, 13 : Channel interface (CH I
F) Circuit type, 14: Data control circuit (DCC), 15:
Channel side cache adapter (C Adp), 16:
Cache memory, 17: Drive side cache adapter (C Adp), 18: Drive interface circuit (Drive IF), 19: Parity generation circuit (P
G), 20: microprocessor (MP), 21: parity group level, 22: data name, 23: cache address, 24: data drive number (DDDrive)
No. ), 25: failure flag, 26: parity drive number (PDrive No.), 27: address in SCSI drive (Addr in SCSI), 28: sub data name, 29: RAID level.

Claims

【特許請求の範囲】[Claims]

【請求項１】論理グループを構成する複数台のドライブ
を含むディスク装置と、該ディスク装置を管理する制御
装置とを備えたディスクアレイ装置において、ｉ（ｉはｉ≧１の整数）個のデータと該データから作成
したｊ（ｊはｊ≧１の整数）個のエラー訂正用データと
から構成される第１のパリティグループを格納するため
の領域と、上記第１のパリティグループとは異なる構成
の第２のパリティグループを格納するための領域とが、
前記論理グループ中に混在していることを特徴とするデ
ィスクアレイ装置。1. A disk array device comprising a disk device including a plurality of drives forming a logical group and a controller for managing the disk device, wherein i (i is an integer of i ≧ 1) pieces of data. And an area for storing a first parity group composed of j (j is an integer of j ≧ 1) error correction data created from the data, and a configuration different from the first parity group And an area for storing the second parity group of
A disk array device mixed in the logical group.

【請求項２】論理グループを構成するｎ台のドライブを
含むディスク装置と、該ディスク装置を管理する制御装
置とを備えたディスクアレイ装置において、前記論理グループ内の各ドライブをパーティションで区
切り、ｎ台以下のｍ台の任意のドライブの任意のパーテ
ィションを任意の数選択し、該選択したパーティション
によりパーティショングループを設定し、前記論理グループは、互いにｍが異なる複数のパーティ
ショングループにより構成されることを特徴とするディ
スクアレイ装置。2. A disk array device comprising a disk device including n drives forming a logical group, and a control device for managing the disk device, wherein each drive in the logical group is partitioned into n partitions. Select any number of arbitrary partitions of m or less arbitrary drives, and set a partition group by the selected partitions, and the logical group is configured by a plurality of partition groups in which m is different from each other. Characteristic disk array device.

【請求項３】上位装置からのデ−タの入出力要求に対す
る、当該デ−タを格納してある、または格納するディス
ク装置と、該ディスク装置を管理する制御装置とからな
るディスクアレイ装置において、前記ディスク装置を多数のドライブにより構成し、これ
らのドライブを２台以上のｎ台のドライブの論理グルー
プにグループ分けし、各論理グループ内の各ドライブを
パーティションで区切るとともに、ｎ台以下のｍ台の任意のドライブの任意のパーティショ
ンを任意の数選択し、該選択したパーティションにより
パーティショングループを設定し、前記論理グループは、互いにｍが異なる複数のパーティ
ショングループにより構成されることを特徴とするディ
スクアレイ装置。3. A disk array device comprising a disk device storing or storing the data in response to an input / output request of the data from a host device, and a control device managing the disk device. , The disk device is composed of a large number of drives, these drives are grouped into logical groups of two or more n drives, and each drive in each logical group is divided into partitions, and n or less m A disk characterized by selecting an arbitrary number of arbitrary partitions of arbitrary drives and setting a partition group by the selected partitions, wherein the logical group is composed of a plurality of partition groups having mutually different m. Array device.

【請求項４】前記論理グループ内に前記パーティション
グループを設定する際、選択されたパーティションの各
ドライブ内でのアドレスを同一とすることを特徴とする
請求項２または３に記載のディスクアレイ装置。4. The disk array device according to claim 2, wherein when the partition group is set in the logical group, the addresses of the selected partitions are the same in each drive.

【請求項５】前記論理グループ内に前記パーティション
グループを設定する際、ユーザの指定に応じた種類およ
び大きさのパーティショングループが設定できることを
特徴とする請求項２または３に記載のディスクアレイ装
置。5. The disk array device according to claim 2, wherein when the partition group is set in the logical group, a partition group of a type and size according to a user's designation can be set.

【請求項６】前記論理グループ内に前記パーティション
グループを設定する際、各ドライブから選択されるパー
ティションをシリンダの集合とすることを特徴とする請
求項２または３に記載のディスクアレイ装置。6. The disk array device according to claim 2, wherein when the partition group is set in the logical group, a partition selected from each drive is a set of cylinders.

【請求項７】前記論理グループ内に前記パーティション
グループを設定する際、各ドライブから選択されるパー
ティションをトラックまたはレコード単位とすることを
特徴とする請求項２または３に記載のディスクアレイ装
置。7. The disk array device according to claim 2, wherein when the partition group is set in the logical group, the partition selected from each drive is set in track or record units.

【請求項８】前記論理グループ内に設定されたパーティ
ショングループは、前記制御装置内のテーブルにより管
理されることを特徴とする請求項２または３に記載のデ
ィスクアレイ装置。8. The disk array device according to claim 2, wherein the partition group set in the logical group is managed by a table in the control device.

【請求項９】前記論理グループ内に設定されたパーティ
ショングループ毎に、ＲＡＩＤを設定することを特徴と
する請求項２または３に記載のディスクアレイ装置。9. The disk array device according to claim 2, wherein a RAID is set for each partition group set in the logical group.

【請求項１０】前記論理グループ内に設定された少なく
とも２個のパーティショングループを互いに異なるＲＡ
ＩＤのレベルで構成し、各パーティショングループに
は、複数のデータと該データから作成した１つ以上のエ
ラー訂正用データとからなるパリティグループのデータ
を格納することを特徴とする請求項９に記載のディスク
アレイ装置。10. At least two partition groups set in the logical group have different RAs from each other.
10. The data of a parity group, which is configured at an ID level, and stores in each partition group data of a plurality of data and one or more error correction data created from the data. Disk array device.

【請求項１１】前記論理グループ内に設定された少なく
とも２個のパーティショングループを、前記エラー訂正
用データの数が異なる同一のＲＡＩＤのレベルで構成す
ることを特徴とする請求項９に記載のディスクアレイ装
置。11. The disk according to claim 9, wherein at least two partition groups set in the logical group are configured with the same RAID level in which the number of the error correction data is different. Array device.

【請求項１２】前記論理グループ内に設定された少なく
とも２個のパーティショングループを、性能の異なる同
一のＲＡＩＤのレベルで構成することを特徴とする請求
項９に記載のディスクアレイ装置。12. The disk array device according to claim 9, wherein at least two partition groups set in the logical group are configured with the same RAID level having different performances.

【請求項１３】前記論理グループ内に２個以上のパーテ
ィショングループを設定する際、各パーティショングル
ープに対して設定された異なるＲＡＩＤのレベルをパリ
ティグループレベルとして分類し、各パリティグループ
レベルに対応したテーブルにより各パーティショングル
ープを管理することを特徴とする請求項９に記載のディ
スクアレイ装置。13. When setting two or more partition groups in the logical group, different RAID levels set for each partition group are classified as parity group levels, and a table corresponding to each parity group level is set. 10. The disk array device according to claim 9, wherein each partition group is managed by.

【請求項１４】前記論理グループ内の任意のドライブに
障害が発生し、この障害が発生したドライブ内のデータ
を、論理グループ内の正常なドライブに格納されている
データおよびエラー訂正用データから回復する障害回復
処理の際、前記パリティグループレベル毎に障害回復処
理を行うことを特徴とする請求項１３に記載のディスク
アレイ装置。14. A failure occurs in any drive in the logical group, and the data in the failed drive is recovered from the data stored in the normal drive in the logical group and the error correction data. 14. The disk array device according to claim 13, wherein the failure recovery processing is performed for each parity group level in the failure recovery processing.

【請求項１５】前記パリティグループレベル毎に障害回
復処理を行う際に、あるパリティグループの障害回復処
理が完了した後、ある時間通常の読み出しおよび書き込
み処理を行い、その後、次のパリティグループの障害回
復処理を行うことを繰り返して、障害が発生したドライ
ブ内のデータを回復する障害回復処理を行うことを特徴
とする請求項１４に記載のディスクアレイ装置。15. When performing failure recovery processing for each parity group level, normal read and write processing is performed for a certain period of time after failure recovery processing of a certain parity group is completed, and then failure of the next parity group. 15. The disk array device according to claim 14, wherein failure recovery processing for recovering data in a failed drive is repeated by repeating recovery processing.

【請求項１６】前記論理グループ内の任意のドライブに
障害が発生し、この障害が発生したドライブ内のデータ
を、論理グループ内の正常なドライブに格納されている
データおよびエラー訂正用データから回復する障害回復
処理の際、信頼性およびデータ消失確率の低いパリティ
グループレベルから順次障害回復処理を行うことを特徴
とする請求項１４または１５に記載のディスクアレイ装
置。16. An arbitrary drive in the logical group fails, and data in the failed drive is recovered from data stored in a normal drive in the logical group and error correction data. 16. The disk array device according to claim 14, wherein the failure recovery processing is performed sequentially from a parity group level with low reliability and low data loss probability.

【請求項１７】前記論理グループ内の任意のドライブに
障害が発生し、この障害が発生したドライブ内のデータ
を、論理グループ内の正常なドライブに格納されている
データおよびエラー訂正用データから回復する障害回復
処理の際、信頼性およびデータ消失確率の高いパリティ
グループレベルから順次障害回復処理を行うことを特徴
とする請求項１４または１５に記載のディスクアレイ装
置。17. A failure occurs in any drive in the logical group, and the data in the failed drive is recovered from the data stored in the normal drive in the logical group and the error correction data. 16. The disk array device according to claim 14, wherein the failure recovery processing is performed sequentially from the parity group level with high reliability and high data loss probability.

【請求項１８】高信頼性が要求されるデータを、信頼性
およびデータ消失確率の高いＲＡＩＤのレベルが設定さ
れたパリティグループに格納することを特徴とする請求
項１０または１７に記載のディスクアレイ装置。18. The disk array according to claim 10, wherein data requiring high reliability is stored in a parity group in which a RAID level with high reliability and high data loss probability is set. apparatus.

【請求項１９】前記論理グループ内の任意のドライブに
障害が発生し、この障害が発生したドライブ内のデータ
を、論理グループ内の正常なドライブに格納されている
データおよびエラー訂正用データから回復する障害回復
処理の際、障害時の性能低下が大きいパリティグループ
レベルから障害回復処理を行うことを特徴とする請求項
１４または１５に記載のディスクアレイ装置。19. A failure occurs in any drive in the logical group, and the data in the failed drive is recovered from the data stored in the normal drive in the logical group and the error correction data. 16. The disk array device according to claim 14, wherein the failure recovery processing is performed from the parity group level, which has a large decrease in performance at the time of failure.

【請求項２０】前記論理グループ内の任意のドライブに
障害が発生し、この障害が発生したドライブ内のデータ
を、論理グループ内の正常なドライブに格納されている
データおよびエラー訂正用データから回復する障害回復
処理の際、障害時の性能低下が小さいパリティグループ
レベルから障害回復処理を行うことを特徴とする請求項
１４または１５に記載のディスクアレイ装置。20. A failure occurs in any drive in the logical group, and the data in the failed drive is recovered from the data stored in the normal drive in the logical group and the error correction data. 16. The disk array device according to claim 14, wherein the failure recovery processing is performed from the parity group level in which the performance degradation at the time of failure is small.

【請求項２１】頻繁に読み出しまたは書き込み要求が上
位装置から発行されるデータを、障害時の性能低下が大
きいＲＡＩＤのレベルが設定されたパリティグループに
格納し、障害発生時にはすぐに障害回復処理を行い正常
状態に復帰させることを特徴とする請求項１０または１
９に記載のディスクアレイ装置。21. Data that is frequently issued by a higher-level device for read or write requests is stored in a parity group in which a RAID level that causes a large decrease in performance at the time of failure is set, and failure recovery processing is immediately performed when a failure occurs. 11. The method according to claim 10, wherein the normal state is restored.
9. The disk array device according to item 9.

【請求項２２】頻繁に読み出しまたは書き込み要求が上
位装置から発行されるデータを、障害時の性能低下が小
さいＲＡＩＤのレベルが設定されたパリティグループに
格納し、障害発生時にはすぐに障害回復処理を行い正常
状態に復帰させることを特徴とする請求項１０または２
０に記載のディスクアレイ装置。22. Data which is frequently issued by a host device for read or write is stored in a parity group in which a RAID level is set so that performance degradation at the time of failure is small, and failure recovery processing is immediately performed when a failure occurs. The method according to claim 10, wherein the normal state is restored.
0. The disk array device according to item 0.

【請求項２３】前記論理グループ内のデータのバックア
ップを行う際、前記パリティグループレベル毎にバック
アップ処理を行うことを特徴とする請求項１３に記載の
ディスクアレイ装置。23. The disk array device according to claim 13, wherein when the data in the logical group is backed up, a backup process is performed for each parity group level.

【請求項２４】前記論理グループ内のデータに対してパ
リティグループレベル毎にバックアップを行う際、ある
パリティグループのバックアップ処理が完了した後、あ
る時間通常の読み出しおよび書き込み処理を行い、その
後、次のパリティグループのバックアップ処理を行うこ
とを繰り返して、論理グループを構成する全ドライブ内
のデータのバックアップを行うことを特徴とする請求項
２３に記載のディスクアレイ装置。24. When data in the logical group is backed up for each parity group level, normal read and write processing is performed for a certain time after the backup processing of a certain parity group is completed, and then the next 24. The disk array device according to claim 23, wherein the backup processing of the parity group is repeated to back up the data in all the drives forming the logical group.

【請求項２５】前記論理グループ内のデータに対してパ
リティグループレベル毎にバックアップを行う際、信頼
性およびデータ消失確率の低いパリティグループレベル
から順次バックアップ処理を行うことを特徴とする請求
項２３または２４に記載のディスクアレイ装置。25. When performing backup for each parity group level on the data in the logical group, the backup process is sequentially performed from the parity group level having the low reliability and the low data loss probability. 24. The disk array device according to item 24.

【請求項２６】前記論理グループ内のデータに対してパ
リティグループレベル毎にバックアップを行う際、信頼
性およびデータ消失確率の高いパリティグループレベル
から順次バックアップ処理を行うことを特徴とする請求
項２３または２４に記載のディスクアレイ装置。26. When the data in the logical group is backed up for each parity group level, the backup process is sequentially performed from the parity group level with high reliability and high data loss probability. 24. The disk array device according to item 24.

【請求項２７】上位装置からの読み出しまたは書き込み
要求数が、ユーザが予め設定した値より小さくなった
ら、当該パリティグループの障害回復処理を開始するこ
とを特徴とする請求項１４または１５に記載のディスク
アレイ装置。27. The failure recovery process of the parity group is started when the number of read or write requests from the host device becomes smaller than a value preset by the user. Disk array device.

【請求項２８】上位装置からの読み出しまたは書き込み
要求数が、ユーザが予め設定した値より小さくなった
ら、当該パリティグループのバックアップ処理を開始す
ることを特徴とする請求項２３または２４に記載のディ
スクアレイ装置。28. The disk according to claim 23, wherein when the number of read or write requests from the host device becomes smaller than a value preset by the user, the backup processing of the parity group is started. Array device.

【請求項２９】前記論理グループ内のデータに対してパ
リティグループレベル毎にバックアップを行う際、デー
タ消失確率の高いパリティグループほど頻繁にバックア
ップ処理を行うことを特徴とする請求項２３または２４
に記載のディスクアレイ装置。29. When the data in the logical group is backed up for each parity group level, the backup process is performed more frequently for a parity group having a higher data loss probability.
The disk array device according to 1.

【請求項３０】論理グループを構成する複数台のドライ
ブを含むディスク装置と、該ディスク装置を管理する制
御装置とを備えたディスクアレイ装置における前記ディ
スク装置の領域の区分け方法であって、ｉ（ｉはｉ≧１の整数）個のデータと該データから作成
したｊ（ｊはｊ≧１の整数）個のエラー訂正用データと
から構成される第１のパリティグループを格納するため
の領域と、上記第１のパリティグループとは異なる構成
の第２のパリティグループを格納するための領域とが、
前記論理グループ中に混在するように、前記ディスク装
置の領域を区分けすることを特徴とするディスクアレイ
の区分け方法。30. A method of dividing an area of a disk device in a disk array device comprising a disk device including a plurality of drives forming a logical group, and a control device for managing the disk device, wherein i ( i is an integer of i ≧ 1) and an area for storing a first parity group composed of j (j is an integer of j ≧ 1) error correction data created from the data , An area for storing a second parity group having a configuration different from that of the first parity group,
A method of partitioning a disk array, wherein the areas of the disk devices are partitioned so that they are mixed in the logical group.

【請求項３１】論理グループを構成するｎ台のドライブ
を含むディスク装置と、該ディスク装置を管理する制御
装置とを備えたディスクアレイ装置における前記ディス
ク装置の領域の区分け方法であって、前記論理グループ内の各ドライブをパーティションで区
切り、ｎ台以下のｍ台の任意のドライブの任意のパーテ
ィションを任意の数選択し、該選択したパーティション
によりパーティショングループを設定し、前記論理グル
ープが互いにｍが異なる複数のパーティショングループ
により構成されるように区分けすることを特徴とするデ
ィスクアレイの区分け方法。31. A method of partitioning an area of a disk device in a disk array device comprising a disk device including n drives forming a logical group, and a controller for managing the disk device, Each drive in the group is divided by a partition, any number of n or less m arbitrary drives are selected, a partition group is set by the selected partition, and the logical groups are different from each other in m. A disk array partitioning method characterized by partitioning into a plurality of partition groups.