JP2007527572A5

JP2007527572A5 -

Info

Publication number: JP2007527572A5
Application number: JP2006534090A
Authority: JP
Filing date: 2004-09-30
Publication date: 2007-12-13

Description

インスタントボリュームの復旧を支援するエミュレーティッドストレージシステムEmulated storage system that supports instant volume recovery

発明の分野Field of Invention

本発明はデータストレージに関することである。特に既存のフルバックアップ(full back-up)及び後続のインクリメンタルバックアップ(incremental back-up)を使って、フルバックアップの同等なことを提供するため、テープストレージシステムをエミュレーティング(emulating)して、エンドユーザーが上記のバックアップからデータを復旧し得るようにする装置及び方法に関することである。 The present invention relates to data storage. Especially using existing full backup (full back-up) and the subsequent incremental backup (incremental back-up), to provide the equivalent of a full backup, the tape storage system by emulating (emulating), end The present invention relates to an apparatus and a method for enabling a user to recover data from the above backup.

関連技術の説明Explanation of related technology

ほとんどのコンピューターシステムは、一つ以上のホストコンピューターとこのホストコンピューターによって使われたデータを保存する、一つ以上のデータストレージシステムを含む。このホストコンピューターとストレージシステムは、一般的にファイバチャネルネットワーク、イーサネットネットワーク、又は他の形態の通信ネットワーク等を使って、共にネットワーキング(networking)される。ファイバチャネル(fibre channel)は、チャネル基盤の転送方式の速度とネットワーク基盤の転送方式の柔軟性を組み合わせて、マルティプルイニシエーター(multiple initiator)がネットワークを通じてマルティプルターゲット(multiple target)と通信し得るようにする標準であり、上記のイニシエーターと上記のターゲットは、ネットワークに連結される任意の装置であり得る。ファイバチャネルは、一般的に光ファイバケーブル等の速い電送媒体を使って具現されることによって、大容量データを転送するストレージシステムネットワークに広く選ばれている。 Most computer systems include one or more host computers and one or more data storage systems that store data used by the host computers. The host computer and storage system are typically networked together, such as using a fiber channel network, an Ethernet network, or other form of communication network. Fiber channel combines the speed of channel-based transfer methods with the flexibility of network-based transfer methods so that multiple initiators can communicate with multiple targets over the network. The initiator and the target can be any device connected to the network. The fiber channel is generally selected as a storage system network for transferring large-capacity data by being implemented using a fast transmission medium such as an optical fiber cable.

図１は様々なホストコンピューターとバックアップストレージシステムを含む一般的にネットワーキングされたコンピューティング環境の一例を示す。一つ以上のアプリケーション・サーバー(application server)１０２は、近距離通信網(LAN)１０３を通じて複数のユーザー・コンピュータ１０４に連結される。アプリケーション・サーバー(application server)１０２及びユーザー・コンピュータ１０４は皆“ホストコンピューター”と見なし得る。アプリケーション・サーバー(application server)１０２は、SAN(storage area network)１０８を通じて、一つ以上の第１ストレージ装置１０６に連結される。第１ストレージ装置１０６は例えば、EMC Corporation、IBM Corporation等で利用され得るディスク・アレイであり得る。代案としてバス(未図示)又はその他ネットワークリンクはアプリケーション・サーバーと第１ストレージシステム１０６間の相互接続を提供し得る。バス及び/又はファイバチャネルのネットワーク連結はホストコンピューター[例えば、アプリケーション・サーバー１０２とストレージシステム１０６の間に電送されたパケットのフォーマットを指示する、SCSI(Small Component System Interconnect)プロトコル等のプロトコルを使って作動し得る。 FIG. 1 illustrates an example of a generally networked computing environment that includes various host computers and backup storage systems. One or more application servers 102 are connected to a plurality of user computers 104 through a near field communication network (LAN) 103. Application server 102 and user computer 104 may all be considered “host computers”. An application server 102 is connected to one or more first storage devices 106 through a SAN (storage area network) 108. The first storage device 106 can be, for example, a disk array that can be used by EMC Corporation, IBM Corporation, or the like. Alternatively, a bus (not shown) or other network link may provide an interconnection between the application server and the first storage system 106. The bus and / or fiber channel network connection is performed using a host computer [eg, using a protocol such as the SCSI (Small Component System Interconnect) protocol that indicates the format of the packets transmitted between the application server 102 and the storage system 106. Can work.

図１に図示されたネットワーキングされたコンピューティング環境は、例えば大型の金融機関又は大企業によって使われ得る大型システムの典型的な例である。ほとんどのネットワーキングされたコンピューティング環境が図１に図示された要素を全て含む必要はない。例えば、小さくネットワーキングされたコンピューティング環境は、ストレージシステムに直接又はLANを通じて連結されるホストコンピューターを簡単に含み得る。又は、図１にはユーザー・コンピュータ１０４、アプリケーション・サーバー１０２及び媒体サーバーが別々に図示されているが、これらの機能は一つ以上のコンピューターに結合され得る。 The networked computing environment illustrated in FIG. 1 is a typical example of a large system that can be used, for example, by a large financial institution or large enterprise. Most networked computing environments need not include all of the elements illustrated in FIG. For example, a small networked computing environment can simply include a host computer that is coupled directly or through a LAN to a storage system. Alternatively, although user computer 104, application server 102, and media server are shown separately in FIG. 1, these functions may be combined into one or more computers.

第１ストレージ装置１０６だけでなく、ほとんどのネットワーキングされたコンピューティング環境は、一つ以上の第２又はバックアップストレージシステム１１０を含む。バックアップ・ストレージシステム１１０は大容量ではあるが、信頼性がある第２ストレージシステムが使われ得ると言っても、一般的にテープ・ライブラリ(tape library)となり得る。一般的に、第２ストレージシステムは第１ストレージ装置より低速であるが、オフサイト(off-site)で保存及び削除が可能な何らかの形態の分離可能な媒体(例えば、テープ、磁気ディスク、又は光ディスク)を含む。 Most networked computing environments, not just the first storage device 106, include one or more second or backup storage systems 110. Although the backup storage system 110 has a large capacity, a reliable second storage system can be used, but can generally be a tape library. Generally, the second storage system is slower than the first storage device, but some form of separable media (eg, tape, magnetic disk, or optical disk) that can be stored and deleted off-site. )including.

図示された例において、アプリケーション・サーバー１０２は例えば、イーサネット又は他の通信リンク１１２を通じて、バックアップ・ストレージシステム１１０と直接、通信し得る。しかし、このような連結は比較的遅く、プロセッサ・タイム又はネットワーク帯域等のリソースを消耗する可能性もある。従って、図示されたようなシステムは例えば、SAN１０８とバックアップ・ストレージシステム１１０の間で、ファイバチャネルを使う通信リンクを提供し得る一つ以上の媒体サーバーを含み得る。 In the illustrated example, the application server 102 may communicate directly with the backup storage system 110 via, for example, an Ethernet or other communication link 112. However, such connections are relatively slow and may consume resources such as processor time or network bandwidth. Thus, a system as shown may include, for example, one or more media servers that may provide a communication link using Fiber Channel between the SAN 108 and the backup storage system 110.

媒体サーバー１１４はホストコンピューター（ユーザー・コンピュータ１０４、媒体サーバー１１４、及び/又はアプリケーション・サーバー１０２等）、第１ストレージ装置１０６、及びバックアップストレージシステム１１０の間でデータの転送を制御する、バックアップ/復旧アプリケーションを含むソフトウェアを実行し得る。バックアップ/復旧アプリケーションの例としてVeritas、Legato社等の製品が挙げられる。データの保護のため、ネットワーキングされたコンピューティング環境内の様々なホストコンピューター及び/又は第１ストレージ装置からのデータは、公知のバックアップ/復旧アプリケーションを使うバックアップストレージシステム１１０に周期的にバックアップされ得る。 The media server 114 controls the transfer of data between the host computer (such as the user computer 104, the media server 114, and / or the application server 102), the first storage device 106, and the backup storage system 110. Software including applications can be executed. Examples of backup / recovery applications include products from Veritas, Legato, etc. For data protection, data from various host computers and / or first storage devices in a networked computing environment can be periodically backed up to a backup storage system 110 using known backup / recovery applications.

もちろん、上記したように、ほとんどのネットワーキングされたコンピューティング環境は、図１に図示された例示的なネットワーキングされたコンピューティング環境より小さい、更に少ない構成要素を含み得る。従って、媒体サーバー１１４は又、実質的に単一のホストコンピューター内のアプリケーション・サーバー１０２と結合され得、バックアップ/復旧アプリケーションはバックアップ・ストレージシステム１１０にネットワークを通じて、直接的又は間接的に連結される任意のホストコンピューター上で実行され得ると認識されるべきである。 Of course, as noted above, most networked computing environments may include fewer components than the exemplary networked computing environment illustrated in FIG. Thus, the media server 114 can also be combined with the application server 102 in a substantially single host computer, and the backup / recovery application is directly or indirectly coupled to the backup storage system 110 over the network. It should be appreciated that it can be executed on any host computer.

典型的なバックアップ・ストレージシステムの一例は多数のテープ・カートリッジ、一つ以上のテープ・ドライブ及びテープ・ドライブへのカートリッジのローディングとアンロードを制御するロボットメカニズムを含むテープ・ライブラリである。バックアップ/復旧アプリケーションは、ロボットメカニズムが特定テープカートリッジ、例えばテープの番号０００１の位置を決定して、テープ・ドライブにテープ・カートリッジをローディングすることによって、データがテープ上に書き込まれるように指示する。また、バックアップ/復旧アプリケーションはデータがテープ上に書き込まれるフォーマットを制御する。一般的に、バックアップ/復旧アプリケーションはSCSI命令、又はその他標準化された命令を使ってロボットメカニズムに指示し、テープ・ドライブを制御して、テープ上にデータを書き込み、テープから書き込みデータを前もって復旧させる。 An example of a typical backup storage system is a tape library that includes a number of tape cartridges, one or more tape drives, and a robotic mechanism that controls the loading and unloading of cartridges into the tape drives. The backup / recovery application directs data to be written onto the tape by the robotic mechanism determining the position of a particular tape cartridge, eg, tape number 0001, and loading the tape cartridge into the tape drive. The backup / recovery application also controls the format in which data is written on the tape. Generally, the backup / restore application instructs the robot mechanism with the SCSI command, or other standardized instruction, and controls the tape drive writes data onto the tape, to advance recover the write data from the tape .

従来のテープ・ライブラリのバックアップシステムは、速度、信頼性及び固定された容量を含む、いろいろな問題を有している。ほとんどの大企業は毎週、テラバイトのデータをバックアップする必要がある。しかし、高費用であるにも関わらず、ハイエンドテープ(high-end tape)は一般的に、時間当り約５０ギガバイト(GB/hr)に変換する秒当り３０〜４０メガバイト(MB/s)の速度でのみ、データを読み出し/書き込みし得る。従って、１又は２テラバイトのデータをテープ・バックアップシステムにバックアップするための連続データの転送時間は少なくとも１０〜２０時間になり得る。 Conventional tape library backup systems have various problems, including speed, reliability and fixed capacity. Most large companies need to back up terabytes of data every week. However, despite the high cost, high-end tapes typically have a speed of 30-40 megabytes (MB / s) per second, which translates to about 50 gigabytes (GB / hr) per hour. Only can read / write data. Therefore, the continuous data transfer time for backing up 1 or 2 terabytes of data to the tape backup system can be at least 10 to 20 hours.

又は、ほとんどのテープメーカーは、テープが落ちたり(人又はロボットメカニズムがテープを運んだり、ローディング動作中に落としたりする可能性があるため、典型的なテープ・ライブラリにおいて比較的に頻繁に発生し得る)極度の温度及び湿度等の非理想的な環境条件にテープが露出されたりする場合に、テープで又はテープから、データを保存又は復旧し得るように保障してくれない。従って、調整された環境で保存テープの保存には、相当な注意が必要である。また、複雑な構造のテープ・ライブラリ(ロボットメカニズムを含む)は維持費が高く、それぞれのテープ・カートリッジは比較的高価であり、寿命が制限されている。 Or, most tape manufacturers have relatively frequent occurrences in a typical tape library because tapes can drop (a person or robotic mechanism can carry the tape or drop it during a loading operation). Obtain) If the tape is exposed to non-ideal environmental conditions such as extreme temperature and humidity, it does not guarantee that data can be stored or restored on or from the tape. Therefore, considerable care is required to store the storage tape in a regulated environment. Also, complex libraries of tape libraries (including robotic mechanisms) are expensive to maintain, and each tape cartridge is relatively expensive and has a limited lifetime.

発明の概要Summary of the Invention

本発明の実施形態は従来のテープ・ライブラリシステムが有する問題点の一部又は全部を軽減させたり克服したりし、従来のテープ・ライブラリシステムに比べ、更に信頼できるバックアップ・ストレージシステムを提供する。 Embodiments of the present invention alleviate or overcome some or all of the problems of conventional tape library systems and provide a more reliable backup storage system than conventional tape library systems.

全体を概括して見ると、本発明の実施形態は、バックアップ/復旧アプリケーションが装置及び媒体を物理的なテープ・ライブラリと同一に見なすよう、従来のテープ・バックアップ・ストレージシステムをエミュレーティングするランダム・アクセス基盤ストレージシステムを提供する。本発明のストレージシステムはソフトウェアとハードウェアを使って、物理的なテープ媒体をエミュレーティングし、一つ以上のランダム・アクセスのディスク・アレイ、トランスレイティング・テープ・フォーマット(translating tape format)、線形、シーケンシャルデータをディスクに保存するに適合なデータに代替させる。また、ハードウェア及び/又はソフトウェアで具現されたアプリケーションは、バックアップ・ストレージシステムに保存されたデータを復旧させるために提供される。 Overall, embodiments of the present invention provide a randomized emulation that emulates a traditional tape backup storage system so that backup / recovery applications view devices and media identically to physical tape libraries. Provide an access-based storage system. The storage system of the present invention uses software and hardware to emulate a physical tape medium, one or more random-access disk arrays, translating tape format, linear, Replace sequential data with data suitable for storage on disk. In addition, an application implemented in hardware and / or software is provided to restore data stored in the backup storage system.

本発明の様々な実施形態によると、シーケンシャルテープフォーマットされたデータをランダム・アクセスI/Oに適合したフォーマットに変換するメカニズムが提供される。一実施形態において、NFS(network file system)又は、CIFS(common Internet file system)マウント済みボリューム(mounted volume)としてのホストコンピューター上のテープフォーマットされたデータの、変換された表現をマウントするため、メカニズムが提供される。 In accordance with various embodiments of the present invention, a mechanism is provided for converting sequential tape formatted data into a format compatible with random access I / O. In one embodiment, NFS (network file system) or, for mounting CIFS (common Internet file system) mounted volume tapes formatted data on the host computer as a (Mounted volume), the transformed representation, the mechanism Is provided.

本発明の他の実施形態によると、マウント済みファイルシステムに対する書き込みをセイフストレージ(safe storage)に転換することによって、オリジナルデータが変更されていない状態のままにするためのメカニズムが提供される。一実施形態において、ランダム・アクセスI/Oができるよう、オリジナルデータに対する実時間変化を追跡するためのメカニズムが提供される。他の実施形態において、新規書き込みデータバック(data back)をシーケンシャルテープ・特定I/Oに適合したテープフォーマットされたデータに変換するためのメカニズムが提供される。 According to another embodiment of the present invention, a mechanism is provided for keeping the original data unchanged by converting writing to the mounted file system to safe storage. In one embodiment, a mechanism is provided for tracking real-time changes to the original data to allow random access I / O. In another embodiment, a mechanism is provided for converting a new write data back to tape formatted data compatible with sequential tape specific I / O.

一実施形態においての方法は、バックアップ・ストレージシステムに保存された、最も最近のバックアップされたバージョンの一つ以上のデータファイルに対応する、一つ以上のデータファイルを含むデータボリュームをホストコンピューター上にマウントする段階及び、最も最近にバックアップされたバージョンの一つ以上のデータファイルを保存する間、バックアップ・ストレージシステムに保存された、最も最近にバックアップされたバージョンの一つ以上のデータファイルより、更に最近の第２バージョンの一つ以上のデータファイルに対応するデータをバックアップ・ストレージシステムに保存する段階を含む。上記の方法は、最も最近にバックアップされたバージョンの一つ以上のデータファイルと第２バージョンの一つ以上のデータファイルのリンキング(linking)を含むこともできる。一例において、上記の方法は、最も最近にバックアップされたバージョンの一つ以上のデータファイルと第２バージョンの一つ以上のデータファイルを同一なものと見なす、データ構造の生成を含むこともできる。他の例において、第２バージョンの一つ以上のデータファイルは、最も最近にバックアップされた、バージョンの一つ以上のデータファイルの修正されたバージョンである可能性がある。 In one embodiment, a method includes a data volume on a host computer that includes one or more data files corresponding to the most recently backed up version of one or more data files stored in a backup storage system. More than the most recently backed up version of one or more data files stored in the backup storage system while mounting and storing one or more data files of the most recently backed up version And storing data corresponding to the one or more data files of the recent second version in a backup storage system. The method may also include linking the most recently backed up version of one or more data files and the second version of one or more data files. In one example, the method may include generating a data structure that considers the most recently backed up version of one or more data files and the second version of one or more data files to be the same. In other examples, the second version of the one or more data files may be a modified version of the most recently backed up version of the one or more data files.

他の実施形態において、バックアップストレージシステムは、バックアップデータセットを保存するためのバックアップストレージ媒体及び、上記の方法を具現する指示のセットを実行するために構成された一つ以上のプロセッサを含む制御器を含む。 In another embodiment, a backup storage system includes a backup storage medium for storing a backup data set, and a controller including one or more processors configured to execute a set of instructions embodying the above method. including.

他の実施形態によると、データ構造が保存されているコンピューター読み出し可能な媒体が提供され、上記のデータ構造は一つ以上のデータファイルを含むバックアップデータセットに対応するシステムファイルを独自的に識別する第１識別子及び、バックアップデータセットにおいての一つ以上のデータファイル各々の最近のバージョンが保存されたストレージ媒体上の個別の記憶場所を識別する一つ以上の第２識別子を含む。 According to another embodiment, a computer readable medium having a data structure stored thereon is provided, the data structure uniquely identifying a system file corresponding to a backup data set that includes one or more data files. Including a first identifier and one or more second identifiers that identify individual storage locations on the storage medium in which the most recent version of each of the one or more data files in the backup data set is stored.

添付図面は一定の比率で図示されていない。図面において、様々な図面に図示された、各々の同一の又は、ほぼ同一の構成要素は、同一の参照番号で示した。明確にするため、全ての図面に図示された、全ての構成要素ごとに参照番号を付与してはいない。 The accompanying drawings are not shown to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For clarity, reference numerals are not assigned to all components shown in all drawings.

詳細説明Detailed explanation

多様な実施形態を、添付図面を参照して、さらに詳細に説明する。本発明は図面に図示されたり、後述される説明においての構成要素の配置及び、構造の詳細事項に限定されない。本発明は多様な方法と形態で実施され得る。また、ここで使われた表現及び用語は、本発明を限定しようとするのではなく、説明のためのものである。“含む”、“有する”、“構成される”、“成る”などの表現は、後述されるアイテムと同等のものだけでなく、追加的なアイテムを含む意味である。 Various embodiments will be described in more detail with reference to the accompanying drawings. The present invention is not limited to the details of the arrangement of components and the structure in the description shown in the drawings or described later. The present invention can be implemented in various ways and forms. Also, the expressions and terms used herein are for the purpose of explanation rather than limitation of the present invention. Expressions such as “include”, “have”, “configured”, “consist of” are not only equivalent to the items described below, but also include additional items.

この明細書で使われた“ホストコンピューター”という用語は、ストレージシステム又は、他のホストコンピューターと通信できるパーソナルコンピューター、ワークステーション、メインフレーム、ネットワーキングされたクライアント、サーバーなどのような、一つ以上のプロセッサを有する任意のコンピューターを意味する。ホストコンピューターはユーザーコンピューター(ユーザーワークステーション、PC、メインフレームなどになり得る)だけでなく、媒体サーバー及び、アプリケーションサーバー(図１を参照して、上記したとおりである)を含み得る。また、この明細書内で“ネットワーキングされたコンピューター環境”という用語は、ストレージシステムが各々のホストコンピューターと通信できる方法で、一つ以上の共有したストレージシステムに複数のホストコンピュータが連結された任意のコンピューティング環境を含む。ファイバチャネルは本発明の実施形態に使われ得る通信ネットワークの一例である。しかし、このネットワークはファイバチャネルに限られず、多様なネットワーク構成要素はファイバチャネルの代わりに又はこれに追加的にトークンリング、イーサネット等の任意のネットワークを通じて又は、他のネットワーク連結の組み合わせを通じて、互いに通信ができると理解されるべきである。この発明の実施形態はSCSI又は、並列SCSIのようなバストポロジーに使われ得もする。 As used herein, the term “host computer” refers to one or more storage systems or personal computers, workstations, mainframes, networked clients, servers, etc. that can communicate with other host computers. Means any computer having a processor. The host computer can include not only user computers (which can be user workstations, PCs, mainframes, etc.), but also media servers and application servers (as described above with reference to FIG. 1). Also, within this specification, the term “networked computer environment” refers to any arbitrary connection of a plurality of host computers to one or more shared storage systems in such a way that the storage system can communicate with each host computer. Includes computing environment. Fiber Channel is an example of a communication network that can be used in embodiments of the present invention. However, this network is not limited to Fiber Channel, and various network components communicate with each other instead of or in addition to Fiber Channel through any network such as Token Ring, Ethernet, etc., or through other combinations of network connections. It should be understood that Embodiments of the invention can also be used in bus topologies such as SCSI or parallel SCSI.

本発明の多様な実施形態によると、分離可能な媒体基盤ストレージシステムをエミュレーティングするため、一つ以上のディスクアレイを使うことができる仮想分離の可能な媒体ライブラリーバックアップ・ストレージシステムが提供される。本発明の実施形態によると、ユーザーが既存のバックアップ手順を修正または調整したり、新たなバックアップ/復旧アプリケーションを購入する必要なく分離可能な媒体(テープ、磁気ディスク、光ディスク等)にデータをバックアップするのに使われるものと同一のバックアップ/復旧アプリケーションを使い、ディスクアレイにデータがバックアップできる。上記の一実施形態において、テープがエミュレーティングされた分離可能な媒体はテープであり、この発明のバックアップ・ストレージジシステムはテープ及び、従来のテープライブラリーシステムでテープのハンドリングに使われた、ロボットメカニズムを含むテープライブラリーシステムをエミュレーティングする。 According to various embodiments of the present invention, a virtual separable media library backup storage system is provided that can use one or more disk arrays to emulate a separable media based storage system. . According to an embodiment of the present invention, data is backed up to a separable medium (tape, magnetic disk, optical disk, etc.) without the need to modify or adjust an existing backup procedure or purchase a new backup / recovery application. Data can be backed up to a disk array using the same backup / recovery application used for In the above-described embodiment, the separable medium on which the tape is emulated is a tape, and the backup storage system of the present invention is a robot used for handling tape and a tape in a conventional tape library system. Emulate a tape library system that includes a mechanism.

本発明の実施形態によるストレージシステムは、ホストコンピューター(バックアップ/復旧アプリケーションを駆動する)とバックアップ・ストレージ媒体を一緒にインターフェースするハードウェアとソフトウェアを含む。ストレージシステムはテープまたは、他の形態の分離可能なストレージ媒体をエミュレーティングし、バックアップ/復旧アプリケーションが装置及び、媒体を物理的テープライブラリーと同一と見做すようになり、線形、シーケンシャルテープフォーマットデータをランダムアクセスディスク上に保存するのに適するデータに変換するよう設計することができる。このような方式で、本発明のストレージシステムは新たなバックアップ/復旧アプリケーションソフトウェアまたは、政策を必要とせず、向上した機能(以下で説明するように、ユーザーが個人的にバックアップされたユーザーファイルを検索できるようにする等の機能)を提供できる。 A storage system according to an embodiment of the present invention includes hardware and software that interface a host computer (which drives a backup / recovery application) and a backup storage medium together. Storage systems emulate tapes or other forms of separable storage media, allowing backup / recovery applications to see devices and media identical to physical tape libraries, linear, sequential tape format It can be designed to convert the data into data suitable for storage on a random access disk. In this way, the storage system of the present invention does not require any new backup / recovery application software or policies, and has an improved function (as described below, the user searches for user files that have been personally backed up. Functions such as making it possible).

図２は本発明の実施形態によるバックアップ・ストレージシステム１７０を含むネットワーキングされたコンピューティング環境の一実施形態のブロック図を表す。図示されたように、ホストコンピューター１２０はネットワーク連結１２１を通じてストレージシステム１７０に繋がる。このネットワーク連結１２１は例えば、ホストコンピューター１２０とストレージシステム１７０間の高速データの転送が可能なファイバチャネル連結などとなり得る。ホストコンピューター１２０は一つ以上のアプリケーションサーバー１０２(図１)及び/または、媒体サーバー１１４(図１)となったり、含むことができ、ネットワーキングされたコンピューティング環境内に存在する任意のコンピューターまたは、第一ストレージシステム１１０(図１)からデータのバックアップを可能にできるものとして認識されるべきである。また、一つ以上のユーザーコンピューター１３６はイーサネット連結などの他のネットワーク連結１３８を通じて、ストレージシステム１７０に繋がることもできる。後述のとおり、ストレージシステムはユーザーコンピューター１３６のユーザーがストレージシステムからバックアップされたユーザーファイルを見て、選択的な復旧を可能にすることもできる。 FIG. 2 depicts a block diagram of one embodiment of a networked computing environment that includes a backup storage system 170 according to an embodiment of the present invention. As shown, the host computer 120 is connected to the storage system 170 through the network connection 121. The network connection 121 may be, for example, a fiber channel connection that enables high-speed data transfer between the host computer 120 and the storage system 170. Host computer 120 can be or include one or more application servers 102 (FIG. 1) and / or media server 114 (FIG. 1), and can be any computer present in a networked computing environment, or It should be appreciated that data backup from the first storage system 110 (FIG. 1) can be enabled. One or more user computers 136 may also be connected to the storage system 170 through other network connections 138 such as Ethernet connections. As will be described later, the storage system may allow a user of the user computer 136 to selectively recover by looking at a user file backed up from the storage system.

ストレージシステムは例えば、以下により詳しく説明されたような一つ以上のディスクアレイになリ得るバックアップ・ストレージ媒体１２６を含む。バックアップ・ストレージ媒体１２６はホストコンピューター１２０からバックアップしたデータのための実際の保存空間を提供する。しかし、ストレージシステム１７０はテープライブラリーのような分離可能な媒体ストレージシステムをエミュレーティングし、ホストコンピューター１２０上にバックアップ/復旧アプリケーションを実行することで、従来の分離可能なストレージ媒体にデータがバックアップされたように見えるようにする、追加的なハードウェア及び、ソフトウェアを含むこともできる。故に、図２に図示されたようにストレージシステム１７０は例えば、テープのような仮想または、エミュレーティングされた分離可能なストレージ媒体を意味する“エミュレーティングされた媒体”１３４が含まれる。この“エミュレーティングされた媒体”１３４はストレージシステム・ソフトウェア及び/または、ハードウェアによりホストコンピューターに提供され、物理的なストレージ媒体としてホストコンピューターに現れる。実際のバックアップ・ストレージ媒体１２６とエミュレーティングされた媒体１３４間のインターフェーシングは、以下での詳細な説明のとおりホストコンピューター１２０からデータを受け取り、バックアップ・ストレージ媒体１２６にデータを保存するスイッチングネットワーク１３２及び、ストレージシステム制御器(未図示)となり得る。このような方式で、ストレージシステムは従来のテープ・ストレージシステムをホストコンピューター１２０にエミュレーティングする。 The storage system includes, for example, a backup storage medium 126 that can be one or more disk arrays as described in more detail below. The backup storage medium 126 provides an actual storage space for data backed up from the host computer 120. However, the storage system 170 emulates a separable media storage system such as a tape library, and by executing a backup / recovery application on the host computer 120, data is backed up to a conventional separable storage medium. Additional hardware and software can also be included to make it look like. Thus, as illustrated in FIG. 2, the storage system 170 includes an “emulated medium” 134 that means a virtual or emulated separable storage medium such as, for example, tape. This “emulated medium” 134 is provided to the host computer by storage system software and / or hardware and appears on the host computer as a physical storage medium. Interfacing between the actual backup storage medium 126 and the emulated medium 134 is a switching network 132 that receives data from the host computer 120 and stores the data in the backup storage medium 126 as described in detail below. And a storage system controller (not shown). In this manner, the storage system emulates a conventional tape storage system in the host computer 120.

一実施形態によると、ストレージシステムはストレージシステム１７０上のホストコンピューター１２０からバックアップされたユーザーデータと関連したメタデータ(metadata)を保存する“論理メタデータキャッシュ”２４２を含み得る。ここで使われた“メタデータ”という用語はユーザーデータに関する情報を表し、実際のユーザーデータの特性を記述するデータを意味する。論理メタデータキャッシュ２４２はユーザー及び/または、ソフトウェアアプリケーションがバックアップされたユーザーファイルをランダムに配置し、互いにユーザーファイルを比べ、或いはバックアップされたユーザーファイルにアクセスし、調整できるようにする検索可能なデータの集合を意味する。論理メタデータキャッシュ２４２内に保存されたデータが使えるソフトウェアアプリケーションの二つの例は、より詳しく後述されるが、エンドユーザー復旧アプリケーション３００及び、合成フルバックアップ・アプリケーション２４０を含む。 According to one embodiment, the storage system may include a “ logical metadata cache” 242 that stores metadata associated with user data backed up from the host computer 120 on the storage system 170. As used herein, the term “metadata” represents information about user data and refers to data that describes the characteristics of actual user data. The logical metadata cache 242 is searchable data that allows users and / or software applications to randomly place backed user files, compare user files with each other, or access and adjust backed up user files. Means a set of Two examples of software applications that can use data stored in the logical metadata cache 242 include an end-user recovery application 300 and a synthetic full backup application 240, which will be described in more detail below.

要するに、合成フルバックアップ・アプリケーション２４０は既存の一つ以上のフルバックアップ・データセットと一つ以上のインクリメンタルバックアップ・データセットから合成フルバックアップ・データセットが生成できる。合成フルバックアップは周期的(例えば、毎週)フルバックアップを遂行する必要がないので、時間とネットワーク・リソースを相当に節約できる。合成フルバックアップ・アプリケーション２４０は後で詳しく説明する。エンドユーザー復旧アプリケーション３００は、エンドユーザー（例えば、ユーザーコンピューター１３６のオペレータ）がストレージシステム１７０から、前もってバックアップされたユーザーファイルをブラウジング、ローケイティング、ビューイング及び/又は、復旧できるようにする。これについても後で詳しく説明する。 In short, the synthetic full backup application 240 can generate a synthetic full backup data set from one or more existing full backup data sets and one or more incremental backup data sets. Synthetic full backups do not need to perform periodic (eg weekly) full backups, which can save considerable time and network resources. The synthetic full backup application 240 will be described in detail later. The end user recovery application 300 allows an end user (eg, an operator of the user computer 136) to browse, locate, view, and / or recover previously backed up user files from the storage system 170. This will also be described in detail later.

前述のとおり、ストレージシステム１７０はホストコンピューター１２０とバックアップ・ストレージ媒体１２６をインターフェースさせるハードウェア及びソフトウェアを含む。本発明の実施形態によるハードウェア及びソフトウェアは、従来のテープライブラリー・バックアップシステムをエミュレーティングし、ホストコンピューター１２０の観点ではテープ上にデータがバックアップされたように見えるが、実際には複数のディスクアレイのような他のストレージ媒体上にバックアップがされる。 As described above, the storage system 170 includes hardware and software that interfaces the host computer 120 and the backup storage medium 126. The hardware and software according to embodiments of the present invention emulate a conventional tape library backup system, and in terms of the host computer 120, it appears that data has been backed up on tape, but in practice multiple disks Backups are made on other storage media such as arrays.

図３は本発明の実施形態によるストレージシステム１７０の一実施形態を表すブロック図である。一実施形態において、ストレージシステム１７０のハードウェアはストレージシステム制御器１２２及びバックアップ・ストレージ媒体１２６にストレージシステム制御器１２２を連結するスイッチングネットワーク１３２を含む。ストレージシステム制御器１２２はストレージシステム・ソフトウェアの全部または、一部を駆動できるプロセッサ１２７(単一プロセッサまたは、複数のプロセッサとなり得る)及び、メモリー１２９(RAM、ROM、PROM、EEPROM、フラッシュメモリー及び、その組み合わせなど)を含む。メモリー１２９はバックアップ・ストレージ媒体１２６に保存されたデータに関するメタデータを保存するのに使われることもある。本発明の実施形態を実行するプログラミングコードを含むソフトウェアは一般的にRAM、ROM、光ディスク、磁気ディスクまたは、テープなどのコンピューターが読み出し可能及び/又は書き込み可能な不揮発性の記録媒体に保存され、以降のプロセッサ１２７によって実行することができるメモリー１２９にコピーされる。このようなプログラミングコードは複数のプログラミング言語、例えばJava、Visual Basic、C、C#、C++、Fortran、Pascal、Eiffel、Basic、COBALまたは、その組み合わせのいずれか一つに書き込むことができ、本発明は特定のプログラミング言語に限定されない。一般的に、動作時にプロセッサ１２７は本発明の実施形態を実行するコードと同じデータが不揮発性の記録媒体から不揮発性の記録媒体よりプロセッサによって、情報に速くアクセスできるようにするRAMのような他の形態のメモリーで読み出されるようにする。 FIG. 3 is a block diagram showing an embodiment of the storage system 170 according to the embodiment of the present invention. In one embodiment, the storage system 170 hardware includes a storage system controller 122 and a switching network 132 that couples the storage system controller 122 to a backup storage medium 126. The storage system controller 122 includes a processor 127 (can be a single processor or a plurality of processors) capable of driving all or part of the storage system software, and a memory 129 (RAM, ROM, PROM, EEPROM, flash memory, and the like). Including combinations thereof). The memory 129 may be used to store metadata about data stored on the backup storage medium 126. Software including programming code for implementing embodiments of the invention is typically stored on a computer readable and / or writable non-volatile recording medium such as RAM, ROM, optical disk, magnetic disk, or tape, and so on. To a memory 129 that can be executed by the other processor 127. Such programming code can be written in any one of a plurality of programming languages such as Java, Visual Basic, C, C #, C ++, Fortran, Pascal, Eiffel, Basic, COBAL, or a combination thereof. It is not limited to a specific programming language. In general, in operation, the processor 127 may be the same as the code that implements embodiments of the present invention, such as a RAM that allows a processor to access information faster from a non-volatile storage medium than a non-volatile storage medium. It is made to read with the memory of the form.

図３に図示したように、制御器１２２は制御器１２２をホストコンピューター１２０及び、スイッチングネットワーク１３２に連結する多数のポートアダプター１２４a、１２４b、１２４cを含む。図示されたように、ホストコンピューター１２０は例えば、ファイバチャネル・ポートアダプターなどのポートアダプター１２４aを通じてストレージシステムに繋がる。ストレージシステム制御器１２２を通じてホストコンピューター１２０はデータをバックアップ・ストレージ媒体１２６にバックアップし、バックアップ・ストレージ媒体１２６からデータを復旧できる。 As shown in FIG. 3, the controller 122 includes a number of port adapters 124 a, 124 b, 124 c that couple the controller 122 to the host computer 120 and the switching network 132. As shown in the figure, the host computer 120 is connected to the storage system through a port adapter 124a such as a fiber channel / port adapter. Through the storage system controller 122, the host computer 120 can back up data to the backup storage medium 126 and recover data from the backup storage medium 126.

図示した例において、スイッチングネットワーク１３２は一つ以上のファイバチャネル・スイッチ１２８a、１２８bを含み得る。ストレージシステム制御器１２２はストレージシステム制御器をファイバチャネル・スイッチ１２８a、１２８bに繋ぐ複数のファイバチャネル・ポートアダプター１２４b、１２４cを含む。ファイバチャネル・スイッチ１２８a、１２８bを通じてストレージシステム制御器１２２はデータがバックアップストレージ媒体１２６にバックアップされるようにする。図３に図示したように、スイッチングネットワーク１３２はイーサネット・ポートアダプター１２５a、１２５bを通じてストレージシステム制御器１２２に繋がった一つ以上のイーサネットスイッチ１３０a、１３０bをさらに含み得る。一例において、ストレージシステム制御器１２２は例えば、LAN１０３に繋がってストレージシステム１７０が後述のとおり、ホストコンピューター(例えば、ユーザーコンピューター)との通信を可能にする他のイーサネット・ポートアダプター１２５cをさらに含む。 In the illustrated example, the switching network 132 may include one or more Fiber Channel switches 128a, 128b. The storage system controller 122 includes a plurality of fiber channel port adapters 124b, 124c that connect the storage system controller to the fiber channel switches 128a, 128b. Through the Fiber Channel switches 128a and 128b, the storage system controller 122 allows data to be backed up to the backup storage medium 126. As illustrated in FIG. 3, the switching network 132 may further include one or more Ethernet switches 130a, 130b connected to the storage system controller 122 through Ethernet port adapters 125a, 125b. In one example, the storage system controller 122 further includes, for example, another Ethernet port adapter 125c that connects to the LAN 103 and allows the storage system 170 to communicate with a host computer (eg, a user computer) as described below.

図３に図示した例において、ストレージシステム制御器１２２は二つのファイバチャネル・スイッチと二つのイーサネットスイッチを含む、スイッチングネットワークを通じてバックアップ・ストレージ媒体１２６に繋がる。ストレージシステム１７０内の二つ以上の各々の形態のスイッチの提供は、システム内の全ての単一ポイントの失敗を除去する。即ち、一つのスイッチ（例えば、ファイバチャネル・スイッチ１２８a）が失敗しても、ストレージシステム制御器１２２は依然として他のスイッチを通じてバックアップ・ストレージ媒体１２６と通信できる。このような配列は信頼度及び、速度の面でメリットがある。例えば、上記したように、余分の構成要素の提供と単一ポイントの失敗を除去することで信頼度が向上する。また、幾つかの実施形態において、ストレージシステム制御器は並列ファイバチャネル・スイッチの全部または、一部を使ったバックアップストレージ媒体１２６上にデータをバックアップできるので、全体のバックアップ速度が速くなる。しかし、システムは二つ以上の各々の形態のスイッチを含んだり、スイッチングネットワークがファイバチャネル及びイーサネットスイッチを含む必要がない。また、バックアップ・ストレージ媒体１２６が単一ディスクアレイを含む例においてはスイッチが全く必要ない。 In the example illustrated in FIG. 3, the storage system controller 122 is connected to the backup storage medium 126 through a switching network that includes two Fiber Channel switches and two Ethernet switches. Providing two or more of each form of switch in storage system 170 eliminates all single point failures in the system. That is, if one switch (eg, Fiber Channel switch 128a) fails, the storage system controller 122 can still communicate with the backup storage medium 126 through the other switch. Such an arrangement is advantageous in terms of reliability and speed. For example, as described above, reliability is improved by providing extra components and eliminating single point failures. In some embodiments, the storage system controller can back up data on the backup storage medium 126 using all or part of the parallel Fiber Channel switch, thus increasing the overall backup speed. However, the system need not include more than one form of each switch, and the switching network need not include Fiber Channel and Ethernet switches. Also, no switch is required in the example where the backup storage medium 126 includes a single disk array.

前述のとおり、一実施形態において、バックアップ・ストレージ媒体１２６は一つ以上のディスクアレイを含み得る。一つの望ましい実施形態において、バックアップ・ストレージ媒体１２６は複数のATAまたは、SATAディスクを含む。このようなディスクは市中で容易に手に入れることのできる製品で、EMC、IBMなどの製造社の従来の保存アレイ製品に比べて、比較的低廉である。また、このような分離可能な媒体は(例えば、テープ)の価格とこのような媒体が限定された寿命を有するという事実を念頭に置く場合、このような媒体は価格面で従来のテープ基盤のバックアップ・ストレージシステムに匹敵する。また、このようなディスクはテープに比べて、高速読み出し/書き込みが可能である。例えば、単一ファイバチャネル連結を通じてテープのバックアップ速度より確実に速い(例えば、十倍ほど)約５４０GB/hrに換算される、少なくとも１５０MB/sの速度でデータをディスク上にバックアップすることができる。その上、一部のファイバチャネル連結は並列で具現することができるので、一層速くなる。本発明の実施形態によると、バックアップ・ストレージ媒体は、複数のRAID(Redundant Array of Independent Disks)方式を具現するように構成することができる。例えば、一実施形態においてバックアップ・ストレージ媒体はRAID-５具現として構成することができる。 As described above, in one embodiment, the backup storage medium 126 may include one or more disk arrays. In one preferred embodiment, the backup storage medium 126 includes a plurality of ATA or SATA disks. Such discs are easily available in the market and are relatively inexpensive compared to conventional storage array products from manufacturers such as EMC and IBM. Also, in view of the price of such separable media (e.g., tape) and the fact that such media has a limited lifetime, such media are price-sensitive to traditional tape-based media. Comparable to backup storage system. In addition, such discs as compared to the tape, which enables high-speed read / write. For example, data can be backed up on a disk at a speed of at least 150 MB / s, which is converted to about 540 GB / hr, which is reliably faster (eg, about 10 times) than the tape backup speed through a single fiber channel connection. In addition, some Fiber Channel connections can be implemented in parallel, which is even faster. According to the embodiment of the present invention, the backup storage medium can be configured to implement a plurality of RAID (Redundant Array of Independent Disks) systems. For example, in one embodiment, the backup storage medium can be configured as a RAID-5 implementation.

前述のとおり、本発明による実施形態は、テープカートリッジを物理的なバックアップ・ストレージ媒体として交代するように、ディスクアレイを使った従来のテープライブラリー・バックアップシステムをエミュレーティングすることによって、“仮想テープライブラリー”を提供する。従来のテープライブラリーに提供され物理的なテープカートリッジは“仮想カートリッジ”という用語により取り替えられる。“仮想テープライブラリー”という用語は例えば、一つ以上のディスクアレイとしてソフトウェア及び/又は、物理的なハードウェアで具現することができるエミュレーティングされたテープライブラリーを意味するものとして認識されるべきである。ここでは、主にエミュレーティングされたテープを言及しているが、ストレージシステムはCD-ROM、DVD-ROMなどの他のストレージ媒体をエミュレーティングすることができ、“仮想カートリッジ”という用語は一般的にエミュレーティングされたテープまたは、エミュレーティングされたCDなどのエミュレーティングされたストレージ媒体を意味するものと認識されるべきである。一実施形態において、仮想カートリッジは、実際には一つ以上のハードディスクに対応する。 As described above, embodiments in accordance with the present invention emulate a “virtual tape” by emulating a conventional tape library backup system using a disk array to replace a tape cartridge as a physical backup storage medium. Library "is provided. The physical tape cartridge provided in a conventional tape library is replaced by the term “virtual cartridge”. The term “virtual tape library” should be recognized as meaning an emulated tape library that can be implemented in software and / or physical hardware as one or more disk arrays, for example. It is. Although we mainly refer to emulated tapes here, the storage system can emulate other storage media such as CD-ROM, DVD-ROM, etc. The term “virtual cartridge” is common Should be recognized as meaning emulated storage media such as tape or emulated CD. In one embodiment, the virtual cartridge actually corresponds to one or more hard disks.

故に、一実施形態において、ソフトウェア・インターフェースはテープライブラリーをエミュレーティングするよう提供され、バックアップ/復旧アプリケーションにおいて、データがテープにバックアップされるように見えるようになる。しかし、実際のテープライブラリーはこのディスクアレイ上にデータが実際にバックアップされるようにするが、一つ以上のディスクアレイによって取り替えられる。以下、ストレージシステム１７０に含まれたソフトウェアの多様な形態、特性及び動作を説明する。 Thus, in one embodiment, a software interface is provided to emulate a tape library so that in a backup / recovery application, data appears to be backed up to tape. However, an actual tape library allows data to be actually backed up on this disk array, but is replaced by one or more disk arrays. Hereinafter, various forms, characteristics, and operations of software included in the storage system 170 will be described.

ソフトウェアがストレージシステム１７０に“含まれる”とも説明でき、ストレージシステム制御器１２２(図３)のプロセッサ１２７によって実行されるとも説明できるが、ストレージシステム制御器１２２上で全てのソフトウェアが実行される必要はない。合成フルバックアップ・アプリケーション及び、エンドユーザー復旧アプリケーションなどのソフトウェアプログラムはホストコンピューター及び/又は、ユーザーコンピューターで実行することができ、ストレージシステム制御器、ホストコンピューター及び、ユーザーコンピューターの全部または、一部を経てこの部分が分配され得る。故に、ストレージシステム制御器がコンピューターなどの含まれた物理的エンティティーである必要はない。ストレージシステム１７０は媒体サーバー１１４または、アプリケーションサーバー１０２などのホストコンピューター上に存在するソフトウェアと通信できる。また、ストレージシステムは同一または、相違するホストコンピューター上に存在したり、このホストコンピューターで駆動することができる、幾つかのソフトウェア・アプリケーションを含み得る。ストレージシステム１７０は一部の実施形態において、分離した装置として実施することができても、分離した装置に限られない。一例に、ストレージシステム１７０は従来のテープライブラリー・バックアップシステム“プラグ・アンド・プレー”の代わりとして作用する独立ユニットとして提供することができる。(即ち、既存のバックアップ手順及び、政策を修正する必要はない。)このようなストレージシステム・ユニットは従来のバックアップシステムを含むネットワーキングされたコンピューティング環境に使われ、余分または、追加的な保存容量を提供することもできる。 Although the software can be described as “included” in the storage system 170 and can be described as being executed by the processor 127 of the storage system controller 122 (FIG. 3), all software must be executed on the storage system controller 122 There is no. Software programs such as synthetic full backup applications and end-user recovery applications can be run on the host computer and / or user computer, via all or part of the storage system controller, host computer and user computer. This part can be distributed. Thus, the storage system controller need not be an included physical entity such as a computer. The storage system 170 can communicate with software residing on the media server 114 or a host computer such as the application server 102. The storage system may also include several software applications that may reside on or be driven by the same or different host computers. Although the storage system 170 can be implemented as a separate device in some embodiments, it is not limited to a separate device. As an example, the storage system 170 can be provided as an independent unit that acts as an alternative to the traditional tape library backup system “plug and play”. (That is, there is no need to modify existing backup procedures and policies.) Such storage system units are used in networked computing environments that include traditional backup systems and have extra or additional storage capacity. Can also be provided.

前述のとおり、一実施形態によるとホストコンピューター１２０（例えば、アプリケーションサーバー１０２又は媒体サーバー１１４になり得る。図１参照）はこのホストコンピューター１２０をストレージシステム１７０に繋ぐネットワーク・リンク(例えば、ファイバチャネル・リンク)１２１を通じて、バックアップ・ストレージ媒体１２６上にデータがバックアップできる。主にエミュレーティングされた媒体上にデータをバックアップすることについて後述されるが、この原理はエミュレーティングされた媒体からバックアップデータを復旧するのにも適用されるものと認識されるべきである。ホストコンピューター１２０とエミュレーティングされた媒体１３４間のデータの流れは前述のとおり、バックアップ/復旧アプリケーションによって制御できる。バックアップ/復旧アプリケーションの観点では、データが物理的バージョンのエミュレーティングされた媒体上に実際にバックアップされたように見られることもある。 As described above, according to one embodiment, the host computer 120 (e.g., can be the application server 102 or the media server 114. See FIG. 1) is a network link (e.g., Fiber Channel network) that connects the host computer 120 to the storage system 170. Data can be backed up on the backup storage medium 126 via the link 121. Although primarily described below for backing up data on emulated media, it should be recognized that this principle also applies to recovering backup data from emulated media. The data flow between the host computer 120 and the emulated medium 134 can be controlled by the backup / recovery application as described above. From the point of view of a backup / recovery application, data may appear to have actually been backed up on a physical version of the emulated media.

図４に図示したとおり、ストレージシステム・ソフトウェア１５０はエミュレーティングされた媒体を意味し、ホストコンピューター１２０上に存在するバックアップ/復旧アプリケーション１４０とバックアップ・ストレージ媒体１２６間のインターフェースを提供する一つ以上の論理的抽象層(logical abstraction layer)を含む。ソフトウェア１５０はバックアップ/復旧アプリケーション１４０からテープフォーマットデータを受け取って、ランダムアクセスディスク(例えば、ハードディスク、光ディスク等)上に保存するのに適合したデータに変換する。一例において、このソフトウェア１５０はストレージシステム制御器１２２のプロセッサ１２７上で実行され、メモリー１２９(図３)上に保存できる。 As illustrated in FIG. 4, storage system software 150 refers to an emulated medium and provides one or more interfaces that provide an interface between backup / recovery application 140 and backup storage medium 126 residing on host computer 120. Includes a logical abstraction layer. Software 150 receives tape format data from backup / restore application 140 and converts it into data suitable for storage on a random access disk (eg, hard disk, optical disk, etc.). In one example, the software 150 is executed on the processor 127 of the storage system controller 122 and can be stored on the memory 129 (FIG. 3).

一実施形態によると、上記のソフトウェア１５０はテープ、テープドライブ及び、テープをテープドライブへ転送したりテープドライブから転送されるのに使われる、ロボットメカニズムのSCSIエミュレーションを提供できる仮想テープライブラリー(VTL)層１４２を意味する層を含み得る。バックアップ/復旧アプリケーション１４０は例えば、矢印１４４で表示されたSCSI命令などを使ってVTL１４２と通信(例えば、エミュレーティングされた媒体にデータをバックアップまたは、書き込み)できる。故に、VTLは他のストレージシステム・ソフトウェア及び、ハードウェアとバックアップ/復旧アプリケーション間のソフトウェア・インターフェースが形成できるので、エミュレーティングされたストレージ媒体１３４をバックアップ/復旧アプリケーションに提供して、エミュレーティングされた媒体が従来の分離可能なバックアップ・ストレージ媒体としてバックアップ/復旧アプリケイションに現れるようにする。 According to one embodiment, the software 150 described above is a virtual tape library (VTL) that can provide SCSI emulation of robotic mechanisms used to transfer tapes to and from tape drives. ) Layer 142 may be included. The backup / restore application 140 can communicate with the VTL 142 (eg, back up or write data to an emulated medium) using, for example, a SCSI command indicated by arrow 144. Therefore, VTL can be emulated by providing emulated storage media 134 to backup / recovery applications as other storage system software and software interfaces between hardware and backup / recovery applications can be formed Allow media to appear in backup / recovery applications as traditional separable backup storage media.

ファイルシステム層１４６として言及された、第二ソフトウェア層はエミュレーティングされたストレージ媒体(VTLと表される)と物理的バックアップ・ストレージ媒体１２６間のインターフェースが提供できる。一例において、ファイルシステム１４６は小さな運営システムとして動作して、矢印１４８で表示したSCSI命令などを使って、バックアップ・ストレージ媒体１２６と通信することで、バックアップ・ストレージ媒体１２６に、またはそこからデータが読み出し及び、書き込みできる。 The second software layer, referred to as the file system layer 146, can provide an interface between the emulated storage medium (denoted VTL) and the physical backup storage medium 126. In one example, the file system 146 operates as a small operating system and communicates with the backup storage medium 126 using, for example, SCSI instructions as indicated by arrows 148 so that data can be transferred to or from the backup storage medium 126. Read and write .

一実施形態において、VTLは一般的なテープライブラリー支援を提供し、任意のSCSI媒体チェンジャー (SCSI media changer)を支援できる。エミュレーティッドテープ装置はIBMLTO-１、LTO-２テープ装置、Quantum SuperDLT３２０テープ装置、 Quantum P３０００テープライブラリー・システムまたは、StorageTekL１８０テープライブラリー・システムなどを含むことができるが、これに限られない。VTL内の各々の仮想カートリッジはデータが保存されるにつれ、動的に増えていくことのできるファイルである。これは固定したサイズを有する、従来のテープカートリッジとは全く違う。一つ以上の仮想カートリッジは、図５を参照して後述するシステムファイルに保存することができる。 In one embodiment, the VTL provides general tape library support and can support any SCSI media changer. Emulated tape devices may include, but are not limited to, IBM LTO-1, LTO-2 tape devices, Quantum SuperDLT320 tape devices, Quantum P3000 tape library systems, or StorageTek L180 tape library systems. Each virtual cartridge in the VTL is a file that can grow dynamically as data is stored. This is quite different from a conventional tape cartridge having a fixed size. One or more virtual cartridges can be stored in a system file described below with reference to FIG.

図５は、本発明の実施形態によるシステムファイル２００を表したファイルシステム・ソフトウェア１４６内のデータ構造の一例を表した図面である。この実施形態において、システムファイル２００はヘッダー２０２及び、データ２０４を含む。ヘッダー２０２はシステムファイルに保存された各仮想カートリッジを識別する情報を含み得る。ヘッダー２０２は仮想カートリッジが書き込み防止の可否、仮想カートリッジの生成/修正日などの情報を含み得る。一例において、ヘッダー２０２は各仮想カートリッジを独自的に識別し、ストレージシステムに保存された他の仮想カートリッジから各仮想カートリッジを区別する情報を含む。例えば、この情報は仮想カートリッジの名前及び、識別番号(例えば、ロボットメカニズムによってテープが識別できるように、一般的に物理的テープに提供されるバーコードに対応する)を含み得る。ヘッダー２０２は各仮想カートリッジの容量、最終修正日などの追加的な情報も含み得る。 FIG. 5 is a diagram illustrating an example of a data structure in the file system software 146 representing the system file 200 according to the embodiment of the present invention. In this embodiment, the system file 200 includes a header 202 and data 204. The header 202 may include information identifying each virtual cartridge stored in the system file. The header 202 may include information such as whether or not the virtual cartridge is write- protected and the creation / modification date of the virtual cartridge. In one example, the header 202 includes information that uniquely identifies each virtual cartridge and distinguishes each virtual cartridge from other virtual cartridges stored in the storage system. For example, this information may include the name of the virtual cartridge and an identification number (eg, corresponding to a barcode typically provided on a physical tape so that the tape can be identified by a robotic mechanism). The header 202 may also include additional information such as the capacity of each virtual cartridge, the date of last modification.

本発明は一実施形態によると、ヘッダー２０２のサイズはシステムが追跡可能なデータの独特なセットの数と保存されたデータの形態(例えば、一つ以上のホストコンピューター・システムからデータバックアップを表す仮想カートリッジ)を表すように極大化できる。例えば、テープ・ストレージシステムに一般的にバックアップしたデータは多数のシステム及び、ユーザーファイルを表す大型データセットによって、一般的に特徴づけられる。データセットが大きいので、これに対して追跡する、非連続データファイルの数は少なくなり得る。故に、一実施形態において、ヘッダー２０２のサイズは効果的に追跡するには多すぎるデータを保存する場合(即ち、ヘッダーが大きすぎること)と十分な数のカートリッジ識別子を保存するスペースが足りない場合(即ち、ヘッダーが小さすぎること)での折衷を通じて、選択できる。例となる一実施形態において、ヘッダー２０２はシステムファイル２００の最初の３２MBを活用する。しかし、ヘッダー２０２はシステムの必要及び、このシステムの必要と容量による特徴に基づいた様々なサイズを有し得、ヘッダー２０２のための様々なサイズが選択できるものと認識されるべきである。 According to one embodiment of the present invention, the size of the header 202 is the number of unique sets of data that the system can track and the form of stored data (e.g., virtual data representing data backups from one or more host computer systems). Can be maximized to represent a cartridge). For example, data typically backed up to a tape storage system is typically characterized by a large number of systems and large data sets representing user files. Because the data set is large, the number of non-consecutive data files tracked against it can be small. Thus, in one embodiment, the size of the header 202 is sufficient to store too much data to effectively track (ie, the header is too large) and there is not enough space to store a sufficient number of cartridge identifiers. You can choose through the compromises (ie headers are too small). In one exemplary embodiment, header 202 utilizes the first 32 MB of system file 200. However, it should be appreciated that the header 202 can have various sizes based on the needs of the system and the needs and capacity of the system, and that various sizes for the header 202 can be selected.

バックアップ/復旧アプリケーションの観点では、仮想カートリッジは全て同じ属性と特徴を有する物理的テープカートリッジとして現れる。即ち、バックアップ/復旧アプリケーションにおいて、仮想カートリッジは一連の書き込まれたテープとして現れる。しかし、一つの望ましい実施形態において、仮想カートリッジに保存されたデータはバックアップ・ストレージ媒体１２６上にシーケンシャルフォーマットに保存されない。返って、仮想カートリッジ上に書き込まれたものと現れるデータは、実際にはランダムアクセスの可能なディスクフォーマット・データで、ストレージシステム・ファイル内に保存される。メタデータは保存されたデータを仮想カートリッジにリンクして、バックアップ/復旧アプリケーションがカートリッジフォーマットでデータを読み出し及び、書き込みするのに使われる。 From a backup / recovery application perspective, virtual cartridges all appear as physical tape cartridges with the same attributes and characteristics. That is, in backup / recovery applications, the virtual cartridge appears as a series of written tapes. However, in one preferred embodiment, data stored on the virtual cartridge is not stored on the backup storage medium 126 in a sequential format. In return, the data that appears to have been written on the virtual cartridge is actually disk format data that can be randomly accessed and stored in the storage system file. Metadata is used to link stored data to a virtual cartridge so that backup / recovery applications can read and write data in cartridge format.

故に、望ましい一つの実施形態を概括すると、ユーザー及び/又は、システムデータ(“ファイルデータ”を意味する)はホストコンピューター１２０からストレージシステム１７０によって受信され、バックアップ・ストレージ媒体１２６を成すディスクアレイに保存される。後述のとおり、ストレージシステムのソフトウェア１５０(図４)及び/または、ハードウェアはこのファイルデータをシステムファイルの形でバックアップ・ストレージ媒体１２６に書き込む。メタデータはストレージシステム制御器によってバックアップされたファイルデータから抽出され、バックアップされたユーザー及び/又は、システムファイルの属性を追跡する。例えば、各ファイルについてこのメタデータはファイル名、ファイルの生成日または、最終修正日、ファイルについての暗号化情報(encryption information)及び、その他の情報を含み得る。また、メタデータは仮想カートリッジにファイルをリンクする各ファイル毎にストレージシステムによって生成できる。このようなメタデータを使って、ソフトウェアはホストコンピューターにテープカートリッジのエミュレーションを提供するが、ファイルデータは実際にはテープフォーマットで保存されず、返って、後述のとおりシステムファイルに保存される。シーケンシャルカートリッジフォーマットよりは、システムファイルにデータを保存するのが、特定のファイルを探すためシーケンシャルデータを通じてスキャンする必要なく、個々のファイルに高速、効率的なランダムアクセスができるというメリットがある。 Thus, to summarize one preferred embodiment, user and / or system data (meaning “file data”) is received by the storage system 170 from the host computer 120 and stored in a disk array comprising a backup storage medium 126. Is done. As will be described later, the storage system software 150 (FIG. 4) and / or hardware writes this file data to the backup storage medium 126 in the form of a system file. The metadata is extracted from the file data backed up by the storage system controller and tracks the attributes of the backed up user and / or system file. For example, for each file, this metadata may include the file name, file creation date or last modification date, encryption information about the file, and other information. Also, metadata can be generated by the storage system for each file that links the file to the virtual cartridge. Using such metadata, the software provides emulation of the tape cartridge to the host computer, but the file data is not actually stored in tape format, but is returned and stored in a system file as described below. Compared to the sequential cartridge format, storing data in a system file has the advantage of enabling high-speed and efficient random access to individual files without the need to scan through sequential data to search for specific files.

前述のとおり、一実施形態によるとファイルデータ(即ち、ユーザー及び/又は、システムデータ)は、システムファイルとしてバックアップ・ストレージ媒体に保存され、各システムファイルは実際のユーザー及び/又は、システムファイルのデータとヘッダーを含む。各システムファイル２００のヘッダー２０２はユーザー及び/又は、システムファイルを仮想カートリッジにリンクするメタデータを含んだテープディレクトリ２０６を含む。“メタデータ”という用語はユーザー及び/又は、システムファイルデータではない実際のユーザー及び/又は、システムデータの属性を表すデータを意味する。一例によると、テープディレクトリはバイトレベル以下の仮想カートリッジ上のデータレイアウトを規定し得る。一実施形態において、テープディレクトリ２０６は図６に図示したとおり、テーブル構造を有する。上記のテーブルは保存された情報タイプについてのコラム２２０（例えば、データ、ファイルマーカー(FM)など）、バイトで使われたディスクブロックのサイズについてのコラム２２２、及びファイルデータが保存されたディスクブロックの数を反映するコラム２２４を含む。故に、テープディレクトリは制御器がバックアップ・ストレージ媒体１２６に保存された任意のデータファイルに、ランダム(連続の反対)アクセスできるようにする。例えば、図６に図示したとおり、テープディレクトリはファイルのデータ２２６が、システムファイル２００の始まりから一つのブロックを始めることを指示するので、データファイル２２６は仮想テープ上に迅速に配置できる。この一つのブロックはファイルマーカー(FM)に対応するので、サイズを有しない。ファイルマーカーはシステムファイルに保存されない。即ち、ファイルマーカーはゼロデータ(zero data)に対応する。従来のテープ及び、バックアップ/復旧アプリケーションによって使われることで、(テープディレクトリはファイルマーカーを含むのに)これは、テーパーファイルと共にファイルマーカーを書き込み、仮想カートリッジを見る時、ファイルマーカーをも見たがるためである。従って、ファイルマーカーはテープディレクトリ内で追跡を行う。しかし、ファイルマーカーは任意のデータを示さないので、システムファイルのデータセクション内に保存されない。ファイルのデータ２２６は矢印２０５で表示したシステムファイル・データセクションの最初の部分から始まり、長さは１０２４バイトである(即ち、一つのディスクブロックはサイズが１０２４バイトである。)。他のファイルデータはデータの量、即ちデータファイルのサイズによって１０２４バイトではない、他のブロックのサイズで保存できるものと認識されるべきである。例えば、もっと大きいデータファイルは効率のため、さらに大きいブロックサイズを使って保存できる。 As described above, according to one embodiment, file data (ie, user and / or system data) is stored as a system file on a backup storage medium, and each system file is an actual user and / or system file data. And header. The header 202 of each system file 200 includes a tape directory 206 that contains metadata that links the user and / or system file to the virtual cartridge. The term “metadata” refers to data that represents attributes of actual users and / or system data that are not user and / or system file data. According to one example, a tape directory may define a data layout on a virtual cartridge that is below the byte level. In one embodiment, the tape directory 206 has a table structure as illustrated in FIG. The above table includes a column 220 for the stored information type (eg, data, file marker (FM), etc.), a column 222 for the size of the disk block used in bytes, and a disk block in which the file data is stored. It includes a column 224 that reflects the number. Thus, the tape directory allows the controller to randomly (opposite continuous) access any data file stored on the backup storage medium 126. For example, as shown in FIG. 6, the tape directory indicates that the file data 226 starts a block from the beginning of the system file 200, so that the data file 226 can be quickly placed on the virtual tape. Since this one block corresponds to the file marker (FM), it has no size. File markers are not saved in system files. That is, the file marker corresponds to zero data. Used by traditional tape and backup / recovery applications, it writes a file marker with a taper file (even though the tape directory contains a file marker), but also sees the file marker when looking at the virtual cartridge. Because. Thus, file markers track within the tape directory. However, since the file marker does not indicate any data, it is not stored in the data section of the system file. File data 226 begins at the first portion of the system file data section indicated by arrow 205 and is 1024 bytes long (ie, one disk block is 1024 bytes in size). It should be appreciated that other file data can be stored in other block sizes, not 1024 bytes depending on the amount of data, ie the size of the data file. For example, a larger data file can be stored using a larger block size for efficiency.

一例において、テープディレクトリはストレージシステムにバックアップされた、各データファイルに関する“ファイルディスクリプタ”に含まれ得る。ファイルディスクリプタはストレージシステムに保存されたデータファイル２０４に関する、メタデータを含む。一実施形態において、ファイルディスクリプタは、大抵のユニックス基盤のコンピューターシステムに使われるテープアーカイブ(タール)・フォーマットのようなスタンダードフォーマットで具現できる。各ファイルディスクリプタはユーザーファイルに対応する名前、ユーザーファイルの生成/修正日、ユーザーファイルのサイズ及び、ユーザーファイルへのアクセス制限可否などの情報を含み得る。ファイルディスクリプタに保存された追加情報は、データがコピーされたディレクトリ構造を説明する情報をさらに含み得る。故に、ファイルディスクリプタは後述のとおり、対応するデータファイルに関する検索可能なメタデータを含み得る。 In one example, the tape directory may be included in a “file descriptor” for each data file backed up to the storage system. The file descriptor includes metadata regarding the data file 204 stored in the storage system. In one embodiment, the file descriptor may be implemented in a standard format such as a tape archive (tar) format used in most Unix-based computer systems. Each file descriptor may include information such as the name corresponding to the user file, the creation / modification date of the user file, the size of the user file, and whether or not access to the user file can be restricted. The additional information stored in the file descriptor may further include information describing the directory structure where the data is copied. Thus, the file descriptor can include searchable metadata regarding the corresponding data file, as described below.

バックアップ/復旧アプリケーションの観点では、任意の仮想カートリッジはファイルディスクリプタに対応する複数のデータファイルを含み得る。ストレージシステム・ソフトウェアの観点では、データファイルが特定のバックアップ作業にリンクできるシステムファイルに保存される。例えば、特定の時間に一つのホストコンピューターによって実行されたバックアップは、一つ以上の仮想カートリッジに対応する一つのシステムファイルが生成できる。仮想カートリッジは任意のサイズであり得、仮想カートリッジに保存されるユーザーファイルの増加によって動的に増えていくことができる。 From a backup / recovery application perspective, any virtual cartridge may contain multiple data files corresponding to file descriptors. From a storage system software perspective, data files are stored in system files that can be linked to specific backup operations. For example, a backup executed by one host computer at a specific time can generate one system file corresponding to one or more virtual cartridges. The virtual cartridge can be of any size and can grow dynamically as the number of user files stored on the virtual cartridge increases.

上記の図３を参考すると、ストレージシステム７０は合成フルバックアップ・ソフトウェアアプリケーション２４０を含み得る。一実施形態において、ホストコンピューター１２０は、エミュレーティングされた媒体１３４上にデータをバックアップして一つ以上の仮想カートリッジを形成する。幾つかのコンピューター環境において、“フルバックアップ”即ち、ネットワーク内の第一ストレージシステム(図１)に保存された全てのデータのバックアップコピーは周期的に(例えば、毎週)果たせる。この処理は一般的にコピーするデータが大容量なので、とても時間がかかる。従って、大抵のコンピューティング環境において、追加的なバックアップ、一名インクリメンタルバックアップは、連続的なフルバックアップ例えば、毎日のフルバックアップ中に遂行できる。インクリメンタルバックアップは一つの処理なので、インクリメンタルバックアップであれ、フルバックアップであれ、最後のバックアップが遂行された以後に変わったデータだけがバックアップされる。一般的に、ファイル内の多くのデータが頻りに変更されなくても、変更されたデータはファイル基盤にバックアップされる。故に、インクリメンタルバックアップはフルバックアップのケースより少ないので、高速で果たされる。大抵の環境では一般的に毎週一回ずつフルバックアップを実行し、インクリメンタルバックアップは毎日実行するが、このような時間フレームが使われる必要がないということを認識するべきである。例えば、ある環境では一日に何回もインクリメンタルバックアップが必要となる。この発明の原理はどれほど頻りに実行されるかとは関係なく、フルバックアップ(選択的なインクリメンタルバックアップ)を使う全ての環境に適用される。 Referring to FIG. 3 above, the storage system 70 may include a synthetic full backup software application 240. In one embodiment, the host computer 120 backs up data on the emulated medium 134 to form one or more virtual cartridges. In some computing environments, a “full backup”, ie a backup copy of all data stored in the first storage system in the network (FIG. 1) can be performed periodically (eg, weekly). This process is very time consuming because generally the data to be copied is large. Thus, in most computing environments, additional backups, one-person incremental backups, can be performed during continuous full backups, such as daily full backups. Since incremental backup is a process, only data that has changed since the last backup was performed, whether incremental or full, is backed up. Generally, even if a lot of data in a file is not changed frequently, the changed data is backed up to a file base. Therefore, incremental backups are performed faster because they are less than full backup cases. It should be recognized that although most environments typically perform full backups once a week and incremental backups daily, such time frames do not need to be used. For example, some environments require incremental backups several times a day. The principles of the invention apply to all environments that use full backups (selective incremental backups), regardless of how often they are implemented.

フルバックアップ手順が実行される間、ホストコンピューターは複数のデータファイルから成るバックアップされたデータを含む一つ以上の仮想カートリッジが生成できる。明確性のため、後述の説明ではフルバックアップが単に一つの仮想カートリッジを生成すると仮想する。しかし、フルバックアップは一つ以上に仮想カートリッジを生成し、本発明の原理は仮想カートリッジの数に限定されないことと認識されるべきである。 During the full backup procedure, the host computer can generate one or more virtual cartridges containing backed up data consisting of multiple data files. For clarity, the following description assumes that a full backup simply creates one virtual cartridge. However, it should be recognized that a full backup creates one or more virtual cartridges and the principles of the present invention are not limited to the number of virtual cartridges.

一実施形態によると、一つの既存のフルバックアップ・データセットと一つ以上のインクリメンタルバックアップ・データセットから、合成フルバックアップ・データセットを生成する方法が提供される。この方法は周期的(例えば、毎週)フルバックアップを遂行する必要がないので、ユーザーの時間とネットワークリソースを相当に節約できる。また、当業者に自明なとおり、例えば、最新バージョンのファイルがインクリメンタルバックアップに存在する場合、バックアップ/復旧アプリケーションは一般的に最後のフルバックアップに基づいたファイルを復旧して、インクリメンタルバックアップからの全ての変更を適用するため、フルバックアップに基づいた復旧データと一つ以上のインクリメンタルバックアップは時間を消費する処理となり得る。故に、合成フルバックアップの提供は、バックアップ復旧アプリケーションが、フルバックアップと一つ以上のインクリメンタルバックアップから重ねて復旧する必要なく、合成フルバックアップにのみ基づいて、データファイルをさらに迅速に復旧できるようにする、追加的なメリットを有することができる。“最新バージョン”という用語は、ファイルが新たなバージョン番号を有するのとは関係なく、一般的にデータファイルの最も最近のコピー(即ち、データファイルが保存された最も最近の時間)を意味するものと認識されるべきである。“バージョン”という用語は、幾つかの方法で修正できる、または幾度も保存できる同一のファイルのコピーを意味する。 According to one embodiment, a method is provided for generating a synthetic full backup data set from an existing full backup data set and one or more incremental backup data sets. This method does not require periodic (eg weekly) full backups, thus saving users time and network resources. Also, as will be apparent to those skilled in the art, for example, if the latest version of a file is present in an incremental backup, the backup / recovery application generally recovers the file based on the last full backup and all the files from the incremental backup. To apply the changes, recovery data based on a full backup and one or more incremental backups can be a time consuming process. Hence, providing synthetic full backups allows backup recovery applications to recover data files more quickly based solely on synthetic full backups without having to restore from a full backup and one or more incremental backups Can have additional benefits. The term “latest version” generally refers to the most recent copy of a data file (ie, the most recent time the data file was saved), regardless of whether the file has a new version number. Should be recognized. The term “version” means a copy of the same file that can be modified in several ways or saved many times.

図７は合成フルバックアップ手順を概略的に表した図面である。ホストコンピューター１２０は最初の時間、例えば週末にフルバックアップ２３０を実行できる。ホストコンピューター１２０は連続的なインクリメンタルバックアップ２３２a、２３２b、２３２c、２３２d、２３２eを例えば、一週間毎日実行できる。続いて、ストレージシステム１７０は、後述するように合成フルバックアップ・データセット２３４が生成できる。 FIG. 7 is a diagram schematically showing a synthetic full backup procedure. The host computer 120 can perform a full backup 230 at an initial time, for example, at the weekend. The host computer 120 can perform continuous incremental backups 232a, 232b, 232c, 232d, 232e, for example, daily for a week. Subsequently, the storage system 170 can generate a composite full backup data set 234 as described below.

一実施形態によると、ストレージシステム１７０は合成フルバックアップ・アプリケーション２４０(図３)として、言及されたソフトウェア・アプリケーションを含み得る。合成フルバックアップ・アプリケーション２４０は、ストレージシステム制御器１２２(図２)または、ホストコンピューター１２０上で駆動できる。合成フルバックアップ・アプリケーション２４０は合成フルバックアップ・データセット２３４の生成に必要なソフトウェア命令とインターフェースを含む。一例として、合成フルバックアップ・アプリケーションはフルバックアップ・データセット２３０とインクリメンタルバックアップデータセット２３２各々のメタデータ表現の論理的な併合を遂行して、合成フルバックアップ・データセット２３４を含む新たな仮想カートリッジが生成できる。 According to one embodiment, storage system 170 may include the software application referred to as synthetic full backup application 240 (FIG. 3). The synthetic full backup application 240 can be run on the storage system controller 122 (FIG. 2) or the host computer 120. The synthetic full backup application 240 includes the software instructions and interfaces necessary to generate the synthetic full backup data set 234. As an example, the synthetic full backup application performs a logical merge of the metadata representations of each of the full backup data set 230 and the incremental backup data set 232 to create a new virtual cartridge containing the synthetic full backup data set 234. Can be generated.

例えば、図８に図示したとおり、既存のフルバックアップ・データセットはユーザーファイル(Ｆ１、Ｆ２、Ｆ３、Ｆ４)を含み得る。第一インクリメンタルバックアップ・データセット２３２aは、ユーザーファイルＦ２の修正されたバージョンであるＦ２′及び、Ｆ３の修正されたバージョンであるＦ３′を含み得る。第二インクリメンタルバックアップ・データセット２３２bは、ユーザーファイルＦ１の修正されたバージョンであるＦ１′、Ｆ２のさらに修正されたバージョンであるＦ２″及び、新たなユーザーファイルであるＦ５を含み得る。故に、合成フルバックアップ・データセット２３４はフルバックアップ・データセット２３０と二つのインクリメンタルデータセット２３２a、２３２bの論理的併合から形成され、各ユーザーファイル(Ｆ１、Ｆ２、Ｆ３、Ｆ４、Ｆ５)の最終バージョンを含む。故に、図８に図示したとおり、合成フルバックアップ・データセットはユーザーファイルＦ１′、Ｆ２″、Ｆ３′、Ｆ４及び、Ｆ５を含む。 For example, as illustrated in FIG. 8, an existing full backup data set may include user files (F1, F2, F3, F4). The first incremental backup data set 232a may include F2 ′, which is a modified version of user file F2, and F3 ′, which is a modified version of F3. Second incremental backup data set 232b is a modified version of the user files F1 F1 ', F2 "and a further modified version of F2, may include F5 is a new user file. Thus, the synthetic The full backup data set 234 is formed from a logical merge of the full backup data set 230 and the two incremental data sets 232a, 232b and includes the final version of each user file (F1, F2, F3, F4, F5). Thus, as illustrated in FIG. 8, the composite full backup data set includes user files F1 ′, F2 ″, F3 ′, F4 and F5.

図３、図４に図示したとおり、ファイルシステム・ソフトウェア１４６はエミュレーティングされた媒体１３４に保存された、各ユーザーファイルに関したメタデータを保存した論理的メタデータキャッシュ２４２が生成できる。論理的メタデータキャッシュは物理的データキャッシュである必要はないが、代わりにストレージ媒体１２６に保存されたデータの検索可能なコレクションであり得る。他の例において、論理的メタデータキャッシュ２４２はデータベースとして、具現できる。メタデータがデータベースに保存された場合、従来のデータベース命令(例えば、SQL命令)はフルバックアップ・データセットと一つ以上のインクリメンタルバックアップ・データセットの論理的併合を遂行して、合成フルバックアップ・データセットを生成することができる。 As shown in FIGS. 3 and 4, the file system software 146 can generate a logical metadata cache 242 that stores metadata about each user file stored on the emulated medium 134. The logical metadata cache need not be a physical data cache, but can instead be a searchable collection of data stored on the storage medium 126. In another example, the logical metadata cache 242 can be implemented as a database. When metadata is stored in the database, traditional database instructions (e.g., SQL instructions) perform a logical merge of the full backup data set and one or more incremental backup data sets to produce a synthetic full backup data. A set can be generated.

上記のように、エミュレーティングされた媒体１３４に保存された各データファイルは、データファイルに関してメタデータが含まれたファイルディスクリプタを含み、バックアップストレージ媒体１２６上のファイルの位置を含み得る。一実施形態では、ホストコンピュータ１２０上で駆動されるバックアップ/復旧アプリケーションは、エミュレーティングされた媒体１３４上にストリーミングテープフォーマットでデータを保存する。図９は、このテープフォーマットを表すデータ構造２５０の例を図示した図面である。上記のように、システムファイルデータ構造は、データファイルに対するファイルディスクリプタ、ファイルの生成、及び/又は、修正日、セキュリティー情報、ファイルの出所であるホストシステムのディレクトリ構造のみならず、その他の仮想カートリッジにファイルをリンクする情報のようなデータファイルに関する情報を有するヘッダーを含む。このようなヘッダーは、ホストコンピュータ、第１ストレージシステム等からバックアップ(コピー)された実際のユーザー、及び、システムファイルであるデータ２５４と関係をする。システムファイルデータ構造は、次のヘッダーをブロック境界に適切に整列できるパッド２５６を選択的に含み得る。 As described above, each data file stored on the emulated medium 134 includes a file descriptor that includes metadata about the data file and may include the location of the file on the backup storage medium 126. In one embodiment, a backup / recovery application running on the host computer 120 stores data in the streaming tape format on the emulated media 134. FIG. 9 is a diagram illustrating an example of a data structure 250 representing the tape format. As noted above, the system file data structure is not limited to the file descriptor for the data file, file creation and / or modification date, security information, directory structure of the host system from which the file originated, as well as other virtual cartridges. Includes a header with information about the data file, such as information linking the files. Such a header relates to the actual user backed up (copied) from the host computer, the first storage system, etc., and the data 254 which is a system file. The system file data structure may optionally include a pad 256 that can properly align the next header to the block boundary.

図９に図示されたように、一実施形態では、ヘッダーデータは、論理的メタデータキャッシュ２４２に配置され、他のシーケンシャルテープデータフォーマットに対する高速検索、及び、ランダムアクセスを可能にする。ストレージシステム制御器１２２上にファイルシステムソフトウェア１４８を使用することで、具現された論理的メタデータキャッシュの使用は、エミュレーティングされた媒体１３４に保存された線形、シーケンシャルテープデータフォーマットをバックアップストレージ媒体１２６を構成する物理的ディスク上に保存されたランダムアクセスデータフォーマットに変換できるようにする。論理的メタデータキャッシュ２４２は、データファイルに対するファイルデスクリプタを含むヘッダー２５２、データファイルへのアクセスを制御するのに使用され得るセキュリティー情報、及び、以下に論じるようなポインター２５６を仮想カートリッジ、及び、バックアップストレージ媒体１２６上のデータファイルの実際の位置に保存する。一実施形態では、論理的メタデータキャッシュはフルバックアップデータセット２３０と各インクリメンタルデータセット２３２にバックアップされた全てのデータファイルに関するデータを保存する。 As illustrated in FIG. 9, in one embodiment, the header data is placed in the logical metadata cache 242 to enable fast retrieval and random access to other sequential tape data formats. By using the file system software 148 on the storage system controller 122, the use of the implemented logical metadata cache allows the linear, sequential tape data format stored on the emulated medium 134 to be backed up to the storage medium 126. Can be converted to a random access data format stored on the physical disk constituting the. The logical metadata cache 242 includes a header 252 that includes a file descriptor for the data file, security information that can be used to control access to the data file, and a pointer 256 as discussed below for the virtual cartridge and backup. Save to the actual location of the data file on the storage medium 126. In one embodiment, the logical metadata cache stores data regarding all data files backed up to the full backup data set 230 and each incremental data set 232.

一実施形態によると、合成フルバックアップアプリケーションソフトウェア２４０は、論理的メタデータキャッシュに保存された情報を使用して、合成フルバックアップデータセットを生成する。続いて、この合成フルバックアップデータセットは、合成フルバックアップアプリケーション２４０によって生成された合成仮想カートリッジにリンクされる。バックアップ/復旧アプリケーションにおいて、合成フルバックアップデータセットは、この合成仮想カートリッジ上に保存されるように見える。上記のように、合成フルバックアップデータセットは、既存のフルバックアップデータセットとインクリメンタルバックアップデータセットの論理的併合を行うことで生成され得る。このような論理的併合は、それぞれの既存フルバックアップデータセットとインクリメンタルバックアップデータセットに含まれたそれぞれのデータファイルの比較、及び、図８を参照にして説明された最終修正されたバージョンの各ユーザーファイルの混合の生成を含み得る。 According to one embodiment, the synthetic full backup application software 240 uses the information stored in the logical metadata cache to generate a synthetic full backup data set. Subsequently, this synthetic full backup data set is linked to the synthetic virtual cartridge generated by the synthetic full backup application 240. In backup / recovery applications, the synthetic full backup data set appears to be stored on this synthetic virtual cartridge. As described above, a synthetic full backup data set can be generated by logically merging an existing full backup data set and an incremental backup data set. Such logical merging involves comparing each data file contained in each existing full backup data set with each incremental backup data set, and each user of the final modified version described with reference to FIG. It may include the generation of a mixture of files.

一実施形態によると、図１０に図示されたように、合成仮想カートリッジ２６０は、他の仮想カートリッジ、特に、既存のフルバックアップデータセットとインクリメンタルバックアップデータセットを含んだ仮想カートリッジ上のデータファイルの位置をポインティングするポインターを含む。上記図８に関して、挙げられた例を考慮すると、合成仮想カートリッジ２６０は仮想カートリッジ２６２上の既存フルバックアップデータセット内のユーザーファイル(Ｆ４)（既存フルバックアップデータセットは最新バージョンのユーザーファイル(Ｆ４)を含むため）の位置と、例として、仮想カートリッジ２６４上のインクリメンタルデータセット２３２a内のユーザーファイル(Ｆ３′)の位置をポインティング（矢印２６８で表示される）するポインター２６６を含む。 According to one embodiment, as illustrated in FIG. 10, the synthetic virtual cartridge 260 is a data file location on another virtual cartridge, particularly a virtual cartridge containing an existing full backup data set and an incremental backup data set. Contains a pointer to point to. Considering the example given with respect to FIG. 8 above, the synthetic virtual cartridge 260 is the user file (F4) in the existing full backup data set on the virtual cartridge 262 (the existing full backup data set is the latest version of the user file (F4)). And a pointer 266 pointing to (indicated by arrow 268) the location of the user file (F3 ') in the incremental data set 232a on the virtual cartridge 264, for example.

合成仮想カートリッジは、ポインター２６６がポインティングするデータを含む全ての仮想カートリッジの識別番号を含んだリスト２７０も含む。この従属カートリッジリスト２７０は、実データの位置追跡と従属仮想カートリッジの削除防止のために重要であり得る。この実施形態では、合成フルバックアップデータセットは、実際のユーザーファイルを含まないで、バックアップストレージ媒体１２６上のユーザーファイルの位置を示すポインターのセットを含む。従って、実際のユーザーファイル(他の仮想カートリッジ上に保存された)の削除を防ぎ得る。これは、データを含んだ仮想カートリッジの記録（従属カートリッジリスト２７０）を維持し、各仮想カートリッジの上書き(over-written)、または、削除を防ぐことで部分的に達成できる。合成仮想カートリッジは、合成仮想カートリッジの大きさ、バックアップストレージ媒体１２６上の合成仮想カートリッジの位置と同じカートリッジデータ２７２を含み得る。また、合成仮想カートリッジは識別番号、及び/又は、名前２７４を有し得る。 The composite virtual cartridge also includes a list 270 that includes the identification numbers of all virtual cartridges that contain the data that the pointer 266 points to. This dependent cartridge list 270 may be important for tracking the location of real data and preventing deletion of dependent virtual cartridges. In this embodiment, the synthetic full backup data set does not include the actual user file, but includes a set of pointers that indicate the location of the user file on the backup storage medium 126. Therefore, it is possible to prevent deletion of actual user files (stored on other virtual cartridges). This can be accomplished in part by maintaining a record of the virtual cartridges containing data (subordinate cartridge list 270) and preventing over-written or deletion of each virtual cartridge. Synthetic virtual cartridge size synthetic virtual cartridges may contain the same cartridge data 272 and the position of the synthetic virtual cartridge on the backup storage medium 126. The composite virtual cartridge may also have an identification number and / or name 274.

他の実施形態によると、合成仮想カートリッジは、ポインターと実際に保存されたユーザーファイルの組み合わせを含み得る。図１１に図示されたように、一例において、合成仮想カートリッジは、仮想カートリッジ２６２上の既存フルバックアップデータセット２３０内のデータファイル(図９を参照にして説明したような最新バージョン)の位置をポインティングするポインター２６６を含む。合成仮想カートリッジは、矢印２８０で示されたインクリメンタルデータセット２３２からコピーされた実データファイルを含むデータ２７８を含むことができる。このような方法で、インクリメンタルバックアップデータセットは、合成フルバックアップデータセット２７６が生成された後に削除され得るので、ストレージ空間が節約される。上記の合成仮想カートリッジは、全てのユーザーファイルのコピーではなく全体、または、一部ポインターを含む合成仮想カートリッジより小さい。 According to other embodiments, the composite virtual cartridge may include a combination of pointers and actually stored user files. As shown in FIG. 11, in one example, the synthetic virtual cartridge points to the location of the data file (latest version as described with reference to FIG. 9) in the existing full backup data set 230 on the virtual cartridge 262. Pointer 266 to be included. The composite virtual cartridge can include data 278 including actual data files copied from the incremental data set 232 indicated by arrow 280. In this way, the incremental backup data set can be deleted after the synthetic full backup data set 276 is generated, thus saving storage space. The synthetic virtual cartridge described above is smaller than a synthetic virtual cartridge that includes pointers in whole or in part rather than copies of all user files.

合成フルバックアップは、ポインターと保存されたファイルデータの組み合わせを含み、上記例に限定されないということが認識されなければならない。例として、合成フルバックアップは、あるインクリメンタル、及び/又は、フルバックアップに保存された多数のファイルに関したデータファイルに対するポインターを含み得、他の既存のフル、及び/又は、インクリメンタルバックアップからコピーされて保存されたファイルデータを含み得る。また、その代案として、合成フルバックアップはどのポインターも含まないで、適切なフル及び/又は、インクリメンタルバックアップからコピーされた最新バージョンの実ファイルデータを含む全ての関連インクリメンタルバックアップ、及び、以前のフルバックアップに基づいて生成され得る。 It should be recognized that a synthetic full backup includes a combination of pointers and stored file data and is not limited to the above example. As an example, a synthetic full backup may include pointers to data files for a number of files stored in one incremental and / or full backup, copied from other existing full and / or incremental backups. May contain stored file data. Further, as an alternative, synthetic full backup not include any pointers, appropriate full and / or, all related incremental backup that includes the actual file data copied latest version from the incremental backup, and the previous full backup Can be generated based on

一実施形態では、合成フルバックアップアプリケーションソフトウェアは、それぞれの既存フルバックアップデータセットとインクリメンタルバックアップデータセットに対するユーザー、及び、システムファイルメタデータを比較して、最新バージョンのデータファイルそれぞれが位置された所を決定できるようにする。ディファレンシングアルゴリズム(differencing algorithm)を含み得る。例えば、ディファレンシングアルゴリズムは、他のバックアップセット内の同一のデータファイルの相異したバージョンの間で生成日、及び/又は、修正日等を比較して最新バージョンのデータファイルを選択するのに使用できる。しかし、ユーザーは、しばしば、ファイル内の任意のデータを実際に変更せず、ユーザーファイルをオープンし、ファイルを保存することができる(従って、その修正のデータを変更する)。従って、システムはシステム、或は、ユーザーファイル内のデータを分析して、データが実際に変更されたかを決定できる更に向上したディファレンシングアルゴリズムを具現し得る。このようなディファレンシングアルゴリズムの変形、及び、他の形態の比較アルゴリズムは、当業者に自明なものである。また、上記のように、メタデータがデータベースフォーマットで保存される場合、SQL命令等のデータベース命令は、論理的併合を行う時に使用され得る。本発明は、合成フルバックアップデータセットを正確に生成できるように最新、または、最終バージョンの各ユーザーファイルが、全体比較された既存バックアップセットから選択され得るようにする全てのアルゴリズムに適用され得る。 In one embodiment, the synthetic full backup application software compares the user and system file metadata for each existing full backup data set and incremental backup data set to determine where the latest version of each data file is located. Be able to decide. A differencing algorithm may be included. For example, a differentiating algorithm may be used to select the latest version of a data file by comparing generation date and / or modification date among different versions of the same data file in other backup sets. Can be used. However, the user can often open the user file and save the file (thus changing the modification data) without actually changing any data in the file. Thus, the system may implement a further improved differentiation algorithm that can analyze the data in the system or user file to determine if the data has actually changed. Such variations of the differencing algorithm and other forms of comparison algorithms will be apparent to those skilled in the art. Also, as described above, when metadata is stored in a database format, database instructions such as SQL instructions can be used when performing a logical merge. The present invention can be applied to all algorithms that allow the latest or final version of each user file to be selected from the overall compared existing backup set so that a synthetic full backup data set can be accurately generated.

当業者に自明であるように、合成フルバックアップアプリケーションは、ホストコンピュータが物理的フルバックアップを行う必要なく、フルバックアップデータセットが生成され、利用可能にする。データをバックアップストレージシステムに転送するプロセッサの負担により、ホストコンピュータに負担をかけないようにするだけでなく、合成フルバックアップアプリケーションがストレージシステムに実行される実施形態において、ネットワーク帯域幅の活用を相当減少させる。図７に図示されたように、第１合成フルバックアップデータセット２３４と一連のインクリメンタルバックアップデータセット２３６を使用して追加的な合成フルバックアップデータセットが生成される。これは、頻繁に修正されず、頻繁にコピーされないファイル、または、対象に相当な時間的利益を与えることができる。その代わりに、合成フルバックアップデータセットは、単に一回コピーされたファイルに対するポインターを維持することができる。 As will be apparent to those skilled in the art, synthetic full backup applications allow a full backup data set to be generated and made available without the host computer having to perform a physical full backup. In addition to not burdening the host computer due to the processor burden of transferring data to the backup storage system, the use of network bandwidth is significantly reduced in embodiments where a synthetic full backup application is run on the storage system. Let As illustrated in FIG. 7, an additional composite full backup data set is generated using the first composite full backup data set 234 and a series of incremental backup data sets 236. This can provide considerable time benefit to files or objects that are not frequently modified and not frequently copied. Instead, the synthetic full backup data set can simply maintain a pointer to the file that was copied once.

図３を参照にして上記にように、ストレージシステムは、エンドユーザー復旧アプリケーション３００としてのソフトウェアアプリケーションを含み得る。従って、他の実施形態によると、エンドユーザーがITスタップの干渉なく、既存バックアップ/復旧手順、及び/又は、政策を変更する必要なくバックアップデータを探して復旧する方法が提供される。典型的バックアップストレージシステムにおいて、ホストコンピュータ１２０で駆動されるバックアップ/復旧アプリケーションは、ITスタップによって制御され、エンドユーザーがITスタップによる干渉なくバックアップされたデータにアクセスすることは不可能であったり、相当難しくなり得る。本発明の実施形態によると、ストレージシステムソフトウェアは、エンドユーザーが、例えば、バックアップストレージ媒体１２６とのウェブ基盤または他のインターフェースを通じて自分のファイルを探して復旧することを提供する。 As described above with reference to FIG. 3, the storage system may include a software application as the end user recovery application 300. Thus, according to another embodiment, a method is provided for an end user to locate and restore backup data without the need for IT stub interference and without having to change existing backup / recovery procedures and / or policies. In a typical backup storage system, the backup / recovery application that is driven by the host computer 120 is controlled by IT stubs, and it may not be possible for end users to access the backed up data without interference by IT stubs. Can be difficult. According to embodiments of the present invention, the storage system software provides for end users to locate and recover their files through, for example, a web infrastructure or other interface with the backup storage medium 126.

合成フルバックアップアプリケーション２４０と同様にエンドユーザー復旧アプリケーション３００は、ストレージシステム制御器１２２または、ホストコンピュータ１２０上で駆動できるということが認識されなければならない。エンドユーザー復旧アプリケーションは、認証されたユーザーが論理的メタデータのキャッシュを検索して、バックアップストレージ媒体１２６からバックアップされたファイルを探し、選択的に復旧させるのに必要なソフトウェア命令とインターフェースを含む。 It should be appreciated that the end user recovery application 300 as well as the synthetic full backup application 240 can be run on the storage system controller 122 or the host computer 120. The end-user recovery application includes software instructions and interfaces necessary for an authenticated user to search the logical metadata cache to find and selectively recover the backed up files from the backup storage medium 126.

一実施形態によると、ユーザーコンピュータ１３６上で設置、及び/又は、実行されるユーザーインターフェースを含むソフトウェアが提供される。ユーザーインターフェースは、ユーザーがバックアップストレージ媒体上のファイルを探すようにする全ての形態のインターフェースになり得る。例えば、ユーザーインターフェースは、グラフィックユーザーインターフェース、ウェブ基盤、又は、テキストインターフェース等になり得る。ユーザーコンピュータは、例えば、イーサネット連結のようなネットワーク連結１３８を通じてストレージシステム１７０に連結される。このネットワーク連結１３８を通じて、ユーザーコンピュータ１３６のオペレータはストレージシステム１７０に保存されたデータにアクセスできる。 According to one embodiment, software is provided that includes a user interface that is installed and / or executed on the user computer 136. The user interface can be any form of interface that allows the user to locate files on the backup storage medium. For example, the user interface can be a graphic user interface, a web infrastructure, a text interface, or the like. The user computer is connected to the storage system 170 through a network connection 138 such as an Ethernet connection. Through this network connection 138, the operator of the user computer 136 can access data stored in the storage system 170.

一例において、エンドユーザー復旧アプリケーション３００は、ユーザー認証、及び/又は、認証特徴を含む。例えば、ユーザーは、ユーザー名とパスワードを使用するユーザーコンピュータ上のユーザーインターフェースを通じてログインを要請されえる。ユーザーコンピュータは、適切なユーザー立証メカニズムを使用して、ユーザーがストレージシステムにアクセスしたか否かを決定できるストレージシステム（例えば、エンドユーザー復旧アプリケーション）でユーザー名とパスワードを転送することができる。ユーザー立証メカニズムに含められるが、これに限定されないいくつかの例としてはMicrosoft Active Directory server、 Unix“yellow pages”server、又は、 Lightweight Directory Acess Ptotocol等がある。ログイン/ユーザー立証メカニズムは、エンドユーザー復旧アプリケーションと通信してユーザー権限を転換することができる。例えば、いくつかのユーザーは自分が生成したファイルのみを検索できるようにすることもでき、又は、所定の権限を持ったり、オーナー（owner）として識別され得もする。例えば、システムオペレータ、または、管理者のような他のユーザーは、バックアップされたファイル全てに対してアクセスが許容され得る。 In one example, the end user recovery application 300 includes user authentication and / or authentication features. For example, a user may be requested to log in through a user interface on a user computer that uses a username and password. The user computer can transfer the username and password with a storage system (eg, an end user recovery application) that can determine whether the user has accessed the storage system using an appropriate user verification mechanism. Some examples that may be included in, but not limited to, the user verification mechanism include Microsoft Active Directory server, Unix “yellow pages” server, or Lightweight Directory Acess Ptotocol. The login / user verification mechanism can communicate with the end-user recovery application to switch user rights. For example, some users may be able to search only for files they have created, or may have a predetermined authority or be identified as an owner. For example, a system operator or other user such as an administrator may be allowed access to all backed up files.

一実施形態によると、エンドユーザー復旧アプリケーションは、論理的メタデータキャッシュを使用して、バックアップストレージ媒体上にバックアップされた全てのデータファイルに対する情報を得る。エンドユーザー復旧アプリケーションは、例えば、バックアップ時間、バックアップ日、ユーザー名、オリジナルユーザーのコンピュータディレクトリ構造（ファイルがバックアップされた場合に得ることができる。）、又は、他のファイル特性等によって分類されたユーザーファイルの階層的ディレクトリ構造を、ユーザーインターフェースを通じてユーザーに提供する。一例において、ユーザーに提供されるディレクトリ構造は、ユーザーに与えられた権限によって変わることができる。エンドユーザー復旧アプリケーションは、ブラウジング要請（すなわち、ユーザーインターフェースを通じて、ユーザーがディレクトリ構造をブラウジングして所望のファイルを探す）を受けたり、ユーザーが名前、日等によってファイルを検索することができる。 According to one embodiment, the end user recovery application uses a logical metadata cache to obtain information for all data files backed up on a backup storage medium. End-user recovery applications are, for example, users categorized by backup time, backup date, user name, original user's computer directory structure (obtained when the file is backed up), or other file characteristics, etc. Provide the user with a hierarchical directory structure of files through the user interface. In one example, the directory structure provided to the user can vary depending on the privileges granted to the user. The end-user recovery application can receive a browsing request (ie, the user browses the directory structure to find a desired file through the user interface), or the user can search for a file by name, date, or the like.

一実施形態によると、ユーザーはストレージシステムからバックアップされたファイルを復旧することができる。例えば、ユーザーが所望のファイルを探すと、上記のように、ユーザーは、ネットワーク連結１３８を通じて、ストレージシステムから上記のファイルをダウンロードすることができる。一例において、このようなダウンロードの手順は、当業者に公知されたように、ウェブ基盤ダウンロードに匹敵する方式で具現され得る。 According to one embodiment, the user can recover files that were backed up from the storage system. For example, when a user searches for a desired file, the user can download the file from the storage system over the network connection 138 as described above. In one example, such a download procedure may be implemented in a manner comparable to web-based download, as known to those skilled in the art.

ビューイング/ダウンロードに関する許可を有したエンドユーザーが、ファイルにアクセスできるようにし、ユーザーインターフェースを通じてこのアクセスを可能にさせることで、エンドユーザー復旧アプリケーションは、ユーザーが自分のファイルをバックアップ政策、又は、手順を変更せずに検索、及び、復旧させることができる。 By allowing end users with viewing / downloading permissions to access the file and allowing this access through the user interface, the end-user recovery application allows the user to back up his / her files, policies or procedures. Can be retrieved and restored without changing the password.

他の実施形態によると、ユーザーがバックアップストレージ媒体１２６上に保存されたバックアップデータセットのビュー（view)が付着されたネットワークを“マウント”し得る方法、及び、メカニズムが提供される。これは、ユーザーが自分のコンピュータに連結された任意の他のローカル、又は、ネットワークドライブ上のデータをビューイングし、アクセスするのと同様で、マウント済みデータセット内のデータをビューイングし、アクセスできるようにする、従って、例えば、ユーザーは媒体サーバー１４４(図１）を通じた復旧処理を実行せずにアプリケーションサーバー［例えば、システム第１ストレージ装置１０６(図１）が失敗した場合］で、データを有効に復旧することができる。上記のようなマウント手順を使用したアプリケーションサーバーへのデータ復旧は、ボリューム復旧が容易な典型的媒体サーバーに比べて、数十倍早い速度でなり得る。“マウント”という用語は、ネットワークドライブ等のネットワーク構成要素、または、データボリュームをホストコンピュータの運営システムで利用可能にすることを意味するものと認識されなければならない。データボリュームは、例えば、単一データファイル、または、システムファイル、複数のファイル、または、複数のファイルを含むディレクトリ構造等を含むことができる。カモンマウントプロトコル（common mounting protocol)は、NFS(network file system)、または、CIFS(common internet file system)シェアーリング(sharing)を含む。このようなプロトコルは、ホストコンピュータが、リモートリソース(remoteresource）がホストコンピュータ上に局部的に提供されるものと現れるインターフェースを通じて、ネットワーク連結を経て、他のコンピュータ上のリソースにアクセスできるようにする。 According to other embodiments, a method and mechanism is provided that allows a user to “ mount ” a network to which a view of a backup dataset stored on a backup storage medium 126 is attached. This is similar to a user viewing and accessing data on any other local or network drive connected to his computer, and viewing and accessing data in a mounted dataset. So that, for example, the user does not perform the recovery process through the media server 144 (FIG. 1) and the data on the application server [eg, if the system first storage device 106 (FIG. 1) fails] Can be recovered effectively. Data recovery to an application server using the mount procedure as described above can be tens of times faster than a typical media server that is easy to recover from a volume. The term “ mount ” should be recognized to mean making a network component, such as a network drive, or a data volume available to the host computer operating system. The data volume can include, for example, a single data file or a system file, a plurality of files, or a directory structure including a plurality of files. Kamon mount protocol (common mounting protocol) includes a NFS (network file system), or, CIFS (common internet file system) Sharing (sharing). Such a protocol allows a host computer to access resources on other computers via a network connection through an interface where a remote resource appears to be provided locally on the host computer.

図１２は、本発明の一実施形態によるボリュームマウントを行なう方法を示した順序図である。第１段階２９０において、ユーザーはデータボリュームを選択してマウントし、バックアップストレージシステム制御器１２２にボリュームマウント要請を伝達する(図３）。一般的に、ユーザーは、バックアップされた情報の全体的、及び、正確な表現をキャプチャーしできるように、フルバックアップデータセット（インクリメンタルバックアップデータセットでなく）から、データを復旧するのを所望することがある。現在のフルバックアップデータセットが存在しない場合（例えば、ネットワークマネージャが、フルバックアップを毎週実行することにより、ユーザーが週中にデータを復旧すのを希望しても、現在のフルバックアップを利用できない場合）、合成フルバックアップが生成され、選択されたデータの復旧に使用され得る。 FIG. 12 is a flowchart illustrating a method for performing volume mounting according to an embodiment of the present invention. In the first step 290, the user selects and mounts a data volume and transmits a volume mount request to the backup storage system controller 122 (FIG. 3). In general, users want to recover data from a full backup data set (not an incremental backup data set) so that an overall and accurate representation of the backed up information can be captured. There is. If there is no current full backup data set (for example, if the network manager wants to recover data during the week by performing a full backup every week, but the current full backup is not available) ), A synthetic full backup can be generated and used to recover the selected data.

一実施形態によると、バックアップストレージシステム１７０は、データボリュームマウントと復旧手順を行なう方法を制御して具現できるボリューム復旧アプリケーション３１０(図１３）であるソフトウェアアプリケーションを含むことができる。合成フルバックアップ、及び、エンドユーザー復旧アプリケーションと類似したボリューム復旧アプリケーション３１０は、ホストコンピュータ、及び/又は、ユーザーコンピュータ上で実行することができ、その一部は、ストレージシステム制御器、ホストコンピュータ、及び、ユーザーコンピュータの全体、または、一部に分配され得る。 According to one embodiment, the backup storage system 170 can include a software application that is a volume recovery application 310 (FIG. 13) that can be implemented by controlling the method of performing data volume mounting and recovery procedures. A synthetic full backup and volume recovery application 310 similar to an end user recovery application can be run on a host computer and / or user computer, some of which include a storage system controller, a host computer, and Can be distributed to all or part of a user computer.

上記図１２を再び参照すると、ボリュームマウントが要請された後、ボリューム復旧アプリケーションは、現在のフルバックアップデータセットが利用可能であるか否かを照会することができる（段階２９２）。利用ができない場合、ボリューム復旧アプリケーションは、合成フルバックアップアプリケーション２４０と通信して合成フルバックアップ処理を行ない（図１参照）、現在のバックアップデータセットを生成することができる（段階２９４）。ボリューム復旧アプリケーションは、正規のフルバックアップデータセット、または、合成フルバックアップデータセットをエクスポート（exporting)し、要請されたボリュームマウントをNFS、または、CIFSシェアによって行なうことができる。特に、ボリューム復旧アプリケーションは、論理的メタデータキャッシュ２４２を照会して、段階２９０で識別され選択されたフルバックアップボリュームを示す適切なメタデータを探す。 Referring again to FIG. 12, after the volume mount is requested, the volume recovery application may query whether the current full backup data set is available (step 292). If not, the volume recovery application can communicate with the synthetic full backup application 240 to perform a synthetic full backup process (see FIG. 1) and generate a current backup data set (step 294). The volume recovery application can export a regular full backup data set or a synthetic full backup data set, and perform the requested volume mount by NFS or CIFS share. In particular, the volume recovery application queries the logical metadata cache 242 for appropriate metadata that indicates the full backup volume identified and selected in step 290.

一実施形態によると、マウント要請（段階２９０）は、ボリューム復旧アプリケーションが一つ以上のファイルディスクリプタ構造を生成し、NFSまたはCIFSシェアによるマウントに対するボリュームのエクスポートを容易にする（段階２９６）。図１４は、ボリューム復旧アプリケーションによって生成され得るファイルディスクリプタ構造３２０の一実施形態を示した図面であり、ファイルディスクリプタ３２０は、テープフォーマットにおいてシステムファイル（例えば、システムファイル３２２、図１５参照）に対応する。上記のように、ファイルディスクリプタは、ストレージシステムに保存されたシステムファイルとデータファイルに対応する検索可能なメタデータを含む。ファイルディスクリプタ３２０は、例えば、マウントされるボリュームに含まれたデータファイルに対するファイルパーミション（アクセス制御ファイル）３２４とファイル名３２２等の情報を含む複数のフィールドを含むことができる。また、ファイルディスクリプタは、データファイルのソースデータの位置（すなわち、ストレージ媒体１２６上にデータファイルが保存された位置を識別するために）、データファイルの長さ３２８に対する一つ以上のポインター３２６、及び、リンクされたリストファイルディスクリプタ構造内の次のエントリ（entry)（例えば、次のデータファイル）に対するポインター３３０を含む。例えば、参照番号３３１によって示された“次”のフィールドがナル(null)である場合、データファイルがファイルディスクリプタ３２０によって示されたシステムファイルに知られている最新のデータファイル（例えば、最近リンクされたリストエントリである。）であるということを示す。マウントされるデータボリューム内に含まれた各システムファイルは、図１４に図示されたようなファイルディスクリプタ構造によって表現される。要請されたボリューム内の各システムファイルが、生成されたファイルディスクリプタ３２０を有する場合、ファイルディスクリプタはNFS、または、CIFS要請に答える関連データファイルを探してエクスポートすることに使用され得る。 According to one embodiment, the mount request (stage 290) allows the volume recovery application to generate one or more file descriptor structures to facilitate volume export for mounts with NFS or CIFS shares (stage 296). FIG. 14 is a diagram illustrating one embodiment of a file descriptor structure 320 that can be generated by a volume recovery application, which corresponds to a system file (eg, system file 322, see FIG. 15) in tape format. . As described above, the file descriptor includes searchable metadata corresponding to system files and data files stored in the storage system. The file descriptor 320 can include a plurality of fields including information such as a file permission (access control file) 324 and a file name 322 for the data file included in the mounted volume. The file descriptor also includes one or more pointers 326 to the data file length 328, the location of the source data of the data file (ie, to identify the location where the data file is stored on the storage medium 126), and Contains a pointer 330 to the next entry (eg, the next data file) in the linked list file descriptor structure. For example, if the “next” field indicated by reference number 331 is null, the data file is the latest data file known to the system file indicated by file descriptor 320 (eg, recently linked). This is a list entry). Each system file included in the data volume to be mounted is represented by a file descriptor structure as shown in FIG. If each system file in the requested volume has a generated file descriptor 320, the file descriptor can be used to locate and export an associated data file that answers the NFS or CIFS request.

上記のように、一実施形態において、ファイルディスクリプタは、大部分のユニックス基盤コンピュータシステムに使用されるテープアーカイブ(タール）フォーマット等の標準化されたフォーマットによって具現され得る。図１５は、テープ(例えば、タール）データストリームのセグメントによるテープフォーマットで書き込まれた典型的システムファイル３３２を示した図面である。図１６は、システムファイル３３２に対する対応ファイルディスクリプタ３４０を示した図面である。図１５に図示されたように、テープフォーマットで書き込まれたファイルは、システムファイル３３２に保存された実データ３３８とヘッダー３３６を含む。データ３３８は、一つ以上のデータファイルに対応することができる。図示された例において、システムファイル３３２の長さは１０３２バイトであるが、上記ファイルはファイルの大きさと書き込まれたフォーマットによって任意の長さを有し得る。 As described above, in one embodiment, the file descriptor may be embodied in a standardized format, such as a tape archive (tar) format used in most Unix-based computer systems. FIG. 15 shows an exemplary system file 332 written in a tape format with segments of a tape (eg, tar) data stream. FIG. 16 is a diagram showing a file descriptor 340 corresponding to the system file 332. As shown in FIG. 15, the file written in the tape format includes actual data 338 and a header 336 stored in the system file 332. Data 338 can correspond to one or more data files. In the illustrated example, the length of the system file 332 is 1032 bytes, but the file may have any length depending on the size of the file and the written format.

ファイル３３２に対するファイルディスクリプタ３４０は、ヘッダー３３６に含まれる。図１６に図示されたように、そして、図１４に図示された一般的な例と同様に、ファイルディスクリプタ３４０は、ファイル名３４１、保安情報３４４、システムファイルに知られている各データの保存されたデータに対するポインター３４２、対応データファイルの長さ３４６、及び、図示された例においてナル(null)３４８であるシステムファイルに知られている次のデータファイルを識別する“次”のエントリを含む。 A file descriptor 340 for the file 332 is included in the header 336. As shown in FIG. 16, and similar to the general example shown in FIG. 14, the file descriptor 340 stores a file name 341, security information 344, and each data known to the system file. And a "next" entry identifying the next data file known to the system file, which is null 348 in the illustrated example.

上記図１２を再び参照すると、マウントされたデータボリューム内のファイルに対する全てのファイルディスクリプタが生成された場合、ボリューム復旧アプリケーションは、生成されたファイルディスクリプタに基づいたファイルシステムをユーザーが特定のマウントポイントにNFS、または、CIFSシェアによってエクスポートする(段階２９８）。このポイントで、マウントが完了し(段階２９９）、マウント済みデータボリュームは後述するように、ユーザーがデータを読み出し、及び/又は、書き込みするのに利用可能である。 Referring to FIG. 12 again, when all file descriptors for files in the mounted data volume are generated, the volume recovery application allows the user to set the file system based on the generated file descriptor to a specific mount point. Export by NFS or CIFS share (step 298). At this point, the mount is complete (step 299) and the mounted data volume is available for the user to read and / or write data, as described below.

一実施形態によると、NFSまたはCIFS読み出し動作［すなわち、ユーザーがマウント済みデータボリューム内のデータをビューイング(viewing)することを所望］は、ファイル特定をマッチングさせるためのファイルディスクリプタを通じて検索することでサービスされる。一実施形態によると、ユーザーは、自分が直接ファイルディスクリプタを実際に検索する必要がないということが認識されなければならない。その代わりに、ボリューム復旧アプリケーションは、例えば、典型的なディレクトリ構造フォーマット内でユーザーにデータを提供するユーザーインターフェースを含むことができる。ボリューム復旧アプリケーションは、特定ファイルに対するユーザー要請を論理的メタデータキャッシュにアクセスする検索命令に変換して、マッチングシステムファイルに対するファイルディスクリプタ３２０を検索するソフトウェアを含むことができる。ファイルを探すことができた場合、ユーザーコンピュータへのデータ転送は、リンクされたリストをフォローイング(following)［すなわち、実データを探すためにファイルディスクリプタに保存されたポインターをフォローイング］することで達成され、要請したユーザーに送られ得るファイルデータのためにバッファを生成する。 According to one embodiment, NFS or CIFS read operations [i.e., the user wants to view the data in the mounted data volume] can be retrieved by searching through a file descriptor to match the file specification. Serviced. According to one embodiment, the user must be aware that he does not have to actually retrieve the file descriptor directly. Instead, the volume recovery application may include a user interface that provides data to the user, for example, in a typical directory structure format. The volume recovery application may include software that converts a user request for a specific file into a search command that accesses a logical metadata cache to search for a file descriptor 320 for a matching system file. If the file can be found, the data transfer to the user computer can be done by following the linked list [ie, following the pointer stored in the file descriptor to find the actual data]. Create a buffer for file data that can be achieved and sent to the requesting user.

他の実施形態によると、また、ユーザーがマウント済みボリュームに新たなデータを書き込むためにメカニズムが提供され得る。上記のように、マウント済みボリュームデータは、ユーザーに普通のネットワークドライブ、または,他のネットワーク−保存されたデータとして見られることがある。しかし、実際には、オリジナルマウント済みボリュームデータは、一般的に少なくとも他のバックアップデータセットが生成するまでには、保護される必要がある実際のバックアップデータである。従って、ユーザーがオリジナルバックアップデータを実際に修正し得るようにするのは望ましくなかったりもする。ユーザーがマウント済みボリュームに対応するデータを修正できるようになっている間、バックアップデータの修正を防止するために、後述するように、他のストレージ媒体への書き込みに転換するメカニズムが提供される。 According to other embodiments, a mechanism may also be provided for the user to write new data to the mounted volume. As described above, the mounted volume data may be viewed by the user as a normal network drive or other network-stored data. In practice, however, the original mounted volume data is generally the actual backup data that needs to be protected, at least until another backup data set is generated. Therefore, it may not be desirable to allow the user to actually modify the original backup data. While the user can modify the data corresponding to the mounted volume, a mechanism is provided to switch to writing to other storage media, as described below, to prevent modification of the backup data.

図１７は、本発明の一実施形態による書き込み要請を処理する方法を示した順序図である。最初の段階３５０において、ユーザーはNFSまたはCIFS書き込み動作(一般的にデータファイルをエディッティングまたはビューイングする間、“保存”オプションを選択することによる。）を要請する。ボリューム復旧アプリケーションは、利用可能なストレージ空間を探してその空間にデータを書き込み、及び、新規書き込みデータを参照するための適合したファイルディスクリプタを更新することで書き込み要請を実行する。 FIG. 17 is a flowchart illustrating a method for processing a write request according to an embodiment of the present invention. In an initial step 350, the user requests an NFS or CIFS write operation (typically by selecting the “Save” option while editing or viewing a data file). The volume recovery application searches for an available storage space, writes data to the space, and executes a write request by updating a suitable file descriptor for referring to new write data.

一実施形態によると、ボリューム復旧アプリケーションは、データを書き込むためのストレージ空間がすでに分配されたか否かを照会し(段階３５２）、分配されていない場合、ストレージ空間を分配する（段階３５４）。ストレージ空間は、バックアップストレージ媒体１２６に分配され得る(図１３）。分配された保存空間は、書き込みデータのみをホールディングするために特別に示されることができる(関連したメタデータは選択的である。）。 According to one embodiment, the volume recovery application queries whether storage space for writing data has already been distributed (step 352) and, if not, distributes the storage space (step 354). The storage space can be distributed to the backup storage medium 126 (FIG. 13). The distributed storage space can be specifically indicated to hold only the write data (relevant metadata is optional).

図１８は、バックアップストレージ媒体１２６に保存されたNFS、または、CIFS書き込みデータの一例を示した図面である。書き込みデータ３６０は、ボリューム復旧アプリケーションによってサービスされた書き込み命令の結果として発生した保存されたデータに対応する、例えば、二つの書き込まれた部分であるw１＿３６２、w２＿３６４を含む。例えば、w１及びw２は、マウント済みデータボリューム内に含まれた修正されたデータファイルに対応することができる。二つの書き込み要請に對応して図示されてはいるが、本発明の原理は、書き込み要請の数に限定されず適用されることができ、ファイルは書き込み要請の数によって適合するよう変更され得るということが認識されなければならない。書き込みデータ３６０は、オリジナルデータ（例えば、ファイル３３２）と新規書き込みデータ３６０間の磁気表示関係を形成するメタデータを含むヘッダーも含む。特に、図１９を更に参照して後術するように、ヘッダーは記録されたデータ部分w１、w２が、オリジナルデータに関連して論理的に存在するところを示すオフセット情報を含み得る。 FIG. 18 is a diagram illustrating an example of NFS or CIFS write data stored in the backup storage medium 126. Write data 360 includes, for example, two written portions, w1_362 and w2_364, corresponding to stored data generated as a result of a write command serviced by a volume recovery application. For example, w1 and w2 can correspond to modified data files included in the mounted data volume. Although the illustrated and對応the two write request, the principles of the present invention can be applied is not limited to the number of write requests, the file may be changed to fit the number of write requests It must be recognized. Write data 360 also includes a header that includes metadata that forms a magnetic display relationship between original data (eg, file 332) and new write data 360. In particular, as will be described later with further reference to FIG. 19, the header may include offset information indicating where the recorded data portions w1, w2 are logically present in relation to the original data.

図１９は、二つの書き込み要請がサービスされた以後のシステムファイルレイアウトの一例を示した図面である。オリジナルシステムファイル３３２は、バックアップストレージ媒体１２６(図１３）に保存され、上記のマウント手順を通じてユーザーに提供される。図１９に図示されたシステムファイル３３２は、データフォーマット内にあり、データ部分３３８は、複数のデータファイル(例えば、ユーザーファイル)を含むことができる。データは、オフセットゼロバイト（ポイント３７０）から始まり、後でポイント３７２で終了する。書き込まれたファイル３６０は、ユーザーの要請に対応してファイル３３２にデータを書き込む。例えば、ユーザーは、システムファイル３３２内に含まれた二つのデータファイルを修正することができ、結果的に、書き込まれたファイル３６０は、w１及びw２を含む。上記のように、この書き込まれたファイル３６０は、オリジナルバックアップデータを変更しないようにストレージ媒体上のファイル３３２から分離され、保存されることができる。論理的に修正されたシステムファイル３８０が図示され、書き込み要請を通して使用者による変更（すなわち、書き込まれたファイル３６０）を含むファイル３３２を示す。すなわち、修正されたシステムファイル３８０において、w１及びw２(使用者修正データファイル）は、バックアップされたデータを除去せずに、オリジナルシステムファイル３３２のデータ部分内に含まれたオリジナルデータファイルを代替することに使用され得る。 FIG. 19 is a diagram illustrating an example of a system file layout after two write requests are serviced. The original system file 332 is stored in the backup storage medium 126 (FIG. 13) and provided to the user through the mounting procedure described above. The system file 332 illustrated in FIG. 19 is in a data format, and the data portion 338 can include multiple data files (eg, user files). The data starts at offset zero byte (point 370) and ends at point 372 later. The written file 360 writes data to the file 332 in response to a user request. For example, the user can modify two data files included in the system file 332, and as a result, the written file 360 includes w1 and w2. As described above, the written file 360 can be separated from the file 332 on the storage medium and stored so as not to change the original backup data. A logically modified system file 380 is shown, showing a file 332 that includes user changes through a write request (ie, the written file 360). That is, in the modified system file 380, w1 and w2 (user modified data file) replace the original data file included in the data portion of the original system file 332 without removing the backed up data. Can be used for that.

図１９に図示されたように、修正されたシステムファイルは、オリジナルシステムファイル３３２と書き込まれたファイル３６０の論理的併合(summation)に対応する。図示されたように、オリジナルシステムファイルデータ３３８は、オリジナルファイル内のオフセットゼロで始まる。オフセット６４(参照番号３８４)で、修正されたデータの第１部分(W１)が始まり、９バイトが追加されたオフセット７３(参照番号３８６)で終了する。従って、ユーザーの書き込み要請によるユーザー修正データファイルであるW１は、オリジナルシステムファイル３３２内のオフセット６４に位置されたオリジナルデータファイルを代替するのに使用することができる。W１は、書き込まれたファイル３６０内のオフセット０（３９０）から存在し、書き込まれたファイル３６０内のオフセット９(３９２)で終了するため、W１の長さは９バイトになる。修正されたファイル内W１のスタート位置(図示された例において、オフセット６４)は、ヘッダー３６６に保存された情報、即ち、書き込まれたファイル３６０とオリジナルファイル３３２との間の相対的関係により決定される。W２部分も修正されたファイル３８０内に含まれ、オフセット１０３２(ファイルのオリジナルエンド、参照番号３７２)で始まり、論理的にファイルを１００バイト延長する。又、W２の長さは、ヘッダー３６６に位置された情報から決定される。ファイルの新しい終了ポイントは参照番号３８８で表示される。 As shown in FIG. 19, the modified system file corresponds to a logical summation of the original system file 332 and the written file 360. As shown, the original system file data 338 starts at offset zero within the original file. At offset 64 (reference number 384), the first portion (W1) of the modified data begins and ends at offset 73 (reference number 386) with 9 bytes added. Accordingly, the user modified data file W1 according to the user's write request can be used to replace the original data file located at the offset 64 in the original system file 332. W1 is present at offset 0 of the written file 360 (390), to end the offset 9 of the written file 360 (392), the length of W1 becomes 9 bytes. The start position (in the illustrated example, offset 64) of the modified in-file W1 is determined by the information stored in the header 366, ie, the relative relationship between the written file 360 and the original file 332. The The W2 portion is also included in the modified file 380, starting at offset 1032 (original end of file, reference number 372) and logically extending the file by 100 bytes. Also, the length of W2 is determined from the information located in the header 366. The new end point of the file is indicated by reference number 388.

修正されたファイルは、論理的に生成され、ユーザー修正バージョンのオリジナルファイルで表現されるが、ファイル３６０により表現された新規書き込みデータは、オリジナルファイル３３２の一部として実際に保存されない。代わりに、上記したように、新規書き込みデータは、データを書き込むために識別されたストレージ媒体上の特定の位置に保存される。このような方式で、一般的なローカル又はネットワークドライブと同様に、ユーザーがマウント済みボリュームに外観上、書き込みが可能な反面、オリジナルバックアップデータの保全が維持される。 The modified file is logically generated and represented in the user modified version of the original file, but the new write data represented by the file 360 is not actually saved as part of the original file 332. Alternatively, as described above, the new write data is stored in a particular location on a storage medium identified to write data. In this manner, as with a general local or network drive, the user can write to the mounted volume in appearance, but the integrity of the original backup data is maintained.

修正されたファイル３８０は、修正されたファイルを示すファイルディスクリプタを含んだヘッダー３８２を含む。図２０は、このようなファイルディスクリプタ４００の例を示した図面である。ファイルディスクリプタ４００は、修正されたファイル３８０のファイル名を識別するネイムフィールド(name field)４０２、及び修正されたファイル３８０の許容属性を識別する保安フィールド(security field)４０４を含む。ファイルディスクリプタ４００は、それぞれのオリジナルファイルと書き込まれたファイルに保存されたデータをキャプチャーするためのオリジナルファイル３３２に対するポインタ、及び書き込まれたファイル３６０に対するポインタを含む複数のデータフィールドも含む。ファイルディスクリプタ４００に与えられたポインタのリンクされたリストを連続的にフォローイングすることによって、修正されたファイル３８０の表示が与えられる。 The modified file 380 includes a header 382 that includes a file descriptor that indicates the modified file. FIG. 20 shows an example of such a file descriptor 400. The file descriptor 400 includes a name field 402 that identifies the file name of the modified file 380 and a security field 404 that identifies the allowed attributes of the modified file 380. The file descriptor 400 also includes a plurality of data fields including a pointer to the original file 332 for capturing data stored in each original file and the written file, and a pointer to the written file 360. By continuously following the linked list of pointers provided to the file descriptor 400, an indication of the modified file 380 is provided.

図１９及び図２０には、修正されたファイルに対するファイルディスクリプタの一例が図示されて説明されている。第１データフィールド４０６において、図１９で参照番号４０８で識別されるオフセットゼロバイトにある修正されたファイル３８０内の第１データファイル位置に対するポインタが位置する。フォローイングフィールド４１０は、ポインタ４０６により位置が特定されたデータファイルの長さを表示する。図示された例において、図１９で見るように、長さは６４バイトである（ゼロオフセットポイント４０８と６４バイトのオフセット３８４との間でデータが延長される）。次のフィールド４１２は、図１９に図示されたように、修正されたファイル３８０内の次のデータファイルがW１であることを表示する。従って、ポインタ４１４は、W１に対応したデータの位置が、ゼロオフセットポインタ(参照番号３９０、図１９)で新規書き込みファイル３６０に保存されるということを表示する。長さフィールド４１６は、図１９で見るように、W１は修正されたファイル３８０内でオフセット６４(３８４)とオフセット７３(３８６)との間で延長され、W１の長さが９バイトであるということを表示する。次のフィールド４１８は、修正されたファイル３８０内の次のデータファイルが、オリジナルシステムファイル３３２からのデータファイルであるということを表示する。フィールド４２０内のポインタは、次のデータファイルが、修正されたファイル３８０内のオフセット７３(図１９の参照番号３８６)に位置されるということを表示する。フィールド４２２は、図１９に図示されたように、データファイルの長さが９５９バイトということを表示する。次のフィールド４２４は、フォローイングデータファイルがW２であることを表示する。また、フィールド４２６内のポインタは、W２の位置、即ち、図１９に図示されたように、新規書き込みファイル３６０のオフセット９を表示する。フィールド４２８は、W２の長さが１００バイトであるということ、次のフィールド４３０はナルを含むということ、図１９に図示されたように、W２が、修正されたファイル３８０内の最終データファイルであることを表示する。従って、ファイルディスクリプタ４００は、修正されたファイル３８０の構造、及び修正されたファイル３８０に含まれたデータの位置を示す「ロードマップ(roadmap)」を含む。 19 and 20 illustrate an example of a file descriptor for the modified file. In the first data field 406 is located a pointer to the first data file location in the modified file 380 at the offset zero byte identified by reference numeral 408 in FIG. The following field 410 displays the length of the data file whose position is specified by the pointer 406. In the illustrated example, as seen in FIG. 19, the length is 64 bytes (data is extended between a zero offset point 408 and a 64 byte offset 384). The next field 412 indicates that the next data file in the modified file 380 is W1, as illustrated in FIG. Therefore, the pointer 414 indicates that the position of the data corresponding to W1 is stored in the new write file 360 with the zero offset pointer (reference number 390, FIG. 19). The length field 416, as seen in FIG. 19, is that W1 is extended in the modified file 380 between offset 64 (384) and offset 73 (386), and the length of W1 is 9 bytes. Display. Next field 418 indicates that the next data file in modified file 380 is a data file from original system file 332. The pointer in field 420 indicates that the next data file is located at offset 73 (reference number 386 in FIG. 19) in the modified file 380. Field 422 displays that the length of the data file is 959 bytes, as shown in FIG. The next field 424 indicates that the following data file is W2. The pointer in the field 426 displays the position of W2, that is, the offset 9 of the new write file 360 as shown in FIG. Field 428 is that W2 is 100 bytes long, the next field 430 contains nulls, and W2 is the final data file in modified file 380 as illustrated in FIG. Display that there is. Accordingly, the file descriptor 400 includes a “roadmap” indicating the structure of the modified file 380 and the location of the data contained in the modified file 380.

上記のボリューム復旧アプリケーション、及び方法は、シーケンシャルテープフォーマットデータをNFS又はCIFS等のランダムアクセスI/Oシステムに適合する形態で表示する。ファイルディスクリプタ４００のようなリンクされたリストファイルディスクリプタは、特定タールストリーム(tar stream)内の各データファイルのストレージ媒体上の位置と共に、例えば、タールストリーム内の他のデータファイルに関連したタールストリーム内の各データファイルの位置を記録することによって、シーケンシャルテープフォーマットデータを、ランダムアクセスが可能なデータに変換するのに使用され得る。また、一実施形態によれば、ボリューム復旧アプリケーションは、バックアップ/復旧アプリケーションが、上記の普通の方式でデータにアクセスできるよう、変更された(即ち、書き込まれた)データバックをテープ(例えば、タール)フォーマットで表示するプロビジョン(provision)を含み得る。一実施形態によれば、インスタント復旧アプリケーションは、ファイルシステムソフトウェアに関して、上記した方式でテープヘッダー、パッド、データ、及びファイルマーカーで適切にフォーマットされた仮想カートリッジを生成する設備を含む。他の実施形態において、ボリューム復旧アプリケーションは、ファイルシステムソフトウェアとインターフェースされて新規書き込み、及び修正されたファイルを含む上記のような仮想カートリッジを生成することができる。 The volume recovery application and method described above display sequential tape format data in a form compatible with a random access I / O system such as NFS or CIFS. A linked list file descriptor, such as file descriptor 400, along with the location on the storage medium of each data file in a particular tar stream, for example in a tar stream associated with other data files in the tar stream. Can be used to convert sequential tape format data into randomly accessible data by recording the location of each data file. Also, according to one embodiment, the volume recovery application can store the modified (ie, written ) data back on tape (eg, tar) so that the backup / recovery application can access the data in the normal manner described above. ) May include provisions that are displayed in a format. According to one embodiment, the instant recovery application includes facilities for generating a virtual cartridge appropriately formatted with tape headers, pads, data, and file markers in the manner described above with respect to file system software. In other embodiments, the volume recovery application can interface with the file system software to create a virtual cartridge as described above that includes newly written and modified files.

本発明において、合成フルバックアップアプリケーション、エンドユーザー復旧アプリケーション、及びボリューム復旧アプリケーションのようなソフトウェアの用語が主に使用されたが、ソフトウェア、ハードウェア又はファームウェア、又はその組み合せで他の形態が選択的に具現できることが認識されなければならない。従って、本発明の実施形態は、ストレージシステムのプロセッサで少なくとも一部が実行され、前記したような合成フルバックアップアプリケーション及び/又はエンドユーザー復旧アプリケーションの機能を遂行する場合、コンピュータプログラムでインコーディングされた全てのコンピュータ読み出し可能媒体(例えば、コンピュータメモリ、フロッピーディスク、コンパクトディスク、テープ等)が含まれ得る。 In the present invention, software terms such as synthetic full backup application, end user recovery application, and volume recovery application are mainly used, but other forms are selectively used in software, hardware or firmware, or a combination thereof. It must be recognized that it can be implemented. Accordingly, the embodiments of the present invention are at least partially executed by the processor of the storage system and are encoded by a computer program when performing the functions of the synthetic full backup application and / or the end user recovery application as described above. all computer readable medium (e.g., a computer memory, a floppy disk, compact disk, tape, etc.) may be included.

要するに、本発明による実施形態は、従来のテープバックアップシステムをエミュレートするが、エンドユーザーがバックアップされたファイルをビュー又は復旧するようにし、合成バックアップを生成できるような向上した機能性を提供できるストレージシステム、及び方法を含む。しかし、本発明による多様な形態は、コンピュータデータのバックアップ以外に使用され得る。本発明によるストレージシステムは、保存されたデータが、ハードディスクアクセス時間において、連続的でなくランダムにアクセスできる大容量データを経済的に保存するのに使用され得もし、本発明による実施形態は、従来のバックアップストレージシステム以外でも具現できる。例えば、本発明による実施形態は、映画と音楽の幅広い選択を意味するビデオ及び/又はオーディオ・オンデマンドが可能なビデオ又はオーディオデータを保存するのに使用され得る。 In summary, embodiments according to the present invention emulate a conventional tape backup system, but allow end users to view or recover backed up files and provide enhanced functionality that can generate synthetic backups. Systems and methods are included. However, various forms according to the present invention can be used other than backup of computer data. The storage system according to the present invention can also be used to economically store large amounts of data in which the stored data can be accessed randomly rather than continuously during hard disk access time. It can be implemented in other than backup storage systems. For example, embodiments according to the present invention may be used to store video and / or audio on demand capable video and / or audio data meaning a wide selection of movies and music.

本発明の一つ以上の実施形態の幾つかの様相に関する詳細な説明により、当業者が、多様な変形、修正、及び改良を行えることを認識しなければならない。このような変形、修正、及び改良は、この詳細な説明の一部として意図され、本発明の思想内で意図されたものである。従って、上記説明と図面は例示のみのためのものである。 It should be appreciated that those skilled in the art can make various variations, modifications, and improvements from the detailed description of several aspects of one or more embodiments of the present invention. Such variations, modifications, and improvements are intended as part of this detailed description, and are intended within the spirit of the invention. Accordingly, the foregoing description and drawings are for illustrative purposes only.

バックアップストレージシステムを含む大型のネットワーキングされたコンピューティング環境の一例を示したブロック図である。1 is a block diagram illustrating an example of a large networked computing environment that includes a backup storage system. FIG. 本発明によるストレージシステムを含む、ネットワーキングされたコンピューティング環境の一実施形態のブロック図である。1 is a block diagram of one embodiment of a networked computing environment including a storage system according to the present invention. 本発明によるストレージシステムの一実施形態のブロック図である。1 is a block diagram of an embodiment of a storage system according to the present invention. FIG. 本発明によるストレージシステムの一実施形態の仮想レイアウトを示したブロック図である。It is the block diagram which showed the virtual layout of one Embodiment of the storage system by this invention. 本発明の実施形態によるシステムファイルの一例の概略的なレイアウトである。3 is a schematic layout of an example of a system file according to an embodiment of the present invention. 本発明の実施形態によるテープディレクトリ構造の一例を示した図面である。3 is a diagram illustrating an example of a tape directory structure according to an exemplary embodiment of the present invention. 本発明の実施形態による合成フルバックアップを生成する方法の一例を示した図面である。3 is a diagram illustrating an example of a method for generating a synthetic full backup according to an exemplary embodiment of the present invention. 本発明の実施形態による合成フルバックアップを含むバックアップ・データセットのシリーズの一例の概略的な図面である。4 is a schematic drawing of an example of a series of backup data sets including a synthetic full backup according to an embodiment of the present invention. メタデータキャッシュ構造(metadata cache structure)の一例を示した図面である。2 is a diagram illustrating an example of a metadata cache structure. 合成フルバックアップ・データセットを保存する仮想カートリッジの一例を示した図面である。It is the figure which showed an example of the virtual cartridge which preserve | saves a synthetic | combination full backup data set. 合成フルバックアップ・データセットを保存する仮想カートリッジの他の例を示した図面である。It is drawing which showed the other example of the virtual cartridge which preserve | saves a synthetic full backup data set. 本発明の実施形態によるバックアップ・ストレージシステムからデータを復旧するための方法の、一実施形態のフローチャートである。2 is a flowchart of one embodiment of a method for recovering data from a backup storage system according to an embodiment of the present invention; 本発明の実施形態によるバックアップ・ストレージシステムを含むネットワーキングされたコンピューティング環境の他の実施形態のブロック図である。FIG. 6 is a block diagram of another embodiment of a networked computing environment including a backup storage system according to an embodiment of the present invention. 本発明の実施形態によるファイルディスクリプタ構造(file descriptor structure)の一例を示した図面である。3 is a diagram illustrating an example of a file descriptor structure according to an exemplary embodiment of the present invention. ファイルデータがテープフォーマットで保存され得る方法の一例を示した図面である。6 is a diagram illustrating an example of a method in which file data can be stored in a tape format. 図１５に図示されたファイルに対するファイルディスクリプタを示した図面である。FIG. 16 is a diagram illustrating a file descriptor for the file illustrated in FIG. 15. 本発明の一実施形態によってマウントされた、データボリュームにデータを書き込む方法のフローチャートである。4 is a flowchart of a method for writing data to a data volume mounted according to an embodiment of the present invention. 新規書き込みファイルの一例を示した図面である。It is drawing which showed an example of the new write file. 本発明の一実施形態によるオリジナルファイル、新規書き込みファイル及び、最終的に修正されたファイル間の関係に対する一例を示した図面である。6 is a diagram illustrating an example of a relationship between an original file, a newly written file, and a finally modified file according to an exemplary embodiment of the present invention. 図１９に図示した修正されたファイルを示すファイルディスクリプタの一例を示した図面である。FIG. 20 is a diagram illustrating an example of a file descriptor indicating the modified file illustrated in FIG. 19. FIG.

Claims

シーケンシャルフォーマットストレージ媒体上への保存のためにフォーマットされたデータをバックアップ／復旧アプリケーションから受け取る段階と、Receiving data formatted from a backup / recovery application for storage on a sequential format storage medium;
前記データを、ランダムアクセスストレージ媒体上への保存及びランダムアクセスストレージ媒体からの検索に適合したフォーマットを有するバックアップデータに変換する段階と、Converting the data into backup data having a format adapted for storage on and retrieval from a random access storage medium;
前記バックアップデータを前記ランダムアクセスストレージ媒体上に保存する段階と、Storing the backup data on the random access storage medium;
データボリュームをホストコンピュータ上にマウントする段階であって、該データボリュームが前記バックアップデータの少なくとも一部に対応するマウント済みデータを含んでいる段階と、Mounting a data volume on a host computer, the data volume including mounted data corresponding to at least a portion of the backup data;
前記バックアップデータを修正することなく前記マウント済みデータに変更を加えることを許可する段階とAllowing changes to the mounted data without modifying the backup data; and
を含む方法。Including methods.

前記ホストコンピュータのオペレータのアクセス権限を認証する段階を更に含む請求項１に記載の方法。The method of claim 1, further comprising authenticating an access authority of an operator of the host computer.

前記オペレータの前記アクセス権限に基づいて、前記マウント済みデータへのアクセスを制限する段階を更に含む請求項２に記載の方法。The method of claim 2, further comprising restricting access to the mounted data based on the access authority of the operator.

前記マウント済みデータに加えられた変更を前記ホストコンピュータ上に保存する段階を更に含む請求項１〜３の何れかに記載の方法。4. A method as claimed in any preceding claim, further comprising the step of saving changes made to the mounted data on the host computer.

前記マウント済みデータを前記ホストコンピュータ上にダウンロードする段階を更に含む請求項１〜４の何れかに記載の方法。5. A method according to any preceding claim, further comprising downloading the mounted data onto the host computer.

少なくとも一つのランダムアクセスディスクアレイを含むバックアップストレージ媒体と、A backup storage medium including at least one random access disk array; and
バックアップ／復旧アプリケーションからシーケンシャルフォーマットデータを受け取り、該シーケンシャルフォーマットデータを、前記バックアップストレージ媒体上への保存に適合したフォーマットを有するバックアップデータに変換する手段であって、該フォーマットは、前記バックアップデータへの非シーケンシャルアクセスを可能にするフォーマットである、手段と、Means for receiving sequential format data from a backup / recovery application and converting the sequential format data into backup data having a format adapted for storage on the backup storage medium, the format comprising: Means that is a format that allows non-sequential access; and
前記バックアップデータへのアクセスを提供するように構成されたユーザインターフェースとA user interface configured to provide access to the backup data;
を含むシステム。Including system.

前記バックアップデータは複数のファイルを含み、The backup data includes a plurality of files,
前記ユーザインターフェースは、前記複数のファイルの階層的ディレクトリ構造を提供するように構成されている、請求項６に記載のシステム。The system of claim 6, wherein the user interface is configured to provide a hierarchical directory structure of the plurality of files.

前記バックアップデータは、少なくとも一つのバックアップデータファイルを含み、The backup data includes at least one backup data file,
前記ユーザインターフェースは、前記バックアップデータを保存している間、前記少なくとも一つのバックアップデータファイルの修正されたバージョンに対応する新規書き込みデータを受け入れるように構成されている、請求項６又は７に記載のシステム。8. The user interface according to claim 6 or 7, wherein the user interface is configured to accept new write data corresponding to a modified version of the at least one backup data file while storing the backup data. system.

第２のストレージ媒体を更に含み、A second storage medium,
前記新規書き込みデータが前記第２のストレージ媒体に保存される、請求項８に記載のシステム。The system of claim 8, wherein the new write data is stored on the second storage medium.

前記新規書き込みデータは、該新規書き込みデータと前記少なくとも一つのバックアップデータファイルとの関係を記述するメタデータを含んでいる、請求項８に記載のシステム。9. The system of claim 8, wherein the new write data includes metadata that describes a relationship between the new write data and the at least one backup data file.

検索可能なデータの集合を含む論理メタデータキャッシュを更に含み、Further including a logical metadata cache containing a collection of searchable data;
前記ユーザインターフェースは、前記検索可能なデータの集合へのアクセスを提供して、前記バックアップデータの一部を成す個々のバックアップファイルの位置を特定することをオペレータに許可するように構成されている、請求項６〜１０の何れかに記載のシステム。The user interface is configured to provide access to the searchable data collection to allow an operator to locate individual backup files that are part of the backup data. The system according to claim 6.

バックアップストレージシステムからデータボリュームを復旧する方法であって、A method of recovering a data volume from a backup storage system,
現在のフルバックアップデータセットが利用可能でないことを判定する段階と、Determining that the current full backup dataset is not available;
前記現在のフルバックアップデータセットに対応する合成フルバックアップデータセットを生成する段階と、Generating a synthetic full backup data set corresponding to the current full backup data set;
前記合成フルバックアップデータセットを、要求されたボリュームマウントとしてエクスポートすることによって、前記データボリュームをホストコンピュータに復旧する段階とRestoring the data volume to a host computer by exporting the synthetic full backup data set as a requested volume mount;
を含む方法。Including methods.

前記バックアップストレージシステム上には、第１の複数のデータファイルを含む既存フルバックアップデータセットと、第２の複数のデータファイルを含むインクリメンタルフルバックアップデータセットとが保存されており、On the backup storage system, an existing full backup data set including a first plurality of data files and an incremental full backup data set including a second plurality of data files are stored,
前記合成フルバックアップデータセットを生成する段階は、The step of generating the synthetic full backup data set includes:
前記第１及び第２の複数のデータファイルに含まれる各データファイルの最新のコピーを決定する段階と、Determining a latest copy of each data file included in the first and second data files;
一組のインジケータを記憶する段階であって、該一組のインジケータは、前記既存フルバックアップデータセット及び前記インクリメンタルフルバックアップデータセットの一方における前記各データファイルの最新のコピーの記憶場所をそれぞれ示す、段階と、Storing a set of indicators, each indicating a storage location of a latest copy of each data file in one of the existing full backup data set and the incremental full backup data set; Stages,
前記一組のインジケータに基づいて、前記各データファイルの最新のコピーに対応する前記合成フルバックアップデータセットを生成する段階とGenerating the synthetic full backup data set corresponding to the latest copy of each data file based on the set of indicators;
を含んでいる、請求項１２に記載の方法。The method according to claim 12, comprising: