CN1997972A - Emulated storage system supporting instant volume restore - Google Patents

Emulated storage system supporting instant volume restore Download PDF

Info

Publication number
CN1997972A
CN1997972A CN 200480030746 CN200480030746A CN1997972A CN 1997972 A CN1997972 A CN 1997972A CN 200480030746 CN200480030746 CN 200480030746 CN 200480030746 A CN200480030746 A CN 200480030746A CN 1997972 A CN1997972 A CN 1997972A
Authority
CN
China
Prior art keywords
data
file
backup
storage system
data file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200480030746
Other languages
Chinese (zh)
Inventor
米克洛斯·桑多菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sepaton Inc
Original Assignee
Sepaton Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sepaton Inc filed Critical Sepaton Inc
Publication of CN1997972A publication Critical patent/CN1997972A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

In a back-up storage system, an apparatus and methods for mounting a data volume corresponding to a back-up data set to a host computer. In one example, a method includes mounting a data volume on a host computer, the data volume comprising at least one data file, the data file corresponding to a most recently backed-up version of the at least one data file stored on a backup storage system, and storing, on the backup storage system, data corresponding to a second version of the at least one data file that is more recent than the most recently backed-up version of the at least one data file stored on the backup storage system while preserving the most recently backed-up version of the at least one data file.

Description

Support the emulated storage system of instant volume restore
Technical field
Various aspects of the present invention relate to data storage, more exactly, relate to and be used for the emulate tapes storage system so that the equivalent that uses existing full backup and later incremental backup to be provided, and make the terminal user can be from described backup the apparatus and method of restore data.
Background technology
Many computer systems comprise one or more main frames and store one or more data-storage systems of the employed data of main frame.These main frames and storage system generally link together by for example network of the communication network of optical-fibre channel, Ethernet or other types.Optical-fibre channel is a kind of combining based on the speed of the transfer mechanism of passage with based on the standard of the dirigibility of the transfer mechanism of network, and allowing a plurality of startups by network and a plurality of destinations traffic, described startup and target can be any equipment with network coupled.Optical-fibre channel is used transmission medium execution fast usually, for instance, optical fiber cable, and in the storage networking that transmits mass data, described optical-fibre channel is general selection.
The embodiment of typical networked computer environments comprises as shown in Figure 1 several main frames and backup storage system.One or more apps servers 102 are coupled on many subscriber computers 104 by LAN (Local Area Network) 103 (LAN).Apps server 102 and subscriber computer 104 all may be considered to " main frame ".Apps server 102 is by storage area network 108 (SAN) and one or more main storage device 106 couplings.Main storage device 106 may be that for example disk array can obtain from for example EMC Inc., IBM Corporation and other similar companies.As an alternative, data transmission bus (not shown) or other networks connect may provide interconnecting between apps server and the main storage system 106.The connection of data transmission bus and/or fiber channel network can use agreement be operated, small computer system interface (SCSI) agreement for example, SCSI is controlled at the packets of information of certain form of transmission between main frame (for example, apps server 102) and the storage system 106.
People will understand, and accompanying drawing 1 illustrational networked computer environments is the typical case of large scale system, for example, and employed system of big financial institution or major company.It will be appreciated that many networked computer environments are not to comprise all parts of enumerating in the accompanying drawing 1.For example, less networked computer environments can be reduced to and comprise and directly being connected with storage system, or the main frame that is connected with storage system by LAN.In addition, although accompanying drawing 1 is for example understood independent subscriber computer 104, apps server 102 and media server 114, these functions can be merged into the computing machine more than.
Except main storage device 106, many networked computer environments comprise at least one auxiliary or backup storage system 110.Described backup storage system 110 is tape library normally, although may use other jumbo, reliable secondary storage system.Typically, the speed of these secondary storage system is slower than the speed of main storage system, but comprises some type (for example, tape, disk or CD) of some removable medium, and the described medium that can move again can be moved with the strange land to be stored.
In indicative embodiment, apps server 102 perhaps can be passed through, and for example Ethernet or other communicate to connect 112, directly communicate by letter with backup storage system 110.Yet described connection may relatively slowly and take resource, for example takies the processor time or the network bandwidth.Therefore, be used for illustrational system and may comprise one or more media servers 114, described media server 114 can provide between SAN 108 and the backup storage system 110 to communicate to connect by for example optical-fibre channel.
Media server 114 can move the software that comprises backup/restore application, (for example subscriber computer 104 for described application program main control system, media server 114, and/or apps server 102), the transmission of the data between main storage device 106 and the backup storage system 110.Backup/restore application can be from for example Veritas, and Legato and other company obtain.For Data Protection, can periodically be backuped in the backup storage system 110 by backup/restore application in networked computing system from the data of various main frame and/or main storage device, this is known in the art.
Certainly, discussion just as mentioned in the above it will be appreciated that: compare with the networked computer environments as the model of showing in the accompanying drawing 1, many networked computer environments may be less and comprise less parts.Therefore, people will figure out: in fact media server 114 is connected with apps server 112 in the independent main frame, and backup/restore application can be carried out on the main frame of any and backup storage system 110 couplings (directly or indirectly, for example pass through network).
An example of typical backup storage system is a tape library, and this tape library comprises plurality of magnetic tape drum and at least one magnetic tape station, and the magnetic tape station that is written into of control tape cassete and the robot apparatus that unloads from magnetic tape station.Backup/restore application provides instruction to determine the position of particular tape cartridge to robot apparatus, for example magnetic tape label 0001 and magnetic tape station so that data that tape cassete is written into can be write in the tape.Backup/restore application also may command writes the form of the data of tape.Typically, backup/restore application can be used scsi command, or other standardized commands, with the instruction robot apparatus and control magnetic tape station and data are write tape and recover data that write from tape.
There are some problems in traditional tape library backup system, comprises speed, reliability and fixed capacity.Being permitted great company needs the data of back-up terabytes weekly.Yet,, be converted to about 50 GB/hour (GB/hr) even expensive, high-end tape is merely able to the speed read/write data with 30-40 megabyte/second (MB/s) usually.Therefore, the data of one or two ten thousand megabyte are backuped to need 10-20 hour the continuous data-switching time at least in the tape backup system.
In addition, most tape manufacturers can not guarantee storage (or recover) data to/from the possibility of tape, if losing, (loses moving or be written into the operating period tape by tape, may occur in relatively continually in the typical tape library owing to people's operation or the reason of robot apparatus) if or tape be exposed to nonideal environment, for instance, in the extreme temperature or humidity.Therefore, need a large amount of effort that storage tape is in the environment that can control.And, the maintenance costliness of the composite structure of tape library (comprising robot apparatus), and individual other tape cassete is relatively costly and have a limited term of life.
Summary of the invention
The backup storage system that embodiment of the present invention provide overcomes or alleviates some or all problem of traditional tape base system, and can provide more flexibility than traditional tape base system.
Put it briefly, the various aspects of embodiment of the present invention provide to be stored as the storage system on basis at random, described storage system is imitated traditional tape backup storage system, so that the equipment that backup/restore application indicated is the same with real tape library with medium.Storage system of the present invention uses software and hardware with the real tape-shaped medium's of emulation, and substitutes with the one or more arrays of memory disk at random, and magnetic tape format, linear, alphabetic data transformation are suitable for being stored in the data on the disk.In addition, be provided at the data that the application program of moving on hardware and/or the software is used for recovering to be stored in backup storage system.
According to some aspect of embodiment of the present invention, the device that provides is used for continuous magnetic tape format data are converted to suitable form with random access I/O.In one embodiment, the device that provides is included as installs that volume is installed on the main frame and the preparation of doing with the converting expressing formula of magnetic tape format data as NFS (network file system(NFS)) or CIFS (common internet file system).
According to other aspects of the present invention and embodiment, be provided for and will write the file system that is converted to installation device with " safe storage ", raw data remains unchanged whereby.In one embodiment, the device that provides is used to follow the tracks of the real-time conversion of raw data, so that random access I/O becomes possibility.In another embodiment, the I/O that the device that provides is used for the up-to-date data backup that writes is converted to for order tape feature is suitable magnetic tape format data.
In one embodiment, method may further comprise the steps: book is installed on the main frame, described book comprises at least one one data file, this data file is corresponding with the up-to-date backup version of at least one one data file in being stored in backup storage system, the corresponding data of second version of storage and described at least one one data file in backup storage system, when preserving the up-to-date backup version of a at least data file, described data are upgraded than the up-to-date backup version that is stored at least one one data file in the backup storage system.This method may also comprise the up-to-date backup version of a at least data file of connection and second version of at least one one data file.In one embodiment, this method may also comprise the data structure of second version of the up-to-date backup version of a at least data file of newly-built sign and at least one one data file.In another embodiment, second version of at least one one data file may be the revision of the up-to-date backup version of at least one one data file.
According to another embodiment, backup storage system comprises the backup storage medium that is used for the store backup data collection, and controller, this controller comprises at least one processor, this processor is used to carry out above a series of instructions of described method of operation through configuration.
In the another one embodiment, store data structure in the computer-readable medium that provides, this data structure comprises that first indicates kit, this first sign kit indicates the system file that comprise at least one one data file corresponding with backup data set uniquely, and at least one second sign kit, this second memory location separately that indicates in the kit sign storage medium, described storage medium stores the latest edition of concentrated each one data file at least of Backup Data.
Description of drawings
Accompanying drawing is not according to scale.In the accompanying drawing, each in each accompanying drawing of explaining identical or the same numeral of essentially identical part.Clear in order to represent, be not that each part all is labeled in each part accompanying drawing.In the accompanying drawings:
Accompanying drawing 1 is the block scheme of an embodiment that comprises the catenet computing environment of backup storage system;
Accompanying drawing 2 is the block schemes of an embodiment that comprise the networked computer environments of storage system according to various aspects of the present invention;
Accompanying drawing 3 is the block schemes according to an embodiment of the storage system of various aspects of the present invention;
Accompanying drawing 4 is the block schemes that illustrate according to the virtual design of an embodiment of the storage system of various aspects of the present invention;
Accompanying drawing 5 is the synoptic diagram according to an embodiment of the system file of various aspects of the present invention;
Accompanying drawing 6 is embodiment according to various aspects machine readable catalogue of the present invention;
Accompanying drawing 7 is according to various aspects of the present invention, and the chart of an embodiment of the method for newly-built synthetic full backup is described;
Accompanying drawing 8 is according to various aspects of the present invention, includes the synoptic diagram of an embodiment of the backup data set series of synthetic full backup;
Accompanying drawing 9 is charts of an embodiment of the structure of metadata cache;
Accompanying drawing 10 is charts of an embodiment of the virtual cartridge of the synthetic full back-up data sets of storage; And
Accompanying drawing 11 is charts of another embodiment of the virtual cartridge of the synthetic full back-up data sets of storage;
Accompanying drawing 12 is according to various aspects of the present invention, is used for recovering the process flow diagram of an embodiment of method of the data of backup storage system;
Accompanying drawing 13 is according to various aspects of the present invention, comprises the block scheme of another embodiment of the networked computer environments of backup storage system;
Accompanying drawing 14 is the charts according to an embodiment of various aspects file descriptor structure of the present invention;
Accompanying drawing 15 usefulness chart explaining supporting paper data are embodiment as the storage of magnetic tape format how;
The filec descriptor of the file description of accompanying drawing 16 usefulness chart explainings explanation accompanying drawing 15;
Accompanying drawing 17 is the embodiment process flow diagrams of method that data write the book of installation according to the present invention;
Accompanying drawing 18 is the synoptic diagram that newly write file;
Accompanying drawing 19 is according to various aspects of the present invention, raw data, newly writes the synoptic diagram of an embodiment of the relation between file and the final revised file; And
Accompanying drawing 20 is charts of an embodiment of filec descriptor of the revised file of expression accompanying drawing 19;
Embodiment
Below in conjunction with corresponding accompanying drawing various embodiment and aspect are carried out more detailed description.It will be appreciated that the present invention is not limited in its application in the structure and the layout of the parts of hereinafter setting forth, or in the details of explaining in the accompanying drawing.The present invention is applicable to other embodiment and can puts into practice and carry out in various mode.And wording used herein and term are for purposes of illustration, and can not be considered to restriction." comprise ", the use of " comprising ", " having ", " comprising ", " relating to " and they unite use, the use of these wording and term is meant the project that comprises listed thereafter project and be equal to the meaning and annex.
Be meant any computing machine that has processor at least as term used herein " main frame ", for instance, personal computer, workstation, large scale computer, network client and server or the like, described main frame can with other devices communicating, for example, storage system or other main frames.Main frame may comprise media server and apps server (as front 1 description in conjunction with the accompanying drawings) and subscriber computer (may be teller work station, personal computer, large scale computer or the like).In addition, in disclosed file, term " networked computer environments " comprises any computing environment, and wherein multiple host is connected in one or more shared storage systems, in this mode, storage system can with each main-machine communication.Optical-fibre channel is an embodiment of communication network, and described communication network can be used for embodiment of the present invention.Yet, it will be appreciated that, network described herein is not limited to optical-fibre channel, and various network components may connect and mutual communication by any network, for instance, Token Ring or ether replace, perhaps except that optical-fibre channel, or the combination by different network connections.And various aspects of the present invention may also be used in bus topologies, for example, and SCSI or parallel SCSI.
According to various embodiment of the present invention and various aspects, the virtual removable medium library back-up storage system that provides can be used one or more disk arrays, is that movably medium is simulated on the basis with the storage system.Use embodiment of the present invention, the data array that can back up to disk is finished by using similar backup/restore application, as being used to Backup Data (for instance to medium movably, tape, disk, CD or the like), exempted the user and must make any modification or adjustment or buy new backup/restore application existing stand-by program.In one embodiment, this paper describes in detail, and move media is simulated as tape again, and the backup storage system of the present invention tape base system of imitating comprises tape and be used to handle the aut.eq. of tape in traditional tape base system.
Storage system according to various aspects of the present invention comprises the hardware and software that is connected with backup storage medium by interface and main frame (operation backup/restore application).Described storage system can be designed as emulate tapes, or the mobile memory medium again of other types, so that backup/restore application is considered as real tape library with described device and medium, and linearity, alphabetic data, magnetic tape format data are converted to the data that are suitable for storing in the memory disk at random.In this way, storage system of the present invention can provide the function of increase (for instance, to allow the user file of the indivedual backups of user inquiring, the backup/restore application software or the strategy of as discussed below) and not looking for novelty.
With reference to the accompanying drawings 2, comprise an embodiment of the networked computer environment of backup storage system 170 according to aspect of the present invention with the form of explanation explanation of block scheme.As giving an example, main frame 120 is by network connection 121 and storage system 170 couplings.Network connects 121, and for example, optical-fibre channel connects to allow the high-speed transfer of the data between main frame 120 and the storage system 170.It will be appreciated that, main frame 120 may be, maybe may comprise one or more apps servers 102 (referring to accompanying drawing 1) and/or media server 114 (referring to accompanying drawing 1), and may be from networked computer environment existing any computing machine or from main memory device 110 (referring to annex 1) Backup Data.In addition, one or more subscriber computers 136 may connect 138 for example ether connections by other network, with storage system 170 couplings.As following detailed discussion, storage system can make the user of subscriber computer 136 read and optionally recover the user file of backup from storage system.
Storage system comprises backup storage medium 126, and described backup storage medium 126 may be that for instance, one or more disk arrays are as following more detailed explanation.Backup storage medium 126 provides actual storage space for the Backup Data from main frame 120.Yet storage system 170 may also comprise software and additional hardware, and described hardware is imitated movably medium memory system, for example, tape library, so that on main frame 120, move backup/restore application, data seem to be backuped to traditional packaged type storage medium.Therefore, the explanation in accompanying drawing 2, storage system 170 may comprise " emulated media " 134, should " emulated media " 134 expressions for example, movable storage medium virtual or emulation (for example, tape).Described " emulated media " 134 is present in the main frame by storage system software and/or hardware, and appears in the main frame 120 as physical storage medium.Further interface between the backup storage medium 126 of emulated media 134 and reality may be controller system memory (not shown) and switching network 132, described switching network 132 receive from the data of main frame 120 and with data storing to backup storage medium 126, as following more detailed discussion.By this mode, the traditional magnetic tape type storage system of storage system " emulation " is in main frame 120.
According to an embodiment, storage system can comprise " logical metadata cache " 242, and described logical metadata cache 242 storages relate to the metadata that backups to the user data of storage system 170 from main frame 120.Be meant the information of representing user data and the data of describing real user data attribute as term as used herein " metadata ".The data set that logical metadata cache 242 expressions can be inquired about makes user and/or software application can determine the position of backup user file at random, contrast user file and another one file, or visit and processing backup user file.The embodiment of two software applications may use the data that are stored in the logical metadata cache 242, comprises synthetic full back-up application 240 and the end-user restore application 300 that hereinafter will more comprehensively discuss.
Put it briefly, synthetic full back-up application 240 has the ability of newly-built synthetic full back-up data sets from an existing full back-up data sets and one or more incremental back-up data set.Synthetic full backup can be exempted execution cycle property, and (for instance, the needs of full backup weekly) therefore, are saved considerable time and Internet resources.The details of synthetic full back-up application 240 will be done hereinafter further and discuss.End-user restore application 300, also will do hereinafter further and discuss, make terminal user (operating personnel of subscriber computer 136 for instance) can browse, inquire about, read and/or recover the user file of backup from storage system 170 in the past.
Just as discussed above, storage system 170 comprises the hardware and software that is connected with backup storage medium 126 interfaces with main frame 120.Can the traditional tape library backup system of emulation in conjunction with the hardware and software of embodiment of the present invention, thus from 120 angle of main frame, data obviously backup on the tape, but in fact backup in the another one storage medium, for instance, for example, numerous disk arrays.
With reference to the accompanying drawings 3, with the form of explanation of block scheme a embodiment according to the storage system 170 of each side of the present invention.In one embodiment, the hardware of storage system 170 comprises controller system memory 122 and the switching network 132 that is connected controller system memory 122 and backup storage medium 126.Controller system memory 122 comprises processor 127 (may be single processor or a plurality of processor) and memory 129 (RAM, ROM for instance,, PROM, EEPROM, flash memory or the like, or their combination), can move all or part storage system software.Memory 129 also can be used for storing the metadata relevant with the data that are stored in backup storage medium 126.That software (comprise program coding carry out embodiment of the present invention) is stored in usually is readable/nonvolatile recording medium that can write in, for instance, RAM, ROM, CD, disk or tape or the like copy in the memory 129 then, and wherein software is carried out by processor 127.Described program coding may be write with any language in the multiple programming language, for instance, and Java, Visual Basic, C, C#, or C++, Fortran, Pascal, Eiffel, Basic, COBAL, and the language of combination are owing to the invention is not restricted to special programming language.Typically, in operation, processor 127 causes data (for instance, the coding of execution embodiment of the present invention) from nonvolatile recording medium, read another memory form, for example, RAM allows with the information than nonvolatile recording medium faster speed access processor.
As shown in Figure 3, controller 122 also comprises a series of interface adapter 124a, 124b and 124c that controller 122 and main frame 120 are connected to switching network 132.As illustrational, main frame 120 is coupled with storage system by interface adapter 124a, and described interface adapter 124a may be, for instance, and the interface adapter of optical-fibre channel.By the controller 122 of storage system, main frame 120 backups to data in the backup storage medium 126, and can recover backed up data from backup storage medium 126.
In the exemplary embodiment, switching network 132 may comprise one or more fibre channel switch 128a, 128b.Controller system memory 122 comprises numerous fiber channel interface adapter 124b and 124c, controller system memory is coupled to fibre channel switch 128a, 128b.By fibre channel switch 128a, 128b, the controller 122 of storage system allows data to backup in the backup storage medium 126.As accompanying drawing 3 was given an example, switching network 132 may further include one or more ether switch 130a, and 130b, described ether switch 130a, 130b are by ether interface adapter 125a, and the controller 122 of 125b and storage system is coupled.In one embodiment, controller system memory 122 further comprises another ether interface adapter 125c, can with, for example LAN103 coupling is to impel storage system 170 and main frame (for example, subscriber computer) communication, as discussion hereinafter.
In accompanying drawing 3 illustrational embodiment, controller system memory 122 is by switching network and backup storage medium 126 couplings, and described switching network comprises two kinds of fibre channel switch and two kinds of ether switches.Any single failpoint in the each type analogue system of the switch at least two kinds of storage systems 170 is provided.In other words, even a switch (for example, fibre channel switch 128a) lost efficacy, controller system memory 122 will be communicated by letter with back-up storage media 126 by the another one switch.Aspect stability and speed, described arrangement has superiority.For instance, discussion as mentioned in the above, stability improves by spare part and the single failpoint of elimination that provides.And in certain embodiments, the fibre channel switch of the parallel connection that controller system memory can be by using some or all backups to data in the backup storage medium 126, thereby improves whole backup rate.Yet, it will be appreciated that, and the system that do not require comprises the each type of two or more switches, do not require that switching network had both comprised that optical-fibre channel also comprised the ether switch yet.Further, in the embodiment of the backup storage medium 126 that comprises single disk array, do not need switch.
Discussion just as mentioned in the above, in one embodiment, backup storage medium 126 may comprise one or more disk arrays.In a preferred embodiment, backup storage medium 126 comprises numerous ATA or SATA disk.Described disk is " for sale " commodity, and by for example EMC, the storage array commodity of manufacturers such as IBM are compared more cheap with traditional.And () cost is to have in limited time in serviceable life of a kind of factor and medium for instance, tape, and described disk can be compared for the backup storage system on basis with traditional tape on cost when move media again.In addition, described disk read/write data is faster than tape in fact.For instance, connect by single fiber optic network, data can at least approximately the speed of 150MB/s be backed up to disk on, 150MB/s is converted to about 540GB/hr, obviously faster than the backup rate (for example, by the disk order) of tape.In addition, several optical-fibre channels connect and can in parallelly carry out, thereby further improve speed.According to embodiment of the present invention, back-up storage media can be carried out any one RAID (Redundant Array of Inexpensive Disc) sequence by set.For instance, in one embodiment, backup storage medium can be carried out the RAID-5 task.
Just as discussed above, by using disk array to replace tape cassete to realize, therefore provide " VTL " during the traditional tape library backup system of embodiment of the present invention emulation as real backup storage medium.Real tape cassete appears in traditional tape library and is substituted by term used herein " virtual cartridge ".It will be appreciated that for purposes of this disclosure, term " VTL " is meant the emulate tapes storehouse that can carry out in software and/or physical hardware, for instance, as one or more disk arrays.People will further figure out, and relate generally to emulate tapes although discuss, storage system also can emulation other storage medium, for instance, CD-ROM or DVD-ROM, and also term " virtual cartridge " generally is meant the storage medium of emulation, for example, emulate tapes or simulated CD.In one embodiment, virtual cartridge is in fact corresponding to one or more hard disks.
Therefore, in one embodiment, provide software interface to arrive backup/restore application, seem that data are backuped to tape with the emulate tapes storehouse.Yet real tape library is substituted by one or more disk arrays, so that in fact data are backuped in these disk arrays.It will be appreciated that the other types of removable medium saved system can be by emulation, and the invention is not restricted to the emulation of tape library storage system.Following discussion is included in the various aspects feature in the storage system 170 and the operation of software with explanation.
People will recognize, although the software that may describe " comprised " in storage system 170, and may be stored processor 127 operations of system controller 122 (referring to accompanying drawing 3), and do not require that all software moves in controller system memory 122.Software program, for instance, the application program and the end-user restore application of synthetic full backup can be moved on main frame and/or subscriber computer, may distribute by whole or some controller system memory, main frame and subscriber computer in this part.Therefore, people will recognize and not require that controller system memory comprises physical entity, for example computing machine.Storage system 170 and the software communication that is stored in the main frame, for example, for instance, media server 114 and apps server 102.In addition, storage system may comprise the several application software that can move and keep on identical with different main frames.And, it will be appreciated that storage system 170 is not limited to the discrete fragment of device, although in certain embodiments, the discrete fragment that storage system 170 can be used as device embodies.In one embodiment, storage system 170 can provide as self-contained unit, and the effect of playing " end and beginning " replaces traditional tape library backup system (for instance, existing back-up processing device and strategy not being needed the modification made).Described storage system unit can use in networked computer environments, comprises that traditional standby system is to provide redundant or extra storage capacity.
Discussion just as mentioned in the above, according to an embodiment, main frame 120 (may be, for example apps server 102 in the accompanying drawing 1 or media server 114) can connect 121 (for instance by network, optical fiber connects) data are backuped in the backup storage medium 126, described network connects 121 main frame 120 is coupled in the storage system 170.People will recognize, although following discussion will preferentially relate to the backup of data in emulated media, principle also is applicable to recovers Backup Data from emulated media.Data stream between main frame 120 and the emulated media 134 can be controlled by backup/restore application, discussion just as mentioned in the above.Viewpoint from backup/restore application is apparent that, in fact data are backuped in the real version of emulated media.
With reference to the accompanying drawings 4, storage system software 150 may comprise the logical abstraction layer of one or more expression emulated media, and the backup/restore application 140 that is fixed in the main frame 120 and the interface between the backup storage medium 126 are provided.Software 150 is accepted the magnetic tape format data from backup/restore application 140, and with described data translation for being suitable for being stored in the data in the random access disk (hard disk, CD etc. similarly medium) for instance.In one embodiment, software 150 moves in the processor 127 of controller system memory 122, and can be stored in (referring to accompanying drawing 3) in the memory 129.
According to an embodiment, software 150 can comprise layer, emulation, the magnetic tape station of SCSI tape can be provided and be used for that tape is converted to the form of magnetic tape station and be converted to the aut.eq. of the form of tape from magnetic tape station as for the VTL that relates to herein (VTL) layer 142.Backup/restore application 140 for example can be used, and scsi command (representing with arrow 144) is communicated by letter with VTL142 (backup or read in the emulated media data) for instance.Therefore, VTL can form other storage system software and the software interface between hardware and the backup/restore application, emulated storage system medium 134 (in the accompanying drawing 2) appears in the backup/restore application, and allows emulated media as traditional can appearing in the backup/restore application by mobile again backup storage medium.
Second software layer of mentioning herein can provide interface between emulation storage medium (representing) and the real backup storage medium 126 as file system layer 146 in VTL.In one embodiment, file system 146 plays the effect of micro-tensioning system to communicate by letter with the storage medium 126 of backup, for instance, uses scsi command (representing with arrow 148), with read and write data from backup storage medium 126.
In one embodiment, described VTL provides general tape library support, and can support the converter of any SCSI medium.The tape unit of emulation can include, but are not limited to IBM LTO-1 and LTO-2 tape unit, Quantum SuperDLT320 tape unit, Quantum P3000 tape base system, or Storage TekL180 tape base system.In VTL, each virtual cartridge is a file, when data storage, and described file dynamic growth.This is opposite with traditional tape cassete that fixed measure is arranged.One or more virtual cartridge can be stored in the system file, as 5 further describe with reference to the accompanying drawings hereinafter.
In accompanying drawing 5, illustrate an embodiment of data structure in file system software 146, explain according to one embodiment of the invention and understand system file 200.In described embodiment, system file 200 comprises 202 and data 204.May comprise the information that indicates each virtual cartridge for described 202, virtual cartridge is stored in the system file.No matter whether write-protect of virtual cartridge, the information that described head can comprise has, virtual cartridge newly-built/revise data or the like.In one embodiment, 202 information that comprise indicates each coding tape uniquely, and being stored in the virtual cartridge the storage system with each coding tape block separately from other.For instance, information may comprise the mark code (for example, corresponding to the bar code of representing usually, so that tape can be indicated by aut.eq.) of title and virtual cartridge on real tape.202 also can comprise additional information, for instance, and the capacity of each virtual cartridge, data of revising or the like recently.
According to one embodiment of the invention, can make full use of a size of 202 with reflect store data type (for instance, virtual cartridge is represented data Backup Data from and more host computer system), the described data set (for example, virtual cartridge) that can follow the tracks of with a series of tangible systems.For instance, the data that backup to tape storage systems usually generally are with bigger data set identification feature, represent digital display circuit and user file.Because data set is very big, a series of tracked discrete data files may be relatively littler.Correspondingly, in one embodiment, the selection of 202 size is to be based upon because storage data too many and can not effectively keep following the tracks of (for example, head is too big) and not have the space to be used to store on the basis of compromising between the sign (for example, head is too little) of coding tape of sufficient amount.In an embodiment exemplary, 202 utilize a 32MB of system file 200.Yet people will recognize, system need and the basis of feature on, 202 can have different sizes, depend on the demand and the capacity of system, people can be a different size of 202 selections.
It will be appreciated that from the viewpoint of backup/restore application, the virtual cartridge with all identical attribute and feature occurs as real tape cassete.In other words, for backup/restore application, virtual cartridge occurs as writing tape in fact.Yet, in a preferred embodiment, the data that are stored in the virtual cartridge are not that form is stored in the backup storage medium 126 in order, but, the data that are written into virtual cartridge on the surface be actually as can random access, the disc format data are stored in the file of storage system.The data that metadata is used to connect storage are to virtual cartridge, so that the backup/restore application magnetic tape format of can encoding is come read and write.
Therefore, in fact whole from an embodiment preferred, user and/or system data (being meant " file data " herein) are stored system 170 and receive from main frame 120, and are stored in the disk array that replenishes backup storage medium 126.The software 150 (with reference to the accompanying drawings 4) and/or the hardware of storage system are written to the form of this file data with system file in the storage medium 126 of backup, as hereinafter more detailed description.Being stored the attribute that metadata that system controller takes passages from the backup file data is used to follow the tracks of user and/or system file is backed up.For instance, the described metadata of each part file may comprise file name, set up the nearest modification of date or file, any information that is composed of password and other information about file.In addition, set up metadata by storage system for each part file, described metadata is connected to virtual cartridge with file.Use described metadata, software provides the emulation of tape cassete to main frame; Yet in fact file data is not only to be stored in magnetic tape format, but is stored in the system file, as description hereinafter.In system file, store data, rather than, can help allowing quick, effective and random access respective files, and not require that the scanning sequency data are to find special file with coding magnetic tape format in proper order.
Discussion just as mentioned in the above, according to an embodiment, file data (for example, user and/or system data) is stored in the backup storage medium as system file, and each system file comprises head and data, and data are real user and/or system file.202 of each part system file 200 comprises machine readable catalogue 206, and described machine readable catalogue 206 comprises the metadata that user and/or system file is connected to virtual cartridge.Herein term " metadata " neither refer to the user neither system file data, but the data of the attribute of real user and/or system data are described.According to an embodiment, machine readable catalogue can define, down to byte, and the data layout in the virtual cartridge.In one embodiment, machine readable catalogue 206 has tableau format, as shown in accompanying drawing 6.Form comprises the hurdle 220 of the type that is used for canned data (for example data, file mark FM or the like), the hurdle 222 of the size of the disk byte of using in the expression byte, and the hurdle 224 of quantity that calculates the disk byte of store file data.Therefore, machine readable catalogue allows any data file in any storage medium 126 that is stored in backup of controller random access (with opposite order).For instance, with reference to the accompanying drawings 6, data file 226 may be in virtual cartridge location fast, be because machine readable catalogue has pointed out that the data of file 226 start from the block diagram of the starting stage of system file 200.Because in response to file mark (FM), a described block diagram does not have size.File mark is not stored in the system, and for example, file mark is corresponding to remainder certificate.Because file mark is used by traditional tape, so machine readable catalogue comprises file mark, therefore, backup/restore application writes file mark along data file, and when browsing virtual coding tape, wishes to see file mark.So file mark is tracked in machine readable catalogue.Yet file mark is not represented any data, and therefore is not stored in the data division of system file.So the data of file 226 start from the start-up portion (representing with arrow 205) of the data division of system file, and its length is 1024 bytes (for example, a disk byte is 1024 bytes).It will be appreciated that it is not that this depends on the total amount of data in the byte of 1024 bytes that other file data may be stored in, for example, the size of data file.For instance, for efficient, bigger data file may use bigger disk byte to store.
In one embodiment, machine readable catalogue may be comprised in " filec descriptor ", and described filec descriptor interrelates with each one data file that backups to storage system.Filec descriptor comprise be stored in storage system in the metadata that is associated of data file 204.In one embodiment, filec descriptor may be carried out according to standard format, for instance, and by the tape history file form (a kind of extension name of compressed file) of most of UNIX basic systems (multiple-access computer operating system) use.Each filec descriptor for example may comprise, the data of the user file of the information of the title of corresponding user file, newly-built/modification, the size of user file, any restrict access of user file or the like.The additional information that is stored in the filec descriptor may further include the information of describing bibliographic structure, can copies data from bibliographic structure.Therefore, filec descriptor can comprise about the metadata of inquiring about corresponding to data file, as hereinafter discussing in more detail.
From the viewpoint of backup/restore application, any virtual coding tape may comprise a plurality of file datas and corresponding filec descriptor.From the viewpoint of storage system software, data file is stored in the system file, and system file can be linked, for instance, and special back-up job.For example, backup is carried out a system file that can produce corresponding to one or more virtual cartridge by a main frame in the specific time.Therefore, virtual cartridge may be any size, and when more user file was stored in the virtual cartridge, virtual cartridge can dynamic growth.
Refer again to accompanying drawing 3, as mentioned above, storage system 170 may comprise synthetic full backup software application 240.In one embodiment, main frame 120 backups to data in the emulated media 134, forms one or more virtual cartridge.In some computer environment, " full backup " for instance, is stored in the backup copy of all data in the main storage system of network (referring to accompanying drawing 1), may periodically be carried out (for example, weekly).Because lot of data is copied, this processing procedure is very long usually.Therefore, in many computing environment, extra backup, said incremental backup may be carried out between continuous full backup (for example, every day).Incremental backup is a kind of process owing to carry out (no matter be increment or all) up-to-date backup is saved, and wherein has only data to be changed.Typically, the data of change are the backups in the library, even the most of data in the file are not changed continually.Therefore, incremental backup is less usually, thereby comparatively fast finishes can be than full backup the time.It will be appreciated that,, do not require table service time although in common weekly complete backup with in a week, carry out many environment of incremental backup every day.For instance, certain environment may require in one day incremental backup several times.Principle of the present invention is applied to any use full backup environment of (with incremental backup at random), and does not consider to carry out frequency.
In full back-up procedure, main frame can newly-built one or more comprising contains the virtual cartridge of the Backup Data of numerous data files.In order to clearly demonstrate, following discussion will suppose that full backup only produces a virtual cartridge.Yet, it will be appreciated that full backup can produce more than one virtual cartridge, principle of the present invention can be applied in any amount of virtual cartridge.
According to an embodiment, be provided for from the method for an existing full back-up data sets and the one or more newly-built synthetic full back-up data sets of incremental back-up data set.Described method can be avoided periodically (for example, the requirement of complete backup weekly), thereby a large amount of time and the Internet resources of saving user.Furtherly, as known to the one of ordinary skilled in the art, on the basis of full backup and one or more incremental backups, restore data is a time-consuming procedure, for instance, if the up-to-date version of file is present in the incremental backup, backup/restore application usually will be on the basis of up-to-date full backup store files, use any change in the incremental backup then.Therefore, preserve the additional advantage of data file on the basis that provides synthetic full backup to have to allow back-up storage to be applied in synthetic full backup quickly, do not need from full backup and one or more incremental backup, to finish repeatedly and preserve.It will be appreciated that vocabulary used herein " latest edition " generally is meant the latest copy (for instance, data file is in the nearest holding time) of data file, no matter whether file has new start context.Term used herein " version " generally is meant the copy of identical file, and identical file can be revised in some way or may repeatedly be preserved.
With reference to the accompanying drawings 7, the chart of illustrational synthetic full backup program is described.Main frame 120 can back up 230 in very first time complete, for instance, and at weekend.Main frame 120 can be carried out continuous incremental backup 232a, 232b, and 232c, 232d and 232e, for instance, the every day in the week.Storage system 170 can newly-built synthetic full back-up data sets 234, as description hereinafter.
According to an embodiment, storage system 170 can comprise software application, in this article refers to synthetic full back-up application 240 (with reference to the accompanying drawings 3).Synthetic full back-up application 240 can be in controller system memory 122 (with reference to the accompanying drawings 2) operation or in main frame 120, move.Synthetic full back-up application is drawn together software command and newly-built synthetic full back-up data sets 234 necessary interfaces.In one embodiment, the logic that synthetic full back-up application can be finished the metadata of each full back-up data sets 230 of expression and incremental back-up data set 232 merges, to produce the new virtual cartridge that comprises synthetic full back-up data sets 234.
For instance, with reference to the accompanying drawings 8, existing full back-up data sets can comprise user file F1, F2, F3 and F4.The first incremental back-up data set 232a can comprise the user file F2 ' of the revision of F2, and the revision F3 ' of F3.The second incremental back-up data set 232b can comprise the user file F1 ' and the further revision F2 of F2 " and the new user file F5 of the revision of F1.Therefore, the synthetic full back-up data sets 234 that forms from the logic of full back-up data sets 230 and two incremental data set 232a and 232b merges comprises each part user file F1, F2, F3, the latest edition of F4 and F5.As shown in Figure 8, synthetic full back-up data sets herein comprises user file F1 ', F2 ", F3 ', F4 and F5.
Refer again to accompanying drawing 3 and 4, file system software 146 can newly-built logical metadata cache 242, these cache memory 242 storages be stored in emulated media 134 in the metadata that is associated of each part user file.It will be appreciated that logical metadata cache does not require it is real data caching, but may replace being stored in the collection the inquired about data in the storage medium 126.In another example, logical metadata cache 242 can be used as the database execution.Metadata store is in database, and traditional database command (for example, sql command) can be used to finish the logic merging of full back-up data sets and one or more incremental back-up data set with newly-built synthetic full back-up data sets.
Just as discussed above, each one data file that is stored on the emulated media 134 may comprise filec descriptor, and described filec descriptor comprises the metadata related with data file, comprises the position of the file in the backup storage medium 126.In one embodiment, the backup/restore application of moving in main frame 120 is kept at the data of the tape stream format in the emulated media 134.Be illustrated in the magnetic tape format of accompanying drawing 9 illustrated at the embodiment of data structure 250.As discussed above, system file data structure comprises head, described head may comprise the information about data file, for example, the filec descriptor of data file, the newly-built and/or file data revised, security information, the bibliographic structure of the main system of document source, and other are with the information of file chaining to virtual cartridge.Described head is related with data 254, is real user and system file, and described user and system file back up from main frame, main storage system or the like.System file data structure may also comprise pad 256 randomly, and pad 256 can suitably be registered to the area limit line with next stature.
As shown in Figure 9, in one embodiment, a data is placed in the logical metadata cache 242, to allow the inquiring about data recording on tape form continuous with random access fast.The use of logical metadata cache, finish by the file system software 148 that is stored in the system controller 122, allow translation linear, the continuous tape data layout to be stored in the emulated media 134, enter in the random access data layout that is stored in the physical disk that replenishes backup storage medium 126.Logical metadata cache 242 storage 252, the filec descriptor, the security information that comprise data file for described 252, described security information is used to control the visit to the data file, as hereinafter more detailed discussion, designator 256 is corresponding to the actual position of data file in virtual cartridge and backup storage medium 126.In one embodiment, the data that store of logical metadata cache relate in full back-up data sets 230 and incremental data set 232 in all data files of each piece of data.
According to an embodiment, synthetic full back-up application software 240 uses the information that is stored in the logical metadata cache to come newly-built synthetic full back-up data sets.Described synthetic full back-up data sets is linked to virtual cartridge, and described synthetic virtual cartridge is newly-built by synthetic full back-up application 240.For backup/restore application, synthetic full back-up data sets is stored in the synthetic virtual cartridge outwardly.Discussion just as mentioned in the above, synthetic full back-up data sets can merge newly-built by the logic of carrying out existing full back-up data sets and incremental back-up data set.Logic merges can comprise each one data file of contrast, each one data file is included in each existing full back-up data sets and the incremental back-up data set, and the version of the up-to-date modification of newly-built each part user file is synthetic, as 8 discussion with reference to the accompanying drawings.
According to an embodiment, synthetic virtual cartridge 260 comprises designator, described designator is pointed out the location of data file in other virtual cartridge, be apparent that, as shown in Figure 10, described virtual cartridge comprises existing full back-up data sets and incremental back-up data set.Consider the embodiment that earlier drawings 8 provides, synthetic virtual cartridge 260 comprises designator 266, this designator 266 is pointed out the position in the existing full back-up data sets of (marking with arrow 268) user file F4 (because existing full back-up data sets comprises the latest edition of F4) in virtual cartridge 262, for instance, the position among the incremental data set 232a of user file F3 ' in virtual cartridge 264.
Synthetic virtual cartridge also can comprise the tabulation 270 of the mark code (optional title) that comprises all virtual cartridge, and all virtual cartridge comprise and are instructed to accord with 266 pointed data.Attached coding Tape Lists 270 is very important for following the tracks of real data, and prevents that attached virtual cartridge is by demagnetization.In described embodiment, synthetic full back-up data sets does not comprise any real user file, but other group of indicators is pointed out the position of user file in the storage medium 126 of backup.Therefore, need to describe (position of user file in the storage medium 126 of backup) for preventing that real user file (being stored in other the virtual cartridge) is deleted.This can comprise the record (attached coding machine readable catalogue 270) of the virtual cartridge of data and protect each virtual cartridge of described virtual cartridge to avoid writing or deleting partly and realize by reservation.Synthetic virtual cartridge also can comprise coding data recording on tape 272, for example, and the size of synthetic virtual cartridge, position in backup storage medium 126 or the like.In addition, He Cheng virtual cartridge has mark code and/or title 274.
According to the another one embodiment, synthetic virtual cartridge also can comprise the combination of the user file of designator and actual stored.With reference to the accompanying drawings 11, in one embodiment, synthetic virtual cartridge comprises designator 266, and designator 266 is pointed out the position of data file (up-to-date version is as 9 o'clock discussion with reference to the accompanying drawings) in the existing full back-up data sets 230 of virtual cartridge 262.Synthetic virtual cartridge also can comprise data 278, and data 278 comprise the real data file of copy from incremental data set 232, shown in arrow 280.In this way, after synthetic full back-up data sets 276 was newly-built, incremental back-up data set can be deleted, thus conserve storage.Because comprise the copy of all or part of designator rather than all user files, synthetic virtual cartridge is relatively little.
It will be appreciated that synthetic full backup can comprise the combination of any designator and store file data, the embodiment that is not limited to above provide.For instance, He Cheng full backup can be included as some be stored in certain increment and/or the complete backup some according to the document data file designator and comprise the data file of the storage that from other existing complete and/or incremental backups, copies.And as an alternative, synthetic full backup can be newly-built on the basis of existing full backup and any relevant incremental backup, described incremental backup does not comprise any designator, but comprises the latest edition of the True Data file that copies from preferred complete and/or incremental backup.
In one embodiment, synthetic complete back-up application software can comprise calculus of differences, described calculus of differences can be existing full back-up data sets of each part and incremental data set relatively user and system file metadata, with the position of the latest edition of determining each one data file.For instance, calculus of differences can be used for data more newly-built and/or that revise, version number (if available) or the like, and the different editions of the same data file in the different backup sets is selected the version of nearest data file.Yet the user can open user file and preserve file (thereby changing data of revising) and do not need to change really any data in the file.Therefore, system can finish more complicated calculus of differences, can analytic system or user file in data whether be modified really with specified data.The conversion of described calculus of differences and the type of other comparable algorithms are known in the art.In addition, discussion just as mentioned in the above, when metadata is stored in the database format, database command, for example, sql command can be used to actuating logic and merge.The present invention can use any described calculus of differences with guarantee each part user file recently or latest edition concentrated from the existing backed up data of whole contrasts and picked out so that compatibly produce synthetic full back-up data sets.
It should be appreciated by those skilled in the art that synthetic full back-up application can newly-built complete backup data set and can be obtained and do not need main frame to carry out real full backup.Be not only to avoid increasing, and the full back-up application of synthesizing in embodiments can carry out in storage system, can significantly reduce the utilization of network broadband the main frame and the expense of processor of data-switching to the storage system of backup.As shown in Figure 7, further synthetic full back-up data sets can be by using the first synthetic full back-up data sets 234 and incremental back-up data set subsequently 236 newly-built.In the significant jump that provides, file or target are not often revised, frequent copy.In fact, He Cheng full back-up data sets can be in the file that has just been copied reservation indicators.
As 3 discussion with reference to the accompanying drawings, storage system can comprise the software application that relates to end-user restore application 300.Therefore, according to another embodiment, provide the method that is used for determining and recovering Backup Data, and do not needed the invention of IT working group, and do not required existing backup/restoration processor and/or strategy are made any change the terminal user.In typical backup storage system, the backup/restore application of operation is controlled by IT working group in main frame 120, and for the terminal user under the situation of the invention that does not have IT working group, visit backed up data come said, be cannot or unusual difficulty.According to the various aspects of embodiment of the present invention, the storage system software that provides is passed through, for instance, network for the basis or other the interface of backup storage medium 126 allow the terminal user to determine position and recovery file.
It will be appreciated that, owing to use synthetic full back-up application 240, end-user recovery can move (with reference to the accompanying drawings 2) or move on main frame 120 on the controller 122 of storage system application program 300.Terminal is recovered the necessary interface of file that application program comprises software command and allows authorized users query logic metadata cache, recovers at random, backs up from backup storage medium 126.
According to an embodiment, the software that provides comprises the user interface that is installed on the subscriber computer 136 and/or carries out on subscriber computer 136.User interface can be the interface that the permission user of any kind determines the position of file in the storage medium of backup.For instance, user interface can be the user interface of drawing, and can maybe can be text interface based on network.Subscriber computer connects 138 by network and is coupled to storage system 170, and described network connects 138 and can be, for example, ether connects.Connect 138 by network, the operating personnel of subscriber computer 136 can visit the data that are stored in the storage system 170.
In one embodiment, end-user recovery comprises the application program 300 of subscriber authorisation card and/or authorisation features.For instance, the user can use username and password to register by the user interface on the subscriber computer.Subscriber computer can use preferred user authorization means with the username and password storage system (for instance, to end-user restore application) of communicating by letter, and whether visits storage system with the decision user.The embodiment of some user authorization means can include, but are not limited to, MicrosoftActive Directory server, Unix " Yellow Page " server or lightweight directory access protocol.Registration/user authorization means can be communicated by letter with the privilege of exchangeing subscriber with end-user restore application.For instance, the certain user can be allowed to inquire about self newly-built file, or has file some privilege or that be taken as everyone sign.Other user, for instance, system operators or the people who is authorized to can visit all backup files or the like.
According to an embodiment, end-user restore application uses logical metadata cache to obtain to backup to the information of the data file of backup storage medium about all.Terminal recovery application program appears in face of the user by user interface, the graduate bibliographic structure of user file storage, for instance, BACKUP TIME/data, user name, initial user computer directory structure (when file backup, obtaining), or other file characteristic.In one embodiment, appearing at user's bibliographic structure in front can change according to user privilege.End-user restore application can receive the requirement (for instance, by user interface, the user can browse the position of bibliographic structure to the file that needs) or the user that browse can pass through inquiry files such as title, date.
According to an embodiment, the user can restore from storage system.For instance, in case the position of the file that the user determine to need, discussion just as mentioned in the above, the user can by network connect 138 from storage system file in download.In one embodiment, known to the one of ordinary skilled in the art, download to contrast and download for the mode of the download on basis with any network.
By allowing end-user access to allow the alternative document of browsing/downloading, with by impel through user interface (for example, network is the end on basis) visit, end-user restore application can make user inquiring and recover the file of oneself and need not to change any backup policy or program.
According to the apparatus and method that another embodiment provides, the user can install the Network attachment view of the backup data set that is stored in the backup storage medium 126 whereby.Can allow the user to browse and visit data like this, because logging data on any this locality that the user will be coupled at the computing machine with them or the network drive in the data centralization of installing.Therefore, for example, the user can recover the validity of apps server (for example, when system hosts storage 106 was lost efficacy, with reference to the accompanying drawings 1) data, and need not to carry out recovery routine by media server 114 (with reference to the accompanying drawings 1).The data of using installation procedure described herein to recover apps server may be to go out a plurality of orders of magnitude soon so that the recovery of volume than typical media server.It will be appreciated that term used herein " installation " is meant that setting up book or network constitutes, network drive for example, available host operating system.Book may comprise, for example single data file or system file, numerous file or comprise the bibliographic structure of numerous files.Common installation agreement comprises the total part of NFS (network file system(NFS)) and CIFS (general the Internet file system).As if by the resource on the other main frame of network connected reference, described interface makes remote resource just in the main frame of this locality to these agreements permission main frames by the interface.
With reference to the accompanying drawings 12, explain and understand the process flow diagram that is used to carry out an embodiment of the method that volume installs according to various aspects of the present invention.In first step 290, the user selects the book that will install, and will roll up and install in the controller 122 of asking to send to backup storage system (with reference to the accompanying drawings 3).Usually, the user can be from full back-up data sets (and being not only incremental back-up data set) restore data, to obtain the expression formula of the correct backup information of complete sum.If current full back-up data sets does not exist, (for example, network manager complete backup weekly, if therefore the user wishes restore data in this week, current full backup may be out of use), can be newly-built synthetic full backup (as indicated above) also is used to the data recovering to select.
According to an embodiment, backup storage system 170 may comprise software application, about the volume restore application 310 of this paper (with reference to the accompanying drawings 13), can control and realize being used to carrying out that book is installed and the method for recovery routine.Volume restore application 310 is similar with synthetic full backup and end-user restore application, can on main frame and/or subscriber computer, carry out, at this, a part of program is distributed by all or part controller system memory, main frame and subscriber computer.
Refer again to accompanying drawing 12, after the request volume was installed, no matter whether current full backup was available (step 292), and described volume restore application may be under suspicion.Can communicate by letter and synthesize the full back-up procedure program with synthetic full back-up application 240 (with reference to the accompanying drawings 2) if not, described volume restore application to carry out, and newly-built current backup data set (step 294).Described volume restore application can output in the conventional full back-up data sets or in the synthetic full back-up data sets, to carry out the request that volume is installed, described request may be that NFS also may be the common part of CIFS.Specifically, described volume restore application queries logical metadata cache 242 is to determine preferred metadata, and this metadata table is shown in the chosen full backup volume that is labeled in the step 290.
According to an embodiment, request (in the step 290) is installed may causes volume restore application to set up one or more file descriptor structure, convenient more with the output of the volume that is used in installation, as NFS or the common part (step 296) of CIFS.About accompanying drawing 14, explain an embodiment understanding file descriptor structure 320, described filec descriptor can be set up by volume restore application, and filec descriptor 320 is corresponding to the system file of magnetic tape format (for example system file 332, with reference to the accompanying drawings 15).Description just as mentioned in the above, filec descriptor comprises the metadata that can inquire about, described metadata is corresponding with data file and system file in being stored in storage system.Described filec descriptor 320 may comprise the numerous parts that contain information, and for example, filename 322 and being used for is included in the license file (Access Control List (ACL)) 324 of the data file of the volume that will install.In addition, filec descriptor with the position of the source data of specified data file (for example comprises one or more designator 326, whether the unlabeled data file is stored in storage medium 126), the length 328 of data file, designator 330 with next clauses and subclauses of sensing (for example, next one data file) in being connected the listing file descriptor structure.If " down a " file is empty, for example indicated by reference number 331, so, represent that then this data file is by last one data file in the system file of filec descriptor 320 (for example, last connection list of entries) representative.As explaining in the accompanying drawing 14, each system file that is included in the book that will be mounted will be represented with file descriptor structure.In case each system file in request volume has been set up filec descriptor 320, then described filec descriptor can be in response to the request of NFS or CIFS with the location data file relevant with output.
Discussion just as mentioned in the above, in one embodiment, filec descriptor may be carried out according to standard format, for instance, and by the tape history file form (tar: a kind of extension name of compressed file) of most of UNIX basic systems (multiple-access computer operating system) use.As accompanying drawing 15 was explained, the example of exemplary systems file 332 will (for example, tar) part of data source be write (for example, tar form) with magnetic tape format as tape.Accompanying drawing 16 is explained and is understood and system file 332 corresponding filec descriptors 340.As accompanying drawing 15 explain, the file of writing with magnetic tape format comprises 336 and be stored in True Data 338 in the system file 332.Described data 338 may be consistent with portion or many one data file.At the embodiment that is used for explaining, the length of system file 332 is 1032 bytes, yet, it will be understood that file can have any length, this depends on the size and the format write of file.
The filec descriptor 340 of file 332 is included in 336.With reference to the accompanying drawings 16 explain and the 14 general embodiment that provide are as can be known with reference to the accompanying drawings, filec descriptor 340 comprises for system file it being filename 341, security information 334, the designator 342 of the storage data of known each one data file, the length 346 of corresponding data file, and " next " clauses and subclauses of known next one data file of designation system file, at the embodiment that is used for explaining, " next " clauses and subclauses are empty 348.
Refer again to accompanying drawing 12, in case all filec descriptors of the file in the book that will install are established, volume restore application outputs to specific user's mounting points with file system as NFS or CIFS shared portion (step 298) on the basis of the filec descriptor of setting up.In described mounting points, finish installation (step 299), and the book of installing reads for the user and/or write data is available, hereinafter will do detailed description.
According to an embodiment, NFS or CIFS read operation (for example, the user wishes browsing data in the book of installing) provide by the filec descriptor 320 that inquiry is used for the matching files explanation.It will be appreciated that according to an embodiment, the user need not own inquiry file descriptor.But volume restore application may is drawn together data is presented on user's user interface in front, for example, and typical directory structure format.Volume restore application can comprise software, and the special file that this software is asked the user is converted to the querying command in the access logic metadata cache, and is matching system ff filec descriptor 320.In case determine the position of file, the data that are transformed in the subscriber computer can (for example connect tabulation by following the trail of, tracking is stored in designator in the filec descriptor to determine the position of True Data) to finish and be that file data is set up impact damper, described file data will be sent to the user place that files a request.
According to another embodiment, the device that provides for the user also can write new data in the installation volume.Discussion just as mentioned in the above, the volume data of installation may appear in face of the user as common network drive or other network stored datas.Yet in fact, the volume data of original installation are by true backed up data, and these data need to preserve usually; At least up to newly-built another part backup data set.Therefore, allowing the user to revise the original backup data truly may be also imperfect.For avoiding revising the original backup data, and still allow user's modification corresponding to the data that volume is installed, the device that provides (is conspicuous for the user) will write and transfer in the storage system, as discussion hereinafter.
With reference to the accompanying drawings 17, explain and understand process flow diagram according to an embodiment of the method for the processing write requests of various aspects of the present invention.In first step 350, the user asks NFS or CIFS write operation (normally selecting " preservation " option by the process at editor or browsing data file).Then, volume restore application is carried out write request by determining available storage space, and data are write this space, and upgrades suitable filec descriptor with reference to the up-to-date data that write.
According to an embodiment, whether volume restore application queries data (step 352) allocation space for writing, if do not have, and volume restore application memory allocated space (step 354).Storage space may be distributed in the backup storage medium 126 (with reference to the accompanying drawings 13).The storage space that distributes can be particularly designed for just preserving writing data (associated metadata at random).
With reference to the accompanying drawings 18, explain and understand the embodiment that the NFS that is stored in the backup storage medium 126 or CIFS write data.Write data 360 and comprise, for example, two write part, and corresponding to the W1 362 and the W2 364 of storage data, the result of the write order that provides as volume restore application is provided described storage data.For instance, W1 and W2 are corresponding to the data file of revising in the book that is included in installation.It will be appreciated that although explain corresponding two write requests of example, principle of the present invention goes for any amount of write request, and file may be to be fit to change, to react the write request of any right quantity.Write data 360 and also can comprise 366, this 366 comprises metadata, and described metadata forms the self-described relation between raw data (for example, file 332) and the up-to-date data that write 360.Especially, head can comprise the compensated information that writes data division W1 and W2 that sign exists with respect to the raw data logic, 19 further describes in conjunction with the accompanying drawings.
With reference to the accompanying drawings 19, explain and understand when two write requests of proposition, an embodiment of system file Butut.Raw data file 332 is stored in the back-up storage media 126 (with reference to the accompanying drawings 13), and appears in face of the user by the program of above describing is installed.As shown in Figure 19, system file 332 is write with magnetic tape format, and data division 338 may comprise numerous data files (for example, user file).These data start from offset zero bytes (point 370) and end at a little after 372 end 1032 bytes.Writing file 360 asks so that data are write in the file 332 corresponding to the user.For instance, the user can revise two one data file that are included in the system file 332, obtains comprising that W1 and W2's writes file 360.Description just as mentioned in the above, writing file 360 may be in storage medium and file 332 separate storage, so that need not change the original backup data.Logic Modification system file 380 is used to explain, and representative comprises the file 332 of the variation (for example writing file 360) that the user passes through write request and does.In other words, in the system file of revising 380, W1 and W2 (data file of user's modification) may be used to substitute the raw data file in the data division that is included in raw data file 332, and need not to delete Backup Data.
As shown in Figure 19, the system file of modification and primal system file 332 and write the logic of file 360 in conjunction with corresponding.As shown in the figure, original system file data 338 starts from the exists from offset zero of source document.Compensating for 64 (representing with reference number 384), the W1 of first that revises data begins, and ends to have along the compensation of representing with reference number 386 73 position of 9 bytes.Therefore, W1, the data file of user's modification may be used to replace raw data file from user's write request, and this raw data file is determined the position of compensation 64 in primal system file 332.As shown in the figure, the length of W1 is 9 bytes, because W1 starts from writing the exists from offset zero (390) of file 360, and ends at the compensation 9 (392) that writes file 360.Determine the starting position of W1 by the information that is stored in 366, that is, write the relativeness between file 360 and the source document 332 revised file (explaining in the example and compensating 64).W2 partly is also included within the file 380 of modification, originate in to compensate for 1032 (nature end of file is represented with reference number 372), and logic extends to 100 byte file.And the length of W2 is determined by the locating information in 366.The new distal point of file is represented with reference number 388.
It will be appreciated that although to be logics newly-built and represent the user-modified version of source document for revised file 380, the up-to-date data of representing with file 360 that write not are as a part of actual stored of source document 332.But, discussion just as mentioned in the above, the up-to-date data storing that writes writes the privileged site of data in the sign of storage medium.In this manner, the integrality of original backup data file is maintained, and allows the user to write significantly simultaneously volume is installed, because they may be initial position or network drive.
The file of revising 380 comprises 382, and this 382 comprises the filec descriptor of representing revised file.With reference to the accompanying drawings 20, explain the embodiment that understands filec descriptor 400.Filec descriptor 400 comprises the title part 402 of the filename that indicates revised file 380 and indicates the security 404 of the allowance attribute of revised file 380.Filec descriptor 400 also comprises the designator that has comprised source document 332 and writes numerous data divisions of the designator of file 360, is stored in each part source document and the data that write in the file to obtain.By the connection tabulation of designator given in the Continuous Tracking filec descriptor 400, draw the expression formula of revised file 380.
With reference to the accompanying drawings 19 and accompanying drawing 20, explain a special embodiment of the filec descriptor of understanding revised file.At first data division 406, determine the designator location of the position of first data file, in the file of revising 380, described file is positioned at the exists from offset zero byte, as the sign of reference number 408 in the accompanying drawing 19.The length of subsequent section 410 unlabeled data files, the position of described subsequent section 410 is marked by designator 406.In explanatory embodiment, length is 64 bytes, can from accompanying drawing 19, draw (data are at the zero compensation point, reference number 408, and provide between the compensation of 64 bytes, represent with reference number 384).It is W1 in revised file 380 that next part 412 indicates next data file, as shown in Figure 19.Therefore, the designator 414 that indicates the position of the data corresponding with W1 is stored in the up-to-date zero compensation point that writes file 360 (reference number 390 in the accompanying drawing 19).The length that length part 416 indicates W1 is 9 bytes, also can see in accompanying drawing 19--W1 is by providing between compensation in the revised file 380 64 (reference number 384) and compensation 73 (reference numbers 386).Next data file in the next part 418 sign revised files 380 is the data file from primal system file 332.Designator in part 420 indicates compensation 73 location of next data file in revised file 380, shown in the reference number in the accompanying drawing 19 386.The length of part 422 unlabeled data files is 959 bytes, also can be with reference to the accompanying drawings 19.The data file that next part 424 indicates subsequently is W2.Once more, designator in the part 426 indicates the location of W2, that is, the up-to-date location that writes file 360 in compensation 9 can be with reference to the accompanying drawings 19.The length that part 428 indicates W2 is 100 bytes, and next part 430 comprises that indicating W2 is the room of latest data file in revised file 380, as shown in Figure 19.Therefore, filec descriptor 400 comprises the structure that indicates revised file 380 and the " roadmap " of Data Position, and this " roadmap " is included in the revised file 380.
Volume restore application and method have been described the form random access I/O system of above-mentioned representative continuous tape formatted data to be fit to, for example NFS or CIFS.Connect the listing file descriptor, for example filec descriptor 400, can be used for continuous magnetic tape format data are converted to the random access data, this is that data file by writing down the special tar of each part source is in the position of storage medium, for example, and the position in the relevant tar source of each one data file other data files in the tar source finish.In addition, according to an embodiment, volume restore application may be included as the preparation that the expression that data become (for example, writing) magnetic tape format (for example tar) is done, so that the general fashion visit data that can above describe of backup/restore application.According to an embodiment, the instant recovery application program comprises the instrument that produces virtual cartridge, and with the file system software relevant mode of this instrument above to describe come appropriate formatization with leader tape head, pad, data and file mark.In another embodiment, volume restore application is connected with file system software with the newly-built middle as mentioned virtual cartridge of discussing, and described virtual cartridge comprises the up-to-date file that writes and revise.
It will be appreciated that, although various aspects of the present invention, for example the description in this article of He Cheng full back-up application, end-user restore application and volume restore application is mainly carried out according to software, described aspect and other aspects can be chosen in software, hardware or firmware, or carry out in any their combination.Therefore, for instance, embodiment of the present invention can comprise and anyly are composed of order number (for instance, numerous instructions) computer-readable medium (for example, computer memory, floppy disk, compact disk and tape or the like), when carrying out on the processor in storage system, realize finishing the synthetic full back-up application and/or the function of end-user restore application to small part, as above detailed description.
In general, embodiment of the present invention and various aspects comprise the method for the tape backup system that storage system and emulation are traditional, but the function of enhancing can be provided, for instance, can newly-built synthetic backup and allow the terminal user to browse and restore.Yet, it will be appreciated that various aspects of the present invention can be used for, and the not just backup of computer data.Because storage system of the present invention can be used for storing economically huge data, and order that can be opposite is at hard disk random access storage data in the access time, and embodiment of the present invention can find application outside traditional backup storage system.For instance, embodiment of the present invention can be used to store the video or the voice data of the more selection of representing film and music, and realize video and/or audio as required.
So describe several aspects of at least one embodiment of the present invention, it will be appreciated that various changes, modification and improvement will be incidental for the one of ordinary skilled in the art.Described change, modification and improvement are to be used for a disclosed part, within the scope of the invention.Correspondingly, the description of front and accompanying drawing are just as embodiment.

Claims (11)

1. method, it comprises the steps:
Book is installed on the main frame, and this book comprises at least one one data file, and described data file is corresponding with the up-to-date backup version of at least one one data file in being stored in backup storage system;
The corresponding data of second version of storage and described at least one one data file in backup storage system, when preserving the up-to-date backup version of a at least data file, described data are upgraded than the up-to-date backup version that is stored at least one one data file in the backup storage system.
2. according to the method for claim 1, further may further comprise the steps:
Connect the up-to-date backup version of described at least one one data file and second version of described at least one one data file.
3. according to the method for claim 1, further may further comprise the steps:
Newdata structure, this data structure had both indicated the up-to-date backup version of described at least one one data file, also indicated second version of described at least one one data file.
4. according to the method for claim 3, second version of wherein said at least one one data file is the revision of the up-to-date backup version of described at least one one data file.
5. according to the process of claim 1 wherein that the step of installation data volume comprises that execution NFS installs or CIFS one of installs.
6. according to the method for claim 1, wherein the step of installation data volume comprises that foundation comprises the filec descriptor of the metadata relevant with the up-to-date backup version of described at least one one data file, and described metadata comprises the designator of the memory location of up-to-date backup version in backup storage medium that indicates described at least one one data file.
7. backup storage system, this system comprises:
The backup storage medium that is used for the store backup data collection; And
Controller, this controller comprise at least one realizes the method in the claim 1 with execution through the processor of configuration a series of instructions.
8. as claim 7 backup storage system required for protection, wherein said backup data set is the full back-up data sets of synthesizing.
9. the computer-readable medium that is composed of numerous order numbers, when carrying out at least one processor, described numerous order numbers are realized the method for claim 1.
10. as claim 9 computer-readable medium required for protection, wherein said processor is included in the backup storage system.
11. a computer readable media store that is composed of numerous order numbers has data structure, this data structure comprises:
First indicates kit, and this first sign kit indicates the system file corresponding with the backup data set that comprises at least one one data file uniquely; And at least one second sign kit, this second memory location separately that indicates in kit sign storage medium, described storage medium stores the latest edition of concentrated each one data file at least of Backup Data.
CN 200480030746 2003-09-30 2004-09-30 Emulated storage system supporting instant volume restore Pending CN1997972A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US50732903P 2003-09-30 2003-09-30
US60/507,329 2003-09-30
US10/911,987 2004-08-05

Publications (1)

Publication Number Publication Date
CN1997972A true CN1997972A (en) 2007-07-11

Family

ID=38252236

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200480030746 Pending CN1997972A (en) 2003-09-30 2004-09-30 Emulated storage system supporting instant volume restore

Country Status (1)

Country Link
CN (1) CN1997972A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023903B (en) * 2009-09-10 2012-12-19 联想(北京)有限公司 Version management method and device for data backup
CN110955564A (en) * 2019-12-30 2020-04-03 深圳探科技术有限公司 Data disaster recovery system based on block chain technology
CN113391960A (en) * 2021-08-18 2021-09-14 深圳市中科鼎创科技股份有限公司 Technical method and system for rapidly realizing recovery of operating system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023903B (en) * 2009-09-10 2012-12-19 联想(北京)有限公司 Version management method and device for data backup
CN110955564A (en) * 2019-12-30 2020-04-03 深圳探科技术有限公司 Data disaster recovery system based on block chain technology
CN113391960A (en) * 2021-08-18 2021-09-14 深圳市中科鼎创科技股份有限公司 Technical method and system for rapidly realizing recovery of operating system

Similar Documents

Publication Publication Date Title
CN100483365C (en) Emulated storage system
US20050108486A1 (en) Emulated storage system supporting instant volume restore
US8386733B1 (en) Method and apparatus for performing file-level restoration from a block-based backup file stored on a sequential storage device
US7596713B2 (en) Fast backup storage and fast recovery of data (FBSRD)
US6880051B2 (en) Method, system, and program for maintaining backup copies of files in a backup storage device
US7308463B2 (en) Providing requested file mapping information for a file on a storage device
CN101258493B (en) System and method for performing a search operation within a sequential access data storage subsystem
CN1489737B (en) Virtual tape storage system and method
EP1734451B1 (en) File name generation apparatus
US20080027998A1 (en) Method and apparatus of continuous data protection for NAS
US20070214384A1 (en) Method for backing up data in a clustered file system
CN101784996A (en) Emulated storage system
JP2005031716A (en) Method and device for data backup
CN101939737A (en) Scalable de-duplication mechanism
US7073038B2 (en) Apparatus and method for implementing dynamic structure level pointers
KR20060080239A (en) Emulated storage system supporting instant volume restore
JP2007527572A5 (en)
CN1997972A (en) Emulated storage system supporting instant volume restore
US20100325116A1 (en) Data library optimization
JP2000227868A (en) Backup acquiring method and computer system
US20230393948A1 (en) Storage system and method of restoring storage system
JP3288856B2 (en) Electronic information filing equipment
JP2002169728A (en) Backup method and computer system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1110128

Country of ref document: HK

C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20070711

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1110128

Country of ref document: HK