CN109407975A

CN109407975A - Data writing method and calculate node and distributed memory system

Info

Publication number: CN109407975A
Application number: CN201811095443.8A
Authority: CN
Inventors: 何春雄
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-09-19
Filing date: 2018-09-19
Publication date: 2019-03-01
Anticipated expiration: 2038-09-19
Also published as: CN109407975B

Abstract

In distributed memory system, after calculate node receives write order, target data is carried in the write order, the corresponding target partition of write order described in the tag queries according to write order；Inquire the corresponding multiple hard disk ID of the target partition, the corresponding multiple hard disks of the target partition include with the hard disk for having selected state and with the hard disk of alternative state；First data are stored to the hard disk for having selected state, but not the hard disk of target data deposit alternative state.

Description

Data writing method and calculate node and distributed memory system

Technical field

The present invention relates to field of storage, in particular to distributed storage technology.

Background technique

In the prior art, distributed memory system is divided into multiple failure domains (usually with server or cabinet It is divided for granularity).A copy only can be stored in the same failure domain, single failure domain failure is avoided to lead to multiple copies not It may have access to, therefore the quantity of system requirements failure domain has to be larger than equal to number of copies.For example, at least being needed under three transcript scenes Three or three or more failure domains, just can guarantee will not place two copies in the same failure domain.When some failure domain loses After effect, the data in the failure domain can be rebuild on other failure domains, to guarantee the integrality of copy amount.

Under prior art, once after creation storage pool, it is subsequent to be difficult to be adjusted the copy of storage pool.Example If the distributed memory systems of three copies can only store data in the way of three copies, can not increase number of copies or Reduce number of copies.

If you need to force reduce number of copies, can only one by one volume reduction failure domain until the quantity and number of copies of failure domain it is equal, so It forces to remove a failure domain again afterwards, since, there are mutual exclusion rule, data are unable to complete reconstruction between copy, be reached with this To the purpose for reducing number of copies.By taking three copies are reduced to two copies as an example, scheme is: first distributed memory system event Barrier domain sum is reduced to three；Then data are stored in a manner of three copies；Then a failure domain is removed by force, Only retain two failure domains；In this way, respectively possessing a copy in two failure domains retained, therefore reach from the reduction of three copies To the purpose of two copies.It can could see and, this scheme is complicated for operation and it is necessary to remove failure domain by force, limitation is very By force.

Summary of the invention

In a first aspect, the present invention provides a kind of embodiment of data writing method, it is used for distributed memory system storing data, The distributed memory system includes calculate node and multiple memory nodes, and each memory node includes hard disk, the method packet Include: the calculate node receives write order, carries target data in the write order and target labels, the target labels are used for Identify the target data；The calculate node inquires target partition corresponding to the target labels；The calculate node is looked into Ask the corresponding multiple hard disk ID of the target partition, the corresponding multiple hard disks of the target partition include having to have selected the hard of state Disk and hard disk with alternative state, the hard disk for having selected state can be used for storing the copy of the target data, wherein institute Stating with the hard disk for having selected state includes Primary Hard Drive and from hard disk, and the hard disk of the alternative state is not used in the storage number of targets According to copy；Storage section of the calculate node by the target data and where having selected hard disk list to be sent to the Primary Hard Drive Point, described selected in hard disk list include from hard disk ID and do not include alternative hard disk ID；Memory node where the Primary Hard Drive The target data is stored in the Primary Hard Drive, and, it is described from hard disk according to the target data is sent to from hard disk ID The memory node at place；The memory node from where the hard disk slave hard disk that target data deposit is local.

Based on the program, the stored copy amount of target data hard disk quantity corresponding with target partition no longer keeps one It causes, to more there is flexibility.

In the first possible implementation of first aspect, in the memory node from where hard disk the target After the local slave hard disk of data deposit, further comprise: the state of alternative hard disk corresponding to the target partition is updated To have selected hard disk；The target data is obtained from the Primary Hard Drive for having stored the target data or from hard disk, obtaining The target data taken stores to new and has selected hard disk.

It, can be to having selected hard disk and the ratio of alternative hard disk to be adjusted, so as to adjust copy amount based on the program.

In second of possible implementation of first aspect, the target data can be deblocking, the target Label is the combination of logical unit number mark LUN ID and logical block address LBA.

This solution provides a kind of specific examples of target data and target labels.

In the third possible implementation of first aspect, the target data can be key-value pair (key value Pair), the target labels are key (key).

In the 4th kind of possible implementation of first aspect, target partition corresponding to the target labels is inquired, is had Body includes: the cryptographic Hash for calculating target labels, according to the corresponding relationship between the cryptographic Hash and the target partition, inquiry with Target partition corresponding to target labels described in target partition corresponding to the target labels.

This solution provides a kind of concrete operations schemes for inquiring target partition.

In the 5th kind of possible implementation of first aspect, according to the method described in claim 1, wherein: the mesh The state for marking the corresponding multiple hard disks of subregion is related to the target partition.

The state that the program explains hard disk is relevant with target partition.The same subregion is for two subregions, shape State can be different.

Second aspect provides a kind of embodiment of data writing method, described for data to be written to distributed memory system Distributed memory system includes calculate node and multiple memory nodes, and each memory node includes hard disk, which comprises

The calculate node receives write order, carries target data and target labels, the target mark in the write order Label are for identifying the target data；The calculate node inquires target partition corresponding to the target labels；The calculating Target partition described in querying node corresponding multiple hard disk ID, the corresponding multiple hard disks of the target partition include having to have selected shape The hard disk of state and hard disk with alternative state, the hard disk for having selected state can be used for storing the copy of the target data, The hard disk for having selected state is not used in the copy for storing the target data；The calculate node is corresponding according to target partition The target data is sent to where the hard disk for having and having selected state by the hard disk ID for having selected state in multiple hard disk ID Memory node.

Based on the program, the stored copy amount of target data hard disk quantity corresponding with target partition no longer keeps one It causes, to more there is flexibility.In the program, the target data is sent to the hard disk for having and having selected state by calculate node The memory node at place covers two kinds of situations: the first situation is similar to the scheme that first aspect provides, calculation server handle The target data is sent to the storage server where the Primary Hard Drive, and the storage server where the Primary Hard Drive is described in Target data is sent to from the storage server where hard disk, then as being stored from the server where hard disk；Second group Situation is not distinguish Primary Hard Drive and from hard disk, selected hard disk according to corresponding to target partition, calculation server is directly described in Target data is sent to the storage server where having selected hard disk and is stored.

In the first possible implementation of second aspect, the memory node from where hard disk is the number of targets After the local slave hard disk of deposit, further comprises: the state of alternative hard disk corresponding to the target partition is updated to Hard disk is selected；The target data is obtained from the Primary Hard Drive for having stored the target data or from hard disk, acquisition The target data store and to new selected hard disk.

The program may update the ratio for having selected hard disk and alternative hard disk.

The present invention also provides the embodiments of distributed memory system and calculate node, have the effect of above-mentioned corresponding method.

Detailed description of the invention

Fig. 1 is distributed memory system embodiment topological diagram；

Fig. 2 is the embodiment schematic diagram of subregion Yu disk state corresponding relationship；

Fig. 3 is the another embodiment schematic diagram of subregion Yu disk state corresponding relationship；

Fig. 4 is the another embodiment schematic diagram of subregion Yu disk state corresponding relationship；

Fig. 5 is the embodiment flow chart to distributed memory system write-in data；

Fig. 6 is another embodiment schematic diagram of subregion Yu disk state corresponding relationship.

Specific embodiment

In distributed memory system, identical Data duplication is stored in different storage servers, each storage Data on server are known as a copy (copy), and the mode of this protection data is known as more copies (multi-copy), Referred to as mirror image (mirror).Here data are, for example, data block or object.Data block is the data unit of block storage, such as Storage area network (storage area network, SAN)；Object is the data sheet of object storage (object storage) Position, such as cloud object storage (cloud object storage) or key assignments (key-value, KV) storage.

Referring to attached drawing 1, host 11 and distributed memory system communication, distributed memory system include storage server 12, Storage server 13, storage server 14 and storage server 15 further include calculation server 16, calculation server 17 and first number According to server 18, the quantity of each server can more (not shown).Wherein, the calculation server is for receiving from host 11 The data of sending calculate the corresponding subregion of data；The corresponding hard disk of subregion is stored in the meta data server 18, in this hair In bright embodiment, the meta data server 18 also extra storage has the state of the corresponding hard disk of subregion, has only selected the hard of state For storing data, the hard disk of alternative state is not used in storing data to disk.Pair of the meta data server to subregion and hard disk relationship Table and disk state is answered to be managed.By query metadata server 18, the calculation server 16 can obtain primary data Need to be sent to storage server (having selected the storage server where hard disk)；The storage server is for directly or indirectly The data that the calculation server is sent are received, is stored in and local has selected hard disk.Each the data of hard-disc storage is selected to be known as The copy for the data that host issues.

Calculation server 16 and calculation server 17 are likely to receive the data block of host, when some calculation server is received After the data block sent to host 11, the hard disk ID for storing this data block is inquired.According to query result, this data Block is respectively stored in the hard disk of storage server 12, storage server 13 and storage server 14.In each storage server The data block of storage is identical, and each data block is properly termed as a copy (copy 121, copy 131 and copy 141).Due to pair This sum is 3, and this storage mode is known as 3 copies.

It should be noted that in other embodiments, two in storage server, calculation server and meta data server Person or three can integrate together, such as the function of the existing storage server of the same server has calculation server again Function.Since the essence of technology does not change, independent introduction is not done to such case embodiment of the present invention.

It, can be as Fig. 1 using single storage server as stored copies in the data protection mode of more copies Minimum unit, that is, each storage server is called a failure domain；For the same data block, each storage clothes The copy amount of business device storage is no more than 1, and the failure of any one storage server only will affect this storage server certainly The copy that oneself is stored will not influence the copy in other storage servers.Other than using storage server as failure domain, Can also using hard disk, machine frame, computer room, data center as stored copies minimum unit.

The embodiment of the present invention carries out the volume reduction of copy according to current business demand, and the capacity after volume reduction is to can satisfy currently Business demand will not bring performance to influence during volume reduction because of data reconstruction, and data reconstruction is high-efficient；In addition, also providing Increase the scheme of data copy number.Specifically, the embodiment of the present invention establishes number to be stored by subregion (partition) According to the corresponding relationship between the hard disk of storing data.In the present embodiment, there are maximum number of copies, and by label or its His mode hard disk corresponding to subregion is arranged different states of having selected, and the state of having selected of hard disk includes having selected hard disk and alternative hard Disk.It, can be to the pair of data by changing the ratio selected between hard disk and alternative hard disk within the scope of maximum number of copies The increase or reduction of this quantity；It is entirely in the hard disk corresponding to the subregion when having selected hard disk (there is no alternative hard disk), The maximum number of copies of subregion has selected hard disk number identical with this subregion.

Specifically, there are the maximum number of copies of subregion in distributed memory system.And establish each subregion and hard disk Mapping relations, mapping relations record in the mapping table, this mapping table is also referred to as subregion routing table.Mapping table passes through mostly secondary This mode is stored in meta data server 18.In the embodiment of the present invention, disk state can be marked, disk state Including having selected state and alternative state.In having selected the hard disk of state that can be stored in copy, the hard disk in alternative state can not It is stored in copy.In the hard disk sum+hard disk sum in the alternative state=maximum number of copies for having selected state.Disk state can Be recorded in the mapping table together with the mapping relations, or in addition individually record.

To multiple hard disks corresponding to the same subregion, the precedence relationship between hard disk can be set, according to precedence relationship Determine the sequence of change disk state.For example, multiple hard disks corresponding with same subregion can record these with the mode of chained list The precedence relationship of hard disk.What it is positioned at chained list stem is Primary Hard Drive, remaining hard disk is from hard disk.By taking Fig. 2 as an example, each subregion is corresponding Hard disk number be 4, therefore in the case where whole hard disks are all to have selected state (not shown), each data can be with the side of 4 copies Formula is stored.For example, subregion 1 corresponds to hard disk 211, hard disk 221, hard disk 231 and hard disk 241, this is described with arrow in Fig. 2 Positional relationship of 4 hard disks in chained list.For subregion 1, what it is positioned at chained list stem is hard disk 211, and what it is positioned at chained list tail portion is Hard disk 241, wherein hard disk 211 is Primary Hard Drive, and hard disk 221, hard disk 231 and hard disk 241 are from hard disk；Similar, subregion 2 is right Answer hard disk 251, hard disk 261, hard disk 271 and hard disk 211, the corresponding hard disk 291 of subregion 3, hard disk 301, hard disk 311 and hard disk 211. From figure 2 it may also be seen that hard disk 211 and three subregions are all related, in which: for subregion 1, hard disk 211 is in and has selected state, Hard disk 211 is but in alternative state for subregion 2, subregion 3.In other words, disk state is not the attribute of hard disk itself, A but parameter for subregion.Disk state describe this hard disk for particular zones be selected state or Alternative state.

As shown in Fig. 2, when needing the copy data of data to be reduced to 3, it, can be chained list end for subregion 1 Hard disk 241 positioned at server 24 is set as alternative state, remaining hard disk should be set as having selected state；It, can be with for subregion 2 The hard disk 211 at chained list end is set as alternative state；For subregion 3, the hard disk 211 at chained list end can be set as alternative State.Further, if it is desired to expanding copy amount, such as 4 copies are increased to from 3 copies, then can be 1 in alternative The hard disk of state is updated to select state.For example, the hard disk 214 in Fig. 2 in subregion 1 is changed to select shape from alternative state State, then subregion 1 can support the storage of 4 copies.In a distributed system, disk state is all carried out to all copies Change, can keep the consistent of each subregion copy amount.

In technical solution provided by the present embodiment, distributed storage system can be flexibly adjusted by changing disk state The supported copy amount of system.Since copy amount is fewer, then utilization ratio of storage resources is higher, and copy amount is more, then data Reliability is higher, therefore, after the embodiment of the present invention, can find between utilization ratio of storage resources and data reliability Preferably balance.

With reference to Fig. 3, a specific implementation method for writing data is introduced below.This method can be applied to Fig. 1 institute In the distributed memory system shown.

The operating system (OS) of step S31, host 11 pass through small computer system interface (small computer System interface, SCSI) or Internet Small Computer Systems Interface (internet small computer System interface, iSCSI) it is sent out to any first calculation server (target calculation server) of distributed memory system Send write request (write request is also referred to as write IO request).It is asked for the convenience of description, the write request that this step issues is called first and is write It asks or target write request.It is carried in first write request and needs to be written the data of distributed memory system, referred to as first Data (target data).In addition, also carrying the label of the first data in first write request, label can be distinguished different Data.It should be noted that different data correspond to different labels under normal conditions, allow different numbers in a few cases According to possessing identical label.

Object (object) is stored, label can be the title of object.In key-value (key-value) storage, the One data are value (value), can be using key (key) as label.

Block (block) is stored, it can be using write address as label.Write address may is that logical unit number identifies (LUN ID)+logical block address (Logical Block Address, LBA).Wherein, LUN ID, which describes first data, needs Which LUN to be written；LBA is for describing the specific position that first data are written into the first LUN (target LUN) It sets, referred to as offset.For the convenience of description, in case of no particular description, the present invention is situated between so that block stores as an example It continues, independent explanation is not done for key-value storage.

In this step, host 11 can be the equipment except distributed memory system, be also possible to distributed memory system Interior equipment.Such as can be and integrated with any one storage server shown in FIG. 1, it is in such a scenario, described Host 11 can be the fusion of physical host and storage server, be also possible to possess the storage server of virtual machine function.More Further, if host and storage server integrate, the first data can be generated by storage server and are write The movement of address, " sending the first write request to distributed memory system " can be omitted.After corresponding, in below step S32 " receive scsi command " operation also no longer needs, receive the first write request storage server can directly according to LUN ID and LBA is assembled into a key.

Step S32, first calculation server pass through operation virtual block storage (virtual block store, VBS) Management software receives SCSI write request, (as previously mentioned, first calculation server is desirably integrated into storage server, In the case where integrated, the equipment for receiving scsi command is properly termed as storage server, but the role of this storage server is suitable In the combination of calculation server and storage server).According to the number of the LUN ID of the first data in the scsi command received and first According to LBA be assembled into a key.Then Hash (hash) operation is carried out to the key, obtains cryptographic Hash.Hash function is a kind of One-way function, the input of random length can be become the output of regular length by it, also, can not find two for an output A different input.It is translated when virtual block is stored with and is virtual block storage, be sometimes referred to as virtual block system (virtual storage system, VBS).

The combination of LUN ID and LBA can uniquely determine a data block, thus the two can be stitched together as The label of data block.The connecting method of LUN ID and LBA are, for example: LUN ID is stitched together to form one after in preceding, LBA Key.Key-Value is stored, it can be directly using Key therein as label.

VBS software module can execute volume metadata management, and VBS provides distributed storage by SCSI or iSCSI interface Access point service enables first calculation server to access distributed storage resource by VBS.VBS and storage server Point-to-point communication is carried out, VBS is enable concurrently to access these storage server hard disks.One can be disposed in each storage server VBS process, the VBS on multiple nodes form VBS cluster.In storage server IO can also be promoted by disposing multiple VBS Performance.

Step S33, first calculation server are determined according to the corresponding relationship between the result and subregion of Hash operation The corresponding partition id of first data.For convenience of description, the corresponding partition id of the first data (target data) is called target point Area ID.

Based on distributed hashtable (Distributed Hash Table, DHT), distributed memory system is by hash space (0~2^32) is divided into N parts (such as N equal portions), and every 1 part is 1 subregion (partition), and each subregion possesses multiple Hash Value, each subregion possess a unique ID (partition ID).Each subregion corresponds to multiple hard disks.That is, logical Hash function is crossed, may be implemented: the cryptographic Hash-of Key > subregion-> a plurality of hard disks corresponding relationship.Pair of subregion and hard disk It should be related to preservation in the mapping table, the mapping table can store in meta data server.A plurality of hard disks can be divided into master Hard disk and from hard disk；It can also be the relationship of equality without master and slave differentiation, between hard disk.

Step S34, first calculation server inquire target partition ID according to target partition ID from the mapping table The ID of corresponding hard disk and the state of these hard disks.Record has the corresponding relationship of each partition id and hard disk ID in mapping table, And the state of hard disk.State includes: to have selected state, alternative state.Disk state can be arranged by administrator by program.Place In having selected the hard disk of state referred to as to select hard disk, the hard disk in alternative state is known as alternative hard disk.Hard disk has been selected to can be written into Data cannot be written in data, alternative hard disk.It should be noted that disk state here is not the shape of hard disk physically State, but setting in logic, for marking: when writing data, whether which can be written into the copy of data.Hard disk ID and Storage server address is corresponding, therefore after the corresponding hard disk ID of acquisition target partition, with can obtaining corresponding storage server Location.The state of hard disk is maintained in meta data server.

The VBS module of step S35, the first calculation server are sent to the storage server where Primary Hard Drive: described first Data in the hard disk ID of state has been selected (may include Primary Hard Drive ID and from hard disk ID；Or do not include Primary Hard Drive ID, only wrap It includes from hard disk ID).After storage server where Primary Hard Drive receives the first data, according to Primary Hard Drive ID, first data The Primary Hard Drive is written；And institute is sent according to from hard disk ID to the storage server where having selected the slave hard disk of state State the first data.It should be noted that first calculation server will not be to the storage where the hard disk in alternative state Server sends first data.

By taking the subregion 1 of Fig. 3 as an example, the storage server 21 where Primary Hard Drive 211 executes following operation: the first data are write Enter Primary Hard Drive 211；First data are sent to the storage server 22 at 221 place of hard disk, carry and deposit in the request of transmission Store up the address of server 22 and the ID of hard disk 221；The storage server first data being sent to where hard disk 231 23, the address of storage server 23 and the ID of hard disk 231 are carried in the request of transmission.However, although hard disk 241 is also subregion Hard disk corresponding to 1, but since the state of hard disk 241 is in alternative state, storage server 21 will not be by described One data are sent to the storage server 24 where hard disk 241.

Step S36, from the storage server where hard disk by the server where Primary Hard Drive obtain first data, It is according to the ID from hard disk that first data deposit is respective from hard disk from hard disk ID.That is: storage server 22 is described in Hard disk 221 is written in first data, and hard disk 231 is written in first data by storage server 23.Storage server 24 is not received To first data, first data will not be written in hard disk 241.

Storage server 22 and storage server 23 receive send after first data be written successful response message to The storage server 21, storage server 21, which is sent, is written successful response message to host, and host is made to learn the first data Write as function.

Step S37, meta data server in the corresponding alternative hard disk of target partition one or more is alternative Hard disk is set as having selected hard disk.Meta data server instruction copy is copied to from the server for stored copy state from Alternatively it is updated in the hard disk selected.Specifically, the server where meta data server instruction copy sends copy to new Copy is stored in newly-increased hard disk as the server where increasing newly and having selected hard disk by the server selected where hard disk increased；Alternatively, first Data server instruction obtains pair as the server where increasing newly and having selected hard disk from the server for stored target data This, and it is stored in local newly-increased hard disk.

Referring to fig. 4, the corresponding hard disk 241 of subregion 1 is become to have selected state from alternative state.So need from hard disk 211, Either the first authentic copy in hard disk 221 or 231 copies in hard disk 241, so that the first data are in a manner of four copies It is maintained in distributed memory system.

The alternative hard disk of subregion is set as to have selected hard disk, or has selected hard disk to be set as alternative hard disk subregion, it can It, can also be by meta data server according to tactful automatic execution to be manually configured in meta data server by administrator.Example Such as, it detects that the temperature of data increases in the storage server of distributed system, then alternative hard disk is set as having selected hard disk, from And increase copy amount；Conversely, reducing the quantity of copy.

Step S38-S43, one or more alternative hard disk setting in the corresponding alternative hard disk of target partition After having selected hard disk, host sends the second write request to the distributed memory system.Is carried in second write order Two data and the second label, second label is for identifying second data.Second meter of the distributed memory system It calculates server and receives second write order, and second data are stored.Step S38-S43 and step S31-S36 It is similar, therefore step S31- step S36 can be referred to, it is not detailed.Step S38-S43 and step S31-S36 difference exist In: write request is changed, therefore stored data is also changed；In addition, after step S37, it is alternative hard Disk tails off (minimum situation is to be reduced to 0), hard disk has been selected to increase, and therefore, the copy amount stored required for data is also corresponding Increase.Such as: the hard disk 241 of the subregion 1 of Fig. 4 is become after having selected state from alternative state.So storage server 21 in addition to The second data be written Primary Hard Drive 211, it is also necessary to the ID of the second data and hard disk 221 be sent to storage server 22, The ID of second data and hard disk 231 is sent to storage server 23, and the ID of the second data and hard disk 241 is sent to Storage server 24.Second data are eventually stored in hard disk 211, hard disk 221, hard disk 231 and hard disk 241.In addition, step Calculation server involved in rapid S38-S43 and step S31-S36 can be the same calculation server, be also possible to different meters Calculate server.In order to distinguish the difference of role, the calculation server of step S31-S36 is called the first calculation server, Calculation server in S38-S43 is known as the second calculation server.

In other embodiments, can have a scheme of alternative steps S38-S43, meta data server target partition Hard disk is selected to be set as alternative state.Correspondingly, one or more in the alternative hard disk of target partition has selected hard disk It is set as after alternative hard disk, the server where notice becomes the hard disk of alternative state deletes local copy.Referring to figure 5, for Fig. 2, the hard disk 231 of subregion 1 from having selected state to become alternative state.It so can be in hard disk 231 The first authentic copy is deleted.Hard disk is being selected to be set as the new write order that alternative state receives subregion 1, in order to write with first Order, the second write order are distinguished, this new write order can be called third write order.Third number is carried in third write order According to third label, the stored copy amount of third data can be corresponding with hard disk quantity has been selected, that is, stores to hard disk 211 In hard disk 221.Since the storing process of third data is similar with step S31-S36, do not repeat them here yet.

For all subregions in the same distributed memory system, the consistent of number of copies can be kept, that is, Select hard disk consistent with the ratio of alternative hard disk.

Claims

1. a kind of data writing method, for data, the distributed storage to be written in the memory node to distributed memory system System includes calculate node and the memory node, and each memory node includes hard disk, which comprises

The calculate node receives write order, and target data and target labels are carried in the write order, and the target labels are used In the mark target data；

The calculate node inquires target partition corresponding to the target labels；

The calculate node inquires the corresponding multiple hard disk ID of the target partition, the corresponding multiple hard disk packets of the target partition It includes with the hard disk for having selected state and with the hard disk of alternative state, the hard disk for having selected state can be used for storing the target The copy of data, wherein described to have the hard disk for having selected state include Primary Hard Drive and from hard disk, and the hard disk of the alternative state is not For storing the copy of the target data；

Memory node of the calculate node by the target data and where having selected hard disk list to be sent to the Primary Hard Drive, institute It states to have selected in hard disk list and includes from hard disk ID and do not include alternative hard disk ID；

The target data is stored in the Primary Hard Drive by the memory node where the Primary Hard Drive, and, it is incited somebody to action according to from hard disk ID The target data is sent to described from the memory node where hard disk；

The memory node from where the hard disk slave hard disk that target data deposit is local.

2. according to the method described in claim 1, the target data is stored in this in the memory node from where hard disk After the slave hard disk on ground, further comprise:

The state of the hard disk of alternative state corresponding to the target partition is updated to select state；

The target data is obtained from the Primary Hard Drive for having stored the target data or from hard disk, described in acquisition Target data storage has selected hard disk to new.

3. according to the method described in claim 1, further include:

The target data is deblocking；

The target labels are the combinations of logical unit number mark LUN ID and logical block address LBA.

4. being specifically included according to the method described in claim 1, inquiring target partition corresponding to the target labels:

The cryptographic Hash for calculating the target labels, according to the corresponding relationship between the cryptographic Hash and the target partition, inquiry With target partition corresponding to target labels described in target partition corresponding to the target labels.

5. according to the method described in claim 1, wherein:

The state of the corresponding multiple hard disks of the target partition is related to the target partition.

6. a kind of data writing method, for data to be written to distributed memory system, the distributed memory system includes calculating Node and multiple memory nodes, each memory node include hard disk, which comprises

The calculate node inquires the corresponding multiple hard disk ID of the target partition, the corresponding multiple hard disk packets of the target partition It includes with the hard disk for having selected state and with the hard disk of alternative state, the hard disk for having selected state can be used for storing the target The copy of data, the hard disk for having selected state are not used in the copy for storing the target data；

The calculate node is according to the hard disk ID for having selected state in the corresponding multiple hard disk ID of target partition, by the number of targets According to being sent to the memory node having where having selected the hard disk of state.

7. according to the method described in claim 6, the memory node from where hard disk is stored in the target data locally Slave hard disk after, further comprise:

The state of alternative hard disk corresponding to the target partition is updated to select hard disk；

8. a kind of distributed memory system, the distributed memory system includes calculate node, metadata node and multiple storages Node, each memory node include hard disk, in which:

The calculate node is used for:

Write order is received, target data and target labels are carried in the write order, the target labels are for identifying the mesh Mark data；

Inquire target partition corresponding to the target labels；

Inquire the corresponding multiple hard disk ID of the target partition from metadata node, the corresponding multiple hard disks of the target partition Including with the hard disk for having selected state and with the hard disk of alternative state, the hard disk for having selected state can be used for storing the mesh Mark the copy of data, wherein described to have the hard disk for having selected state include Primary Hard Drive and from hard disk, the hard disk of the alternative state It is not used in the copy for storing the target data；

Memory node by the target data and where having selected hard disk list to be sent to the Primary Hard Drive, it is described to have selected hard disk clear Include from hard disk ID in list and does not include alternative hard disk ID；

Memory node where the Primary Hard Drive is used for:

The target data is stored in the Primary Hard Drive, and, according to from hard disk ID by the target data be sent to it is described from Memory node where hard disk；

The memory node from where hard disk is used for:

The local slave hard disk of target data deposit.

9. distributed memory system according to claim 8, the metadata node is also used to:

The state of the alternative hard disk is updated to select hard disk, has selected hard disk corresponding to the target partition to increase Quantity；

Instruction from the Primary Hard Drive for having stored the target data or from hard disk, replicate the target data to newly Select hard disk.

10. distributed memory system according to claim 8, in which:

The target data is deblocking；

11. distributed memory system according to claim 8, in which:

12. a kind of calculate node, the calculate node and multiple memory nodes belong to distributed memory system, each memory node Including hard disk, the calculate node includes processor and memory and interface, and computer program, the meter are stored in the memory Operator node is executed by the computer program:

By the interface write order, target data and target labels are carried in the write order, the target labels are used In the mark target data；

Inquire target partition corresponding to the target labels；

Inquire the corresponding multiple hard disk ID of the target partition, the corresponding multiple hard disks of the target partition include having to have selected shape The hard disk of state and hard disk with alternative state, the hard disk for having selected state can be used for storing the copy of the target data, The hard disk for having selected state is not used in the copy for storing the target data；

According to the hard disk ID for having selected state in the corresponding multiple hard disk ID of target partition, by the interface by the number of targets According to being sent to the memory node having where having selected the hard disk of state.