CN103067519A

CN103067519A - Method and device of data distribution storage under heterogeneous platform

Info

Publication number: CN103067519A
Application number: CN2013100012682A
Authority: CN
Inventors: 张伟
Original assignee: Shenzhen Guangdao High Technology Co Ltd
Current assignee: Shenzhen Guangdao Digital Technology Co.,Ltd.
Priority date: 2013-01-04
Filing date: 2013-01-04
Publication date: 2013-04-24
Anticipated expiration: 2033-01-04
Also published as: CN103067519B

Abstract

The invention relates to a method and a device of data distribution storage under a heterogeneous platform. The method comprises the following steps: a client-side sends requests to a server-side, metadata are sent to a first server after the requests are verified by a server; the first server builds or updates meta-information in a local database; the client-side uploads first data, the first server separates the first data into a plurality of block data which are then saved to a plurality of local servers after the first data are received; the block data are read, and each block datum is separated into a plurality of second data according to byte number; and each byte of the second data is separated into a plurality of third data respectively according to a set order of a unit, each corresponding third datum of each byte is connected to a plurality of fourth data sequentially, and the fourth data are stored in a plurality of storage devices respectively. The method and the device of the data distribution storage under the heterogeneous platform have the advantages of safe in data, simple in calculation and fast in processing speed.

Description

The method of distributed data storage and device under a kind of heterogeneous platform

Technical field

The present invention relates to field of data storage, more particularly, relate to method and the device of distributed data storage under a kind of heterogeneous platform.

Background technology

Along with the develop rapidly of computer technology and informationalized progressively deeply, the data sharp increase also promotes the storage Industry Quick Development, the memory device of more and more different frameworks needs collaborative work, the memory device of more and more different departments need to be shared data, but how to realize that trans-departmental data distributed storage and management become difficult point.So-called data protection is exactly save data on multidisciplinary isomery storage platform, but whether no matter authority arranged at present, and the data that are kept at this locality can be seen by all departments, may cause like this data to be tampered or to reveal, and cause dangerous.

In order to satisfy the data processing demands that increases rapidly, Google has proposed Google file system (Google File System writes a Chinese character in simplified form GFS), and the GFS framework as shown in Figure 1.GFS has a lot of identical design objects with traditional distributed file system, such as: performance, scalability, reliability and availability.In addition, GFS thinks the normality event with component failures, rather than accident.GFS comprises the storage machine that hundreds of even the logical cheap apparatus of several thousand Daeporis are assembled, simultaneously by a considerable amount of client access.Some assembly all might occur in the quality and quantity of GFS assembly causes at any given time can't work, and can't recover from their present failure states.For example: the bug of application program bug, operating system, human error, even also have the problems such as hard disk, internal memory, connector, network and power-fail.So the mechanism that continues monitoring, error detection, disaster redundancy and automatically recovery must be integrated among the GFS.

Weigh with common standard, the Google file is very huge, and the file of number GB is very general.Each file comprises many application objects usually, such as the web document.When needs were processed the data set of rapid growth and severals TB that be made of several hundred million objects, it was very unadvisable adopting the mode of the small documents of managing several hundred million KB sizes, although the such way to manage of some file system support.Therefore, the assumed condition of design and parameter are such as I/O operation and Block(piece) size all need to rethink.

The modification of most files is to adopt in the tail of file supplemental data among the GFS, rather than covers the mode of legacy data, to the hardly existence of random writing operation of file, in case after writing, the operation of file is just only read, and normally read in order.The above-mentioned characteristic of a large amount of data fit, for example: the data set of the super large of DAP scanning; The continuous data flow that the application program of moving generates; The data of filing; By machine generate, the intermediate data of an other machine processing, the processing of these intermediate data may be carry out simultaneously, also may be follow-uply just to process.For this access module for mass file, client is nonsensical to the data block cache, and the operation of appending of data is major consideration that performance optimization and atomicity guarantee.

The flexibility that the collaborative design of application program and file system api (Application Programming Interface, application programming interface) has improved whole system.Such as, loosened the requirement to the GFS consistency model, so just alleviated the harsh requirement of file system application programs, greatly simplified the design of GFS.GFS has introduced the record addition operation of atomicity, thereby guarantees that a plurality of clients can append operation simultaneously, does not need extra simultaneous operation to guarantee data's consistency.

GFS realization data are carried out distributed storage at a large amount of cheap machines, but under the GFS framework, data all are visible to whole machines, do not distinguish authority, and it causes dangerous, and therefore, this scheme can't satisfy the data sharing requirement between the different departments.

Bluesky(blue sky) mainly inquire into a kind of method bridge joint cloud and local the application as a network file service, what it was paid close attention to is to substitute the legacy network file service with cloud service, based on agency's solution, convert request on the internet corresponding cloud storage API Calls.Stores service based on cloud can provide the local file system function, accumulated simultaneously by the third party serve the benefit of the extensibility of bringing and cost aspect, the framework of Bluesky is as shown in Figure 2.As seen from Figure 2, its framework thought is: provide NFS(Network File System at front end to client, NFS) and CIFS (Common Internet File System, general purpose I nternet file system) standard interface, client can be uploaded download file by these two kinds of standard interfaces.Local disk is as a buffer memory, and a buffer memory part data is carried out swap-in according to a certain strategy to data and swapped out.Then carry out the data storage by calling publicly-owned cloud storage platform interface, but before uploading data, need data are encrypted, to guarantee the safety of data.A Clearner is arranged in the Bluesky system, its Main Function is garbage reclamation, can act on the local disk as buffer memory, also can act in the cloud storage platform, but because the data in the cloud platform are encrypted, whether the None-identified data need to allow Clearner remove, so can only be encrypted or Clearner need to carry out work with key partial data, affect the safety of data.

Bluesky has solved the publicly-owned cloud storage platform of direct use because of the long-time delay issue that limit bandwidth produces, and has remedied to a certain extent the safety problem of publicly-owned cloud storage platform.Its shortcoming mainly is: at first, Bluesky can't carry out data de-duplication because of a local buffer unit divided data, and the data volume that uploads in the publicly-owned cloud storage platform is larger, increases uplink time; Secondly, because local log-structured file system management and the storage data of having adopted, data management is inconvenient, and data volume is larger; At last, adopt the mode protected data of encrypting greatly to increase computation complexity and pretreatment time.

Summary of the invention

The technical problem to be solved in the present invention is, for the defective that above-mentioned data are dangerous, computing is complicated, processing speed is slow of prior art, provide that a kind of data are safer, computing is simple, processing speed faster method and the device of distributed data storage under the heterogeneous platform.

The technical solution adopted for the present invention to solve the technical problems is: construct the method for distributed data storage under a kind of heterogeneous platform, comprise the steps:

A) the user end to server end sends request, and by sending metadata to first server after the described server end checking;

B) described first server is according to the metamessage in described metadata creation or the renewal local data base;

C) described client upload the first data, described first server are saved in respectively in local a plurality of server after receiving described the first data and being split into a plurality of blocks of data;

D) read described a plurality of blocks of data, and described each blocks of data is split into a plurality of the second data by setting byte number;

E) respectively the data of each byte of described the second data are split into a plurality of the 3rd data by setting figure place, the 3rd data of described each byte correspondence position are connected successively obtain a plurality of the 4th data, and described a plurality of the 4th data are stored into respectively in a plurality of memory devices.

Under heterogeneous platform of the present invention in the method for distributed data storage, described step D) further comprise:

D1) obtain the first address of described blocks of data, and judge whether described blocks of data is readable, in this way, execution in step D2); Otherwise, withdraw from this operation;

D2) begin to read successively described blocks of data from described first address, and described each blocks of data is split into a plurality of the second data by described setting byte number.

In the method for distributed data storage, described setting byte number is four bytes under heterogeneous platform of the present invention.

Under heterogeneous platform of the present invention in the method for distributed data storage, described step e) further comprise:

E1) respectively the data of each byte of described the second data are split into a plurality of the 3rd data by setting figure place;

E2) successively the 3rd data of correspondence position in described each byte are remained unchanged respectively, with the 3rd data zero clearing of all the other positions;

E3) respectively the byte data at the 3rd data place of described correspondence position is carried out moving to left of corresponding figure place or move to right after and superpose and obtain a plurality of the 4th data;

E4) described a plurality of the 4th data are stored into respectively in predefined a plurality of memory device.

In the method for distributed data storage, described setting figure place is one or two or four under heterogeneous platform of the present invention.

The invention still further relates to a kind of device of realizing the method for distributed data storage under the above-mentioned heterogeneous platform, comprising:

Request and data transmitting module: be used for making the user end to server end to send request, and by sending metadata to first server after the described server end checking;

Information creating update module: be used for making described first server according to the metamessage of described metadata creation or renewal local data base;

Data upload and preservation module: be saved in respectively in local a plurality of server after being used for making described client upload the first data, described first server receive described the first data and be split into a plurality of blocks of data;

Data read and split module: be used for reading described a plurality of blocks of data, and described each blocks of data is become a plurality of the second data by byte split;

Data Division and memory module: the data that are used for respectively each byte that will described the second data split into a plurality of the 3rd data by the setting figure place, the 3rd data of described each byte correspondence position are connected successively obtain a plurality of the 4th data, and described a plurality of the 4th data are stored into respectively in a plurality of memory devices.

The device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization of the present invention, described data read and split module and further comprise:

Address acquisition and judging unit: be used for obtaining the first address of described blocks of data, and judge whether described blocks of data is readable, and when not readable, withdraw from this operation;

Blocks of data reads and split cells: be used for beginning to read successively described blocks of data from described first address, and described each blocks of data is split into a plurality of the second data by described setting byte number.

The device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization of the present invention, described setting byte number are four bytes.

The device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization of the present invention, described Data Division and memory module further comprise:

The position split cells: the data that are used for respectively each byte that will described the second data split into a plurality of the 3rd data by the setting figure place;

Data zero clearing unit: be used for respectively successively the 3rd data of described each byte correspondence position are remained unchanged, with the 3rd data zero clearing of all the other positions;

Data displacements superpositing unit: be used for that respectively the byte data at the 3rd data place of described correspondence position carried out moving to left of corresponding figure place or move to right after and superpose and obtain a plurality of the 4th data;

Distributed store unit: be used for storing respectively described a plurality of the 4th data into predefined a plurality of memory device.

The device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization of the present invention, described setting figure place are one or two or four.

Implement method and the device of distributed data storage under the heterogeneous platform of the present invention, have following beneficial effect: because first server is split into a plurality of blocks of data after receiving the first data, and a plurality of blocks of data are saved in respectively in local a plurality of server; Then read a plurality of blocks of data, and each blocks of data is split into a plurality of the second data by setting byte number; Respectively the data of each byte of the second data are split into a plurality of the 3rd data by setting figure place again, the 3rd data of each byte correspondence position are connected successively obtain a plurality of the 4th data, do not need the encryption and decryption of computing complexity, so computing is simple, processing speed is very fast; Because a plurality of the 4th data are stored into respectively in a plurality of memory devices (memory devices under a plurality of heterogeneous platforms), be that each memory device is only preserved a part of data message (through the data of position fractionation and restructuring), can't restore normal data, so data are safer.

Description of drawings

Fig. 1 is the Organization Chart of GFS in the background technology;

Fig. 2 is the Organization Chart of Bluesky in the background technology;

Fig. 3 is the method for distributed data storage under the heterogeneous platform of the present invention and installs the flow chart of method among the embodiment;

Fig. 4 is the particular flow sheet that blocks of data splits into the second data among the described embodiment;

Fig. 5 be among the described embodiment the second data are split and recombinate after the particular flow sheet stored respectively;

Fig. 6 be among the described embodiment the second data by four schematic diagrames that split;

Fig. 7 be among the described embodiment the second data by two schematic diagrames that split;

Fig. 8 is the structural representation that installs among the described embodiment;

Fig. 9 is the entire system Organization Chart of distributed data storage under the heterogeneous platform among the described embodiment;

Figure 10 is the encryption time of different large small documents and the test result comparison diagram that meta of the present invention splits the time.

Embodiment

Can understand and implement the present invention for the ease of those of ordinary skill in the art, embodiments of the present invention is further illustrated below in conjunction with accompanying drawing.

Among the method for distributed data storage and the device embodiment, the flow chart of its method as shown in Figure 3 under heterogeneous platform of the present invention.Among Fig. 3, the method comprises:

Step S01 user end to server end sends request, and by sending metadata to first server after the described server end checking: in this step, after the user end to server end sends request (request of uploading data or downloading data), and by behind the server end identity verification, just metadata sends first server to, in the present embodiment, first server is meta data server.It is worth mentioning that, in the present embodiment, metadata is to be the data of data of description and environment thereof, in other words, metadata be exactly the attribute information of data be the configuration information of data, be divided into several, the size of data and the information such as position of data storage such as: data.It is worth mentioning that above-mentioned server end (local server-side) comprises first server.

Step S02 first server is according to metadata creation or upgrade metamessage in the local data base: in this step, first server is according to the metamessage in above-mentioned metadata creation or the renewal local data base, the also i.e. attribute information of new data more, such as: the information such as the size of pending file or filename.

Step S03 client upload the first data, after receiving the first data and be split into a plurality of blocks of data, first server is saved in respectively in local a plurality of server: in this step, client upload the first data, first server receives the first data and it is processed, specifically exactly the first data are split into a plurality of blocks of data, more above-mentioned a plurality of blocks of data are saved in respectively (specifically in the memory cell of home server cluster) in local a plurality of server.It is worth mentioning that, the data volume of the first data in the present embodiment is larger, if only store with a server, bear larger, sometimes in addition the memory space of a server can be not enough to store data, can not satisfy the needs to large memory space, in order to reduce system burden, in the present embodiment, with the first data distributed and saved (being in the home server cluster) in a plurality of servers.

Step S04 reads a plurality of blocks of data, and each blocks of data split into a plurality of the second data by setting byte number: because the data volume of blocks of data is larger, for follow-up convenience of carrying out Data Division, in this step, read above-mentioned a plurality of blocks of data from said memory cells, and each blocks of data is split into a plurality of the second data by setting byte number, the blocks of data that data volume is larger splits into less a plurality of the second data of data volume, can make like this computing simple, reduce the processing time.It is worth mentioning that in the present embodiment, above-mentioned setting byte number is four bytes, at this moment calculate more conveniently that the processing time is also shorter.Certainly, in the other situation of present embodiment, setting byte number can adjust accordingly, and for example: setting byte number is two bytes or other byte numbers.

Step S05 splits into a plurality of the 3rd data with the data of each byte of the second data by setting figure place respectively, the 3rd data of each byte correspondence position are connected successively obtain a plurality of the 4th data, and a plurality of the 4th data are stored into respectively in a plurality of memory devices: in this step, specifically, data with each byte in four bytes of the second data split into a plurality of the 3rd data by setting figure place exactly, above-mentioned setting figure place is one or two or four, for example: when the setting figure place was one, the data of each byte may be split into eight the 3rd data; When the setting figure place was two, the data of each byte may be split into four the 3rd data; When the setting figure place was four, the data of each byte may be split into two the 3rd data.Then the 3rd data of correspondence position in each byte (i.e. each byte of the second data) are connected successively by the sequencing of place byte and obtain a plurality of the 4th data, and a plurality of the 4th data are stored into respectively in the memory device under a plurality of heterogeneous platforms, what store in each memory device is local data, and be that initial data is through after splitting, True Data can't be seen without authorizing, the fail safe of data can be guaranteed like this.In addition, the method does not need encryption and decryption, has eliminated the complexity of encryption and decryption, makes computing comparatively simple, has reduced the time of processing, and processing speed is very fast.

For the present embodiment, further refinement of above-mentioned steps S04, its particular flow sheet as shown in Figure 4, among Fig. 4, step S04 further comprises:

Step S401 obtains the first address of blocks of data, and whether the decision block data are readable: in this step, it specifically is exactly the first address that obtains above-mentioned a plurality of blocks of data, also namely store the first address of the memory cell of above-mentioned a plurality of blocks of data, and whether the decision block data are readable, and namely whether blocks of data has been ready to can supply to read; In other words, its form of expression is exactly to obtain the data handle (a data handle is the long numerical value of a nybble) that will carry out the position and split, and judges whether the data handle is correct, if the result of judgement is yes, and execution in step S403 then; If the result who judges is no, then execution in step S402.

Step S402 withdraws from this operation: if the determination result is NO for above-mentioned steps S401, then carry out this step.In this step, withdraw from this operation.

Step S403 begins to read successively blocks of data from first address, and each blocks of data is split into a plurality of the second data by setting byte number: if the judged result of above-mentioned steps S401 is yes, then carry out this step.In this step, begin to read successively blocks of data from first address, namely first memory cell from the memory block data begins to read successively blocks of data (also can be to begin reading character from the data handle), and blocks of data split into a plurality of the second data by setting byte number, in this step, when reading blocks of data, the process that judgement is arranged certainly, specifically be exactly whether the decision block data read complete, judge that namely whether the data in all memory cell of memory block data all are read, and ceaselessly judge whether to read last memory cell in other words.In the present embodiment, blocks of data is become a plurality of the second data by four byte splits, namely the length of each the second data is four bytes.When blocks of data splits by four bytes, follow-up position is split and restructuring (merging) operational processes speed, computing is simple.Certainly, in the other situation of present embodiment, each blocks of data also can split by other byte numbers.

For the present embodiment, further refinement of above-mentioned steps S05, its particular flow sheet as shown in Figure 5, among Fig. 5, step S05 further comprises:

Step S501 splits into a plurality of the 3rd data with the data of each byte of the second data by setting figure place respectively: in this step, specifically, data with each byte in four bytes of the second data split into a plurality of the 3rd data by setting figure place exactly, above-mentioned setting figure place is one or two or four, for example: when the setting figure place was one, the data of each byte may be split into eight the 3rd data; When the setting figure place was two, the data of each byte may be split into four the 3rd data; When the setting figure place was four, the data of each byte may be split into two the 3rd data.Certainly, in the other situation of present embodiment, set figure place and can adjust accordingly according to actual needs.

Step S502 remains unchanged the 3rd data of correspondence position in each byte respectively successively, the 3rd data zero clearing with all the other positions: in the present embodiment, because the 3rd data of correspondence position in each byte will be spliced, specifically, be exactly respectively the 3rd data of the correspondence position behind each byte split in four bytes of the second data to be spliced, when the data of each byte of the second data split by four, the data of each byte may be split into first the 3rd data and second the 3rd data, namely with first bytes of the second data, second byte, the 3rd byte connects by former order successively with first the 3rd data in the 4th byte, and with first bytes of the second data, second byte, the 3rd byte connects by former order successively with second the 3rd data in the 4th byte, the operation of above-mentioned splicing or connection also can be union operation, after merging, can obtain two groups 16 data.In the present embodiment, above-mentioned union operation obtains by displacement and stack.In this step, successively the 3rd data of correspondence position in each byte are remained unchanged respectively, the 3rd data zero clearing with all the other positions, for example: when will be with first byte of the second data, second byte, the 3rd byte is when first the 3rd data in the 4th byte connect successively by former order, at first make first byte of the second data, second byte, first the 3rd data in the 3rd byte and the 4th byte remain unchanged, with first byte of the second data, second byte, the position zero clearing at second the 3rd data place in the 3rd byte and the 4th byte; When will be with first bytes of the second data, second byte, the 3rd byte when second the 3rd data in the 4th byte connect successively by former order, second the 3rd data in first byte of the second data, second byte, the 3rd byte and the 4th byte are remained unchanged, with the position zero clearing at first the 3rd data place in first bytes of the second data, second byte, the 3rd byte and the 4th byte.

Step S503 respectively the byte data at the 3rd data place of correspondence position is carried out moving to left of corresponding figure place or move to right after and superpose and obtain a plurality of the 4th data: understand for convenient, be split as example by four and describe this step with above-mentioned, move to right after four data with first byte of the data of second byte (first the 3rd data are constant in this byte, the data after other zero clearings) are superposeed and obtain first's data; Move to right after four data with the 3rd byte of the data of the 4th byte (first the 3rd data are constant in this byte, the data after other zero clearings) are superposeed and obtain the second portion data.First's data and second portion data consist of first the 4th data, and the 4th data here are 16 bit data.Move to left after four data with second byte of the data of first byte (second the 3rd data is constant in this byte, the data after other zero clearings) are superposeed and obtain the third part data; Move to left after four data with the 4th byte of the data of the 3rd byte (second the 3rd data is constant in this byte, other zero clearings) are superposeed and obtain the 4th partial data, second the 4th data of third part data and the 4th partial data formation.Fig. 6 is the schematic diagram that in the present embodiment each byte of the second data is split and splices (merging or restructuring) by four.

Step S504 stores a plurality of the 4th data respectively in predefined a plurality of memory device: in this step, a plurality of the 4th data are stored into respectively in predefined a plurality of memory device, for example: above-mentioned first the 4th data are stored in the memory device, second the 4th data stored in another memory device, namely the data of each byte split by four, namely front four synthetic first the 4th data of each byte in the total data are kept on the storage platform, rear four synthetic second the 4th data are kept on the another one storage platform.It is worth mentioning that a plurality of memory devices in the present embodiment are the memory devices under the heterogeneous platform.The user can only see partial data, and what see is True Data through the data after splitting, and through authorizing, can not see True Data, can guarantee like this fail safe of data.

Fig. 7 is the schematic diagram that in the present embodiment each byte of the second data is split and splices (merging or restructuring) by two, first byte (0 byte) with the second data, second byte (1 byte), first the 3rd data of the 3rd byte (2 byte) and the 4th byte (3 byte) are spliced into example, respectively with first bytes (0 byte) of the second data, second byte (1 byte), first the 3rd data " 01 " of the 3rd byte (2 byte) and the 4th byte (3 byte), " 00 ", " 00 " and " 10 " remains unchanged, with all the other zero clearings in this byte, it also is first byte (0 byte) of the second data, second byte (1 byte), the data of the 3rd byte (2 byte) and the 4th byte (3 byte) become respectively " 0,100 0000 ", " 0,000 0000 ", " 0,000 0000 " and " 1,000 0000 ", then respectively with " 0,000 0000 ", " 0,000 0000 " and " 1,000 0000 " moves to right two successively, data with first byte after four and six add up, and obtain first the 4th data " 01 00 00 10 "; Other are shifted respectively according to the principle of similitude and add up, and obtain second the 4th data " 01 11 10 01 ", the 3rd the 4th data " 11 11 11 10 " and the 4th the 4th data " 00 01 10 11 ", then first the 4th data " 01 00 00 10 ", second the 4th data " 01 11 10 01 ", the 3rd the 4th data " 11 11 11 10 " and the 4th the 4th data " 00 01 10 11 " are stored into respectively in the different memory devices.That is to say, each byte data is split by two, namely synthetic first the 4th data of the front two of each byte in the total data are preserved on the storage platform, follow two, two again, last two each synthetic the 4th data are kept at respectively on the different storage platforms.Certainly, according to actual conditions, can select the position of corresponding positions number to split.So both realize the distributed storage of data, guaranteed again that the data that each department is seen were not True Datas, realized the secret protection of user data.

Present embodiment also relates to a kind of device of realizing said method, and its structural representation as shown in Figure 8.Among Fig. 8, this device comprises request and data transmitting module 1, information creating update module 2, data upload and preservation module 3, data read and split module 4 and Data Division and memory module 5; Wherein, request and data transmitting module 1 are used for making the user end to server end to send request, and by sending metadata to first server after the server end checking; Information creating update module 2 is used for making first server according to the metamessage of metadata creation or renewal local data base; After being used for making client upload the first data, first server receive the first data and be split into a plurality of blocks of data, data upload and preservation module 3 be saved in respectively in local a plurality of server; Data read and split module 4 and are used for reading a plurality of blocks of data, and each blocks of data is become a plurality of the second data by byte split; Data Division and memory module 5 is used for respectively the data of each byte of the second data are split into a plurality of the 3rd data by setting figure place, the 3rd data of each byte correspondence position are connected successively obtain a plurality of the 4th data, and a plurality of the 4th data are stored into respectively in a plurality of memory devices.Above-mentioned data read and split that module 4 further comprises address acquisition and judging unit 41 and blocks of data reads and split cells 42; Wherein, address acquisition and judging unit 41 are used for obtaining the first address of blocks of data, and whether the decision block data are readable, and withdraw from this operation when not readable; Blocks of data reads and split cells 42 is used for beginning to read successively blocks of data from first address, and each blocks of data is split into a plurality of the second data by setting byte number.In the present embodiment, above-mentioned setting byte number is four bytes.

In the present embodiment, above-mentioned Data Division and memory module 5 comprise that further a split cells 51, data zero clearing unit 52, data displacement superpositing unit 53 and distributed store unit are with 54; Wherein, position split cells 51 is used for respectively the data of each byte of the second data are split into a plurality of the 3rd data by setting figure place; Data zero clearing unit 52 is used for respectively successively the 3rd data of each byte correspondence position are remained unchanged, with the 3rd data zero clearing of all the other positions; Data displacements superpositing unit 53 be used for that respectively the byte data at the 3rd data place of correspondence position carried out moving to left of corresponding figure place or move to right after and superpose and obtain a plurality of the 4th data; The distributed store unit with 54 in a plurality of the 4th data are stored into respectively in predefined a plurality of memory device.In the present embodiment, above-mentioned setting figure place is one or two or four.

Fig. 9 is the entire system Organization Chart of distributed data storage under the heterogeneous platform in the present embodiment, and client can carry out namely uploading or downloading data alternately with server end.Figure 10 is the encryption time of different large small documents and the test result comparison diagram that meta of the present invention splits the time, encryption time and the position fractionation time of the large small documents of respectively test difference, wherein the time is the average time of every a Data Processing in Experiment ten times, and the size in experimental data source is respectively seven piece of data of 1MB, 9.89MB, 99.75MB, 300MB, 511MB, 794MB and 1035MB.See that from Figure 10 the performance that the position splits mechanism approximately is 10 times that encrypt.

In a word; in the present embodiment; in order to solve the Privacy Protection of data on heterogeneous platform; do not wish that namely each department sees real data; provide a data management and data storing platform for multidisciplinary isomeric data distributed storage framework for enterprise; the data distribution that enterprise can be stored in the system is left in the memory device of different departments; each department's with no authorized can't be seen True Data; because deposit data is inner in this department; enterprise just can take full advantage of existing all departments storage resources and finish the data storage like this, has reached simultaneously the cost of saving the data store and management.Specifically; when the client and server end is finished alternately (client upload data; by server reception ﹠ disposal data and be saved in the local cluster; or client downloads data; server obtains data and sends client to from this locality) after; server also needs to monitor the data of local update; at first it being carried out the position splits; exactly to needing distributed storage to arrive the data of different departments heterogeneous platform in the server; first its step-by-step is split; be placed on after the fractionation on the memory device of different departments; such data distributed storage on different departments heterogeneous platform; because what store in each department is local deficiency of data, even the department manager also can't see real data, like this with regard to the effective data-privacy protection problem that solved.It is directly perceived simple that the position splits mechanism, obtains easily high-performance.The method that adopts the position to split is carried out preliminary treatment, and performance is better than cryptographic algorithm, upload the data on the memory device of different departments, namely utilizes the isomery storage platform of different departments, and concurrent access is stored more, realizes again the protection to the user data privacy.

Will be to doing preliminary treatment (as encrypting etc.) before the memory device that is transferred to different departments in the prior art; each department also can't see and is kept at local initial data so; also can realize the protection to the user data privacy; but encryption and decryption calculation of complex; performance is lower; adopt position of the present invention method for splitting both can protect the user data privacy, also can obtain superior performance, can utilize the storage resources under a plurality of different heterogeneous platforms simultaneously.

It is worth mentioning that when client is wanted downloading data, the data on the different storage platforms be carried out obtaining True Data after corresponding fractionation and the merging, the inverse process that namely splits according to original position obtains True Data, then the user downloads True Data.

The above embodiment has only expressed several execution mode of the present invention, and it describes comparatively concrete and detailed, but can not therefore be interpreted as the restriction to claim of the present invention.Should be pointed out that for the person of ordinary skill of the art without departing from the inventive concept of the premise, can also make some distortion and improvement, these all belong to protection scope of the present invention.Therefore, the protection range of patent of the present invention should be as the criterion with claims.

Claims

1. the method for distributed data storage under the heterogeneous platform is characterized in that, comprises the steps:

2. the method for distributed data storage under the heterogeneous platform according to claim 1 is characterized in that described step D) further comprise:

3. the method for distributed data storage under the heterogeneous platform according to claim 2 is characterized in that, described setting byte number is four bytes.

4. the method for distributed data storage to the described heterogeneous platform of 3 any one according to claim 1 is characterized in that described step e) further comprise:

5. the method for distributed data storage under the heterogeneous platform according to claim 4 is characterized in that, described setting figure place is one or two or four.

6. a device of realizing the method for distributed data storage under the heterogeneous platform as claimed in claim 1 is characterized in that, comprising:

7. the device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization according to claim 6 is characterized in that described data read and split module and further comprise:

8. the device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization according to claim 7 is characterized in that, described setting byte number is four bytes.

9. the device of the method for distributed data storage to the above-mentioned heterogeneous platform of the described realization of 8 any one according to claim 6 is characterized in that described Data Division and memory module further comprise:

10. the device of the method for distributed data storage under the above-mentioned heterogeneous platform of realization according to claim 9 is characterized in that, described setting figure place is one or two or four.