CN102968356A - Data processing method of cloud storage system - Google Patents

Data processing method of cloud storage system Download PDF

Info

Publication number
CN102968356A
CN102968356A CN2011104569412A CN201110456941A CN102968356A CN 102968356 A CN102968356 A CN 102968356A CN 2011104569412 A CN2011104569412 A CN 2011104569412A CN 201110456941 A CN201110456941 A CN 201110456941A CN 102968356 A CN102968356 A CN 102968356A
Authority
CN
China
Prior art keywords
data
frame
data block
error correction
cloud storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2011104569412A
Other languages
Chinese (zh)
Inventor
刘涛
阮昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Optics and Fine Mechanics of CAS
Original Assignee
Shanghai Institute of Optics and Fine Mechanics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Optics and Fine Mechanics of CAS filed Critical Shanghai Institute of Optics and Fine Mechanics of CAS
Priority to CN2011104569412A priority Critical patent/CN102968356A/en
Publication of CN102968356A publication Critical patent/CN102968356A/en
Pending legal-status Critical Current

Links

Landscapes

  • Detection And Correction Of Errors (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

The invention relates to a data processing method of a cloud storage system. Data storage and data extraction of the cloud storage system are respectively subjected to reed solomon coded data error-correction coding process and reed solomon error correction decoding process. According to the data processing method of the cloud storage system, disclosed by the invention, the safety and the restorability of data in the cloud storage system are improved, the number of backup data is reduced, the storage space of data can be saved, and the cost is reduced.

Description

The data processing method of cloud storage system
Technical field
The present invention relates to cloud storage system, characteristics are data processing methods of a kind of cloud storage system based on inner Saloman.
Background technology
In the epoch of surging forward now, the cloud storage has obtained great attention as the architecture of cloud and the most widely application.In cloud storage system, user data is deposited in the high in the clouds of system, and the memory node in formation high in the clouds is that the user is uncontrollable.User's data may be excavated comparison by unwarranted third party or malice is distorted.
Simultaneously, in the situation of single or a plurality of memory node disappearances, inefficacy (along with the expansion in high in the clouds, the probability of memory node fault will increase), the possibility of user data loss is very big beyond the clouds.Above-mentioned situation shows that a kind of security mechanism that can guarantee user data integrality, privacy and reliability is fully needed in the development of cloud storage badly.
All be the HDFS (Hadoop Distributed File System, i.e. Hadoop distributed file system) that adopts in a kind of similar Hadoop (a kind of open source software of cloud computing) system based on all cloud memory technologies at present.This technology mainly is that a data file is divided into some according to the size of setting, by each data integrity backup (for example the HDFS among the Hadoop is provided with 3 parts of identical copy) is improved reliability, the shortcoming of this technology is the waste storage space again.
In the Saloman error correction/encoding method: its principle is, computing information code symbol deconv is with the remainder after the check code generator polynomial, and concrete formula is:
F?mod?D=C;
Wherein F is raw data, and D is generator polynomial, the redundant correcting data of C for generating.Mod is the complementation computing.
And when decoding, for the sake of simplicity, suppose that depositing original information symbol in is m 3, m 2, m 1, m 0With consequent check character Q 1, Q 0, and the symbol of reading is m 3', m 2', m 1', m 0', Q 1' and Q 0', if calculate the syndrome s that obtains thus 0And s 1Be not 0 entirely, then explanation has mistake, by miscount polynomial expression and improper value, error correcting is come again.
The method is widely used in the processing of DVD data of optical disk, and this coding method can well improve the error correcting capability to raw data, can be with the random error rate of data from 2 * 10 -2Be reduced to 1 * 10 -15We are with the data block of this coding method with an array form in the present invention; at horizontal and vertical Saloman coding in carrying out respectively; obtain horizontal and vertical error correction redundancy data; so just data have been carried out dual error correction protection; improved error correcting capability, and these redundant datas only account for 13% of original data volume.
Just because of in the situation of lower data redundancy, carrying out so efficiently correcting data error, and general cloud storage system is not all taked this kind error correction method, just reach the ability that data are recovered by data backup, general cloud storage system must back up 3 parts and more than, obviously will greatly waste the data storage space, raise the cost
Summary of the invention
The technical problem to be solved in the present invention is, a kind of data processing method of cloud storage system is provided, and the method will improve the security of data in the cloud storage system, restorability, reduce the data backup number, save greatly the data storage space, reduce cost.
Technical solution of the present invention is as follows:
A kind of data processing method of cloud storage system, its characteristics be, the data of cloud storage system deposited in the Saloman coded data Error Correction of Coding of extracting in adopting respectively with data process and inner Saloman error correction decoding processing.
Described data store method comprises following concrete steps:
1. raw data to be stored is divided into K frame data, each frame data comprises the data of identical regular length N bit, when the data length of last initial data frame during less than N, to last blocks of data frame padding data ' 0 ', to reach length N, wherein K is greater than greater than 1 positive integer, and the span of N is: 200<N<2000:
2. described each Frame is added numbering, namely ID obtains a new data frame, and described ID length is 4 bytes, progressively increases since 0001, so the length of new Frame is (N+4);
3. be W data block with a described new K Frame recombinant, each data block comprises M Frame, form the data matrix of a M* (N+4), when the number of the Frame of last data block during less than M, this data block is filled ' 0 ' Frame, so that last data block reaches fixing Frame M, wherein M, the span of W is respectively: 200<M<2000, W=K/M;
4. the Saloman product code error correction/encoding method in described data block being adopted carries out Error Correction of Coding: the row and column of i data block is added respectively P 0, P IThe redundant data of individual error correction is encoded, and changes into one and comprises (M+P 0) * (N+4+P I) individual data in the Saloman data block, wherein, P 0, P IBe respectively the redundant data number that is used for error correction that delegation in the data block and row increase, and 0<P 0<M/2,0<P I<M/2,1<i≤M;
5. described Saloman data block in each is resolved into M+P by row 0Individual data slice is with the M+P of same data block 0The data of individual data slice store into respectively on a plurality of storeies of cloud storage system, and in the same storer, the sheet number≤P of same data block I/ 2;
Described data extraction method comprises the following steps:
1. read out the data slice that belongs to same data block, if P occurs in the i sheet 0/ 2 data read errors, the error correction decoding algorithm according to inner Saloman product code to the correcting data error of this i sheet, recovers raw data;
2. all data slice of same data block have been read, if the P of being less than is arranged when reading this data block I/ 2 go out active or can't read, and computing machine is less than P by the decoding algorithm of inner Saloman product code to this I/ 2 correcting data errors recover raw data;
3. 1., 2. repeating step reads and handles all data blocks that belong to same raw data, removes the error correction redundancy data, and the new data frame of all data blocks is arranged sequentially by former numbering ID, removes numbering again, obtains the data of original storage.
Technique effect of the present invention:
1, remarkable result of the present invention is, Saloman Error Correction of Coding in the row and column of original data block all carried out, reached the ability of double error correction, and these data blocks after error correction again burst are stored on the different memory in the cloud system, can not only can carry out error correction to the sheet data on each storer like this, and when the storer that some is arranged in the cloud system breaks down when causing some sheet data not read, still can be recovered completely, this has improved the reliability of system greatly.
2, general cloud storage system need to copy raw data the many parts of safety of guaranteeing data (general cloud storage system must back up 3 parts and more than), the present invention can reduce the redundant storage amount (can only need two parts of backups or a) of using when guaranteeing to store data security, the data storage space that this will save greatly, reduce cost, the cloud storage system space is utilized more fully.
3, another distinguishing feature of the present invention is, because data are dispersed and are stored on a plurality of storeies, so when there is in people's illegal invasion cloud system certain single storer the outside, the data that obtain all are incomplete, also just improved when facing the illegal invasion of system outside the security of data.
Embodiment
The present invention will be further described below in conjunction with example, but should not limit protection scope of the present invention with this.
This example is used for the raw data of a 100MB is carried out the cloud storage, and the implementation step is as follows
Step 1 is divided into 102401 frame data with the data original to be stored of a 100MB, and each frame data is the data of 1020 bits of regular length.
Described data to be stored are divided into 102401 frame data of regular length 1020, last piece raw data frame length is less than 255 o'clock, to last blocks of data frame padding data ' 0 ', to reach length 1020.
Step 2 adds numbering to described each Frame, and namely ID obtains new Frame.
Described each frame data is added ID, ID length is 4 bytes, progressively increases since 1 in the example ID number from 1 to 102401.
Step 3 is several data blocks with described Frame recombinant, and each data block comprises 1024 Frames, obtains altogether 100 such data blocks.
Described is data block with the Frame recombinant, these Frames will form the data matrix of a 1024*1024, form a data block, if last data block Frame number is less than 1024, to last blocks of data piece padding data ' 0 ', to reach fixed data piece several 1024.
Step 4 by inner Saloman Error Correction of Coding, is encoded i row and column that comprises the data block of 1024*1024 data respectively, changes into the data block that comprises the individual data of (1024+PO) * (1024+PI), wherein a 0<i<=M.PO=PI=100 is that delegation and is listed as the redundant data number that is used for error correction that increases in the data block.Concrete formula is:
F?mod?D=C;
F is 1024 bit data in the present embodiment, and D is generator polynomial, 100 the redundant correcting data of C for generating.Mod is the complementation computing.
Step 5, each data block obtained above is resolved into the 1024+100 sheet by row, the some data that same data block is obtained store into respectively on the storer of several cloud storage systems, and the sheet number of the upper same data block of same storer can not be more than 50.
Step 6 when reading out data from cloud storage system, by the decoding algorithm among the RS-PC, decodes the data that need extraction in the example.
The described data that read out from cloud storage system refer to:
1) reads out the different sheets that belong to same data block, occur being less than 50 data read errors in the i sheet, according to inner Saloman error correction decoding algorithm, can with the full wafer correcting data error, revert to raw data;
2) read the different sheets of same data block, be less than 50 and go out active or can't read if when reading these sheets, have, according to inner Saloman error correction decoding algorithm, can with this segment data error correction, recover.
3) read the different pieces of information piece that all belong to same raw data, removed the error correction redundancy data, with these data blocks by number (1 to 102401) arranged sequentially, remove again numbering, finally obtain the data of original storage.

Claims (3)

1. the data processing method of a cloud storage system is characterized in that, the data of cloud storage system is deposited in the Saloman coded data Error Correction of Coding of extracting in adopting respectively with data process and inner Saloman error correction decoding processing.
2. the data processing method of cloud storage system according to claim 1 is characterized in that described data store method, comprises following concrete steps:
1. raw data to be stored is divided into K frame data, each frame data comprises the data of identical regular length N bit, when the data length of last initial data frame during less than N, to last blocks of data frame padding data ' 0 ', to reach length N, wherein K is greater than greater than 1 positive integer, and the span of N is: 200<N<2000:
2. described each Frame is added numbering, namely ID obtains a new data frame, and described ID length is 4 bytes, progressively increases since 0001, so the length of new Frame is (N+4);
3. be W data block with a described new K Frame recombinant, each data block comprises M Frame, form the data matrix of a M* (N+4), when the number of the Frame of last data block during less than M, this data block is filled ' 0 ' Frame, so that last data block reaches fixing Frame M, wherein M, the span of W is respectively: 200<M<2000, W=K/M;
4. the Saloman product code error correction/encoding method in described data block being adopted carries out Error Correction of Coding: the row and column of i data block is added respectively P 0, P IThe redundant data of individual error correction is encoded, and changes into one and comprises (M+P 0) * (N+4+P I) individual data in the Saloman data block, wherein, P 0, P IBe respectively the redundant data number that is used for error correction that delegation in the data block and row increase, and 0<P 0<M/2,0<P I<M/2), 1<i≤M;
5. described Saloman data block in each is resolved into M+P by row 0Individual data slice is with the M+P of same data block 0The data of individual data slice store into respectively on a plurality of storeies of cloud storage system, and in the same storer, the sheet number≤P of same data block I/ 2;
3. the data processing method of described cloud storage system according to claim 1 is characterized in that described data extraction method comprises the following steps:
1. read out the data slice that belongs to same data block, if P occurs in the i sheet 0/ 2 data read errors, the error correction decoding algorithm according to inner Saloman product code to the correcting data error of this i sheet, recovers raw data;
2. all data slice of same data block have been read, if the P of being less than is arranged when reading this data block I/ 2 go out active or can't read, and computing machine is less than P by the decoding algorithm of inner Saloman product code to this I/ 2 correcting data errors recover raw data;
3. 1., 2. repeating step reads and handles all data blocks that belong to same raw data, removes the error correction redundancy data, and the new data frame of all data blocks is arranged sequentially by former numbering ID, removes numbering again, obtains the data of original storage.
CN2011104569412A 2011-12-30 2011-12-30 Data processing method of cloud storage system Pending CN102968356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104569412A CN102968356A (en) 2011-12-30 2011-12-30 Data processing method of cloud storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104569412A CN102968356A (en) 2011-12-30 2011-12-30 Data processing method of cloud storage system

Publications (1)

Publication Number Publication Date
CN102968356A true CN102968356A (en) 2013-03-13

Family

ID=47798509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104569412A Pending CN102968356A (en) 2011-12-30 2011-12-30 Data processing method of cloud storage system

Country Status (1)

Country Link
CN (1) CN102968356A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104579571A (en) * 2015-01-15 2015-04-29 山东超越数控电子有限公司 Data storage method based on LDPC encoding
CN108880620A (en) * 2018-08-20 2018-11-23 广东石油化工学院 Electric-power wire communication signal reconstructing method
CN110061802A (en) * 2018-01-17 2019-07-26 中兴通讯股份有限公司 Multi-user data transfer control method, device and data transmission set

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005069775A2 (en) * 2004-01-15 2005-08-04 Sandbridge Technologies, Inc. A method of reed-solomon encoding and decoding
CN101840377A (en) * 2010-05-13 2010-09-22 上海交通大学 Data storage method based on RS (Reed-Solomon) erasure codes
CN102006088A (en) * 2010-10-08 2011-04-06 清华大学 Interleaving and error-correcting method for reducing bit error rate of volume hologram storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005069775A2 (en) * 2004-01-15 2005-08-04 Sandbridge Technologies, Inc. A method of reed-solomon encoding and decoding
CN101840377A (en) * 2010-05-13 2010-09-22 上海交通大学 Data storage method based on RS (Reed-Solomon) erasure codes
CN102006088A (en) * 2010-10-08 2011-04-06 清华大学 Interleaving and error-correcting method for reducing bit error rate of volume hologram storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘小成等: "图像交织RS码设计及其C语言实现", 《微计算机信息》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104579571A (en) * 2015-01-15 2015-04-29 山东超越数控电子有限公司 Data storage method based on LDPC encoding
CN110061802A (en) * 2018-01-17 2019-07-26 中兴通讯股份有限公司 Multi-user data transfer control method, device and data transmission set
CN108880620A (en) * 2018-08-20 2018-11-23 广东石油化工学院 Electric-power wire communication signal reconstructing method
CN108880620B (en) * 2018-08-20 2021-06-11 广东石油化工学院 Power line communication signal reconstruction method

Similar Documents

Publication Publication Date Title
US9600365B2 (en) Local erasure codes for data storage
CN111149093B (en) Data encoding, decoding and repairing method of distributed storage system
CN109643258B (en) Multi-node repair using high-rate minimal storage erase code
CN100539444C (en) Be used for to add the method and apparatus that error-correction layer is embedded into error correcting code
Mitzenmacher et al. Biff (Bloom filter) codes: Fast error correction for large data sets
WO1993018589A1 (en) Data recovery after error correction failure
CN108228382B (en) Data recovery method for single-disk fault of EVENODD code
CN106874140B (en) Data storage method and device
CN111090540B (en) Data processing method and device based on erasure codes
Sima et al. Optimal codes for the q-ary deletion channel
CN106776129A (en) A kind of restorative procedure of the multinode data file based on minimum memory regeneration code
CN110389848B (en) Partial repetition code construction method based on block construction and fault node repair method
CN116501553B (en) Data recovery method, device, system, electronic equipment and storage medium
CN111782152A (en) Data storage method, data recovery device, server and storage medium
CN101252413A (en) Method for removing small ring of length 4 in fountain code generated matrix and uses thereof
US20150222291A1 (en) Memory controller, storage device and memory control method
CN102968356A (en) Data processing method of cloud storage system
JP5108000B2 (en) Data encoding and decoding method and apparatus, recording medium on which program for implementing the method is recorded, and recording medium driving system
CN107665152B (en) Decoding method of erasure code
CN100539445C (en) The error correction extra play is embedded the method and apparatus of error correcting code
Huang et al. An improved decoding algorithm for generalized RDP codes
WO2017041232A1 (en) Encoding and decoding framework for binary cyclic code
Fu et al. A scheme of data confidentiality and fault-tolerance in cloud storage
Liu et al. Z codes: General systematic erasure codes with optimal repair bandwidth and storage for distributed storage systems
CN105871508B (en) Network coding and decoding method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C05 Deemed withdrawal (patent law before 1993)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130313