CN105930534B - A kind of fragmentation of data reduction method based on cloud storage service price - Google Patents
A kind of fragmentation of data reduction method based on cloud storage service price Download PDFInfo
- Publication number
- CN105930534B CN105930534B CN201610443197.5A CN201610443197A CN105930534B CN 105930534 B CN105930534 B CN 105930534B CN 201610443197 A CN201610443197 A CN 201610443197A CN 105930534 B CN105930534 B CN 105930534B
- Authority
- CN
- China
- Prior art keywords
- data
- fragmentation
- service
- service charge
- section
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1748—De-duplication implemented within the file system, e.g. based on file segments
- G06F16/1752—De-duplication implemented within the file system, e.g. based on file segments based on file chunks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1724—Details of de-fragmentation performed by the file system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of fragmentation of data reduction method based on cloud storage service price of the present invention, this method weighs memory space service charge and takes (fragmentation of data reading service expense does not include data transport service and takes) using fragmentation of data reading service caused by data de-duplication technology, identification leads to fragmentation of data block of the reading data service charge greater than saved data space service charge, by retaining these fragmentation of data blocks in cloud storage system without data de-duplication operations, to reduce the fragmentation of data amount in cloud storage system, reduce the reading data service fee paid needed for user, it allows users to enjoy data storage service provided by cloud storage system with minimum service price.
Description
Technical field
The invention belongs to computer information storage technology fields, and in particular to a kind of to be used for based on cloud storage service price
The method for reducing fragmentation of data in cloud storage system.
Background technique
With the arriving of information age, data presentation increases explosively, and IDC, which predicts the year two thousand twenty whole world, will generate 44ZB
Data.And the cloud storage system of storage service is provided as external, generally use data de-duplication technology at present to reduce
The data volume that user is stored.But, although the technology can save memory space cost, it is broken that a large amount of data can be introduced
Piece.This is primarily due to using after data de-duplication technology, and continuous data block is dispersed in physical space in logic
Storage causes to read to need a large amount of data read operation when data, user is made to generate a large amount of reading data service fee.Example
Such as, N number of continuous data block an of file is formed after deleting duplicated data block, this N number of continuous data block is likely to
(physics list that storage object in cloud storage system data store is stored in cloud storage system in N number of different storage object
Position, a storage object are usually hundreds of MB bytes or number GB byte), cause to read the reading data that this file needs n times
Operation.And in cloud storage service system, reading data service is charged by the number of operations of reading data, therefore user reads
Take this document 1 time data read operation expense for then needing to pay n times that will generate huge if user frequently reads this document
Reading data service charge.
In order to reduce user using the expense of storage service caused by cloud storage service (including data space service charge and
Data read operation service charge), a kind of fragmentation of data reduction method based on cloud storage service price of the present invention, this method tradeoff
Fragmentation of data reading service caused by memory space service charge and use data de-duplication technology takes (fragmentation of data reading
Service charge does not include data transport service and takes), identification causes reading data service charge caused by user to be greater than saved number
According to the fragmentation of data block of memory space service charge, by retaining these fragmentation of data blocks in cloud storage system without repeat number
According to delete operation, to reduce the fragmentation of data amount in cloud storage system, the reading data service charge paid needed for user is reduced
With allowing users to enjoy data storage service provided by cloud storage system with minimum service price.
Summary of the invention
A kind of fragmentation of data reduction method based on cloud storage service price of the present invention, this method weigh memory space service
Take and (fragmentation of data reading service expense does not include using the expense of fragmentation of data reading service caused by data de-duplication technology
Data transport service is taken), identification causes reading data service charge caused by user to be greater than saved data space clothes
It is engaged in the fragmentation of data block that takes, by retaining these fragmentation of data blocks in cloud storage system without data de-duplication operations,
To reduce the fragmentation of data amount in cloud storage system, the reading data service fee paid needed for user is reduced, is enabled users to
It is enough that data storage service provided by cloud storage system is enjoyed with minimum service price.
Core of the invention thought is that identification causes reading data service charge caused by user to be greater than saved data
The fragmentation of data section of memory space service charge, fragmentation of data section refer to that be stored in address in the same storage object continuously multiple
Repeated data block, there may be multiple fragmentation of data sections in the same storage object.If being generated for some fragmentation of data section
Reading data service charge be greater than saved memory space service charge, then all data regarded as in the fragmentation of data section are broken
Tile is to rewrite fragmentation of data block, will not carry out data de-duplication operations to these fragmentation of data blocks.As shown in formula 1,
It is reading data service charge caused by some fragmentation of data section on the left of the sign of inequality, right side is the memory space service saved
Take, if some fragmentation of data section meets formula 1, shows that the reading data service charge of generation is greater than by the fragmentation of data section and saved
Memory space about takes.Wherein C in sign of inequality left sidegetGenerated reading data service when 1 data read operation of fingering row
Take, which does not include data transport service and take, unrelated with the total size of fragmentation of data section;NreadRefer to that user reads the data
The number of fragment section, Cget*NreadAs user reads reading data service charge caused by the fragmentation of data section;The sign of inequality is right
Side DataSize is the total size of fragmentation of data section, unit GB, CstorageFor memory space service charge required for every GB,
DataSize*CstorageAs user stores the memory space service charge paid required for the fragmentation of data section.
Cget*Nread>DataSize*Cstorage(formula 1)
Main flow of the invention are as follows:
(1) cloud storage service end receives data block information (including the number for belonging to same file that user client is sent
According to block content, data block length, data block fingerprint etc.).
(2) whether cloud storage service end data block obtained in finding step (1) in cloud storage system is repeated data
Block, if so, marking corresponding data block is repeated data block, otherwise, then marking corresponding data block is non-duplicate data block.
(3) the fragmentation of data section being made of the repeated data block marked in step (2) is obtained, fragmentation of data section is by storing
There may be multiple data are broken in the continuous repeated data block composition in the address of the same storage object, the same storage object
Segment.
(4) the reading data service charge and storage of each fragmentation of data section according to obtained in the calculating of formula 1 step (3)
Simulation spatial service expense (wherein reading data number by system provide a fixed value or system rule of thumb data provide one it is pre-
Measured value), if reading data service charge is greater than memory space service charge, mark all data blocks inside corresponding data fragment section
To rewrite fragmentation of data block, otherwise, then marking all data blocks inside corresponding data fragment section is non-rewriting fragmentation of data block.
(5) it marks all rewriting fragmentation of data blocks marked in step (4) and in step (2) all non-duplicate
Data block is written together in the storage object of cloud storage system, and deletes the non-rewriting fragmentation of data marked in step (4)
Block.
Detailed description of the invention
Fig. 1 is flow chart of the invention
Specific embodiment
The present invention relates to user clients and cloud storage service end.User client mainly sends to cloud storage service end and needs
The file data information to be stored, the data that cloud storage service end is then sended over using cloud storage system storage user.It repeats
Data deletion technology can independently be realized at cloud storage service end, can also be cooperated by cloud storage service end and user client
It realizes.Method of the invention can be combined with any one implementation of data de-duplication technology.
Fig. 1 is flow chart of the invention.Specific step is as follows:
(1) cloud storage service end receives the data block information for belonging to same file that user client sends over, should
Data block information includes the metadata information and data content information of data block.The metadata information of data block includes data block
Length, for data block fingerprint of uniquely tagged data block etc..
(2) whether cloud storage service end data block obtained in finding step (1) in cloud storage system is repeated data
Block, the specific steps are as follows:
(2.1) one of data block is read, checks whether the data block is stored mistake in cloud storage system
Data block.If so, marking the data block is repeated data block, otherwise, then marking the data block is non-duplicate data block.
(2.2) it is finished if the data block in step (1) is all read, enters next step, be otherwise transferred to step
(2.1)。
(3) the fragmentation of data section of the repeated data block marked in step (2.1) composition is obtained, fragmentation of data section is by step
(2.1) the continuous repeated data block composition in the address for being stored in the same storage object marked in.
(4) whether obtained fragmentation of data section is to rewrite fragmentation of data section in finding step (3).Specific step is as follows:
(4.1) obtain a fragmentation of data section, according to formula 1 calculate the fragmentation of data section reading data service charge and
(wherein reading data number by system provides a fixed value or system rule of thumb data provides one memory space service charge
A predicted value), if reading data service charge is greater than memory space service charge, mark all data blocks in the fragmentation of data section
To rewrite fragmentation of data block, otherwise, then marking all data blocks in the fragmentation of data section is non-rewriting fragmentation of data block.
(4.2) it is finished if all fragmentation of data sections in step (4) all calculate, enters next step, it is no
Then it is transferred to step (4.1).
(5) it marks all rewriting fragmentation of data blocks marked in step (4.1) and in step (2.1) all non-
Repeated data block is written together in the storage object of cloud storage system, and deletes the non-rewriting number marked in step (4.1)
According to pieces of debris.
Claims (2)
1. a kind of fragmentation of data reduction method based on cloud storage service price, specific steps are as follows:
(1) cloud storage service end receives the data block information for belonging to same file that user client is sent;
(2) whether cloud storage service end data block obtained in finding step (1) in cloud storage system is repeated data block, if
It is then to mark corresponding data block for repeated data block, otherwise, then marking corresponding data block is non-duplicate data block;
(3) the fragmentation of data section being made of the repeated data block marked in step (2) is obtained, fragmentation of data section is same by being stored in
The continuous repeated data block in the address of one storage object forms, and there are multiple fragmentation of data sections in the same storage object;
(4) the reading data service charge and memory space service charge of each fragmentation of data section obtained in step (3) are calculated, if
Reading data service charge is greater than memory space service charge, then marks all data blocks inside corresponding data fragment section to rewrite number
According to pieces of debris, otherwise, then marking all data blocks inside corresponding data fragment section is non-rewriting fragmentation of data block;
(5) by all rewriting fragmentation of data blocks marked in step (4) and all non-duplicate datas marked in step (2)
Block is written together in the storage object of cloud storage system, and deletes the non-rewriting fragmentation of data block marked in step (4).
2. the method according to claim 1, wherein calculating the number of each fragmentation of data section in the step (4)
Take according to reading service and memory space service charge, finds out the fragmentation of data that reading data service charge is greater than memory space service charge
Section, as shown in formula 1;If meeting formula 1, illustrate that reading data service charge is greater than memory space service charge, marks corresponding number
It is to rewrite fragmentation of data block according to all data blocks in fragment section, repeat number delete operation is not carried out to these data blocks;Otherwise,
Then illustrate that reading data service charge is less than or equal to memory space service charge, needs to delete the institute inside its corresponding data fragment section
There is data block;
Cget*Nread>DataSize*Cstorage(formula 1),
In above-mentioned formula 1, the CgetGenerated reading data service charge, the expense when 1 data read operation of fingering row
Take not comprising data transport service, it is unrelated with the total size of fragmentation of data section;The NreadRefer to that user reads the fragmentation of data
The number of section, Cget*NreadAs user reads reading data service charge caused by the fragmentation of data section;The DataSize
For the total size of fragmentation of data section, unit GB, the CstorageIt is described for memory space service charge required for every GB
DataSize*CstorageAs user stores the memory space service charge paid required for the fragmentation of data section.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610443197.5A CN105930534B (en) | 2016-06-20 | 2016-06-20 | A kind of fragmentation of data reduction method based on cloud storage service price |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610443197.5A CN105930534B (en) | 2016-06-20 | 2016-06-20 | A kind of fragmentation of data reduction method based on cloud storage service price |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105930534A CN105930534A (en) | 2016-09-07 |
CN105930534B true CN105930534B (en) | 2019-05-17 |
Family
ID=56830935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610443197.5A Active CN105930534B (en) | 2016-06-20 | 2016-06-20 | A kind of fragmentation of data reduction method based on cloud storage service price |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105930534B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599111B (en) * | 2016-11-30 | 2021-07-02 | 浙江信安数智科技有限公司 | Data management method and storage system |
CN109408288B (en) * | 2018-09-29 | 2020-07-10 | 华中科技大学 | Method for removing duplicate fragments of data in packed file backup process |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7478217B2 (en) * | 2006-04-07 | 2009-01-13 | Mediatek Inc. | Method of storing both large and small files in a data storage device and data storage device thereof |
EP2662782A1 (en) * | 2012-05-10 | 2013-11-13 | Siemens Aktiengesellschaft | Method and system for storing data in a database |
US20140122104A1 (en) * | 2012-10-26 | 2014-05-01 | Koninklijke Philips N.V. | Coaching system that builds coaching messages for physical activity promotion |
CN102999605A (en) * | 2012-11-21 | 2013-03-27 | 重庆大学 | Method and device for optimizing data placement to reduce data fragments |
-
2016
- 2016-06-20 CN CN201610443197.5A patent/CN105930534B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN105930534A (en) | 2016-09-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6373328B2 (en) | Aggregation of reference blocks into a reference set for deduplication in memory management | |
US10642515B2 (en) | Data storage method, electronic device, and computer non-volatile storage medium | |
CN110520857B (en) | Data processing performance enhancement for neural networks using virtualized data iterators | |
CN103959256B (en) | Data duplication based on fingerprint is deleted | |
US8131687B2 (en) | File system with internal deduplication and management of data blocks | |
CN103136243B (en) | File system duplicate removal method based on cloud storage and device | |
TW201841122A (en) | Key-value store tree | |
JP2005267600A5 (en) | ||
CN101178726B (en) | Method to unarchive data file | |
JP2005302038A (en) | Method and system for renaming consecutive key in b-tree | |
JP2012089094A5 (en) | ||
CN105009067A (en) | Managing operations on stored data units | |
CN108255989B (en) | Picture storage method and device, terminal equipment and computer storage medium | |
CN105117351A (en) | Method and apparatus for writing data into cache | |
CN107786638A (en) | A kind of data processing method, apparatus and system | |
CN105493080B (en) | The method and apparatus of data de-duplication based on context-aware | |
CN105930534B (en) | A kind of fragmentation of data reduction method based on cloud storage service price | |
JP5821744B2 (en) | Data presence / absence determination apparatus, data presence / absence determination method, and data presence / absence determination program | |
EP3477462B1 (en) | Tenant aware, variable length, deduplication of stored data | |
CN110352410A (en) | Track the access module and preextraction index node of index node | |
CN105009068A (en) | Managing operations on stored data units | |
CN111427511B (en) | Data storage method and device | |
US7685186B2 (en) | Optimized and robust in-place data transformation | |
CN106484691A (en) | The date storage method of mobile terminal and device | |
CN111078652A (en) | Filing and compressing method and device for logistics box codes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |