CN106681659A - Data compression method and device - Google Patents

Data compression method and device Download PDF

Info

Publication number
CN106681659A
CN106681659A CN201611167099.XA CN201611167099A CN106681659A CN 106681659 A CN106681659 A CN 106681659A CN 201611167099 A CN201611167099 A CN 201611167099A CN 106681659 A CN106681659 A CN 106681659A
Authority
CN
China
Prior art keywords
data
compression
module
disk
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611167099.XA
Other languages
Chinese (zh)
Inventor
赵鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201611167099.XA priority Critical patent/CN106681659A/en
Publication of CN106681659A publication Critical patent/CN106681659A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to the technical field of data processing, and discloses a data compression method which includes the steps: firstly, writing data into a compression buffer memory; secondly, reading the data from the compression buffer memory by a compression engine in a fixed capacity manner; finally, compressing the read data by the compression engine. The invention further discloses a data compression device which comprises a data writing module, a data reading module and a data compression module, the data writing module is used for writing the data into the compression buffer memory, the data reading module is used for reading the data from the compression buffer memory by the compression engine in a fixed capacity manner; the data compression module is used for compressing the read data by the compression engine. The data compression method solves the problems that compressed data are written in the disk, disk fragmentations are generated in a disk, a lot of disk space is occupied, so that disk space is wasted in the prior art.

Description

The method and device of data compression
Technical field
The present invention relates to the technical field of data processing, more particularly to a kind of method and device of data compression.
Background technology
The data that today's society has magnanimity daily are produced, and these data of generation are all much the data for repeating how It is a very great task reasonably to analyze with these data and save it in disk.Preserve so huge number Complete according to too many disk is needed, will so greatly increase the cost of an enterprise, especially Internet firm.Therefore Needed to be compressed data process before data are preserved, so can greatly save disk space, improve magnetic The space availability ratio of disk.
The product of compression has much at present, but most compression is all based on fixed block and carries out data input, so The data of these inputs no longer have fixed size after compression afterwards, and these data are preserved in disk and also are difficult to unification Form, generate many disk fragmentses, huge waste caused to disk space.Additionally, compressed products are not to adopt The form of Real Time Compression, data by data storage disk, are then read out from disk and are compressed by first, and this A little digital independents are read out by the way of order, are then compressed.The drawbacks of this mode is these compression numbers According to being position based on disk rather than being time-based, because the IO of window has the bigger degree of association at the same time.
The content of the invention
It is an object of the invention to provide a kind of method and device of data compression, to write after overcoming prior art data compression Enter disk, in disk, produce many disk fragmentses, take substantial amounts of disk space, the defect for causing disk space to waste.
To achieve these goals, the present invention adopts following technical scheme:
A kind of method of data compression, comprises the following steps:
Data are written to compressed cache;
Compression engine reads data from compressed cache in the way of fixed capacity;
Compression engine is compressed to the data of above-mentioned reading.
Preferably, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache.
Preferably, before data are written to compressed cache, also include:The value of time window and fixed capacity is set.
Preferably, the data of window pass through metadata record at the same time.
Preferably, after compression engine reads data in the way of fixed capacity from compressed cache, also include:Judge number According to capacity whether reach the fixed capacity value that pre-sets, if it is, compression engine is compressed to the data for reading;Such as It is really no, then continue to read data.
Preferably, after data are written to compressed cache, also include:Write-back success is returned to main frame.
Preferably, after compression engine is compressed to the data of above-mentioned reading, also include:Data after compression are write To disk.
The present invention also provides a kind of device of data compression, including:
Data write. module, is written to compressed cache for data;
Data read module, is connected with Data write. module and data compressing module, respectively for compression engine with fixed capacity Mode data are read from compressed cache;
The data of above-mentioned reading are compressed by data compressing module for compression engine.
Preferably, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache.
Preferably, also include:Information sending module, for returning write-back success to main frame.
Preferably, also include:Judge module, for judging whether the capacity of data reaches the fixed capacity for pre-setting Value.
Preferably, also include:Data write disk module, for the data after compression are written to disk.
Preferably, also include:Parameter setting module, for arranging the value of time window and fixed capacity.
Compared with prior art, the present invention has advantages below:
1. most data compression at present is all based on different capabilities carries out data write, then data Jing of these writes There is no after overcompression identical capacity yet, therefore, preservation of these data in disk is difficult, with unified form, so to increase Add the gap between data, generate many disk fragmentses.Compression engine of the present invention is written to the data in compressed cache It is read out in the way of fixed capacity and is compressed, the data after compression is written sequentially to disk in the way of fixed capacity, So data just have unified form in disk, it is to avoid the gap between data after compression, so as to reduce disk Fragment, improves disk space usage.
2. data be using random manner store on disk, but at the same time the data of window often with compared with The big degree of association.Compression engine of the present invention is that data are read from compressed cache according to time window, and same time window is had The data of relevant degree are compressed, and after improve compression, data pre-reads accuracy, and then the performance of lift system.
3. the process of prior art data compression is first to write data into disk, then reads data from disk, then It is compressed, the data after compression is written to into disk finally.And the present invention is first write data in compressed cache, compression is drawn Hold up to digital independent and be compressed, the data after compression are written to into disk finally, unlike the prior art, the present invention Disk need not be first write data into, but first writes data into compressed cache, before the present invention writes data into disk The compression to data is would have been completed, Real Time Compression is realized, the utilization rate of data performance and disk is substantially increased.
Description of the drawings
Fig. 1 is a kind of a kind of structural representation of the device of data compression of the invention.
Fig. 2 is a kind of a kind of schematic flow sheet of example of method of data compression of the invention;
Fig. 3 is the schematic flow sheet of Fig. 2 instantiations.
Fig. 4 is a kind of another kind of structural representation of the device of data compression of the invention;
Fig. 5 is a kind of schematic flow sheet of the method another kind example of data compression of the invention.
Specific embodiment
In order to make it easy to understand, the part noun to occurring in the present invention makees explanation explained below:
Time window, completes specific job task, the time range of this restriction, when referred to as in the time range for limiting Between window.
Metadata(Metadata), it is the information of the tissue, data field and its relation with regard to data, in short, metadata It is exactly the data with regard to data, metadata includes the full detail needed for interacting with another module.
Slip block algorithm, refers to a kind of method of data partition, and data file is divided into less data block, sliding Byte is slided one by one backward in the original position portion from data file of dynamic window order, when sliding window is matched with default value When, just produce a piecemeal.The length of this data block is specified in an interval range and obtains.
With reference to the accompanying drawings and examples, the specific embodiment of the present invention is described in further detail:
Embodiment one:A kind of device of data compression of the present invention is as shown in figure 1, including Data write. module 12, digital independent Module 14, data compressing module 16, parameter setting module 11, judge module 15, information sending module 13 and data write disk Module 17;The parameter setting module 11 successively order and Data write. module 12, information sending module 13, data read module 14th, judge module 15, data compressing module 16 and data write disk module 17.
Parameter setting module 11 is used for the value for pre-setting time window and fixed capacity;Data write. module 12 is used for will Data are written in compressed cache;Information sending module 13 is for returning write-back success to main frame;Data read module 14 is used Data or compression engine are read according to time window with fixation from compressed cache in compression engine in the way of fixed capacity The mode of capacity reads data from compressed cache;Judge module 15 is used to judge whether the capacity of data reaches what is pre-set Fixed capacity value, if it is, compression engine is compressed to the data for reading;If it is not, then continuing to read data;Data pressure Contracting module 16 is used for compression engine and the fixed capacity data for reading is compressed;Data write disk module 17 is used for will compression Data afterwards are written to disk.
Embodiment two:A kind of method of data compression of the present invention is as shown in Fig. 2 comprise the following steps:
Step S201, arranges the value of time window and fixed capacity in advance in system files.
Step S202, writes data in compressed cache.
Step S203, returns write-back success to main frame, realizes online real-time data compression.
Step S204, compression engine is according to time window, and reads data in the way of fixed capacity from compressed cache; The data of window have the higher degree of association at the same time, and after compression is improve by the way of time window, data pre-reads Accuracy, and then the performance of lift system.The data of window are by metadata record at the same time, shorten compression engine from The time of data is read in compressed cache.
Step S205, judges whether the capacity of data reaches the fixed capacity value for pre-setting, if it is, going to step S206;If it is not, then going to step S204.
Step S206, the data of fixed capacity of the compression engine to reading are compressed.
Step S207, the data after compression be written sequentially on disk space in the way of fixed capacity, it is to avoid number Space according between, so as to reduce disk fragmentses, improves the space availability ratio of disk.
The data that compression engine of the present invention is written in compressed cache are read out in the way of fixed capacity, after compression Data be written sequentially to disk space in the way of fixed capacity, such data just have unified form in disk, The gap between data after compressing is avoided, disk fragmentses is reduced, has been saved the memory space of disk.The present invention is adopted and is based on The compression of time window rather than location-based compression, after can so improving compression, data pre-reads accuracy, quickening reading The speed of data.The present invention would have been completed the compression to data before writing data into disk, realize Real Time Compression.
It is explained in further detail with reference to specific embodiment, as shown in figure 3, the present embodiment is comprised the following steps:
Step S301, it is that 10ms and fixed capacity are 4K to arrange time window in advance in system files.Select one properly Time window, the longer compression ratio of time window is higher, but the fashionable performance of disk write is lower, and generally time window is 10ms。
Step S302, writes data in compressed cache.
Step S303, returns write-back success to main frame.
Step S304, the data in compressed cache chronologically form a data block per 4K(Elongated input, fixed length output), Compression engine reads a data block from compressed cache per 10ms.
Step S305, using slip block algorithm, judges whether the capacity of each data block reaches 4K, if it is, turning Step S306;If it is not, then going to step S304.
Step S306, the data in data block of the compression engine at least one capacity for 4K are compressed, and hold each Measure the data block for 4K the data block that capacity is 3.6K is compressed into by 90% compression ratio.
Data in data block of the capacity for 3.6K are written to disk by step S307.
Embodiment three:The device of another kind of the invention data compression is as shown in figure 4, including Data write. module 41, data Read module 42 and data compressing module 43, data read module 42 respectively with Data write. module 41 and data compressing module 43 Connection.
Data write. module 41 is written to compressed cache for data;Data read module 42 is used for compression engine with fixation The mode of capacity reads data from compressed cache;Data compressing module 43 is carried out to the data of above-mentioned reading for compression engine Compression.
Example IV:The method of another kind of the invention data compression is as shown in figure 5, comprise the following steps:
S501, data are written to compressed cache.
S502, compression engine read data from compressed cache in the way of fixed capacity.
S503, compression engine are compressed to the data for reading.
Data in present invention write compressed cache are non-fixed capacities, compression engine receive data in the way of fixed capacity According to and be compressed, the data after compression are written to disk again in the form of fixed capacity, and these data have unified form, So as to reduce disk fragmentses, the space availability ratio of disk is improved.
Illustrated above is only the preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims (10)

1. a kind of method of data compression, it is characterised in that comprise the following steps:
Data are written to compressed cache;
Compression engine reads data from compressed cache in the way of fixed capacity;
Compression engine is compressed to the data of above-mentioned reading.
2. the method for data compression according to claim 1, it is characterised in that compression engine according to time window, and with The mode of fixed capacity reads data from compressed cache.
3. the method for data compression according to claim 2, it is characterised in that before data are written to compressed cache, also Including:The value of time window and fixed capacity is set.
4. the method for the data compression according to claim 2 or 3, it is characterised in that data of window at the same time By metadata record.
5. the method for data compression according to claim 3, it is characterised in that compression engine in the way of fixed capacity from After data are read in compressed cache, also include:Judge whether the capacity of data reaches the fixed capacity value for pre-setting, if It is that then compression engine is compressed to the data for reading;If it is not, then continuing to read data.
6. the method for data compression according to claim 1, it is characterised in that after data are written to compressed cache, also Including:Write-back success is returned to main frame.
7. the method for data compression according to claim 1, it is characterised in that compression engine is entered to the data of above-mentioned reading After row compression, also include:Data after compression are written to into disk.
8. a kind of device of data compression, it is characterised in that include:
Data write. module, is written to compressed cache for data;
Data read module, is connected with Data write. module and data compressing module, respectively for compression engine with fixed capacity Mode data are read from compressed cache;
The data of above-mentioned reading are compressed by data compressing module for compression engine.
9. the device of data compression according to claim 8, it is characterised in that compression engine according to time window, and with The mode of fixed capacity reads data from compressed cache;
Preferably, also include:Parameter setting module, for arranging the value of time window and fixed capacity.
10. the device of data compression according to claim 8, it is characterised in that also include:
Information sending module, for returning write-back success to main frame;
Preferably, also include:Judge module, for judging whether the capacity of data reaches the fixed capacity value for pre-setting;
Preferably, also include:Data write disk module, for the data after compression are written to disk.
CN201611167099.XA 2016-12-16 2016-12-16 Data compression method and device Pending CN106681659A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611167099.XA CN106681659A (en) 2016-12-16 2016-12-16 Data compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611167099.XA CN106681659A (en) 2016-12-16 2016-12-16 Data compression method and device

Publications (1)

Publication Number Publication Date
CN106681659A true CN106681659A (en) 2017-05-17

Family

ID=58870998

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611167099.XA Pending CN106681659A (en) 2016-12-16 2016-12-16 Data compression method and device

Country Status (1)

Country Link
CN (1) CN106681659A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247562A (en) * 2017-06-30 2017-10-13 郑州云海信息技术有限公司 A kind of compression optimization method and its device
CN107392838A (en) * 2017-07-27 2017-11-24 郑州云海信息技术有限公司 WebP compression parallel acceleration methods and device based on OpenCL
CN107947799A (en) * 2017-11-28 2018-04-20 郑州云海信息技术有限公司 A kind of data compression method and apparatus
CN111124259A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Data compression method and system based on full flash memory array
CN113760192A (en) * 2021-08-31 2021-12-07 荣耀终端有限公司 Data reading method, data reading apparatus, storage medium, and program product

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6542644B1 (en) * 1996-09-02 2003-04-01 Fujitsu Limited Statistical data compression/decompression method
CN102611454A (en) * 2012-01-29 2012-07-25 上海锅炉厂有限公司 Dynamic lossless compressing method for real-time historical data
CN103136109A (en) * 2013-02-07 2013-06-05 中国科学院苏州纳米技术与纳米仿生研究所 Writing-in and reading method of solid-state memory system flash translation layer (FTL) with compression function
CN105808151A (en) * 2014-12-29 2016-07-27 华为技术有限公司 Solid-state disk storage device and data access method of solid-state disk storage device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6542644B1 (en) * 1996-09-02 2003-04-01 Fujitsu Limited Statistical data compression/decompression method
CN102611454A (en) * 2012-01-29 2012-07-25 上海锅炉厂有限公司 Dynamic lossless compressing method for real-time historical data
CN103136109A (en) * 2013-02-07 2013-06-05 中国科学院苏州纳米技术与纳米仿生研究所 Writing-in and reading method of solid-state memory system flash translation layer (FTL) with compression function
CN105808151A (en) * 2014-12-29 2016-07-27 华为技术有限公司 Solid-state disk storage device and data access method of solid-state disk storage device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107247562A (en) * 2017-06-30 2017-10-13 郑州云海信息技术有限公司 A kind of compression optimization method and its device
CN107247562B (en) * 2017-06-30 2020-03-06 郑州云海信息技术有限公司 Compression optimization method and device
CN107392838A (en) * 2017-07-27 2017-11-24 郑州云海信息技术有限公司 WebP compression parallel acceleration methods and device based on OpenCL
CN107392838B (en) * 2017-07-27 2020-11-27 苏州浪潮智能科技有限公司 WebP compression parallel acceleration method and device based on OpenCL
CN107947799A (en) * 2017-11-28 2018-04-20 郑州云海信息技术有限公司 A kind of data compression method and apparatus
CN107947799B (en) * 2017-11-28 2021-06-29 郑州云海信息技术有限公司 Data compression method and device
CN111124259A (en) * 2018-10-31 2020-05-08 深信服科技股份有限公司 Data compression method and system based on full flash memory array
CN113760192A (en) * 2021-08-31 2021-12-07 荣耀终端有限公司 Data reading method, data reading apparatus, storage medium, and program product
CN113760192B (en) * 2021-08-31 2022-09-02 荣耀终端有限公司 Data reading method, data reading apparatus, storage medium, and program product

Similar Documents

Publication Publication Date Title
CN106681659A (en) Data compression method and device
CN102609360B (en) Data processing method, data processing device and data processing system
CN104750571B (en) Method for error correction, memory device and controller of memory device
US20130124796A1 (en) Storage method and apparatus which are based on data content identification
WO2018033035A1 (en) Solid-state drive control device and solid-state drive data access method based on learning
CN101916227B (en) RLDRAM SIO storage access control method and device
US9411519B2 (en) Implementing enhanced performance flash memory devices
US20160092361A1 (en) Caching technologies employing data compression
US11010056B2 (en) Data operating method, device, and system
CN103559027A (en) Design method of separate-storage type key-value storage system
US20180300250A1 (en) Method and apparatus for storing data
CN106776759A (en) The small documents pre-head method and system of distributed file system
CN109582598B (en) Preprocessing method for realizing efficient hash table searching based on external storage
CN106648955A (en) Compression method and relevant device
CN107391544A (en) Processing method, device, equipment and the computer storage media of column data storage
WO2023000536A1 (en) Data processing method and system, device, and medium
WO2023197507A1 (en) Video data processing method, system, and apparatus, and computer readable storage medium
US9619400B2 (en) Efficient management of computer memory using memory page associations and memory compression
CN104239231B (en) A kind of method and device for accelerating L2 cache preheating
US20140258247A1 (en) Electronic apparatus for data access and data access method therefor
CN107577614B (en) Data writing method and memory system
CN107423425A (en) A kind of data quick storage and querying method to K/V forms
CN102722456B (en) Flash memory device and data writing method thereof
CN108170376A (en) The method and system that storage card is read and write
CN102360381B (en) Device and method for performing lossless compression on embedded program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170517