WO2006063057A2 - Application d'algorithmes a compression multiple dans un systeme de base de donnees - Google Patents
Application d'algorithmes a compression multiple dans un systeme de base de donnees Download PDFInfo
- Publication number
- WO2006063057A2 WO2006063057A2 PCT/US2005/044275 US2005044275W WO2006063057A2 WO 2006063057 A2 WO2006063057 A2 WO 2006063057A2 US 2005044275 W US2005044275 W US 2005044275W WO 2006063057 A2 WO2006063057 A2 WO 2006063057A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- page
- data
- compression
- storage
- sub
- Prior art date
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
Definitions
- This invention relates to systems, methods, and computer program products for compressing data in a database system.
- B-Trees or some other similar "page”-based structure to store collections of structured data.
- B-Tree systems generally provide efficient methods to store and access large amounts of dynamic data on slow media, such as tape or hard disk ("sub-storage"). Data such as this is typically more data than would ordinarily fit in Random Access Memory (“RAM”)
- RAM Random Access Memory
- B-Tree systems make no assumption about what type of data is being stored, allowing the B-Tree systems to be flexible enough for most kinds of data.
- B-Tree systems generally limit the data to "tables" where each item is stored a row, with its elements stored in columns (the set of columns being the same for all items in the table).
- Each column is defined to contain a fixed size number or a string (either of a fixed size or of variable size).
- compression algorithms remove redundancy in data, thus making the data smaller. This is generally desirable since storing the original version of the data on disk often takes longer than it takes to both compress the data and store the smaller or compressed version of the data on disk.
- a number of different types of compression have been implemented to remove redundancy in data to provide such storage efficiencies.
- compression can shrink the column data before it is put into the columns (i.e., "intra-row” compression), such as if the table system supports the type of data that is being put in, and is still able to sort the rows.
- compression is utilized to shrink the size of the resulting pages (i.e., "inter-row” compression).
- inter-row compression
- intra-row compression involves applying compression before the values are entered into columns, since the compression works within a single row.
- the storage savings from intra-row compression are minimal since most of the data redundancy in a page-based database is between the rows in a table, not within the rows.
- intra- row compression or compression of several rows in a table, results in much better compression, but results in chunks of data that are of different sizes (i.e., each page started out the same size but the compression works differently on each one, resulting in different sizes). Since "inter-row” compression results in chunks of data that vary in size, inter-row compression is generally used with a sub-storage that supports storage and retrieval of variable-sized data.
- variable-sized data chunks such as with inter-row compression
- inter-row compression can result in significant performance degradation.
- much of the space savings offered by inter-row compression is wasted by the sub-storage system as the sub- storage tries to compensate for having to support variable-sized chunks.
- One example of a conventional inter-row compression system in a database is a database that uses a "symbol table".
- a database system such as this looks for common values for each column, and only stores one version of that value in the symbol table, which is also stored in the same page.
- the symbol table refers back to that value whenever the value occurs again in columns stored in the same page.
- this type of compression is an example of inter-row compression, since the compression works by looking at values common in more than one row in the table.
- the problem of variable-sized chunks of data is solved by applying the compression as the items are placed into the pages.
- An example of intra-row compression includes one type of a full-text indexing system.
- a full-text indexing system that uses gamma encoding may assume that smaller numbers are used much more frequently than large ones. The system then stores numbers with a variable number of bytes, where small numbers only take a small number of bytes, and large numbers take more bytes (even more than their corresponding normal fixed-width representation). Where the smaller numbers are represented more frequently in the indexing system, the gamma encoding can provide measurable space savings.
- delta compression i.e., difference
- This type of compression is sometimes used to store databases, such as one used to store a dictionary in as small a space as possible, where many of the data terms have at least some similarity.
- a delta compression algorithm takes advantage of the fact that words in the dictionary, when stored in order, frequently start with a sequence of letters identical to the previous word in the list. For example, after "rabbi", the next word in the dictionary might be "rabbit”. The word “rabbit” could be stored represented as "5t”, indicating that the first 5 letters of this word are the same as in the previous word, but then adds the letter "t” to the end.
- a database index using a B-Tree system might store the word "zoo", but then use a separate (non page-base) data stream to store the corresponding list of rows that the word "zoo” exists in, using delta compression (storing only the difference between numbers in an increasing sequence) and gamma compression (storing smaller numbers using less bytes).
- delta compression storing only the difference between numbers in an increasing sequence
- gamma compression storing smaller numbers using less bytes
- an advantage in the art can be realized with systems, methods, and computer program products that efficiently combine the benefits of several compression algorithms into a single database system, while retaining the system's ability to efficiently make incremental changes to the data.
- the present invention solves one or more of the foregoing problems in the prior art with database systems and methods that provide for the efficient use of multiple compression algorithms in a way that data can be compressed for significant space savings, and can be easily retrieved and read when needed.
- implementations of the present invention provide for the efficient use of both intra-row and inter-row compression techniques in a database system using a page-based structure and a compression plug-in which facilitates access to data from the page based structure and writing of new data into sub storage in an efficient manner.
- a request is received to access (i.e., add, delete, modify) data contained within a database page.
- a compression plug-in retrieves the database page from sub-storage, allocates a page buffer based on a stored value indicating the page size when inter-row decompressed, and then inter-row decompresses the page into that page buffer.
- the page data remains in intra-row compressed form within the page buffer; and any data added to the page buffer is added using intra-row compression techniques, such as gamma encoding.
- the compression plug-in begins by compressing the data in the page buffer using inter-row compression.
- the compression plug-in identifies if there is sufficient space in the page in sub storage to store the data in the page buffer. If there is sufficient space to store the intra-row and inter-row compressed data from the page buffer to the page in sub-storage, the compressed data from the page buffer is saved into the page in the sub-storage. If there is too much data to fit into the page in the sub-storage, the page buffer is split into one or more additional page buffers, as appropriate, and one or more corresponding fixed-size pages are also created in the sub-storage.
- the compression plug-in then inter-row compresses each page buffer and writes the compressed data into the corresponding fixed-size pages in the sub-storage.
- the compression plug-in is utilized to allocate the page buffer, access data from sub-storage, manage compression of data to and from sub-storage, and allocate new page buffers and pages in sub-storage as required, and inform the B-Tree or other row management system of the addition of new pages as a result of a page buffer split.
- Utilizing the compression plug-in for such functionality provides a number of benefits.
- the compression format can be changed, altered, or dynamically customized according to the type of underlying data to be stored without affecting the underlying storage format or row management system.
- the compression plug-in facilitates the determination of the need to create additional pages in sub-storage without first attempting to write the data into sub-storage.
- the use of a compression plug-in also allows an underlying B-Tree or other data storage structure to maintain the data in fixed size pages in sub-storage. By utilizing fixed size pages in sub- storage, optimal efficiency of the underlying storage format is maintained as new pages in sub-storage are created to accommodate additional data being written from page buffers.
- implementations such as these in accordance with the present invention provide the ability to custom-tailor multiple types of compression for each data type being stored, while retaining fixed-size pages in sub-storage. Furthermore, implementations in accordance with the present invention provides these advantages without necessarily requiring any changes to the B-Tree (or other row management) system. Furthermore such implementations provide the ability to maintain an acceptable level of accessibility and modifiability in the database system.
- Figure 1 is a block diagram of an illustrative system utilizing a compression plug-in to control access and storage of new data into sub-storage according to one embodiment of the present invention.
- Figure 2 illustrates the manner in which the compression plug-in of Figure 1 is utilized to access data from sub-storage and utilize a page buffer to add new data to the page data.
- Figure 3 illustrates the manner in which data is transferred from the page buffer to sub-storage utilizing the compression plug-in of Figure 1.
- Figure 4 is a flow diagram illustrating the manner in which a compression plug-in transfers new data for storage in sub-storage according to one embodiment of the present invention.
- Figure 5 is a flow diagram illustrating the manner in which the compression plug-in determines whether to create additional pages within sub-storage for transferring data to sub-storage.
- Figure 6 is a block diagram illustrating the manner in which the compression plug-in utilizes additional page buffers to more efficiently write data to additional pages in sub-storage.
- the present invention extends to systems and methods that provide for the efficient use of multiple compression algorithms in a way that data can be compressed for significant space savings, and can be easily retrieved and read when needed.
- implementations of the present invention provide for the efficient use of both intra-row and inter-row compression techniques in a database system using a page-based structure and a compression plug-in which facilitates access to data from the page based structure and writing of new data into sub storage in an efficient manner.
- the present invention can separate compression from both the sub-storage and the row management system, and thus balance saving space with accessibility and modifiability.
- the compression plug-in facilitates the determination of the need to create additional pages in sub-storage without first attempting to write the data into sub-storage. Due to the inherent inefficiencies of transferring data to sub-storage, utilizing a compression plug-in to determine whether there is sufficient space in the page before transferring data to sub-storage results in substantial performance efficiencies.
- Figure 1 is a block diagram of an illustrative system utilizing a compression plug-in to control access and transfer of data to and from sub-storage according to one embodiment of the present invention.
- a system 10 is provided between sub-storage 12 and a buffer 14.
- system 10 is operably linked to sub-storage 12 and buffer 14 such that page data can be accessed from sub-storage 12 for the addition of data into the page in sub-storage 12.
- information in a page in sub-storage 12 corresponds with information stored in a database. In the event that additional data needs to be added to the database, the page corresponding with the information to be stored is accessed from sub-storage 12.
- the page data in sub-storage 12 is compressed for efficient storage of the page data in the underlying storage format (i.e. B-Tree data structures).
- the page data accessed from sub-storage 12 is at least partially decompressed and sent to buffer 14.
- the data is stored in buffer 14 allowing the new data to be added to the page data as appropriate.
- a compression plug-in 16 is provided in comiection with system 10.
- Compression plug-in 16 provides compression and decompression of data.
- Compression plug-in 16 controls access of page data from sub-storage 12 including providing decompression of page data being accessed from sub-storage 12. Additionally, compression plug-in allocates buffer 14 for data transferred from sub-storage 12 including providing transmission of decompressed data to page buffer 14.
- Compression plug-in 16 also facilitates management of system 10, including the row manager, allowing for compression of new data being added to buffer 14.
- decompressed page data 18 accessed from sub-storage 12 is provided to buffer 14 utilizing compression plug-in 16. Subsequent to the addition of new data from system 10 to buffer 14, compression plug-in 16 facilitates compression of the data in buffer 14 for storage in sub-storage 12. Compression plug-in 16 then transmits compressed page data 20 from buffer 14 into sub-storage 12.
- compression plug-in 16 By applying compression using compression plug-in 16 while the data is being added to the pages used by the row management system, the present invention can separate the compression from both sub-storage 12 and the row management system of system 10, and thus balance saving space with accessibility and modif ⁇ ability. The balance needed for each data type may be different.
- compression plug-in 16 allows for changing of the compression algorithm, without the changing underlying row management or sub-storage systems, providing the ideal balance between compression and modifiability for each type of data that the system stores, without necessarily requiring modification of the underlying systems.
- compression plug-in 16 can be configured for use with any traditional B-Trees, B+Trees, B*Trees, Binary Trees, N-way Trees, Database Tables, Hash-Trees, or any other page-based storage system, with little modification, and without affecting the system's ability to decide on what pages data should be stored.
- Figure 2 illustrates the manner in which compression plug-in 16 is utilized to access data from sub-storage 12 and utilize Page A Buffer 14a to add new data.
- system 10 loads a page (i.e. Page A) and the corresponding data (i.e. Page A Data) from the sub-storage 12.
- Compression plug-in 16 creates a corresponding Page A Buffer 14a that is larger than the data loaded from sub-storage 12.
- the data in sub-storage 12 i.e. Page A Data
- the compression plug-in 16 provides inter-row decompression of the data from sub-storage 12 while leaving the data in intra-row compression.
- the larger size of Page A Buffer 14a is used to store a version of the intra-row compressed data (i.e. Page A data in Page A Buffer 14a).
- New data 22 to be added to Page A data in Page A Buffer 14a is provided in connection with compression plug-in 16.
- compression plug-in 16 applies intra-row compression to the new data to resulting in intra-row compressed new data 24.
- Compression plug- in 16 operates in connection with a row manager to determine the juxtaposition of the new data 24 relevant to the existing page data in Page A Buffer 14a.
- Compression plug-in 16 then inserts the new data into Page A Buffer 14a (expanding Page A Buffer 14a if needed).
- the compression plug-in operates in connection with the row manager before intra-row compression of the data.
- the compression plug-in compresses the page data independent of the row manager and subsequently the row manager adds the intra-row compressed data in the page buffer without the use of the compression plug-in.
- the data from sub- storage is completely decompressed before addition to the page buffer.
- the data in sub-storage is compressed with a single compression algorithm (such as index compression) and the compression plug-in is utilized to control the addition of new data into the sub- storage in the single compression format.
- Figure 3 illustrates the manner in which data is written from page buffer 14 to sub-storage 12 utilizing compression plug-in 16.
- page buffer 14 contains the original data inter-row decompressed from sub-storage 12 (i.e. Page A Data) plus the new data intra-row compressed from compression plug-in 16 (i.e. New Page A Data).
- compression plug-in 16 identifies that the data in page buffer 14 is ready to be sent to sub-storage 12.
- compression plug-in 16 then applies inter-row compression to both the Page A Data and the New Page A Data.
- inter-row compression can include delta-row compression or other known inter-row compression algorithms. Because, the data in page buffer 14 was stored in intra-row compression, the additional inter-row compression provided by compression plug-in 16 results in both intra-row and inter- row compression of the data from page buffer 14. The compressed page data from page buffer 14, including the new page data, is then sent to a page in sub-storage 12 corresponding with page buffer 14. Subsequent to transmission of the data .from compression plug-in 16 to sub-storage 12, the data is stored in sub-storage 12.in both an intra-row compressed and inter-row compressed format. This provides compression benefits of inter-row compression while maintaining fixed sized bits of data that allows for optimized accessibility, modifiability, and overall system performance.
- index data can also benefit greatly.
- traditional SQL databases use B-Trees to index data stored in tables.
- One of the most complicated (and space-consuming) indexes in a database is a full- text index.
- every word from every document in the table is indexed so that by looking up the word in the B-Tree, one can quickly find which documents have that a particular word in them.
- this data is enormous since each entry in the index stores the word, a document identifier, and a position within the document where that word occurs.
- a typical full- text index might be represented as follows:
- Each of the entries in this table account for the fact that there might be many documents and some documents may be very long. As such, the fields used to store the document identifier, and the position information must be large enough to indicate the last possible word in the last possible document in the system. Thus, in the example given above, 32 bits would be needed to store the document identifier, and 16 bits would be needed to store the position (though this would limit the documents to 65536 words). As such, a total of (at least) 17 bytes would be needed for each row of data (11 bytes for the string "zoological" and a terminator or length indicator, 4 bytes for the document identifier, and 2 bytes for the position information), for a total of 136 bytes.
- any values used more than once in the page i.e., the string "zoological" and the document identifiers 5789 and 88764 could be reduced to a single instance, plus one byte (or more) per instance. This would reduce the total size to 17 (first row) + 8 (each instance of "zoological") + 4 (5789) + 4 (each instance of 5789) + 4 (88764) + 3 (each instance of 88764) + 4 (9947852) + 8 * 2 (positions), for a total of 60 bytes.
- implementations of the present invention provide for storage of all of the page data in the B-Tree database system while using three compression algorithms.
- "zoological" is only stored once
- each unique document identifier is only stored once
- both the document identifiers and the positions of the page data are stored using only the increase from the previous item.
- the present system uses gamma encoding to store small numbers with fewer bytes.
- this sample data might require only 11 ("zoological") + 2 (5789 gamma encoded) + 2 (2652 gamma encoded) + 1 ("zoological" and 5789 repeat indicator) + 1 (2752 - 2725 gamma encoded) + 1 ("zoological" and 5789 repeat indicator) + 1 (2731 - 2652 gamma encoded) + 1 ("zoological" and 5789 repeat indicator) + 1 (2788 - 2731 gamma encoded) + 1 ("zoological" repeat indicator) + 3 (88764 - 5789 gamma encoded) + 1 (10 gamma encoded) + 1 ("zoological" and 88476 repeat indicator) + 1 (66 - 10 gamma encoded) + 1 ("zoological" and 88476 repeat indicator) + 1 (82 - 66 gamma encoded) + 4 (9947852 - 88764 gamm
- FIG. 4 is a flow diagram illustrating the manner in which a compression plug-in is utilized to insert new data into sub-storage according to one embodiment of the present invention.
- new data is received in step 26.
- a page in sub-storage having data corresponding with the new data is identified in step 26.
- the page and corresponding data is then accessed from sub-storage in step 28.
- the page data is decompressed using inter-row decompression and sent to a page buffer corresponding with the page in step 32.
- the new data is compressed using intra-row compression in step 34.
- the new data compressed using intra-row compression is then added to the inter-row decompressed data in the page buffer in step 36.
- the data in the page buffer is compressed using inter-row compression in step 38. It is then determined whether the compressed data can be stored in the corresponding page in sub-storage in step 40. In the event that it is determined that the compressed data can be stored in the corresponding page in sub-storage, the inter-row compressed data is then stored in the corresponding page in sub-storage in step 42.
- the compression plug-in is configured to determine, before attempting to write the data from the page buffer to the page in the sub-storage corresponding with the page buffer, whether there is sufficient space in the page in sub-storage to accommodate the data from the page buffer. In the event, that there is sufficient space in the page in sub-storage corresponding with the page buffer, the data is stored in the page in sub-storage. In the event that there is insufficient space in the page in sub-storage, additional space is allocated to store the information in sub- storage before attempting to store the data in sub-storage.
- Figure 5 illustrates a method utilized to allocate additional space for the storage of the data from the page buffer before attempting to store the data in sub- storage according to one embodiment of the present invention.
- a request is received to enter data from a page buffer in which additional data has been added into a page in sub-storage in step 44.
- the data from the page buffer is compressed using inter-row compression in step 46.
- the amount of space provided by the page in sub-storage is then determined in step 48. Once the amount of space provided by the page in sub-storage is determined, the size of the intra-row and inter-row compressed data is determined in step 50.
- step 52 It is then identified whether compressed data from the page buffer will fit into the corresponding page in sub-storage in step 52. If there is sufficient space in the page in sub-storage corresponding with the page buffer, data is saved in a page of sub-storage in step 60. If there is insufficient space in the page in sub-storage corresponding with the page buffer, additional buffers and pages in sub-storage are created to accommodate the amount of compressed data in step 54. The compressed data is decompressed using inter-row decompression and then allocated to the page buffers in step 56. Once the data has been allocated to the additional page buffers, the data from each individual page buffer is compressed using inter-row compression and sent to the respective pages in sub-storage such that each page receives inter-row compressed page data from their respective page buffers in step 58.
- the determination of the sufficiency of space on the page(s) in sub-storage performed by the compression plug-in provides significant performance savings in the data storage system.
- the attempt to write that data to the page in sub-storage results in significant consumption of system operating time.
- the data is retrieved from sub-storage, decompressed, split into additional page buffers, and then re-written to storage.
- the compression plug-in utilizes the row-management system to allocate data into multiple page buffers once it is determined that there is insufficient space in the page(s) in sub-storage to accommodate the data in a particular page buffer.
- the compressed data is not decompressed when additional page buffers are allocated and the data is inserted into the individual page buffers.
- the page data is completely decompressed before being allocated to individual page buffers.
- the size of the pages in the sub-storage are fixed and the compression plug-in determines whether the size of the compressed data is larger than the sized of the fixed sized pages.
- Figure 6 is a block diagram illustrating the manner in which compression plug-in 16 utilizes additional page buffers to more efficiently transfer data to additional pages in sub-storage.
- compression plug-in has identified that the size of the Page A 66 in sub-storage was insufficient to accommodate the data originally retrieved from Page A 66 in combination with the new data added to the data retrieved from Page A 66.
- compression plug- in 16 has allocated an additional page buffer 64 in addition to page buffer 14.
- An additional page i.e. Page B 66
- Page B 66 has been allocated which corresponds with page buffer 64.
- Page B 68 and Page A 66 provide sufficient space for the compressed data which needs to be stored.
- page buffer 64 Once page buffer 64 has been allocated, the data is allocated to Page A Buffer 14a and Page B Buffer 64 using row manager 62. Utilizing row manager 62 allows for the organized and efficient storage of the data in individual page buffers (i.e. Page A Buffer 14 and Page B Buffer 64). Once the data has been allocated to Page A Buffer 14a and Page B Buffer 64, the data is individually retrieved from each Page of Page A Buffer 14a and Page B Buffer 64, compressed using inter-row compression, and sent for storage to Page A 66 and Page B 68. For example, according to one embodiment of the present invention, subsequent to allocation of the inter-row decompressed data to Page A Buffer 14a and Page B Buffer 64, compression plug-in 16 accesses data from Page A Buffer 14a.
- Compression plug-in 16 then compresses the data from Page A Buffer 14a utilizing inter-row compression. Once the data from Page. A Buffer 14a is intra-row and inter-row compressed, compression plug-in confirms that there is sufficient space in Page A 66 to store the compressed data. The compressed data from Page A Buffer 14a is then sent to Page A 66 in sub-storage. Compression plug-in 16 then access the data from Page B Buffer 64, compresses the data using inter-row compression, confirms that there is sufficient storage space in Page B 68, and sends the compressed data to Page B in sub-storage.
- the compression plug-in if the compression plug-in cannot fit the data from the page buffer into the corresponding page in sub- storage, the compression plug-in indicates the condition to the row manager.
- the row manager system handles the condition by assigning one or more additional page buffers in the sub-storage, updating the relevant information in the row manager system, and then telling the compression plug-in to "split" the data in the page buffer into multiple page buffers.
- the compression plug-in may try to balance the data relatively equally in each page buffer, as appropriate. Notwithstanding the allocation system used to store data, each page buffer contains the rows of assigned data having intra-row compression applied thereto.
- more than one additional page buffers and/or pages in sub-storage are allocated based on the size of the compressed data that needs to be stored.
- only a single additional page buffer and sub-storage page set is initially provided. After splitting the compressed data into the page buffers and recompressing the data from the individual pages, it is then determined whether additional page buffers and pages in sub-storage are needed.
- the manner in which data is allocated to individual page buffers is tailored to the type of data to be stored.
- systems in accordance with the present invention can provide benefits to many commercial database systems.
- one benefit provided by the present invention allows the user of those systems to more specifically identify what type of data is being stored so that the database system could compress the rows more effectively.
- Another benefit is for allowing the user to directly specify the compression format to use when storing the rows.
- some frequently used data types can be tailored by the database system itself, and can greatly improve performance and storage requirements for indexes, for example full-text indexes, while retaining their flexibility for storing large amounts of dynamic data.
- the page buffer may be split only into two page buffers to accommodate extra data, and may also be split more flexibly into additional page buffers, as appropriate.
- data can be allocated relatively unevenly, into each of the one, two, or three (etc.) additional buffers.
- the compression plug-in can distribute the items in the specified page buffers into the corresponding specified pages in the proportions specified, such that 15% of the data is allocated to the first page, 70% of the data is allocated in the next page, and 15% of the data is allocated in the last page.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US63385904P | 2004-12-06 | 2004-12-06 | |
US60/633,859 | 2004-12-06 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2006063057A2 true WO2006063057A2 (fr) | 2006-06-15 |
WO2006063057A3 WO2006063057A3 (fr) | 2007-04-26 |
Family
ID=36578522
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2005/044275 WO2006063057A2 (fr) | 2004-12-06 | 2005-12-06 | Application d'algorithmes a compression multiple dans un systeme de base de donnees |
Country Status (2)
Country | Link |
---|---|
US (1) | US7769728B2 (fr) |
WO (1) | WO2006063057A2 (fr) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7200603B1 (en) * | 2004-01-08 | 2007-04-03 | Network Appliance, Inc. | In a data storage server, for each subsets which does not contain compressed data after the compression, a predetermined value is stored in the corresponding entry of the corresponding compression group to indicate that corresponding data is compressed |
US7769728B2 (en) * | 2004-12-06 | 2010-08-03 | Ivie James R | Method and system for intra-row, inter-row compression and decompression of data items in a database using a page-based structure where allocating a page-buffer based on a stored value indicating the page size |
US10348897B2 (en) | 2017-06-27 | 2019-07-09 | Avaya Inc. | System and method for reducing storage space in a contact center |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007005829A2 (fr) * | 2005-07-01 | 2007-01-11 | Nec Laboratories America, Inc. | Compression memoire basee sur un systeme d'exploitation pour systemes integres |
US7496589B1 (en) * | 2005-07-09 | 2009-02-24 | Google Inc. | Highly compressed randomly accessed storage of large tables with arbitrary columns |
US7548928B1 (en) | 2005-08-05 | 2009-06-16 | Google Inc. | Data compression of large scale data stored in sparse tables |
US7668846B1 (en) | 2005-08-05 | 2010-02-23 | Google Inc. | Data reconstruction from shared update log |
US8077059B2 (en) * | 2006-07-21 | 2011-12-13 | Eric John Davies | Database adapter for relational datasets |
US9195695B2 (en) * | 2006-09-15 | 2015-11-24 | Ibm International Group B.V. | Technique for compressing columns of data |
US8266147B2 (en) * | 2006-09-18 | 2012-09-11 | Infobright, Inc. | Methods and systems for database organization |
WO2008034213A1 (fr) * | 2006-09-18 | 2008-03-27 | Infobright Inc. | Procédé et système pour une compression de données dans une base de données relationnelle |
US8386444B2 (en) * | 2006-12-29 | 2013-02-26 | Teradata Us, Inc. | Techniques for selective compression of database information |
US7962638B2 (en) * | 2007-03-26 | 2011-06-14 | International Business Machines Corporation | Data stream filters and plug-ins for storage managers |
US20090043792A1 (en) * | 2007-08-07 | 2009-02-12 | Eric Lawrence Barsness | Partial Compression of a Database Table Based on Historical Information |
US8805799B2 (en) * | 2007-08-07 | 2014-08-12 | International Business Machines Corporation | Dynamic partial uncompression of a database table |
US7747585B2 (en) * | 2007-08-07 | 2010-06-29 | International Business Machines Corporation | Parallel uncompression of a partially compressed database table determines a count of uncompression tasks that satisfies the query |
US20090204967A1 (en) * | 2008-02-08 | 2009-08-13 | Unisys Corporation | Reporting of information pertaining to queuing of requests |
US20090282064A1 (en) * | 2008-05-07 | 2009-11-12 | Veeramanikandan Raju | On the fly compression and storage device, system and method |
US20090287986A1 (en) * | 2008-05-14 | 2009-11-19 | Ab Initio Software Corporation | Managing storage of individually accessible data units |
CN102239472B (zh) | 2008-09-05 | 2017-04-12 | 惠普发展公司,有限责任合伙企业 | 在支持查询的同时高效地存储日志数据 |
US8484351B1 (en) | 2008-10-08 | 2013-07-09 | Google Inc. | Associating application-specific methods with tables used for data storage |
US8285691B2 (en) * | 2010-03-30 | 2012-10-09 | Ca, Inc. | Binary method for locating data rows in a compressed data block |
US8521748B2 (en) | 2010-06-14 | 2013-08-27 | Infobright Inc. | System and method for managing metadata in a relational database |
US8417727B2 (en) | 2010-06-14 | 2013-04-09 | Infobright Inc. | System and method for storing data in a relational database |
US8327070B2 (en) * | 2010-06-24 | 2012-12-04 | International Business Machines Corporation | Method for optimizing sequential data fetches in a computer system |
GB2483282B (en) * | 2010-09-03 | 2017-09-13 | Advanced Risc Mach Ltd | Data compression and decompression using relative and absolute delta values |
US8694474B2 (en) * | 2011-07-06 | 2014-04-08 | Microsoft Corporation | Block entropy encoding for word compression |
US8988444B2 (en) * | 2011-12-16 | 2015-03-24 | Institute For Information Industry | System and method for configuring graphics register data and recording medium |
US20130179409A1 (en) * | 2012-01-06 | 2013-07-11 | International Business Machines Corporation | Separation of data chunks into multiple streams for compression |
US8838577B2 (en) * | 2012-07-24 | 2014-09-16 | International Business Machines Corporation | Accelerated row decompression |
US10841405B1 (en) * | 2013-03-15 | 2020-11-17 | Teradata Us, Inc. | Data compression of table rows |
US9069660B2 (en) * | 2013-03-15 | 2015-06-30 | Apple Inc. | Systems and methods for writing to high-capacity memory |
US9569441B2 (en) | 2013-10-09 | 2017-02-14 | Sap Se | Archival of objects and dynamic search |
US9606769B2 (en) * | 2014-04-05 | 2017-03-28 | Qualcomm Incorporated | System and method for adaptive compression mode selection for buffers in a portable computing device |
US9952771B1 (en) * | 2016-03-31 | 2018-04-24 | EMC IP Holding Company LLC | Method and system for choosing an optimal compression algorithm |
US11288257B2 (en) * | 2016-05-30 | 2022-03-29 | Sap Se | Memory optimization using data aging in full text indexes |
US10432484B2 (en) * | 2016-06-13 | 2019-10-01 | Silver Peak Systems, Inc. | Aggregating select network traffic statistics |
CN106980541B (zh) * | 2017-03-10 | 2019-11-19 | 浙江大学 | 一种大页内存压缩回收***及方法 |
US20230325101A1 (en) * | 2022-04-12 | 2023-10-12 | Samsung Electronics Co., Ltd. | Systems and methods for hybrid storage |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5918225A (en) * | 1993-04-16 | 1999-06-29 | Sybase, Inc. | SQL-based database system with improved indexing methodology |
US6202136B1 (en) * | 1994-12-15 | 2001-03-13 | Bmc Software, Inc. | Method of creating an internally consistent copy of an actively updated data set without specialized caching hardware |
Family Cites Families (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2987206B2 (ja) * | 1991-12-13 | 1999-12-06 | アヴィッド・テクノロジー・インコーポレーテッド | バッファ及びフレーム索引作成 |
US6092070A (en) * | 1992-02-11 | 2000-07-18 | Telcordia Technologies, Inc. | Method and system for lossless date compression and fast recursive expansion |
US5794228A (en) * | 1993-04-16 | 1998-08-11 | Sybase, Inc. | Database system with buffer manager providing per page native data compression and decompression |
US5794229A (en) * | 1993-04-16 | 1998-08-11 | Sybase, Inc. | Database system with methodology for storing a database table by vertically partitioning all columns of the table |
US5668897A (en) * | 1994-03-15 | 1997-09-16 | Stolfo; Salvatore J. | Method and apparatus for imaging, image processing and data compression merge/purge techniques for document image databases |
US5805086A (en) * | 1995-10-10 | 1998-09-08 | International Business Machines Corporation | Method and system for compressing data that facilitates high-speed data decompression |
US5696927A (en) * | 1995-12-21 | 1997-12-09 | Advanced Micro Devices, Inc. | Memory paging system and method including compressed page mapping hierarchy |
US6618728B1 (en) * | 1996-01-31 | 2003-09-09 | Electronic Data Systems Corporation | Multi-process compression |
US6301394B1 (en) * | 1998-09-25 | 2001-10-09 | Anzus, Inc. | Method and apparatus for compressing data |
JP2000305822A (ja) * | 1999-04-26 | 2000-11-02 | Denso Corp | データベース管理装置,データベースレコード抽出装置,データベース管理方法及びデータベースレコード抽出方法 |
US6886098B1 (en) * | 1999-08-13 | 2005-04-26 | Microsoft Corporation | Systems and methods for compression of key sets having multiple keys |
US6411295B1 (en) * | 1999-11-29 | 2002-06-25 | S3 Graphics Co., Ltd. | Apparatus and method for Z-buffer compression |
US6523102B1 (en) * | 2000-04-14 | 2003-02-18 | Interactive Silicon, Inc. | Parallel compression/decompression system and method for implementation of in-memory compressed cache improving storage density and access speed for industry standard memory subsystems and in-line memory modules |
US6782136B1 (en) * | 2001-04-12 | 2004-08-24 | Kt-Tech, Inc. | Method and apparatus for encoding and decoding subband decompositions of signals |
US6857045B2 (en) * | 2002-01-25 | 2005-02-15 | International Business Machines Corporation | Method and system for updating data in a compressed read cache |
US6694323B2 (en) * | 2002-04-25 | 2004-02-17 | Sybase, Inc. | System and methodology for providing compact B-Tree |
US7171427B2 (en) * | 2002-04-26 | 2007-01-30 | Oracle International Corporation | Methods of navigating a cube that is implemented as a relational object |
US9195699B2 (en) * | 2003-08-08 | 2015-11-24 | Oracle International Corporation | Method and apparatus for storage and retrieval of information in compressed cubes |
US20060005047A1 (en) * | 2004-06-16 | 2006-01-05 | Nec Laboratories America, Inc. | Memory encryption architecture |
US7769728B2 (en) * | 2004-12-06 | 2010-08-03 | Ivie James R | Method and system for intra-row, inter-row compression and decompression of data items in a database using a page-based structure where allocating a page-buffer based on a stored value indicating the page size |
EP1958072A4 (fr) * | 2005-12-08 | 2012-05-02 | Intel Corp | Logiciel de compression/decompression d'en-tetes |
-
2005
- 2005-12-06 US US11/294,943 patent/US7769728B2/en not_active Expired - Fee Related
- 2005-12-06 WO PCT/US2005/044275 patent/WO2006063057A2/fr active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5918225A (en) * | 1993-04-16 | 1999-06-29 | Sybase, Inc. | SQL-based database system with improved indexing methodology |
US6202136B1 (en) * | 1994-12-15 | 2001-03-13 | Bmc Software, Inc. | Method of creating an internally consistent copy of an actively updated data set without specialized caching hardware |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7200603B1 (en) * | 2004-01-08 | 2007-04-03 | Network Appliance, Inc. | In a data storage server, for each subsets which does not contain compressed data after the compression, a predetermined value is stored in the corresponding entry of the corresponding compression group to indicate that corresponding data is compressed |
US7769728B2 (en) * | 2004-12-06 | 2010-08-03 | Ivie James R | Method and system for intra-row, inter-row compression and decompression of data items in a database using a page-based structure where allocating a page-buffer based on a stored value indicating the page size |
US10348897B2 (en) | 2017-06-27 | 2019-07-09 | Avaya Inc. | System and method for reducing storage space in a contact center |
Also Published As
Publication number | Publication date |
---|---|
US20060123035A1 (en) | 2006-06-08 |
US7769728B2 (en) | 2010-08-03 |
WO2006063057A3 (fr) | 2007-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7769728B2 (en) | Method and system for intra-row, inter-row compression and decompression of data items in a database using a page-based structure where allocating a page-buffer based on a stored value indicating the page size | |
US6725223B2 (en) | Storage format for encoded vector indexes | |
US7840774B2 (en) | Compressibility checking avoidance | |
US8255398B2 (en) | Compression of sorted value indexes using common prefixes | |
AU2009246432B2 (en) | Managing storage of individually accessible data units | |
US6349372B1 (en) | Virtual uncompressed cache for compressed main memory | |
US8538936B2 (en) | System and method for data compression using compression hardware | |
US11520743B2 (en) | Storing compression units in relational tables | |
US5761536A (en) | System and method for reducing memory fragmentation by assigning remainders to share memory blocks on a best fit basis | |
US7103608B1 (en) | Method and mechanism for storing and accessing data | |
EP1866776B1 (fr) | Procede permettant de detecter la presence de sous-blocs dans un systeme de stockage a redondance reduite | |
Zezula et al. | Dynamic partitioning of signature files | |
US5603022A (en) | Data compression system and method representing records as differences between sorted domain ordinals representing field values | |
US5678043A (en) | Data compression and encryption system and method representing records as differences between sorted domain ordinals that represent field values | |
US6654868B2 (en) | Information storage and retrieval system | |
US5999936A (en) | Method and apparatus for compressing and decompressing sequential records in a computer system | |
EP1265160A2 (fr) | Structure de données | |
JP2001511563A (ja) | データベースのための構造 | |
CN101916228A (zh) | 带有数据压缩功能的闪存转换层及实现方法 | |
EP1934700A2 (fr) | Systeme de gestion de tas de base de donnees a format de page variable, et resolution d'adresses de series d'instructions fixes | |
US11886401B2 (en) | Database key compression | |
US5815096A (en) | Method for compressing sequential data into compression symbols using double-indirect indexing into a dictionary data structure | |
US6965897B1 (en) | Data compression method and apparatus | |
CN1287316C (zh) | 在索引高键码生成期间压缩变长列的方法和*** | |
Zobel et al. | Storage Management for Files of Dynamic Records. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KN KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 05853242 Country of ref document: EP Kind code of ref document: A2 |