GB2493243A

GB2493243A - Determining hot data in a storage system using counting bloom filters

Info

Publication number: GB2493243A
Application number: GB1210250.5A
Authority: GB
Inventors: Xiao-Yu Hu; Loannis Koltsidas; Roman Pletka; Robert Haas
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2011-07-26
Filing date: 2012-06-11
Publication date: 2013-01-30
Anticipated expiration: 2032-06-11
Also published as: GB2493243B; DE102012212183A1; CN103150245A; GB201210250D0; DE102012212183B4; CN103150245B

Abstract

Determining a characteristic of a data entity based on a frequency of access to said data entity in a storage system using a counting bloom filter (CBF') comprising a set (S') of counters (C1); and a data structure having a set of elements each corresponding to a counter. To avoid counter overflow the counting bloom filter is operated for an interval in time wherein the set of counters are reset at the start of the interval. Each time said data entity is accessed during the interval a value of at least one counter (C1) to which said data entity is mapped in the counting bloom filter is increased. At the end of the interval the values of the elements in the data structure are updated based on the current value of that element and the value of the counter to which it is assigned. The interval in time may be a predefined number of accesses. A plurality of counting bloom filters can be used. The method may produce a heat map which is used for selectively populating a cache with â hotâ data or controlling data placement of â hotâ data in fastest storage tier of a tiered storage system.

Description

METHOD AND STORAGE CONTROLLER FOR DETERMINING AN ACCESS

CHARACTERISTIC OF A DATA ENTITY

FIELD OF THE INVENTION

The present invention relates to methods and storage controllers for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storage system.

BACKGROUND

In the following a characteristic of a data entity representing a frequency at which the data entity is accessed at a relative basis also is denoted as a temperature of such data entity.

Determining the temperature of a particular data entity, including in particular its logical address, is a long-standing challenge in storage systems. The temperature of a particular data entity refers to its relative frequency of references, which may include read or write accesses to its peers in the same storage system. A collection of temperature information for the whole storage system is also referred to as a heat map. A data entity is often called "hot" if it is frequently accessed, or "cold" if it is infrequently accessed. The temperature may measure quantitatively how frequently and how recently a data entity is accessed.

A simple and straightforward way to determine the temperature of data entities is to use a counter for each data entity to keep track of the number of references. However, this may be memory inefficient for large-capacity storage systems. In order to shrink the memory footprint of the heat map, a popular solution is to use a counter for a group of contiguous data entities, that is, track the temperature of data at a coarser granularity.

BRIEF SUMMARY OF THE INVENTION

According to a first aspect of the invention, a method is provided for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storage system. A counting bloom filter is provided for being operated for an interval in time, which counting bloom filter comprises a set of counters. A data structure is provided the data structure comprising a set of ekments wherein each element of the set of elements is assigned to a counter of the set of counters. The characteristic of said data entity is determined subject to a value of at least one element of the set of elements.

For each individual interval in time the counting bloom filter is operated the counters of the set of counters are reset prior to or at a beginning of the individual interval in time, -a value of at least one counter of a subset of counters to which subset of counters said data entity is mapped in the counting bloom filter is increased each time said data entity is accessed during the individual interval in time, and -the value of each individual element of the set of elements is updated ator after an end of the individual interval in time, wherein the value of the individual element is updated subject to a value the counter assigned to the individual clement holds at the end of the individual interval in time and subject to a present value of the individual element.

In embodiments, this method may comprise one or more of the following features: -the counting bloom filter is operated muhiple times for consecutive intervals in time; -the value of the individual element is updated subject to a weighted value the counter assigned to the individual element holds at the end of the individual interval in time and subject to a weighted present value of the individual element; -the value of the individual element is updated by the value the counter assigned to the individual element holds at the end of the individual interval in time which value is weighted by a factor a, plus the present value of the individual element which present value is weighted by a factor 1-a; -the factor a has a value between 0.75 and 0.95; -said data entity is mapped to the subset of counters by means of one or more hash functions; -the subset of counters comprises multiple counters to which said data entity is mapped in the counting bloom filter, and wherein only the value of a single counter in the subset is increased, which single counter is the counter in the subset that presently shows a lowest value amongst the muhiple counters in the subset; -each element of the set of elements is assigned to a single counter of the set of counters, and wherein each counter of the set of counters is assigned to a single element of the set of elements; -the subset of counters comprises multiple counters to which said data entity is mapped in the counting bloom filter, a subset of elements contains elements which are assigned to the counters of the subset of counters, and the characteristic of said data entity is determined subject to the value of one or more elements of the subset of elements; -the characteristic of said data entity is determined subject to the value of the element that shows the lowest value amongst the multiple elements in the subset of elements; -accessing said data entity includes at least one of reading said data entity and updating said data entity; -said data entity represents data addressed by a single logical block address; -subject to the determined characteristic of said data entity, said data entity is selected for being cached; -subject to the determined characteristic of said data entity, said data entity is selected for a being stored in a dedicated tier in a tiered storage system.

According to a second aspect of the present invention, a method is provided for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storage system. A first counting bloom filter is provided for being active for a first interval in time, which first counting bloom filter comprises a set of first counters. Each time said data entity is accessed during the first interval in time increasing a value of at least one first counter of a subset of first counters to which subset of first counters said data entity is mapped in the first counting bloom filter is increased. A second counting bloom filter is provided for being active for a second interval in time, which second counting bloom filter comprises a set of second counters. Each time the data entity is accessed during the second interval in time a value of at least one second counter of a subset of second counters to which subset of second counters said data entity is mapped in the second counting bloom filter is increased. The characteristic of the data entity is determined subject to a value of at least one first counter of the subset of first counters at the end of the first interval in time and subject to a value of at least one second counter of the subset of the second counters at the end of the second interval in time.

In embodiments, this method may comprise one or more of the following features: -overall n counting bloom filters are provided each of which n counting bloom filters being active for an associated interval in time, which associated intervals in time follow each other; each of the n counting bloom filters is operated according to the first or second counting bloom filter each time said data entity is accessed during the associated interval in time; and the characteristic of said data entity is determined subject to, for each of then counting bloom filters, a value of at least one counter of a subset of counters associated with said data entity in the respective counting bloom filter at the end of the associated interval in time; -the characteristic of said data entity is determined based on an average of the counter values selected from the n counting bloom filters; -said data entity is mapped to the subset of first counters by mcans of one or more hash ftrnctions, and said data entity is mapped to the subset of second counters by means of the o same one or more hash functions; -the subset of first counters comprises multiple first counters to which said data entity is mapped in the first counting bloom filter; only the value of a single first counter in the subset is increased, which single first counter is the first counter in the subset that presently shows a lowest value amongst the multiple first counters in the subset; and the subset of second counters comprises multiple second counters to which said entity is mapped in the second counting bloom filter; only the value of a single second counter in the subset is increased, which single second counter is the second counter that presently shows a lowest value amongst the muhiple second counters in the subset; -the subset of first counters comprises multiple first counters to which said data entity is mapped in the first counting bloom filter, the subset of second countcrs comprises multiple second counters to which said entity is mapped in the second counting bloom filter; the characteristic of said data entity is determined subject to a value of a dedicated first counter of the subset of first counters which dedicated first counter is the first counter that shows the lowest value amongst the multiple first counters in the subset at the end of the first interval in time, and subject to a value of a dedicated second counter of the subset of second counters which dedicated second counter is the second counter that shows the lowest value amongst the multiple second counters in thc subset at the end of the second interval in time; -accessing said data entity includes at least one of reading said data entity and updating said data entity; -said data entity represents data addressed by a single logical block address; -subject to the determined characteristic of said data entity, said data entity is selected for being cached; -subject to the determined characteristic of said data entity, said data entity is selected for a being stored in a dedicated tier in a tiered storage system.

A further aspect of the invention refers to a computer program product comprising a computer readable medium having computer readable program code embodied therewith, the computer readable program code comprising computer readable program code configured to perform a method according to any one of the preceding aspects or embodiments.

A further aspect of the invention refers to a storage controller for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storage system, comprising a control unit adapted to execute a method according to any one of the preceding aspects or embodiments.

It is understood that method steps may be executed in a different order than listed in a method claim. Such different order shall also be inc'uded in the scope of such claim as is the order of steps as presently listed.

Embodiments described in relation to the aspect of an apparatus shall also be considered as embodiments disclosed in connection with any of the other categories such as the method, the computer program product, etc.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention and its embodiments will be more frilly appreciated by reference to the following detailed description of presently preferred but nonetheless illustrative embodiments in accordance with the present invention when taken in conjunction with the accompanying drawings.

The figures are illustrating: FIG. I, a diagram of a timing sequence of counting bloom filters applied according to an embodiment of the present invention; FIG. 2 a diagram of a first counting bloom filter applied according to an embodiment of the present invention; FIG. 3 a diagram of a second counting bloom filter applied according to an embodiment of the present invention; FIG. 4 a diagram of a tiered storage system; FIG. 5 a flow chart of a method according to an embodiment of the present invention; and FIG. 6 a flow chart of a method according to another embodiment of the present invention.

DETAILED DESCRIPTION OF THE EMBODIMENTS

As an introduction to the following description, it is fir st pointed at general aspects of the invention, concerning methods and controllers for determining a characteristic of a data entity which characteristic is based on a frequency of access to the data entity. Such methods and storage controllers make use of one or more bloom filters specifically adapted to the present application, and specifically make use of one or more counting bloom filters.

A bloom filter may be regarded as a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters achieve space savings at the cost of allowing false positives; the probability of a false positive, however, can be bound to a sufficiently low value. Bloom filters were introduced by Burton Bloom in the l970s, and since then have found widespread adoption in database applications as well as networking. A bloom filter may be regarded as a method for representing a set S=/si, s2 s/ of elements from a universe U, by using a bit vector Vofrn = 0(n) bits. All the bits in the vector V are initially set to 0. The bloom filter may use k hash functions, h1, h, hA for mapping elements from U to the range /1, 2 inJ. For each elements inS, the bits at positions his), h2(s) hk(c) in Vare set to I. To query for an element, i.e. test whether the element is in the set, the element preferably is fed to each one of the k hash functions to get k bit positions. If any of the bits at these positions are 0, the clement is not in the set -if it were, then all the bits would have been set to I when it was inserted. If all the identified bit positions are I, then either the clement is in the set, or the bits have been set to I during the insertion of other elements; the latter case is called a false positive. The probability for an error due to a false positive depends on the selection of the parameters in, Ic. This probability is minimized for klog2(in/n). The bloom filter may be considered as highly effective even for m=cn using a small constant c. For c=8, for example, the false positive error rate is slightly higher than 2%.

Inserting a new element into a bloom filter, i.e. inserting a new element into the set of elements, is accomplished by the following steps: hash the new element k times by means of the /c hash functions and set the bits resulting from this hashing to I. However, a deletion of an clement from the sct may not be achieved by reversing the process. If the elcment to be deleted is hashed and the corresponding bits are set to 0, a bit position may be set to 0 that is hashed to by some other element in the set. To avoid this problem, the idea of a counting bloom filter was developed in the art. In a counting bloom filter, each bit position in the Noom filter is not represented by a single bit but rather by a counter. When a new element is inserted into the set, the corresponding counters are incremented; when an element is deleted from the set, the corresponding counters are deeremented. In order to avoid counter overflow, the counters are designed to be sufficiently large. For example, four bits per counter may suffice for most applications.

In present storage applications, counting bloom filters may not be suited to be used directly for generating heat maps because counting bloom filters arc inherently short-term. As more and more data entities arc requested, i.e. added to the storage system, their corresponding counters are incremented, which counters are of size and may eventually overflow.

Hence, memory-and computation-efficient methods are proposed to estimate a characteristic related to the frequency of access of any data entity in a storage system.

According to the first aspect of a method, preferably a single counting bloom filter comprising a set of counters preferably is repetitively applied to a sequence of individual intervals in time to capture the frequency of access of a, and preferably any data entity in each of these intervals in time. Preferably, the sequence of individual intervals in time form a continuous period in time. Hence, for a specific data entity each time said data entity is accessed during the individual interval in time a value of at least one counter is increased which counter is part of a subset of counters to which subset of counters said data entity is mapped preferably by means of one or more hash firnctions.

In addition, a data structure is provided which data structure comprises a set of elements.

Preferably, each element out of the set of elements is associated with a dedicated, single counter out of the set of counters and, preferably, each counter out of the set of counters is assigned only to a dedicated, single element.

When the method is started the counting bloom filter is started for the fir st time to operate for a first interval in time. At such point in time or prior to that, all the counters of the set of

S

counters and all the elements of the set of elements are preferably reset, i.e. in a specific embodiment are set to value zero. Accordingly, zero values represent the present values of elements and counters at the beginning of the first interval in time. However, during the first interval in time, the counters may be increased subject to data entities accessed such that at the end of the first interval in time the counter values may represent indicators how often various data entities were accessed during this first interval in time, in contrast, element values typically do not change during an interval in time.

At, after, or in response to the end of the first interval in time, one or more, and preferably all of the present values of elements in the set of elements are updated. Such update includes for an individual clement to have a new value assigned wherein the new value is depcndcnt on a value of the counter assigned to the individual element and on the present value of the individual element, For determining the characteristic of a specific data entity at any point in time, the set of elements preferably is queried. Hash functions assigned to the subject data entity are applied and result in specific subset of counters and/or a specific subset of elements respectively.

From values of elements of the subset of elements at the given point in time the characteristic may be derived from.

In this embodiment, the counting bloom filter may be applied to relatively short intervals in time wherein the counting bloom filter may not run the risk of being blocked by a counter overflow. The data structure including the set of elements is used for determining an average of counter values over multiple intervals in time. Hence, in a preferred embodiment, the present value of an element of the data structure represents an average of previous counter values at the end of intervals in time of the associated counter. Upon expiration of another interval in time, the counter value reached at the end of such interval in time preferably is set into relation to the present average value. In a preferred embodiment, this is achieved by weighting the present element value by a factor close to 1, and by weighting the new counter value by a factor close to zero, and by adding both weighted values. By such means, only a single counting bloom filter is needed together with a data structure holding long-term averaged counter values.

In this respect, the data structure may also be interpreted as a "long-term counting bloom filter" since it holds element values representing the timing average of associated counter values of the counting bloom filter which counters of the counting bloom filter arc limited in size. Once the long-term counting bloom filter is updated, the short-term counting bloom filter preferably is reset by initializing all counters of the set to zero and a subsequent interval of time starts. The characteristic of a data entity may preferably be determined by reading out a minimal element value among those elements indexed by hash values of an LBA of said data entity.

According to the second aspect of a method, a first counting bloom fiher is applied only for a limited interval in time before another counting bloom filter is applied to address a subsequent interval in time. Out of the two or more bloom filters each reflecting access patterns to corresponding data entities during the associated intervals, an averaging routine may preferably average the counting bloom filter results achieved at the end of each interval, i.e. average counter values representing such a multitude of counting bloom filter results over time. It is further noted, that the results of the counting bloom filters are averaged by selecting the counter values of each counting bloom filter that corresponds to the data entity which access frequency shall be determined and which counter values may preferably be averaged.

With respect to all aspects, it is noted that an increase of a counter or counter value may also include any other modification of the counter or counter value that may allow for an estimate of the number/frequency of accesses to the corresponding data entity.

The counting bloom filter in the first aspect may preferably use for each interval in time the same set of/c independent hash functions for populating the counters that arc determined as a result of hashed data entities. The counting bloom filters in the second aspect, and specifically the first counting bloom filter and thc second counting bloom filter, may use the same set of k independent hash functions for populating the counters that are determined as a rcsuh of hashed data entities.

Preferably, a counting bloom filter is maintained over a span of requests which number of requests defines the interval in time the counting bloom filter is active.

The long-term counting bloom filter in both aspects is preferably represented by a smoothed or exponentially moving average of a number or all past short-term counting bloom filters, which long-term counting bloom filter may be used as a heat map. The temperature of a partiefflar data entity is obtained by querying the long-term counting bloom filter. In this respect, a temperature of a data entity again denotes its relative frequency of references, which may include read or write accesses to its peers in the same storage system, which temperature may be one of the characteristics of interest to be determined in a storage system.

In particular, the entire temperature information for an entire storage system may also be referred to as a heat map. A data entity is often called "hot" if it is frequently accessed, or 0 "cold" if it is infrequently accessed or updated. The temperature measures quantitatively how frequently and how recently a data entity is accessed. 1-lowever, a characteristic based on access frequency to/of a data entity may in another embodiment refer to its absolute access frequency/numbers.

A sample data entity may preferably be a data chunk that is addressed by a logical block address (LEA).

In Figure 1, a timing sequence of counting bloom filters CBIP1 to CBF4 applied according to an embodiment of the present invention an application is illustrated. A first counting bloom filter CBF1 is applied during a first interval in time t1 -to, a second counting bloom filter CBF2 is applied during a second interval in time t2 -t1, a third counting bloom filter CBF3 is applied during a third interval in time t -t2 and a fourth counting bloom filter CBF4 is applied during a fourth interval in time t4 -t;. Overall n counting bloom filters CBF may be applied each of which being active during an associated time intervaL According to the first aspect of the present invention, all counting bloom filters CBF' to CBF4 my physically be represented by a single counting bloom filter CHF being reused and restarted right at the end of each interval in time, which restart may preferably include a prior reset of its counters.

Preferably, the time intervals do not overlap and a subsequent time interval follows the preceding time interval without a gap in between. Each time interval may be of defined limited length, which defined length, for example, may be represented by a pre-defined number of accesses during such interval in time. As a result, the various time intervals may not necessarily be of equal length. The pre-defined number of accesses may be chosen to be a largest possible before a majority of the counters C of the corresponding counting bloom filter CBF have overflown. Furthermore, even the number of accesses for individual time intervals may be unequal.

Hence, a multiple use a single counting bloom filter, or, alternatively, a single use of multiple counting bloom filters is considered, each being active during a specific time interval, as shown in Fig. 1. For the latter aspect, at the beginning of each time interval, a new counting bloom filter CBFX is initialized with all counters of such counting bloom filter CBF being set to zero. For the first aspect, the single counting bloom filter is initialized with all counters of such counting bloom filter CBF being set to zero at the beginning of each new interval in time.

A first counting bloom filter CBF' is depicted in Figure 2. A number m of first counters C10 to C1m.j build a set S' of first counters assigned to the first counting bloom filter CBF'. An input value which in the present case may be a logical block address LBA representing a data entity is mapped by preferably multiple hash functions hl(LBA), h2(LBA) hk(LBA) -with k=2 in the present example -to k first counters C1 out of the set S' of m first counters C1.

This means that two different hash functions are applied to each LBA in the present case once such LBA is accessed by a host, the storage system itself or any other entity. In the present example, the LBA of value I is hashed to first counters C10 and C'11., The LBA of value 4 is hashed to first countcrs C', and C14. The LBA of value 5 is hashed to first counters C'3 and C'5. Hence, a subset of two first counters C1 out of the set S1 of first counters C' is assigned to each data entry represented by an EBA. With each access of an LBA, the corresponding first counters C' of its subset are incremented. If k hash functions are applied for building the first counting bloom filter CBF1, i.e. for mapping each data entry to k first counters C1, a subset of first counters C1 typically consists of k first counters C'. In another embodiment, only a single first counter C1 out of the subset of first counters C' is incremented for each access to the corresponding data entity, which preferably is the first counter C' out of the subset of first counters C1 with the lowest value. The rational for such embodiment is to accommodate more accesses in this short-term CBF without overflowing of its counters and to increase the accuracy of the frequency estimation.

The first counting bloom filter according to Figure 2 may be used in the single counting bloom filter application repetitively.

A second counting bloom filter CBF2 as may bc uscd in the multiple counting bloom filter application is depicted in Figure 3. Basically, the second counting bloom filter is identical to the first counting bloom filter CBF' in its structure. A set S2 of m second counters C2 contains second counters C20 to C2mj which arc assigned to the second counting bloom filter CBF2.

The input value which again is a logical block address LBA accessed during a second interval in time during which interval the second counting bloom fiher CBF2 is active, is mapped by thc same k hash ifinctions as used in the first counting bloom filter CBF1, i.e. hash flmctions h I (LBA), b2(LBA), hk(LBA) -with k=2 -to second counters C2 out of the set 2 of m second counters C2. Two different hash functions are applied to each LBA in the present case once such LBA is accessed by a host, the storage system itself or any other entity. In the present example, the LBA of value I is hashed to second counters C20 and C21. The LBA of value 4 is hashed to second counters C21 and C24. The LBA of value 5 is hashed to second counters C2 and C25. Hence, a subset of two second counters C2 out of the set 2 of m second counters C2 is assigned to each data entry represented by an LBA. With each access of an LBA, the corresponding second counters C2 of its subset arc incremented. If k hash frmnctions are applied for building the second counting bloom filter CBF2, i.e. for mapping each data entry to k second counters C2, a subset of second counters C2 typically consists of k second counters C2. In another embodiment, only a single second counter C2 out of the subset of second counters C2 is incremented for each access to the corresponding data entity, which preferably is the second counter C2 out of the subset of second counters C2 with the lowest value. The rational for such embodiment is to accommodate more accesses in this short-term CBF without overflowing of its counters and to increase the accuracy of the frequency estimation.

In the present example, the second interval in time in which the second counting bloom filter is applied, is defined by allowing the same given number of data entity accesses as is used for defining thc length of the first interval in time.

In the same way, n counting short term bloom filters CBF may be applied for covering a large time interval according to Figure 1. Preferably, the counter values of each counter of a counting bloom filter CBE' at the end of its associated time interval arc stored. Let C/be the value of the i-th counter in the /-th counting bloom filter CBF, then the value of the i-th counter C2 of the long-term counting bloom filter CBF can be obtained by averaging C? son all short-term counting bloom filters CBF' to CBF, namely, The resulting counter value C1 may then be used as a temperature of a related data entity. By determining all counter values Co to C,,,1 a heat map of the underlying storage system can be achieved. Such counter value C1 may also be denoted more generally as an element of a data structure which data structure supports the averaging of the individual counter values. The temperature of a specific data entity can be determined by hashing its LBA k times resulting in a subset of k counter values out of C0 to Crnj what is also denoted as long-term counting bloom filter, and taking the minimum value out of the subset of k counter values as the estimated temperature of the corresponding data entity.

In another preferred embodiment of implementing a long-term counting bloom filter, a smoothed or exponential moving average of all past short-term counting bloom filter values is used. As a result, it only may be tracked the single short-term counting bloom filter CBF. The single short-term counting bloom filter CBF is reused for each new epoch, i.e. each new interval in time, and is initialized to zero at the beginning of that epoch, i.e. its counters C are set to zero. Again, C/denotes the value of the i-th counter of the set of counters. reached at the end of the most recent interval in timej which interval in time j may just have been terminated. Note that counter values of more previous intervals in time no longer are accessible since only a single counting bloom filter is used. The updated value of /-th element C1 of the set of elements can be obtained by weighting the assigned counter value C/ and by adding the weighted present value of the i-th element C1, for example by using one of the following rules: C =acç+(l-a)C/ C. = [C! +(J-1)C1 I

I

where &cj are weighting factors, typically set to 0.75 0.95. This operation preferably is performed for all elements C out of the set for elements C0 to resulting in m element values. Once the current short-term counting bloom filter CBF is merged into the associated set of elements, all its counters are reset to zero. Hence, only a single counting bloom filter may be used for covering data entity accesses for the current interval of time. Upon expiry of the interval in time, the associated data structure is updated by applying the counter values to the assigned element values. Then, the counting bloom filter is reset by initializing all its counters to zero and a new interval of time is started for which the counting bloom filter is operated from new.

In this way, only a single short-term counting filter and a data structure are needed and thus the RAM requirement is drastically reduced.

The advantages of the periodically-updated data structure are twofold. First it requires a main memory size of only two stored counting bloom filters CBFs, thus drastically reducing the memory requirement. Secondly, the proposed long-term counting bloom filters CBF can adapt to the changing dynamics of workloads thanks to the use of exponential moving average.

An accurate estimation of the temperature of a given data entity can help improve the performance and/or cost efficiency of storage systems. This information can be incorporated into one or more of a cache, a tiered storage system, or a Flash memory based device. For example, "hot" data, once identified, can be inserted into a cache to improve a cache hit rate and thus performance. A hierarchical, i.e. a tiered storage system comprises of at least two storage media: one is typically expensive but fast, while the other is typically inexpensive but slower. "Hot" data, once identified, can be stored on the expensive but fast storage medium in a first tier of the tiered storage system while "cold" data can be stored on the larger-capacity, inexpensive but slower storage medium in a second tier of the tiered storage medium, aiming at a high performance at a lower cost. When a flash memory device is used as the storage medium, data of similar updating frequency may preferably be stored in the same flash erase unit in order to minimize write amplification.

The present idea may be applicable to any system that may benefit from tracking the value of a metric/characteristic for a very large population of data entities over a long time, while using a very small amount of memory.

In a preferred embodiment of the present invention, the present method may be applied for selectively populating a cache, and may preferably also be applied for deciding on block evictions from the cache. A cache typically is a portion of memory space that holds data entities frequently accessed in order to reduce access latency by avoiding multiple accesses to the underlying storage medium. A cache may be implemented as a read cache, a write cache, or a combined read and write cache.

Especially when a cache may be implemented on flash memory, filtering data entities that populate the cache is crucial: populating the cache with "cold" data entities not only pollutes the cache and may force potentially "hot" data entities out of the cachc, but also may result in a large number of flash writes, typically random ones. The latter results in a much lower 0 cache performance, as it severely decreases the throughput of the cache and increases the latency of other read and write requests executed in parallel. Moreover, a high rate of writes to the flash cache results in the flash chips wearing out sooner and, therefore, to a shorter lifetime of the device.

The present method can be used to efficiently maintain a cache by using a long term counting bloom filter CBF, i.e. an averaging means over counter values stemming from counting bloom filters applied to limited period in times. A corresponding storage controller may maintain a long-term CBF over the whole storage system address space at block granularity, i.e. a data entity representing a data block, i.e. the temperature of all blocks in the system is tracked. On each and every access to a block, the system updates its temperature in the short-term CBF. At the same time, the storage controller may preferably keep track of the lowest temperature found in data blocks the cache.

In response to a request for accessing a data block, if such data block is found in the cache, it is served from the cache. Assuming that a data block is requested for access and is not found in the cache, the system reads the block from the underlying storage medium which may be in one embodiment an 1-IDD array. Subsequently, the storage controller uses the current short-term counting bloom filter CBF and the long-term counting bloom filter CBF to get a measure of the temperature of the block. If that temperature is higher than the minimum temperature in the cache, the block is admitted to the cache, i.e. a copy of the block is written to the cache, and specifically to flash memory in case the cache is embodied as a flash cache. Otherwise the block is served to the user, but is not stored in the cache.

When admitting a block to the cache, it may be the case that the cache is fill, that is, a cached block needs to be evicted before the new block can be written into the cache. Then, the system may or may not use the counting bloom filter CBF to select a block to be removed from the cache. Tn the former case, the block with the least temperature in the cache is selected for removal. In the latter case, the system may use any other existing page replacement policy to select a block for removal. That policy can be based on one or more of rcccncy of accesses, on frequency of accesses or any other arbitrary criterion the designer finds suitable. An advantage of this approach is that the internals of the cachc need not be modified.

In another embodiment, a storage system may comprise a storage controller and tiered storage media. Such system is also denoted as tiered storage system. Storage systems comprising multiple tiers of persistent storage with respect to performance and capacity can also benefit from the present approach. In a typical tiered storage system, there is an ordering of storage media according to performance characteristics. Naturally, the more high-performing a storage medium is, the more expensive it is per unit of storage and, consequently, the less its capacity is expected to be. Such a system is shown in the diagram of Figure 4. In this example, the system includes four tiers TO-T3, with a tape storage um being the slowest medium with the most capacity in lowest tier TO, while a flash storage medium is the fastest storage medium with the least capacity amongst the present storage media residing in premium tier T3. In between the two extremes there are two tiers T2 and Tl comprising magnetic disks; the second highest tier T2 comprises SAS disks configured in RAID 5, for example, while the second lowest tier TI comprises SATA disks, configured in a RAID 6 array, for example. Traversing the hierarchy from Tier TO to Tier T3, performance improves both in terms of latency and throughput, while the capacity shrthks.

Typically, in tiered storage systems the total capacity of the storage system is equal to the aggregate capacity of the individual tiers. This effectively means that all tiers are utilized by the system as persistent storage and no block is found in more than one of the tiers at any given time. Specifically, none of the tiers is used as a cache in the hierarchy. Of course, any entity of data can migrate from one tier to some other tier. To achieve a maximum performance, the storage controller of such a tiered storage system aims to store data blocks with the hottest temperature on the fastest tiers, while data blocks with the coldest temperature are pushed down to the less premium tiers.

In such tiered storage system, the present approach of determining temperatures of data entities can be applied by determining the temperature over the whole storage system address space at data block granularity, i.e., the temperature of all data blocks in the storage system are tracked by means of counting bloom filters. On each and every access to a data Mock, the system updates its temperature in the short-term counting bloom filter CBF. At the same time, the system may keep track of the highest and the lowest temperature(s) found in each tier of the system.

On each access to a block currently stored on tierj, the system may use the current short-term counting bloom filter CBF and the long-term counting bloom filter CBF to get a measure of the temperature of the block. If that temperature is higher than the lowest temperature in tier j+J, then a migration is triggered for that block from tierj to tierj+ 1. At the same time, the block with the lowest temperature from tier j+I is demoted to tierj, assuming that Tierj+] is full, i.e., all its blocks have been allocated. Note that as an alternative the block can be moved to any tierj' >j-l-I, if its temperature is found higher that the lowest temperature of ticrj'. A block is demoted to a lower tier preferably when it is replaced by some other block, i.e., it is found to be the coldest block in its current tier. Initially, when a new block is allocated, it is placed in the highest tier that is not thIl yet.

Figure 5 illustrates a flow chart representing a method according to an embodiment of the present invention. In step SO, the method is started by setting a counting bloom filter index I to 1. Instep Si a first counting bloom filter -according to the index Hi -is initiated by setting all counters of the first counting bloom filter to 0. In step 52 a new request is received for accessing a data entity of the present storage medium. Instep 53 it is verified if a first interval in time associated with the first counting bloom filter is expired. If the first interval is not expired (N), the data entity or its identifier respectively, such as the LBA, is fed into the first counting bloom filter, and the subset of corresponding counters identified by hashing the present LBA by means of Ic hash functions arc incremented in step 54. In step 55, the request for access may be served, and optionally, in step 56 the counters of the subset are analyzed in comparison with a lowest temperature value of a data entity in a cache of the storage system.

Then, the storage system continues with step S2 and waits for/receives a new request for data access.

If the first interval is expired/terminated (Y) in step S3, the counter valucs of the fir st counting bloom filter are stored in step 57 and in step 58 new average counter values are determined from all previous counter values. In next step S9, the counting bloom filter index is incremented, and in step SI a next counting bloom filter, i.e. the second counting bloom filter according to the index is initialized.

Figure 6 illustrates a flow chart representing another method according to an embodiment of the present invention. In step SO, the method is started and x elements of a data structure corresponding to x counters of a counting bloom filter are set to zero. Instep Si, the counting bloom filter is initiated by setting all x counters of the counting bloom filter to zero, and the counting bloom filter is started to operate for a defined interval in time which interval in time is started in step Si. in step S2, a new request is received for accessing a data entity of the present storage medium. Instep S3, it is verified if the interval in time the counting bloom filter is expected to operate is already expired. If the interval in time is not expired (N), the data entity, or its respective identifier such as the LBA, is fed into the counting bloom filter, and counters of a subset of counters identified by hashing the present LBA by means of Ic hash thnctions are incremented in step 54. In step 55, the request for access may be served, and optionally, in step S6 the counters of the subset are analyzed in comparison with a lowest temperature value of a data entity in a cache of the storage system. Then, the storage system continues with step S2 and waits for/receives a new request for data access.

If the interval in time is expired/terminated (Y) in step S3, -which may be determined, for example, by having reached a dcfmcd number of data entity accesses -in step 57 new values of elements of the data structure arc determined based on the present counter values and based on the present element values of the first counting bloom filter are stored. Preferably, a new value is determined for each element in the data structure given that each element corresponds to a counter of the counting bloom filter. The new element values arc stored in step S8. In the following step SI, the counter values are reset, and a new interval in time is started. The pending request for access may have been temporarily stored and may be executed during the new interval in time.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention, in particular in form of the controller, may take the form of an entirely hardware embodiment, an entirely software embodimcnt (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," "module" or "system." Furthermore, aspects of the present invention, such as the methods, may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium maybe, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium maybe any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in bascband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remotc computer may be connected to the user's computcr through any type of network, including a local area network (LAN) or a wide area network (WAN), or the 0 connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions.

These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data proccssing apparatus to produce a machine, such that the instructions, which execute via the proccssor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to firnetion in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the flinction'act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical frmnction(s). It should also be noted that, in some alternative implementations, the functions 0 noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the ftmnctionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

<claim-text>CLAIMSA computer-implemented method for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storagc system, comprising providing a counting bloom filter (CBF') for being operated for an interval in time which counting bloom filter (CBF1) comprises a set (S1) of counters (C1), providing a data structure comprising a set of elements wherein each element of the set of elements is assigned to a counter of the set of counters, determining the characteristic of said data entity subject to a value of at least one element of the set of elements, wherein for each individual interval in time the counting bloom filter is operated -the counters of the set of counters are reset prior to or at a beginning of the individual interval in time, -a value of at least one counter (C1) of a subset of counters (C1) to which subset of counters (C1) said data entity is mapped in the counting bloom filter (CBF1) is increased each time said data entity is accessed during the individual interval in time, -the value of each individual element of the set of elements is updated at or after an end of the individual interval in time, wherein the value of the individual element is updated subject to a value the counter assigned to the individual element holds at the end of the individual interval in time and subject to a present value of the individual element.</claim-text> <claim-text>2. The method according to claim 1, wherein the counting bloom filter is operated multiple times for consecutive interva's in time.</claim-text> <claim-text>3. The method according to claim I or claim 2, wherein the value of the individual element is updated subject to a weighted value the counter assigned to the individual element holds at the end of the individual interval in time and subject to a weighted present value of the individual element.</claim-text> <claim-text>4. The method according to claim 3, wherein the value of the individual element is updated by the value the counter assigned to the individual element holds at the end of the individual interval in time which value is weighted by a factor a, plus the prescnt value of the individual element which present value is weighted by a factor 1-a.</claim-text> <claim-text>5. The method according to claim 4, wherein the factor a has a value betwecn 0.75 and 0.95.</claim-text> <claim-text>6. The method according to any one of the prcccding claims, wherein said data cntity is mapped to thc subsct of counters (C') by means of one or more hash functions (h).</claim-text> <claim-text>7. The method according to any one of the preceding claims, whcrein the subset of counters (C1) comprises multipk counters (C') to which said data entity is mapped in the counting bloom filter (CBF'), and wherein only the value of a single counter (C1) in the subset is increased, which single counter (C1) is the counter (C1) in the subset that presently shows a lowest value amongst the multiple counters (C') in the subset.</claim-text> <claim-text>8. The method according to any one of the preceding claims.wherein each element of the set of elements is assigned to a single counter of the set of counters, and wherein each counter of the set of counters is assigned to a single element of the set of elements.</claim-text> <claim-text>9. The method according to any one of the preceding claims, wherein the subset of counters (C1) comprises multiple counters (C') to which said data entity is mapped in the counting bloom filter (CBF1), wherein a subset of elements contains elements which are assigned to the counters of the subset of counters, and wherein the characteristic of said data entity is determined subject to the value of one or more elements of the subset of elements.</claim-text> <claim-text>10. The method according to claim 9, wherein the characteristic of said data entity is determined subject to the value of the element that shows the lowest value amongst the multiple elements in the subset of elements.</claim-text> <claim-text>11. A computer-implemented method for determining a characteristic of a data entity which characteristic is based on a frequency of access to said data entity in a storage system, comprising providing a first counting bloom filter (CBF1) being active for a first interval in time, which first counting bloom filter (CBF1) comprises a set (S5 of first counters (C1), each lime said data entity is accessed during the first interval in time increasing a value of at least one first counter (C') of a subset of first counters (C1) to which subset of first counters (C') said data entity is mapped in the first counting bloom filter (CBF'), providing a second counting bloom filter (CBF2) being active for a second interval in time, which second counting bloom filter (CBF2) comprises a set (S2) of second counters (C2), each time the data entity is accessed during the second interval in time increasing a value of at least one second counter (C2) of a subset of second counters (C2) to which subset of second counters (C2) said data entity is mapped in the second counting bloom filter (CBF2), determining the characteristic of the data entity subject to a value of at least one first counter (C') of the subset of first counters (C1) at the end of the first interval in time, and subject to a value of at least one second counter (C2) of the subset of the second counters (C2) at the end of the second interval in time (CBF2).</claim-text> <claim-text>12. The method according to claim 11, wherein overall n counting bloom filters (CBF) are provided each of which n counting bloom filters (CBF) being active for an associated interval in time, which associated intervals in time follow each other, wherein each of the n counting bloom filters CBF is operated according to the first or second counting bloom filter (CBF,1CBF2) each time said data entity is accessed during the associated interval in time, and wherein the characteristic of said data entity is determined subject to, for each of then counting bloom filters (CBF), a value of at least one counter (C) of a subset of counters (C) associated with said data entity in the respective counting bloom filter (CBF) at the end of the associated interval in time.</claim-text> <claim-text>13. The method according to claim 12, wherein the characteristic of said data entity is determined based on an average of the counter values selected from the n counting bloom filters (CBF).</claim-text> <claim-text>14. The method according to any one of the preceding claims 11 to 13, wherein said data entity is mapped to the subset of first counters (C1) by means of one or more hash functions (h), and wherein said data entity is mapped to the subset of second counters (C2) by means of the same one or more hash functions (h).</claim-text> <claim-text>15. The method according to any one of the preceding claims 11 to 14, wherein the subset of first counters (C') comprises multiple first counters (C1) to which said data entity is mapped in the first counting bloom filter (CBF'), and wherein only the value of a single first counter (C') in the subset is increased, which single first counter (C') is the first counter (C') in the subset that presently shows a lowest value amongst the multiple fir st counters (C2) in the subset, and wherein the subset of second counters (C2) comprises multiple second counters (C2) to which said entity is mapped in the second counting bloom filter (CBF2), and wherein only the value of a single second counter (C2) in the subset is increased, which single second counter (C2) is the second counter (C2) that presently shows a lowest value amongst the multiple second counters (C2) in the subset.</claim-text> <claim-text>16. The method according to any one of the preceding claims 11 to 15, wherein the subset of first counters (C') comprises multiple first counters (C') to which said data entity is mapped in the first counting bloom filter (CBF'), wherein the subset of second counters (C2) comprises multiple second counters (C2) to which said entity is mapped in the second counting bloom filter (CBF2), and wherein the characteristic of said data entity is determined subject to a value of a dedicated first counter (C') of the subset of first counters (C') which dedicated f,rst counter (C') is the first counter (C') that shows the lowest value amongst the multiple first counters (C') in the subsct at the end of the first interval in time, and subject to a value of a dedicated second counter (C2) of the subset of second counters (C2) which dedicated second counter (C2) is the second counter (C2) that shows the lowest value amongst the multiple second counters (C2) in the subset at the end of the second interval in time.</claim-text> <claim-text>17. The method according to any one of the preceding claims, wherein accessing said data entity includes at least one of reading said data entity and updating said data entity.</claim-text> <claim-text>18. The method according to any one of the preceding claims, wherein said data entity represents data addressed by a single logical block address (L BA).</claim-text> <claim-text>19. The method according to any one of thc preceding claims, wherein subject to the determined characteristic of said data entity, said data entity is selected for being cached.</claim-text> <claim-text>20. The method according to any one of the preceding claims, whcrcin subject to the determined characteristic of said data cntity, said data entity is selected for a being stored in a dedicated tier (T) in a tiered storage system.</claim-text> <claim-text>21. A computer program product comprising a computer readable medium having computer readable program code embodied therewith, the computer readable program code comprising computer readable program code configured to pcrform a method according to any onc of thc prcceding claims.</claim-text> <claim-text>22. A storagc controller for determining a characteristic of a data entity which characteristic is bascd on a frequency of access to said data entity in a storagc system, comprising a control unit adaptcd to execute a mcthod according to any onc of the preceding claims Ito 19.</claim-text> <claim-text>23. A computcr-implemented mcthod for determining a characteristic of a data entity substantially as hereinbefore described, with reference to Figures 1-3 and 5-6 of the accompanying drawings.</claim-text> <claim-text>24. A storage controller substantially as hereinbefore described, with reference to Figures 1-3 and 5-6 of the accompanying drawings.</claim-text> <claim-text>25. A computer program substantially as hereinbefore described, with reference to Figures 1-3 and 5-6 of the accompanying drawings.</claim-text>