CN115391239A - Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium - Google Patents

Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium Download PDF

Info

Publication number
CN115391239A
CN115391239A CN202211077779.8A CN202211077779A CN115391239A CN 115391239 A CN115391239 A CN 115391239A CN 202211077779 A CN202211077779 A CN 202211077779A CN 115391239 A CN115391239 A CN 115391239A
Authority
CN
China
Prior art keywords
cache
prefetching
type
bit
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211077779.8A
Other languages
Chinese (zh)
Inventor
周莉
王肖丛
孙田弋
贾思敏
牟进正
薛立晓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202211077779.8A priority Critical patent/CN115391239A/en
Publication of CN115391239A publication Critical patent/CN115391239A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/602Details relating to cache prefetching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to a local area algorithm-based cache dynamic data prefetching method, a system, equipment and a storage medium, wherein the prefetching method comprises the following steps: setting a history record table; reading a history record table and judging the cache failure type; the cache invalidation types are: transitional, repeating, and jumping. Calculating the number of prefetch addresses and memory lines; initiating a prefetch request to a bus; the history table is adjusted based on the hit rate of the prefetched data set. The data prefetching strategy of the invention can effectively prefetch data and backfill multi-level cache, does not influence the existing prefetching strategy, and can improve the accuracy of prefetching.

Description

Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium
Technical Field
The invention relates to a method, a system, equipment and a storage medium for prefetching dynamic cache data based on a local area algorithm, belonging to the field of integrated circuits.
Background
Cache (Cache) is a storage Unit located at a higher level in a computer storage structure, and is a bridge between a main memory and a Central Processing Unit (CPU), and the application of the Cache technology well solves the limitation of a memory wall on the improvement of the performance of a processor. Modern processors reduce the cache failure rate to a great extent through a hierarchical structure of multiple levels of caches, but the access speed gap between the caches and the main memory is still large, so that when a certain level of cache misses, a great deal of cost is still paid for a CPU to obtain a cache line from the main memory.
Therefore, modern processors mostly adopt a data prefetching technology to predict an upcoming memory access requirement, and send a prefetching request to a corresponding memory line before a CPU formally accesses a cache, so as to prefetch memory line data into the cache in advance, thereby improving the cache hit rate.
Data prefetching exploits the temporal and spatial locality of computer programs to increase the hit rate of a cache. Data prefetched from a memory line replaces an original cache line in the cache, and once a prefetch address is mispredicted, not only can the prefetched data be invalid, but also a useful cache line in the original cache can be replaced, and performance loss is caused. The existing data prefetching strategy mostly does not distinguish the type of the cache invalidation, and adopts a method of directly prefetching the next memory line of the address of the cache invalidation. Although the strategy can improve the hit rate of the cache, the probability of the prefetch error is high.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a local algorithm-based cache dynamic data prefetching method, which divides cache failures into three types based on the local algorithm, calculates data prefetching addresses according to the failures of different types, provides a high-precision data prefetching strategy and improves the hit rate of prefetched data.
Interpretation of terms:
1. caching: a memory is located between a CPU and a main memory, has higher speed than the main memory, but has a relatively small storage space.
2. Cache hit: when the CPU sends out an access request, whether the cache has the needed data or not is firstly searched, and if the cache has the needed data, the cache hit is called.
3. And (4) cache invalidation: if the CPU sends out an access request but does not find the corresponding data in the cache, the cache failure is called.
4. Cache line: data information is stored in memory in blocks or lines, referred to as cache lines.
5. Pre-fetching: when a cache miss occurs, the CPU can only send request data to the next level of storage, but the slower the access speed of the lower level of storage (the farther from the CPU), the prefetch technique can fetch the local data needed when the cache miss occurs into the cache line in advance for the CPU to select.
6. Temporal locality: a program accesses memory data of a certain cache line, which data is accessed again or even more times in the next period of time.
7. Spatial locality: a program accesses memory data of a cache line, and is likely to access an adjacent cache line of the cache line next time.
8. Local area algorithm: a method for calculating a data prefetch target address according to a cache invalidation type.
9. Step length: the difference between the current occurring cache miss address and the last cache miss address.
The technical scheme of the invention is as follows:
a cache dynamic data prefetching method based on a local area algorithm is applied to a cache, and specifically comprises the following steps:
s1, constructing a history table, wherein the history table is used for recording an instruction address of cache failure;
s2, reading a history record table, calculating a prefetching step length, and judging a cache failure type;
s3, calculating the address of the pre-fetching data by adopting a local area algorithm, storing the pre-fetching address in a local area address table, and initiating a pre-fetching request to a main memory through a bus;
and S4, adjusting the history table according to the hit rate of the prefetched data group, controlling the replacement rule of the history table, and controlling the prefetching to stop.
According to the present invention, preferably, the cache is a multi-level cache, each level of cache is a multi-way group connection structure, each way includes a plurality of cache lines, each cache line at least includes a tag bit, a data bit and a flag bit, and the flag bit at least includes a cache valid bit and a dirty write bit.
According to the optimization of the invention, in the step S1, when the cache fails, the failed instruction address is stored in the history table, and then the history table is constructed; the history record table comprises n table entries, wherein each table entry comprises a 1-bit effective bit, a 2-bit type bit, an x-bit mark bit and a failure instruction address bit;
the valid bit comprises 0 and 1, if the valid bit is 1, the table entry is valid, and the table entry can be selected and read by the read history record table in the step S2; if the valid bit is 0, the table entry is invalid, and the table entry is waited to be replaced by other instruction addresses to be stored; the effective bits of the three initialized cache invalidation types are all 1, and when a prefetch data miss occurs once, the effective bits are changed from 1 to 0 to wait for being replaced;
the type bits comprise 01,10 and 11, the 01 type bit represents that the cache failure type is a transition type, the 10 type bit represents that the cache failure type is a jump type, and the 11 type bit represents that the cache failure type is a repetition type;
the mark bit x is related to the number n of the entries of the history table, and the mark bit x satisfies the following conditions: 2 x-1 <n≤2 x And n is not less than 3,x is not less than 2; the x-bit number bit forms a label C; for example, if x =2, the label width is 2, and the value range of the label C is: 00,01,10, 11.
According to the present invention, in step S1, a counter is set to record the labels of the entries, the labels represent the time sequence of storing the addresses with cache invalidation into the history table, and a count value of 0 represents that the instruction address in the entry is recorded in the history table at the earliest time; when the history record table is full, replacing the history instruction address in the table according to a replacement rule;
the replacement rule is specifically as follows: when an invalid table entry exists, replacing the table entry with the valid bit of 0 and the small label value, and assuming that the label of the current table entry to be replaced is r, sequentially updating the table entries into: r = n +1, and the index of other table entries t = t-1,r < t ≦ (n + 1);
if all the table entries are valid, the table entry recording the instruction address earliest is replaced, and if the table entry recording the address earliest is marked as r, the table entries are sequentially updated as follows: r = n +1, and the index t = t-1 for other entries (0 < t ≦ (n + 1)). The replacement rule can replace invalid entries in time and can protect the entries which are written into the history list recently from being replaced.
According to the optimization of the invention, in the step S2, reading the history record table, calculating the prefetching step length and judging the cache failure type; the specific process is as follows:
s2-1, reading the history table, judging whether the address with cache failure hits the history table, if so, jumping to the step S4, and returning corresponding data of the pre-fetched data set; if the table entry is not hit or is invalid after the hit, jumping to the step S2-2;
s2-2, reading three cache failure addresses of the latest written history table, calculating a change value of a step length, and judging the relation of the three cache failure addresses by calculating the step length;
s2-3, assuming that the three latest instruction addresses written into the history list from far to near in time are a in sequence i-2 、a i-1 、a i
S2-4, making the step length k1= a i -a i-1 Step k0= a i-1 -a i-2 Judging the relationship between the two step lengths of k1 and k0,
if both k1 and k0 are 0, judging that the cache failure type is a repeated type;
if k1 and k0 are equal and not 0, judging that the cache failure type is a jump type;
if k1 and k0 are not equal, but one step is 0, judging that the cache failure type is a transition type;
if k1 and k0 are not equal and are not 0, no data prefetching is performed; and the effective position 0 of the table item where a2 is located;
s2-5, prefetching a cache line to the bus once when the repeated cache fails;
s2-6, prefetching three cache lines to the bus once when the jump cache fails;
s2-7, prefetching two cache lines to the bus once when the transitional cache fails;
s2-8, after judging the cache failure type, recording the type in a type bit of a history record table;
and S2-9, storing the step size and the cache invalidation type and sending the step size and the cache invalidation type to the step S3, and preparing to calculate a prefetch address by adopting a local area algorithm.
Preferably, in step S3, the address of the prefetch data is calculated by using a local area algorithm, the prefetch address is stored in a local area address table, and then a prefetch request is initiated to the main memory through the bus; the specific process is as follows:
s3-1, acquiring a failure type signal and a step length;
s3-2, when the step length does not span one cache line, the prefetched data is the data of the current failure cache line, and the step length needs to be set to be equal to the highest bit of the step length plus 1 in the case;
s3-3, if the failure type is a repeated type, the address of the data to be prefetched in the memory is a i I = C, the instruction is marked as prefetched, prefetching is stopped until confidence is 0;
s3-4, if the failure type is a jump type, the addresses of the three memory lines to be prefetched are a i +j*k 1 I = C, j =1, 2, 3; three instructions are marked as prefetched, prefetching is stopped until the confidence coefficient is 0;
s3-5, if the cache failure type is transition type, the addresses of two memory lines to be prefetched are a i +j*(k 1 +k 0 ) I = C, j =1, 2, marking both instructions as prefetched, stopping prefetching until confidence is 0;
and S3-6, storing the address in a local address table after the address acquisition is finished, and initiating a data pre-fetching request to the main memory through the bus.
According to the present invention, in step S4, the history table is adjusted according to the hit rate of the prefetch data set, the replacement rule of the history table is controlled, and prefetching is stopped; the specific process is as follows:
s4-1, setting the confidence degrees of all types to be 10 initially, wherein the value ranges of the confidence degrees are 00,01,10 and 11; wherein the confidence coefficient is 00,01,10,11 from low to high; 00 is the minimum value that indicates that data that was previously prefetched for a cache miss cannot continue to hit, i.e., the data that would be needed next time a miss occurred is now not needed. And has become increasingly useless, entries in the history table may be emptied.
S4-2, waiting for the prefetch data from the main memory to be put into a prefetch data group;
s4-3, if the instruction address hits the prefetch data set, generating a prefetch hit signal, backfilling a cache from the prefetch data set, adding 1 to the corresponding type confidence, returning to the step S3 to calculate the address of the prefetch data, and continuously executing prefetching;
s4-4, if the instruction address does not hit the prefetch data set, subtracting 1 from the confidence level; then returning to the step S3 to calculate the instruction address of the prefetched data; when the confidence level decreases to 00, prefetching is stopped and the history table is notified to set the effective location of the instruction address to 0.
A cache dynamic data prefetching system based on a local area algorithm comprises a history record table construction module, a prefetching type discrimination module, a prefetching address storage module and a prefetching control module which are connected in sequence;
a history table construction module for implementing the step S1; a prefetch type discrimination module for implementing the step S2; a prefetch address storage module for implementing step S2; and the control prefetching module is used for realizing the step S4.
A computer device comprising a memory storing a computer program and a processor implementing the steps of a local algorithm based cache dynamic data pre-fetching method when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of a method for cache dynamic data prefetching based on a local area algorithm.
The invention has the beneficial effects that:
1. the invention adopts a local algorithm-based cache dynamic data prefetching method, namely, a prefetching address is calculated through a local algorithm, and the algorithm can better utilize the spatial locality and the temporal locality of program access data and improve the hit rate.
2. The invention has wider application range and can support the data prefetching of the multi-level cache.
3. The local area algorithm of the invention has low complexity and low software and hardware implementation cost.
4. The data prefetching strategy of the invention can effectively prefetch data and backfill multi-level cache, does not influence the existing prefetching strategy, and can improve the accuracy of prefetching.
Drawings
FIG. 1 is a schematic diagram of a logical structure of a local area algorithm-based cache dynamic data prefetching method according to the present invention.
FIG. 2 is a schematic diagram illustrating steps of a local algorithm-based cache dynamic data prefetching method according to the present invention.
Detailed Description
The invention is further described below, but not limited thereto, with reference to the following examples and the accompanying drawings.
Example 1
A cache dynamic data prefetching method based on local area algorithm is disclosed, as shown in FIG. 1 and FIG. 2, the steps include:
s1, constructing a history table, wherein the history table is used for recording an instruction address of cache failure;
the cache is a multi-level cache, each level of cache is a multi-path group connection structure, each path comprises a plurality of cache lines, each cache line at least comprises a tag bit, a data bit and a flag bit, and the flag bit at least comprises a cache valid bit and a dirty write bit.
In the step S1, when the cache fails, storing the failed instruction address into a history table, and further constructing the history table; the history record table comprises n table entries, wherein each table entry comprises a 1-bit effective bit, a 2-bit type bit, an x-bit mark bit and a failure instruction address bit;
the valid bit comprises 0 and 1, if the valid bit is 1, the table entry is valid, and the table entry can be selected and read by the read history record table in the step S2; if the valid bit is 0, the table entry is invalid, and the table entry is waited to be replaced by other instruction addresses to be stored; the effective bits of the three initialized cache invalidation types are all 1, and when a prefetch data miss occurs once, the effective bits are changed from 1 to 0 to wait for being replaced;
the type bits comprise 01,10 and 11, the 01 type bit represents that the cache failure type is a transition type, the 10 type bit represents that the cache failure type is a jump type, and the 11 type bit represents that the cache failure type is a repetition type;
the mark bit x is related to the number n of the entries of the history table, and the mark bit x satisfies the following conditions: 2 x-1 <n≤2 x And n is more than or equal to 3,x is more than or equal to 2; the x-bit number bit forms a label C; for example, if x =2, the label width is 2, and the value range of the label C is: 00,01,10, 11.
In the step S1, a counter is set to record the labels of all the table entries, the labels represent the time sequence of storing the addresses with cache failure into a history table, and the count value of 0 represents that the instruction address in the table entry is recorded in the history table at the earliest time; when the history record table is full, replacing the history instruction address in the table according to a replacement rule;
the replacement rule is specifically as follows: when an invalid table entry exists, replacing the table entry with the valid bit of 0 and the small label value, and assuming that the label of the current table entry to be replaced is r, sequentially updating the table entries into: r = n +1, and the index of other table entries t = t-1,r < t ≦ (n + 1);
if all the table entries are valid, the table entry recording the instruction address earliest is replaced, and if the table entry recording the address earliest is marked as r, the table entries are sequentially updated as follows: r = n +1, and the index t = t-1 for other entries (0 < t ≦ (n + 1)). The replacement rule can replace invalid entries in time and can protect the entries which are written into the history list recently from being replaced.
S2, reading a history record table, calculating a prefetching step length, and judging a cache failure type; the specific process is as follows:
s2-1, reading the history table, judging whether the address with cache failure hits the history table, if so, jumping to the step S4, and returning corresponding data of the pre-fetched data set; if the table entry is not hit or is invalid after the hit, jumping to the step S2-2;
s2-2, reading three cache failure addresses of the latest written history table, calculating a change value of a step length, and judging the relation of the three cache failure addresses by calculating the step length;
s2-3, assuming that the three latest instruction addresses written into the history list from far to near in time are a in sequence i-2 、a i-1 、a i
S2-4, making the step length k1= a i -a i-1 Step k0= a i-1 -a i-2 Judging the relationship between the two step lengths of k1 and k0,
if both k1 and k0 are 0, judging that the cache failure type is a repetitive type;
if k1 and k0 are equal and not 0, judging that the cache failure type is a jump type;
if k1 and k0 are not equal, but one step is 0, judging that the cache failure type is a transition type;
if k1 and k0 are not equal and are not 0, no data prefetching is performed; and the effective position 0 of the table item where a2 is located;
s2-5, prefetching a cache line to the bus once when the repeated cache fails;
s2-6, prefetching three cache lines to the bus once when the jump cache fails;
s2-7, prefetching two cache lines to the bus once when the transitional cache fails;
s2-8, after judging the cache failure type, recording the type in a type bit of a history record table;
and S2-9, storing the step size and the cache invalidation type and sending the step size and the cache invalidation type to the step S3 to prepare for calculating the prefetch address by adopting a local algorithm.
S3, calculating the address of the pre-fetching data by adopting a local area algorithm, storing the pre-fetching address in a local area address table, and initiating a pre-fetching request to a main memory through a bus; the specific process is as follows:
s3-1, acquiring a failure type signal and a step length;
s3-2, when the step length does not span one cache line, the prefetched data is the data of the current failure cache line, and the step length needs to be set to be equal to the highest bit of the step length plus 1 in the case;
s3-3, if the failure type is a repeated type, the address of the data to be prefetched in the memory is a i I = C, the instruction is marked as prefetched, prefetching is stopped until confidence is 0;
s3-4, if the failure type is a jump type, the addresses of three memory lines to be prefetched are a i +j*k 1 I = C, j =1, 2, 3; three instructions are marked as prefetched, prefetching is stopped until the confidence coefficient is 0;
s3-5, if the cache failure type is transition type, the addresses of two memory lines to be prefetched are a i +j*(k 1 +k 0 ) I = C, j =1, 2, marking the two instructions as prefetched, prefetching until confidence is 0 and stopping prefetching;
and S3-6, storing the address in a local address table after the address acquisition is finished, and initiating a data pre-fetching request to the main memory through the bus.
S4, adjusting the history table according to the hit rate of the prefetched data group, controlling the replacement rule of the history table, and controlling the prefetching to stop, wherein the specific process is as follows:
s4-1, setting the confidence degrees of all types to be 10 initially, wherein the confidence degrees are in the value ranges of 00,01,10 and 11; wherein the confidence coefficient is 00,01,10,11 from low to high; 00 is the minimum value that indicates that data that was previously prefetched for a cache miss cannot continue to hit, i.e., the data that would be needed next time a miss occurred is now not needed. And has become increasingly useless, entries in the history table may be emptied.
S4-2, waiting for the prefetch data from the main memory to be put into a prefetch data group;
s4-3, if the instruction address hits the prefetch data set, generating a prefetch hit signal, backfilling a cache from the prefetch data set, adding 1 to the corresponding type confidence, returning to the step S3 to calculate the address of the prefetch data, and continuously executing prefetching;
s4-4, if the instruction address does not hit the prefetch data set, subtracting 1 from the confidence level; then returning to the step S3 to calculate the instruction address of the prefetched data; when the confidence level decreases to 00, prefetching is stopped and the history table is notified to set the effective location of the instruction address to 0.
Example 2
A high-speed buffer dynamic data pre-fetching system based on local algorithm is used for realizing the high-speed buffer dynamic data pre-fetching method based on the local algorithm provided by embodiment 1, and comprises a history record table construction module, a pre-fetching type judgment module, a pre-fetching address storage module and a pre-fetching control module which are connected in sequence;
a history table construction module for implementing the step S1; a prefetch type discrimination module for implementing the step S2; a prefetch address storage module for implementing step S2; and the control prefetching module is used for realizing the step S4.
Example 3
A computer device comprising a memory storing a computer program and a processor implementing the steps of the local algorithm based cache dynamic data prefetching method provided in embodiment 1 when the processor executes the computer program.
Example 4
A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the local algorithm based cache dynamic data prefetching method provided in embodiment 1.

Claims (10)

1. A cache dynamic data prefetching method based on local area algorithm is characterized in that the method is applied to cache, and specifically comprises the following steps:
s1, constructing a history table, wherein the history table is used for recording an instruction address of cache failure;
s2, reading a history record table, calculating a prefetching step length, and judging a cache failure type;
s3, calculating the address of the pre-fetching data by adopting a local area algorithm, storing the pre-fetching address in a local area address table, and initiating a pre-fetching request to a main memory through a bus;
and S4, adjusting the history table according to the hit rate of the prefetched data group, controlling the replacement rule of the history table, and controlling the prefetching to stop.
2. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein the cache is a multi-level cache, and each level of the cache is a multi-way set-connected structure, each way includes a plurality of cache lines, the cache lines include at least a tag bit, a data bit, and a flag bit, and the flag bit includes at least a cache valid bit and a write dirty bit.
3. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein in step S1, when cache fails, the failed instruction address is stored in a history table, and then the history table is constructed; the history record table comprises n table entries, wherein each table entry comprises a 1-bit effective bit, a 2-bit type bit, an x-bit mark bit and a failure instruction address bit;
the valid bit comprises 0 and 1, if the valid bit is 1, the table entry is valid, and the table entry can be selected and read by the read history record table in the step S2; if the valid bit is 0, the table entry is invalid, and the table entry is waited to be replaced by other instruction addresses to be stored; the effective bits of the three initialized cache invalidation types are all 1, and when a prefetch data miss occurs once, the effective bits are changed from 1 to 0 to wait for being replaced;
the type bits comprise 01,10 and 11, the 01 type bit represents that the cache failure type is a transition type, the 10 type bit represents that the cache failure type is a jump type, and the 11 type bit represents that the cache failure type is a repetition type;
the number bit x satisfies: 2 x-1 <n≤2 x And n is more than or equal to 3,x is more than or equal to 2; x position number forming markNumber C.
4. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein in step S1, a counter is set to record labels of entries, the labels represent a time sequence of addresses where cache failure occurs and being stored in the history table, and a count value of 0 indicates that an instruction address in the entry is recorded in the history table earliest; when the history record table is full, replacing the history instruction address in the table according to a replacement rule;
the replacement rule is specifically as follows: and when invalid entries exist, replacing the entries with the valid bit of 0 and small label values, and if all the entries are valid, replacing the entry recording the instruction address earliest.
5. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein in step S2, a history table is read, a prefetching step is calculated, and a cache invalidation type is determined; the specific process is as follows:
s2-1, reading the history table, judging whether the address with cache failure hits the history table, if so, jumping to the step S4, and returning corresponding data of the pre-fetched data set; if the table entry is not hit or is invalid after the hit, jumping to the step S2-2;
s2-2, reading three cache failure addresses of the latest written history table, calculating a change value of a step length, and judging the relation of the three cache failure addresses by calculating the step length;
s2-3, assuming that the three latest instruction addresses written into the history list from far to near in time are a in sequence i-2 、a i-1 、a i
S2-4, making the step length k1= a i -a i-1 Step k0= a i-1 -a i-2 Judging the relationship between the two step lengths of k1 and k0,
if both k1 and k0 are 0, judging that the cache failure type is a repetitive type;
if k1 and k0 are equal and not 0, judging that the cache failure type is a jump type;
if k1 and k0 are not equal, but one step is 0, judging that the cache failure type is a transition type;
if k1 and k0 are not equal and are not 0, no data prefetching is performed; and the effective position 0 of the table item where a2 is located;
s2-5, prefetching a cache line to the bus once when the repeated cache fails;
s2-6, prefetching three cache lines to the bus once when the jump cache fails;
s2-7, prefetching two cache lines to the bus once when the transitional cache fails;
s2-8, after judging the cache failure type, recording the type in a type bit of a history record table;
and S2-9, storing the step size and the cache invalidation type and sending the step size and the cache invalidation type to the step S3, and preparing to calculate a prefetch address by adopting a local area algorithm.
6. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein in step S3, the local area algorithm is used to calculate the address of the prefetched data, the prefetched address is stored in the local area address table, and then a prefetching request is sent to the main memory through the bus; the specific process is as follows:
s3-1, acquiring a failure type signal and a step length;
s3-2, when the step length does not span one cache line, setting the step length to be equal to the highest bit of the step length plus 1;
s3-3, if the failure type is a repeated type, the address of the data to be prefetched in the memory is a i I = C, the instruction is marked as prefetched, prefetching is stopped until confidence is 0;
s3-4, if the failure type is a jump type, the addresses of the three memory lines to be prefetched are a i +j*k 1 I = C, j =1, 2, 3; three instructions are marked as prefetched, prefetching is stopped until the confidence coefficient is 0;
s3-5, if the cache failure type is transition type, the addresses of two memory lines to be prefetched are a i +j*(k 1 +k 0 ) I = C, j =1, 2, marking the two instructions as prefetched, prefetching until confidence is 0 and stopping prefetching;
and S3-6, storing the address in a local address table after the address acquisition is finished, and initiating a data pre-fetching request to the main memory through the bus.
7. The local area algorithm-based cache dynamic data prefetching method according to claim 1, wherein in step S4, the history table is adjusted according to the hit rate of the prefetched data set, the replacement rule of the history table is controlled, and prefetching is stopped; the specific process is as follows:
s4-1, setting the confidence degrees of all types to be 10 initially, wherein the confidence degrees are in the value ranges of 00,01,10 and 11;
s4-2, waiting for the prefetch data from the main memory to be put into a prefetch data group;
s4-3, if the instruction address hits the prefetch data set, generating a prefetch hit signal, backfilling a cache from the prefetch data set, adding 1 to the corresponding type confidence, returning to the step S3 to calculate the address of the prefetch data, and continuously executing prefetching;
s4-4, if the instruction address does not hit the prefetch data set, subtracting 1 from the confidence level; then returning to the step S3 to calculate the instruction address of the prefetched data; when the confidence level decreases to 00, prefetching is stopped and the history table is notified to set the effective location of the instruction address to 0.
8. A cache dynamic data prefetching system based on local area algorithm is characterized in that the cache dynamic data prefetching method based on local area algorithm is used for realizing any one of claims 1-7, and comprises a history record table construction module, a prefetching type discrimination module, a prefetching address storage module and a prefetching control module which are connected in sequence;
a history table construction module for implementing the step S1; a prefetch type discrimination module for implementing the step S2; a prefetch address storage module for implementing step S2; and the control prefetching module is used for realizing the step S4.
9. A computer device comprising a memory storing a computer program and a processor implementing the steps of the local area algorithm based cache dynamic data prefetch method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the local algorithm based cache dynamic data prefetching method according to any one of claims 1 to 7.
CN202211077779.8A 2022-09-05 2022-09-05 Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium Pending CN115391239A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211077779.8A CN115391239A (en) 2022-09-05 2022-09-05 Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211077779.8A CN115391239A (en) 2022-09-05 2022-09-05 Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115391239A true CN115391239A (en) 2022-11-25

Family

ID=84125513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211077779.8A Pending CN115391239A (en) 2022-09-05 2022-09-05 Local algorithm-based cache dynamic data prefetching method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115391239A (en)

Similar Documents

Publication Publication Date Title
US4980823A (en) Sequential prefetching with deconfirmation
JP4045296B2 (en) Cache memory and control method thereof
JP4128878B2 (en) Method and system for speculatively invalidating cached lines
JP4027620B2 (en) Branch prediction apparatus, processor, and branch prediction method
CN102169429B (en) Pre-fetch unit, data prefetching method and microprocessor
CN100407167C (en) Fast and acurate cache way selection
US7925865B2 (en) Accuracy of correlation prefetching via block correlation and adaptive prefetch degree selection
US20080244232A1 (en) Pre-fetch apparatus
JPH1074166A (en) Multilevel dynamic set predicting method and its device
US20160019065A1 (en) Prefetching instructions in a data processing apparatus
US20100217937A1 (en) Data processing apparatus and method
CN109918131B (en) Instruction reading method based on non-blocking instruction cache
US7047362B2 (en) Cache system and method for controlling the cache system comprising direct-mapped cache and fully-associative buffer
US6711651B1 (en) Method and apparatus for history-based movement of shared-data in coherent cache memories of a multiprocessor system using push prefetching
US20150058592A1 (en) Inter-core cooperative tlb prefetchers
US11301250B2 (en) Data prefetching auxiliary circuit, data prefetching method, and microprocessor
CN114579479A (en) Low-pollution cache prefetching system and method based on instruction flow mixed mode learning
CN113986774A (en) Cache replacement system and method based on instruction stream and memory access mode learning
US7711904B2 (en) System, method and computer program product for executing a cache replacement algorithm
CN108132893A (en) A kind of constant Cache for supporting flowing water
US6240489B1 (en) Method for implementing a pseudo least recent used (LRU) mechanism in a four-way cache memory within a data processing system
US11036639B2 (en) Cache apparatus and method that facilitates a reduction in energy consumption through use of first and second data arrays
US6643743B1 (en) Stream-down prefetching cache
US20050050281A1 (en) System and method for cache external writing and write shadowing
JP2002215457A (en) Memory system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination