CN115357196A - Dynamically expandable set-associative cache method, apparatus, device and medium - Google Patents

Dynamically expandable set-associative cache method, apparatus, device and medium Download PDF

Info

Publication number
CN115357196A
CN115357196A CN202211068093.2A CN202211068093A CN115357196A CN 115357196 A CN115357196 A CN 115357196A CN 202211068093 A CN202211068093 A CN 202211068093A CN 115357196 A CN115357196 A CN 115357196A
Authority
CN
China
Prior art keywords
cache
data
block
temporary
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211068093.2A
Other languages
Chinese (zh)
Inventor
高峰
吴喜广
张凡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peng Cheng Laboratory
Original Assignee
Peng Cheng Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peng Cheng Laboratory filed Critical Peng Cheng Laboratory
Priority to CN202211068093.2A priority Critical patent/CN115357196A/en
Publication of CN115357196A publication Critical patent/CN115357196A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application discloses a dynamically expandable group-associative cache method, a device, equipment and a storage medium, belonging to the field of data processing. The method comprises the following steps: determining a first cache group from the data cache according to the write-in address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block; judging whether all cache blocks corresponding to the first cache group have residual storage space; if no storage space is left, screening out at least one temporary cache block from at least one shared cache block; the temporary cache block is a shared cache block which does not store data; associating the temporary cache block to the first cache group, and returning address information of the temporary cache block; and writing the data to be cached into the temporary cache block according to the address information. The resource utilization rate is improved.

Description

Dynamically expandable set-associative cache method, apparatus, device and medium
Technical Field
The present application relates to the field of data processing, and in particular, to a dynamically expandable set associative cache method, apparatus, device, and storage medium.
Background
In the related art, the cache organization structure is a group-associative structure, i.e., the cache is divided into a plurality of cache groups set, and each cache group has a plurality of cache blocks therein. The number of cache blocks contained in each set is the associativity of the cache. When data is written, the group in the cache where the data should be placed is determined according to the access address sent by the processor.
However, addresses accessed by the processor may not be evenly distributed to the various cache sets during program runtime. It is possible that the cache blocks in some cache sets are filled early and by then a large number of data blocks need to be put into the cache set, frequent replacement operations will occur, i.e. flushing a certain cache block to the next level of storage according to a replacement policy, where a new data block is written. While other cache sets may not be accessed late and the cache blocks therein remain idle. That is, the technical problem of low utilization rate of cache group resources exists in the prior art.
Content of application
The present application mainly aims to provide a dynamically extensible set associative cache method, device, equipment and storage medium, and aims to solve the technical problem of low resource utilization rate of the existing cache set.
To achieve the above object, the present application provides a dynamically expandable set associative cache method, the method comprising:
determining a first cache group from the data cache according to the write-in address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block;
judging whether all cache blocks corresponding to the first cache group have residual storage space;
if no storage space is left, screening out at least one temporary cache block from at least one shared cache block; the temporary cache block is a shared cache block which does not store data;
associating the temporary cache block to the first cache group, and returning address information of the temporary cache block;
and writing the data to be cached into the temporary cache block according to the address information.
In an embodiment, if there is no remaining storage space, determining a temporary cache block from at least one shared cache block includes:
if no storage space remains, sequentially determining the data storage condition of at least one shared cache block;
and taking the shared cache block which does not store the first data as the temporary cache block.
In an embodiment, after the writing the data to be cached into the temporary cache block according to the address information, the method further includes:
and if the data written in the temporary cache block is taken out, removing the association relation between the temporary cache block and the cache group.
In an embodiment, the method further comprises:
determining a second cache group of the data to be accessed in the data cache according to the access address of the received access request;
searching data to be accessed corresponding to the access request from all cache blocks corresponding to the second cache group;
if not, searching the data to be accessed from a second temporary cache block corresponding to the second cache group;
and if the hit occurs in the second temporary cache block, returning the storage data in the second temporary cache block.
In an embodiment, if the data to be accessed is not hit, searching the data to be accessed from the second temporary cache block corresponding to the second cache set includes:
if not, judging whether the second cache set is associated with the second temporary cache block;
and if the data to be accessed is associated, searching the data to be accessed from the second temporary cache block.
In one embodiment, a shared cache block indication table is preset in the data cache;
the determining whether the second cache set is associated with the second temporary cache block includes:
inquiring a temporary cache state corresponding to the second cache group in the shared cache block indication table;
if the temporary cache state is an effective state, determining that a second cache set is associated with the second temporary cache block;
if the temporary cache state is a failure state, determining that a second cache group is not associated with the second temporary cache block;
the searching the data to be accessed from the second temporary cache block comprises:
and searching the data to be accessed from a second temporary cache block according to second address information corresponding to the second cache group in the shared cache block indication table.
In an embodiment, after determining, according to an access address of the received access request, that the data to be accessed is in the second cache group in the data cache, the method further includes:
searching data to be accessed corresponding to the access request in parallel from all cache blocks and a second temporary cache block corresponding to the second cache group;
and if so, returning the hit storage data.
In a second aspect, the present application further provides a dynamically scalable set associative cache apparatus, the apparatus comprising:
the cache group determining module is used for determining a first cache group from the data cache according to the write address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block;
the space judgment module is used for judging whether all cache blocks corresponding to the first cache group have residual storage space;
the temporary cache determining module is used for screening out a temporary cache block from at least one shared cache block if no storage space remains; the temporary cache block is a shared cache block which does not store data;
a cache association module, configured to associate the temporary cache block with the first cache group, and return address information of the temporary cache block;
and the data writing module is used for writing the data to be cached into the temporary caching block according to the address information.
In a third aspect, the present application further provides a data caching device, including: a processor, a memory and a data caching program stored in the memory, wherein the data caching program implements the steps of the dynamically expandable set-associative caching method according to any one of the first aspect above when the data caching program is executed by the processor.
In a fourth aspect, the present application further provides a computer-readable storage medium having stored thereon a data caching program, which when executed by a processor implements the dynamically scalable set-associative cache method of any one of the first aspect.
The embodiment of the application provides a dynamically expandable set associative cache method, which comprises the following steps: determining a first cache group from the data cache according to the write-in address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block; judging whether all cache blocks corresponding to the first cache group have residual storage space or not; if no storage space remains, screening out a temporary cache block from at least one shared cache block; the temporary cache block is a shared cache block which does not store data; associating the temporary cache block to the first cache group, and returning the address information of the temporary cache block; and writing the data to be cached into the temporary cache block according to the address information.
Therefore, in the application, a part of the storage space in the data buffer is used as the shared cache block, and when some cache groups are frequently used and the storage spaces of the cache blocks included in the cache groups are all used, at least one shared cache block can be associated with the cache group, that is, at least one shared cache block is allocated to the cache group for use, so that the number of the cache blocks in the cache group is increased, the storage space of the cache group is expanded, the replacement is reduced, and the resource utilization rate is improved.
Drawings
FIG. 1 is a schematic diagram of a data caching device required by the dynamically extensible set associative cache method of the present application;
FIG. 2 is a flowchart illustrating a first embodiment of a dynamically extensible set-associative cache method according to the present invention;
FIG. 3 is a flow chart illustrating a second embodiment of the dynamically extensible set-associative cache method of the present application;
FIG. 4 is a flow chart illustrating a third embodiment of a dynamically extensible set-associative cache method according to the present application;
FIG. 5 is a flowchart illustrating a fourth embodiment of a dynamically extensible set-associative cache method according to the present invention;
FIG. 6 is a flow chart illustrating a fifth embodiment of the dynamically extensible set-associative cache method according to the present application;
FIG. 7 is a functional block diagram of an embodiment of a dynamically extensible set associative cache memory device according to the present application.
The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
In the related art, the running speed of a processor in a computer or a chip is often greater than the reading and writing speed of a main memory, and a cache is introduced to improve the speed of the processor accessing the main memory. A cache Memory is a Memory with a small capacity but high speed between a processor and a main Memory DRAM (Dynamic Random Access Memory), and is generally composed of an SRAM (Static Random Access Memory). There may be multiple levels of caching in the system. Because the buffer capacity is small, the data stored by the data storage device is a subset of the main memory data, and the storage access performance is improved by utilizing the time locality and space locality principles of programs. If the data needed by the processor is found in the cache of the current level, the data is hit (hit), if the data is hit, the slower next-level storage does not need to be accessed, the data can be directly returned from the cache, and the running speed of the cache is much higher than that of the main memory, so that the performance can be greatly improved; if the processor-needed data is not found in the present level of cache called miss, then the next level of storage is accessed to retrieve the needed data.
The minimum unit of data exchange between the cache and the next-level storage is a cache block (or line), that is: the cache and the next-level storage transmit data in units of cache blocks. A common cache organization structure is a set associative structure, that is, a cache is divided into a plurality of cache sets, and each cache set has a plurality of cache blocks therein. The number of cache blocks contained in each cache set is the associativity of the cache. When data is written, a cache group in the cache is determined according to the memory address sent by the processor, but which cache block in the group is not fixed, namely: any cache block of the set may place the data. For example, a 256-bank 4-way cache structure includes 256 banks, each bank has 4 cache blocks, and a cache bank in which a data block is stored in the cache is fixed, but is not fixed at 4 locations in the cache bank.
One type of failure in a set associative cache is referred to as a conflict failure. That is, because a cache set contains limited cache blocks, if many data blocks with different addresses are mapped to the same cache set, it may cause some data blocks to be discarded, and these data blocks are replaced the next time they are needed, which may cause a conflict to fail. Increasing the degree of association may reduce collision failures.
The number of groups and the association degree of the cache in the current general implementation are fixed, that is, the number of groups is not changed, the association degree is not changed, and the capacity of the cache is not changed. But the addresses accessed by the processor may not be evenly distributed across the various cache sets during program runtime. It may happen that the cache blocks in some cache sets are filled early and there are still a large number of data blocks to be put into the cache set, and frequent replacement operations will occur: i.e. a certain cache block is flushed to the next level of storage according to the replacement policy, where a new data block is written. While other cache sets may not be accessed late and the cache blocks therein remain idle. If the cache contains 1024 sets, each set has 4 cache blocks, i.e. 4-way set associative cache. If the group 0 and the group 1 are all idle, the group 2 has all valid data, and the group 3 has 3 valid blocks. At this time, if newly accessed data needs to be placed in the 2 nd group, a cache conflict invalidation occurs, a data block needs to be retrieved from a lower-level storage, one cache block of the 2 nd group is written to a lower-level storage, and new data is placed in the cache block, at this time, although other groups have many free positions, the other groups are not used, and the 2 nd group is full and needs to be replaced.
Therefore, which group the cache block belongs to is fixed in the related art, and even if the cache block of some groups is free and the cache block of some groups is frequently replaced, it cannot use the cache block of the free group.
Therefore, the method and the device provide a solution, the number of cache blocks in some cache groups is increased in a targeted manner according to the utilization condition of the blocks in each cache group, so that the occurrence frequency of replacement can be relieved, the failure rate is reduced, and the hit rate is improved, thereby improving the access performance. In the present application, increasing the number of cache blocks in certain cache sets is achieved by allocating shared cache blocks. That is, when the storage space of the cache blocks included in a certain cache group is used up due to frequent use of the cache group, at least one shared cache block may be associated with the cache group, that is, at least one shared cache block is allocated to the cache group for use, so as to increase the number of cache blocks in the cache group and expand the storage space of the cache group, thereby reducing the occurrence of replacement and further improving the resource utilization rate.
The inventive concept of the present application is further illustrated below with reference to some specific embodiments.
Referring to fig. 1, fig. 1 is a hardware operating environment according to an embodiment of the present inventionData cachingThe structure of the device is shown schematically.
As shown in fig. 1, the data caching apparatus may include: a processor 1001, such as a Central Processing Unit (CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. The memory 1005 may alternatively be a storage device separate from the processor 1001.
Those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of a data caching apparatus and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, the memory 1005, which is a storage medium, may include therein an operating system, a data storage module, a network communication module, a user interface module, and a data caching program.
In the data caching device shown in fig. 1, the network interface 1004 is mainly used for data communication with a network server; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the data caching device of the present application may be disposed in the data caching device, and the data caching device calls the data caching program stored in the memory 1005 through the processor 1001 and executes the dynamically extensible set-associative cache method provided in the embodiments of the present application.
Based on the above hardware structure but not limited to the above hardware structure, the present application provides a first embodiment of a data cache. Referring to FIG. 2, FIG. 2 is a flow chart illustrating a first embodiment of the dynamically extensible set associative cache method of the present application.
It should be noted that, although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown or described herein.
In this embodiment, the method includes:
s101, determining a first cache group from a data cache according to a write address of data to be cached; wherein the data buffer further comprises at least one shared buffer block.
Specifically, in this embodiment, the processor issues a cache instruction when processing the data to be cached, where the cache instruction includes a corresponding access address, i.e., a write address. The access address points to a certain cache set in the data cache. Based on the above description, it can be understood that the data buffer is a high-speed cache Memory with a small capacity located between the processor and the main Memory DRAM (Dynamic Random Access Memory), and is generally composed of an SRAM (Static Random Access Memory). There may be multiple levels of cache in the system, in which case the data buffer in this embodiment may be a level one cache. Furthermore, based on the above description, the write addresses of the processor may not be uniformly distributed to the respective cache sets when the program runs. At this point, the write address may point to some frequently used cache set.
In this embodiment, the whole cache space in the data cache is composed of a fixed storage space and a shared storage space, where the architecture of the fixed storage space is the same as that in the set-associative cache in the prior art, that is, which cache set each cache block belongs to is fixed and unchanged. The shared memory space is composed of at least one shared cache block, and the shared cache block is a cache block which does not fixedly belong to a certain cache group.
In one example, the address bit width of the data cache is 32 bits, and the cache block bit width is 4 bytes, then: the Offset field has a bit width of 2 bits (4 bytes wide is 2 bits), the index has a bit width of 8 bits (256 sets), and the Tag field has a bit width of 22 bits. The fixed storage space comprises 256 groups of cache groups, and each group comprises 2-way cache blocks. The shared memory space comprises 128 cache blocks. It will be appreciated that for simplicity of control cost, 2 consecutive shared cache blocks may be allocated as a unit, and the entire shared memory space may be allocated for use by 64 cache banks. Alternatively, each shared cache block may be allocated as a unit, and the entire shared memory space may be allocated to 128 cache sets. Specifically, the number of shared cache blocks used simultaneously as a unit may be configured by a register.
The structures of the cache blocks in the fixed storage space and the shared storage space are the same, and the cache blocks respectively comprise a 1-bit valid bit V field, a 22-bit Tag field and a 32-bit data field.
In some embodiments, the data buffer is configured with a shared memory management module, which is responsible for recording the total size of the shared memory, the grouping status, the utilization status of each group, and the allocation and reclamation of the shared buffer block.
In this step, a cache group to which data needs to be written is determined according to the issued write address.
Step S102, judging whether all cache blocks corresponding to the first cache group have residual storage space.
Step S103, if no storage space is left, screening out at least one temporary cache block from at least one shared cache block; the temporary cache block is a shared cache block which does not store data.
And step S104, associating the temporary cache block to the first cache group, and returning the address information of the temporary cache block.
Specifically, it is determined whether there is a free block in all the cache blocks corresponding to the first cache group in the fixed storage space, and if there is a free block in all the cache blocks corresponding to the first cache group, the data to be cached is written into the free block, and at this time, the data caching process is ended.
If there is no free block in all the cache blocks corresponding to the first cache group in the fixed storage space, the storage space corresponding to the first cache group in the fixed storage space is already used up, and there is no remaining storage space.
At this time, the storage space corresponding to the first cache set may be expanded. I.e. at least one temporary buffer block is screened out from the at least one shared buffer block. It will be appreciated that the temporary cache block is a blank shared cache block of the shared cache block that is not used by other cache groups and thus does not store data.
After the blank temporary cache block is screened out, the temporary cache block is associated to a first cache group, that is, the shared cache block of the shared storage space is temporarily allocated to the first cache group for use. In other words, when the association exists, the cache block included in the first cache group is composed of 2 parts: the fixed memory space holds the cache blocks belonging to it and the cache blocks temporarily allocated to it in the shared memory space. Therefore, in the embodiment, the storage space of the first cache set is expanded.
It will be appreciated that associating the temporary cache block to the first cache set, i.e. before disassociating the association, the temporary cache block may be considered to be a fixed cache block of the first cache set.
The temporary cache block is associated to the first cache group, and address information of the temporary cache block is returned, so that the processor can accurately determine the physical position of the temporary cache block.
It will be appreciated that since the temporary cache block may comprise 2 consecutive shared cache blocks, the address information returned at this time may be the starting address of the temporary cache block to save internal resources.
And step S105, writing the data to be cached into the temporary cache block according to the address information.
In this step, the processor writes the data to be cached into the temporary cache block.
In an example, the group 2 cache set in the fixed storage space needs to be written with data, but the 2 way cache blocks included in the fixed storage space all have valid data, and at this time, if the data to be written is put into the group 2, a conflict invalidation will occur. At this time, an expansion request may be sent to the shared storage space management module, and after receiving the expansion request, the shared storage space management module screens out 2 continuous blank shared cache blocks from at least one shared cache block managed by the shared storage space management module as a unit to be allocated to the group 2 for use. At this time, the group 2 may include 4 cache blocks in total, that is, 2 original cache blocks and 2 temporary cache blocks, and at this time, the data to be written may be placed into one of the 2 temporary cache blocks.
It can be seen that, in this embodiment, a part of the storage space in the data buffer is used as a shared cache block, and when some cache groups are frequently used and the storage spaces of the cache blocks included in the cache groups are all used, at least one shared cache block may be associated with the cache group, that is, at least one shared cache block is allocated to the cache group for use, so as to increase the number of cache blocks in the cache group, expand the storage space of the cache group, reduce the occurrence of replacement, and further improve the resource utilization rate.
As an embodiment, step S103 specifically includes:
and step S1031, if no storage space is left, sequentially determining the data storage condition of at least one shared cache block.
Step S1032 sets the first shared cache block not storing data as a temporary cache block.
Specifically, in this embodiment, in order to avoid the situation that if the cache groups are not uniformly used, in this embodiment, the shared cache blocks sequentially determine the data storage condition of at least one shared cache block, and then use the first searched shared cache block that is not allocated to the cache group in the fixed storage space as the temporary cache block. It is worth mentioning that the first shared cache block not allocated to a cache group of the fixed storage space may have been historically allocated to other cache groups, but at the present time, the shared cache block is not associated with any cache group.
In this embodiment, the sequential search of the shared cache space can also improve the search efficiency.
Based on the above embodiments, a second embodiment of the dynamically expandable set associative cache method of the present application is proposed. Referring to FIG. 3, FIG. 3 is a flow chart illustrating a second embodiment of a method for applying for a dynamically extensible set associative cache.
In this embodiment, after step S105, the method further includes:
step S106, if the data written in the temporary cache block is taken out, the association relation between the temporary cache block and the cache group is released.
In this embodiment, if the data written in the temporary cache block is taken out, the temporary data block is changed to a blank cache block again, and at this time, the association relationship between the temporary cache block and the cache group is released, and the temporary cache block is changed back to the shared cache block. Thus, the temporary cache block may be reallocated for use if other cache sets now request expansion of storage.
Based on the above embodiments, a third embodiment of a data cache of the present application is provided. Referring to FIG. 4, FIG. 4 is a flow chart illustrating a third embodiment of a method for applying for a dynamically extensible set associative cache.
It should be noted that, although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown or described herein.
In this embodiment, the method includes:
step S201, determining a second cache group of the data to be accessed in the data cache according to the access address of the received access request;
step S202, searching data to be accessed corresponding to the access request from all cache blocks corresponding to the second cache group;
step S203, if the data is not hit, searching the data to be accessed from a second temporary cache block corresponding to the second cache group;
in step S204, if the hit occurs in the second temporary cache block, the storage data in the second temporary cache block is returned.
It is understood that, in the present embodiment, steps S201 to S204 may be performed after, before, or even in parallel with steps S101 to S105, and this is not limited herein.
Specifically, when the processor needs to access data, the processor still issues a corresponding access address, and at this time, the processor determines a corresponding second cache group in the data cache according to the access address, and then searches all cache blocks corresponding to the second cache group to determine whether cache hit occurs.
Specifically, the method for determining cache hit includes: the valid bit of a cache block in the second cache set, whether the cache block is a cache block in the fixed storage space or a temporary cache block allocated by the shared storage space, is 1, and the tag field in the cache block is equal to the tag field in the access address. Data that hits the inside of the cache block is the required data.
In this embodiment, the cache block corresponding to the fixed storage space of the second cache group may be accessed first to determine whether the cache is hit, and then the temporary cache block corresponding to the shared storage space of the second cache group may be accessed to determine whether the cache is hit.
In one embodiment, step S203 includes:
step S2031, if not, judging whether the second cache set is associated with a second temporary cache block;
and step S2032, if the data is associated with the second temporary cache block, searching the data to be accessed from the second temporary cache block.
Specifically, when the shared storage space is allocated with a temporary storage space, i.e., a temporary cache block, for the second cache set, the data to be accessed is searched from the second temporary cache block.
If no temporary storage space is allocated in the shared storage space for the second cache set, the cache is invalidated, and the processor reads data from the next-level storage.
And it will also be appreciated that if there is also no hit in the shared memory space, the cache is invalidated, at which time the processor reads in data from the next level of storage.
Based on the above embodiments, a fourth embodiment of the dynamically extensible set associative cache method of the present application is proposed.
Referring to fig. 5, in the embodiment, a shared cache block indication table is preset in the data cache. The method comprises the following steps:
step S301, determining a second cache group of the data to be accessed in the data cache according to the access address of the received access request;
step S302, searching data to be accessed corresponding to the access request from all cache blocks corresponding to the second cache group;
step S303, if the cache is not hit, inquiring a temporary cache state corresponding to a second cache group in the shared cache block indication table;
step S304, if the temporary cache state is an effective state, determining that the second cache group is associated with a second temporary cache block;
step S305, if the temporary cache state is the invalid state, it is determined that the second cache group is not associated with the second temporary cache block.
Step S306, if the data to be accessed is associated with the data to be accessed, searching for the data to be accessed from a second temporary cache block according to second address information corresponding to the second cache group in the shared cache block indicator table.
In step S307, if a hit occurs in the second temporary cache block, the data stored in the second temporary cache block is returned.
It is understood that, in the present embodiment, steps S301 to S307 may be performed after, before, or even in parallel with steps S101 to S105, and this is not limited herein.
Specifically, in this embodiment, the shared cache block indication table is used to indicate whether each cache set has an allocated cache block in the shared storage space. In one example, the number of entries is the same as the number of sets of cache sets, i.e., 256 entries, each corresponding to a cache set in order (entry 0 corresponds to set 0, entry 1 corresponds to set 1, and so on through entry 255 corresponds to set 255). The shared cache block indicator table has a bit valid bit (V field) indicating whether the cache set has a temporary cache block in the shared memory space, i.e. for indicating its temporary cache status. Wherein a valid bit of 1 indicates that the cache set has an allocated temporary cache block in the shared memory space, i.e. is in a valid state. The valid bit is 0: indicating that no temporary cache blocks are allocated, i.e., in a stale state.
If the valid bit is 1, the Baddr field of the shared cache block indication table (7 bits wide because of 128 shared blocks in total) indicates the starting address of the 2 cache blocks allocated by the cache group in the shared memory space. Thus, the data of the group can be found from the shared space according to the Baddr field.
It can be understood that, when data is written, if a certain cache group needs to expand a space, a request is sent to the shared storage space management module, the shared storage space management module searches for a first shared cache block which is not allocated from the 0 th group in sequence, if the search is successful, an allocation success flag and a starting address of an allocation space are returned, and if all shared spaces are used up, the request is returned to fail. And the data buffer updates the content of the shared buffer block indication table according to the request result and the obtained address: if the allocation is successful, the effective position of the group in the shared space indication table is 1, and Baddr is written as the requested address, otherwise, the shared space indication table is not modified.
It can be understood that, if a certain cache block is released, it also needs to notify the shared storage space management module, and the shared storage space management module determines whether all data blocks in the temporary cache block allocated to the cache group are invalid, and if all data blocks are invalid, sets the valid position of the group in the shared space indication table to 0, that is, unbundles the temporary cache block and the cache group.
Alternatively, in a fifth embodiment of the present application, referring to fig. 6, the method includes:
step S401, according to the access address of the received access request, determining a second cache group of the data to be accessed in the data cache;
step S402, searching the data to be accessed corresponding to the access request in parallel from all the cache blocks corresponding to the second cache group and the second temporary cache block.
And step S403, if the data is hit, returning the hit stored data.
That is, in this embodiment, in order to improve efficiency, to-be-accessed data corresponding to the access request may be searched in parallel from all the cache blocks corresponding to the second cache group and the second temporary cache block.
In order to enable those skilled in the art to better understand the scope of the claims herein. The technical solutions recited in the claims of the present application are explained below by specific implementation examples in specific application scenarios, and it should be understood that the following examples are only used for explaining the present application, and are not used for limiting the scope of protection of the claims of the present application.
In a specific example, the address bit width of the data cache is 32 bits, and the cache block bit width is 4 bytes, then: the Offset field has a bit width of 2 bits (4 bytes wide is 2 bits), the index has a bit width of 8 bits (256 sets), and the Tag field has a bit width of 22 bits.
Its storage structure includes: the fixed storage space 256 groups of 2-way cache blocks and a certain number of shared storage spaces, such as 128 shared cache blocks, the 2 consecutive shared cache blocks are allocated as a unit, and the whole shared storage space can be divided into 64 cache groups. Furthermore, each cache set may have a chance to expand 2 temporary cache blocks at a time, i.e. the number of cache blocks in the cache set is changed from 2 to 4, depending on the register configuration.
The buffer block structures in the fixed storage space and the shared storage space are the same, and the buffer block structures comprise a 1-bit valid bit V field, a 22-bit Tag field and a 32-bit data field.
When reading data, firstly, finding out a cache group corresponding to an access address according to the access address, namely, a value of an index field in the address.
Searching all cache blocks of the cache group in the fixed storage space, if the valid of a certain cache block is 1 and the content of the tag is the same as the tag of the address to be accessed, hitting, and otherwise, missing in the fixed storage space. A hit returns the required data according to the offset in the address.
And in the case of a miss of the fixed storage space, searching whether a valid bit corresponding to the cache set (also indicated by the index field) in the shared cache block indication table is valid. If the value is 1, the validity indicates that space is allocated to the cache group in the shared storage space, and if the value is 0, the condition indicates that no temporary cache block is allocated to the cache group in the shared storage space.
Under the condition that the group in the shared cache block indication table is valid, finding the initial address of the group of temporary cache blocks in the shared storage space according to the Baddr of the indication table, taking out all the temporary cache blocks of the cache group in the shared storage space, and judging whether the cache blocks hit. If hit, the required data is returned according to the offset in the address. Otherwise, the needed data does not exist in the cache, namely the cache fails, and the data needs to be read from the next-level memory.
When the cache fails and data need to be read from the next-level memory and written into the cache, whether a free cache block exists in the fixed storage space of the cache group is judged through the Valid bit. If the buffer group has a free buffer block in the fixed storage space, the newly read data block is written into the free position of the fixed storage space, and the data is returned according to the offset.
If the cache group has no free cache block in the fixed storage space, judging whether the valid bit of the cache group in the shared cache block indication table is set to be 1.
If yes, the cache group in the shared storage space has a temporary cache block. And writing the data into the free block of the cache group in the shared storage space, and returning the data according to the offset.
If the valid bit is 0, sending a request to the shared storage space management module, searching the first unallocated shared cache block from the 0 th group by the shared storage space management module in sequence, if the search is successful, returning an allocation success mark and a starting address of the allocated space, and if all the shared spaces are used up, returning the request to fail. The data buffer updates the content of the shared buffer block indication table according to the request result and the obtained address: if the allocation is successful, the effective position of the group in the shared space indication table is 1, and Baddr is written as the requested address, otherwise, the shared space indication table is not modified.
It is worth mentioning that if the shared memory space is free, then blocks may be allocated to the cache set. If the shared memory space is not free, a new shared cache block cannot be allocated to the cache group, or if the shared memory space allocated to the cache group is not free, a block replacement is selected according to a replacement policy (the old data block is written into the next-level memory, and the new block is written into the position), and data is returned according to the offset.
It can be seen that, in this embodiment, the conventional set associative cache structure is divided into a fixed storage space and a shared storage space, so that after the fixed storage space of a certain cache set is used up, a part of the space can be expanded from the shared storage space, thereby increasing the number of cache blocks in the cache set, expanding the storage space, reducing the occurrence of replacement, and further improving the resource utilization rate.
Based on the same inventive concept, referring to fig. 7, the present application further provides a dynamically expandable set associative cache apparatus, comprising:
the cache group determining module is used for determining a first cache group from the data cache according to the write address of the data to be cached; wherein, the data buffer also comprises at least one shared buffer block;
the space judgment module is used for judging whether all cache blocks corresponding to the first cache group have residual storage space;
the temporary cache determining module is used for screening out at least one temporary cache block from at least one shared cache block if no storage space remains; the temporary cache block is a shared cache block which does not store data;
the cache association module is used for associating the temporary cache block to the first cache group and returning the address information of the temporary cache block;
and the data writing module is used for writing the data to be cached into the temporary cache block according to the address information.
It should be noted that, for the various embodiments of the dynamically extensible set-associative cache apparatus in this embodiment and the technical effects achieved by the same, reference may be made to the various embodiments of the dynamically extensible set-associative cache method in the foregoing embodiments, and no further description is given here.
In addition, an embodiment of the present application further provides a computer storage medium, in which a data caching program is stored, and the data caching program, when executed by a processor, implements the steps of the dynamically extensible set-associative cache method as described above. Therefore, a detailed description thereof will be omitted. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of the embodiments of the method of the present application. It is determined that the program instructions may be deployed to be executed on one computing device or on multiple computing devices located at one site or distributed across multiple sites and interconnected by a communication network, as examples.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where units illustrated as separate components may or may not be physically separate, and components illustrated as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, which may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, where the computer software product is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a Read-only memory (ROM), a random-access memory (RAM), a magnetic disk or an optical disk of a computer, and includes instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods of the embodiments of the present application.
The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all the equivalent structures or equivalent processes that can be directly or indirectly applied to other related technical fields by using the contents of the specification and the drawings of the present application are also included in the scope of the present application.

Claims (10)

1. A dynamically scalable set associative cache method, said method comprising:
determining a first cache group from the data cache according to the write address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block;
judging whether all cache blocks corresponding to the first cache group have residual storage space or not;
if no storage space remains, screening out at least one temporary cache block from at least one shared cache block; the temporary cache block is a shared cache block which does not store data;
associating the temporary cache block to the first cache group, and returning address information of the temporary cache block;
and writing the data to be cached into the temporary cache block according to the address information.
2. The dynamically scalable set associative cache method according to claim 1, wherein said screening out at least one temporary cache block from at least one shared cache block if there is no remaining storage space comprises:
if no storage space remains, sequentially determining the data storage condition of at least one shared cache block;
and taking the first shared cache block which does not store data as the temporary cache block.
3. The dynamically scalable set associative cache method according to claim 2, wherein after writing the data to be cached into the temporary cache block according to the address information, the method further comprises:
and if the data written in the temporary cache block is taken out, removing the association relation between the temporary cache block and the cache group.
4. The dynamically scalable set-associative cache method according to claim 1, further comprising:
determining a second cache group of the data to be accessed in the data cache according to the access address of the received access request;
searching data to be accessed corresponding to the access request from all cache blocks corresponding to the second cache group;
if not, searching the data to be accessed from a second temporary cache block corresponding to the second cache group;
and if the hit occurs in the second temporary cache block, returning the storage data in the second temporary cache block.
5. The dynamically scalable set associative cache method according to claim 4, wherein searching for the data to be accessed from a second temporary cache block corresponding to the second cache set if the data is not hit comprises:
if not, judging whether the second cache set is associated with the second temporary cache block;
and if the data to be accessed is associated, searching the data to be accessed from the second temporary cache block.
6. The dynamically scalable set associative cache method according to claim 5, wherein a shared cache block indicator table is pre-configured in the data cache;
the determining whether the second cache set is associated with the second temporary cache block includes:
inquiring a temporary cache state corresponding to the second cache group in the shared cache block indication table;
if the temporary cache state is an effective state, determining that a second cache group is associated with the second temporary cache block;
if the temporary cache state is a failure state, determining that a second cache group is not associated with the second temporary cache block;
the searching the data to be accessed from the second temporary cache block comprises:
and searching the data to be accessed from a second temporary cache block according to second address information corresponding to the second cache group in the shared cache block indication table.
7. The dynamically scalable set associative cache method according to claim 6, wherein said determining a second cache set of data to be accessed in the data cache according to the access address of the received access request further comprises:
searching data to be accessed corresponding to the access request in parallel from all cache blocks and a second temporary cache block corresponding to the second cache group;
and if so, returning the hit storage data.
8. A dynamically expandable set associative cache apparatus, the apparatus comprising:
the cache group determining module is used for determining a first cache group from the data cache according to the write address of the data to be cached; wherein the data buffer further comprises at least one shared buffer block;
the space judgment module is used for judging whether all cache blocks corresponding to the first cache group have residual storage space or not;
the temporary cache determining module is used for screening out at least one temporary cache block from at least one shared cache block if no storage space remains; the temporary cache block is a shared cache block which does not store data;
the cache association module is used for associating the temporary cache block to the first cache group and returning the address information of the temporary cache block;
and the data writing module is used for writing the data to be cached into the temporary caching block according to the address information.
9. A data caching apparatus, comprising: a processor, a memory, and a data caching program stored in the memory, the data caching program, when executed by the processor, implementing the steps of the dynamically scalable set-associative cache method according to any one of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a data caching program, which when executed by a processor implements the dynamically scalable set-associative caching method according to any one of claims 1 to 7.
CN202211068093.2A 2022-08-31 2022-08-31 Dynamically expandable set-associative cache method, apparatus, device and medium Pending CN115357196A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211068093.2A CN115357196A (en) 2022-08-31 2022-08-31 Dynamically expandable set-associative cache method, apparatus, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211068093.2A CN115357196A (en) 2022-08-31 2022-08-31 Dynamically expandable set-associative cache method, apparatus, device and medium

Publications (1)

Publication Number Publication Date
CN115357196A true CN115357196A (en) 2022-11-18

Family

ID=84005895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211068093.2A Pending CN115357196A (en) 2022-08-31 2022-08-31 Dynamically expandable set-associative cache method, apparatus, device and medium

Country Status (1)

Country Link
CN (1) CN115357196A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010109A (en) * 2023-02-23 2023-04-25 摩尔线程智能科技(北京)有限责任公司 Cache resource allocation method and device, electronic equipment and storage medium
CN117149781A (en) * 2023-11-01 2023-12-01 中电科申泰信息科技有限公司 Group-associative self-adaptive expansion cache architecture and access processing method thereof

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010109A (en) * 2023-02-23 2023-04-25 摩尔线程智能科技(北京)有限责任公司 Cache resource allocation method and device, electronic equipment and storage medium
CN117149781A (en) * 2023-11-01 2023-12-01 中电科申泰信息科技有限公司 Group-associative self-adaptive expansion cache architecture and access processing method thereof
CN117149781B (en) * 2023-11-01 2024-02-13 中电科申泰信息科技有限公司 Group-associative self-adaptive expansion cache architecture and access processing method thereof

Similar Documents

Publication Publication Date Title
CN111602377B (en) Resource adjusting method in cache, data access method and device
CN115357196A (en) Dynamically expandable set-associative cache method, apparatus, device and medium
US9772943B1 (en) Managing synonyms in virtual-address caches
US10929308B2 (en) Performing maintenance operations
CN105740164A (en) Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device
US7925857B2 (en) Method for increasing cache directory associativity classes via efficient tag bit reclaimation
US10108553B2 (en) Memory management method and device and memory controller
CN115168247B (en) Method for dynamically sharing memory space in parallel processor and corresponding processor
WO2024045586A1 (en) Cache supporting simt architecture and corresponding processor
US10853262B2 (en) Memory address translation using stored key entries
US7596665B2 (en) Mechanism for a processor to use locking cache as part of system memory
US5953747A (en) Apparatus and method for serialized set prediction
US11829292B1 (en) Priority-based cache-line fitting in compressed memory systems of processor-based systems
US7562204B1 (en) Identifying and relocating relocatable kernel memory allocations in kernel non-relocatable memory
CN115543532A (en) Processing method and device for missing page exception, electronic equipment and storage medium
CN114546898A (en) TLB management method, device, equipment and storage medium
US7290107B2 (en) Direct deposit using locking cache
US7237084B2 (en) Method and program product for avoiding cache congestion by offsetting addresses while allocating memory
WO2024045817A1 (en) Method for scheduling returned data of simt architecture processor, and corresponding processor
CN113138851B (en) Data management method, related device and system
US20100257319A1 (en) Cache system, method of controlling cache system, and information processing apparatus
US10831673B2 (en) Memory address translation
US5966737A (en) Apparatus and method for serialized set prediction
CN107861819B (en) Cache group load balancing method and device and computer readable storage medium
US20230236961A1 (en) Priority-Based Cache-Line Fitting in Compressed Memory Systems of Processor-Based Systems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination