CN117389914B

CN117389914B - Cache system, cache write-back method, system on chip and electronic equipment

Info

Publication number: CN117389914B
Application number: CN202311695046.5A
Authority: CN
Inventors: 王克行; 李健
Original assignee: Beijing Xiangdixian Computing Technology Co Ltd
Current assignee: Beijing Xiangdixian Computing Technology Co Ltd
Priority date: 2023-12-12
Filing date: 2023-12-12
Publication date: 2024-04-16
Anticipated expiration: 2043-12-12
Also published as: CN117389914A

Abstract

The disclosure provides a cache system, a cache write-back method, a system-on-chip, an electronic component and an electronic device, wherein the method comprises the following steps: for a read-write request of a missed cache line, a cache controller acquires current opening and closing information of each storage bank in a main memory, and selects a first target cache line which does not cause page conflict from a linked list to write back to the main memory by combining an LRU principle with the current opening and closing information; the main memory is located in a downstream device of the cache system. By the scheme, the probability of page conflict during cache write-back is reduced, and the access efficiency of the main memory is improved.

Description

Cache system, cache write-back method, system on chip and electronic equipment

Technical Field

The disclosure relates to the technical field of cache, and in particular relates to a cache system, a cache write-back method, a system on chip, an electronic component and electronic equipment.

Background

An upstream master (host side or upstream device) and downstream host are typically provided within the chip. The upstream master may be a component of a CPU (Central Processing Unit ), GPU (Graphics Processing Unit, GPU, graphics processor), DMA (Direct Memory Access ), or the like. The downstream main memory is a shared memory unit in a chip, and each upstream master with access authority can initiate a read-write access request for the downstream main memory.

In the prior art, dynamic random access memory (Dynamic Random Access Memory, DRAM) is generally used as the main memory of a System On Chip (SOC) chip. Common drams include DDR (double data rate memory), GDDR (Graphics Double Data Rate, graphics double data rate memory), and the like. To mitigate speed differences between individual masters and the main memory to be accessed, one or more levels of cache (cache) are typically inserted before the masters and main memory. If the upstream master has the requirement of accessing the main memory, the upstream master can firstly search in the cache. If the match or hit, then directly performing an operation on the cache, e.g., a read operation directly reads data from the cache, a write operation writes the data to the cache; if there is a miss, a new cacheline (cache line) is allocated for the current pen read/write operation, which involves a cacheline replacement, i.e., writing the original contents of cacheline back to main memory, and allocating the emptied cacheline to the current pen read/write operation. The mapping rule of the address information and the main memory grain also comprises

The prior art is generally based on a least recently Used (LEAST RECENTLY Used, LRU) policy to replace cacheline, i.e., write back the data in cacheline (i.e., cacheline pointed to by the end of the linked list) that was last accessed the least. Although the LRU policy has an advantage in improving the cache hit rate, since the data address written back to the main memory is random for the main memory, it may cause the address sent to the main memory this time and the address sent to the main memory last time to be different row (row) accessing the same bank (bank) of the main memory, thereby causing page conflict. Therefore, if the cache is simply replaced to the main memory according to the LRU principle, a large number of page conflicts may be caused, thereby affecting the access efficiency of the main memory.

Disclosure of Invention

The purpose of the present disclosure is to provide a cache system, a cache write-back method, a system on chip, an electronic component and an electronic device, which are favorable for reducing the probability of page conflict during cache write-back, and further improving the access efficiency of main memory.

According to one aspect of the present disclosure, there is provided a cache system including a cache controller and a cache, the cache including a plurality of cache lines; the cache system stores the historical access sequence of the cache line through a linked list, and updates the historical access sequence based on the least recently used LRU principle;

For read and write requests that miss a cache line, the cache controller is configured to: acquiring current opening and closing information of each memory bank in a main memory, and selecting a first target cache line which does not cause page conflict from the linked list to write back to the main memory by combining the LRU principle with the current opening and closing information; the main memory is located in a downstream device of the cache system.

In a possible implementation manner of the present disclosure, in a case where the downstream device has a capability of counting current opening and closing information of each bank in the main memory, the cache controller is specifically configured to: and acquiring current opening and closing information of each bank in the main memory which is pulled out from the downstream equipment.

In a possible implementation of the disclosure, the cache system further includes a monitoring module configured to: monitoring a read-write request sent to the main memory by an upstream device of the cache system, and generating and maintaining the opening and closing information of each bank in the main memory according to address information carried by the read-write request and a mapping rule of pre-configured address information and main memory particles;

The cache controller is specifically configured to, under the condition of acquiring current opening and closing information of each bank in the main memory: and acquiring the opening and closing information of each bank in the main memory generated and maintained by the monitoring module.

In a possible implementation manner of the present disclosure, mapping rules of address information and main memory particles carried by a read-write request sent to the main memory by an upstream device of the cache system are preconfigured in the cache system; the opening and closing information comprises opening and closing states of all banks in the main memory and corresponding opened rows of the banks in the opening states; the cache controller is specifically configured to, when combining the LRU principle with current switching information and selecting a first target cache line that does not cause a page conflict from the linked list: starting from the tail part of the linked list, selecting a cache line which meets first requirements or second requirements at first, and selecting the first target cache line;

The first requirement includes: mapping address information carried by a read-write request corresponding to a cache line to corresponding main memory particles as candidate particles according to the mapping rule, and inquiring that a bank and a line where the candidate particles are in the main memory are in an open state according to current opening and closing information; the second requirement includes: mapping address information carried by a read-write request corresponding to a cache line to corresponding main memory particles as candidate particles according to the mapping rule, and inquiring that a bank in which the candidate particles are in a closed state in the main memory according to current opening and closing information.

In a possible implementation of the present disclosure, the first requirement has a higher priority than the second requirement.

In a possible implementation manner of the present disclosure, when starting from the tail of the linked list, the cache controller selects the first target cache line that first meets the first requirement or the second requirement, the cache controller is specifically configured to: starting from the tail part of the linked list, judging whether a cache line meeting the first requirement exists, and if so, selecting a cache line meeting the first requirement first to the first target cache line; and if the first target cache line does not exist, starting from the tail part of the linked list, selecting the cache line which meets the second requirement first.

In a possible implementation manner of the present disclosure, the mapping rule of the address information and the host granule includes: and respectively indicating the row information and the bank information of the main memory particles mapped to the address information in the main memory by using the first preset bit and the second preset bit in the address information.

In a possible implementation manner of the present disclosure, the mapping rule of the address information and the host granule further includes: and respectively using the third preset bit, the fourth preset bit and the fifth bit in the address information to indicate column information of the main memory particles mapped by the address information in the main memory, the interleaving mode of the main memory, the bank information of the main memory particles mapped by the address information in the main memory to be inserted into the position of row information or column information, and/or using the sixth preset bit in the address information to indicate whether the main memory starts a memory bank alternating access bank group rotation mode.

In a possible implementation manner of the present disclosure, the linked list includes a plurality of entries, and the cache controller is specifically configured to, when selecting the first target cache line that first meets the first requirement or the second requirement,: and selecting the first target cache line of the cache line which meets the first requirement or the second requirement first within the range of a preset entry window.

In one possible implementation of the present disclosure, the cache controller is further configured to: and if the first target cache is determined not to exist, selecting a second target cache line to write back to the main memory, wherein the second target cache line is a cache line currently positioned at the tail part of the linked list.

According to another aspect of the present disclosure, a cache write-back method is provided and applied to a cache system, where the cache system includes a cache controller and a cache, and the cache includes a plurality of cache lines; the cache system stores the historical access sequence of the cache line through a linked list, and updates the historical access sequence based on the least recently used LRU principle; the method comprises the following steps:

For a read-write request of a missed cache line, the cache controller acquires current opening and closing information of each storage bank in a main memory, and selects a first target cache line which does not cause page conflict from the linked list to write back to the main memory by combining the LRU principle with the current opening and closing information; the main memory is located in a downstream device of the cache system.

In a possible implementation manner of the present disclosure, the obtaining, by the cache controller, current opening and closing information of each bank in the main memory includes: and under the condition that the downstream equipment has the capability of counting the current opening and closing information of each bank in the main memory, the cache controller acquires the current opening and closing information of each bank in the main memory pulled from the downstream equipment.

In a possible implementation manner of the present disclosure, the cache system further includes a monitoring module, and the method further includes: the monitoring module monitors a read-write request sent to the main memory by an upstream device of the cache system, and generates and maintains the opening and closing information of each bank in the main memory according to address information carried by the read-write request and a mapping rule of pre-configured address information and main memory particles;

Correspondingly, the obtaining the current opening and closing information of each storage bank in the main memory includes: and the cache controller acquires the opening and closing information of each bank in the main memory, which is generated and maintained by the monitoring module.

In a possible implementation manner of the present disclosure, mapping rules of address information and main memory particles carried by a read-write request sent to the main memory by an upstream device of the cache system are preconfigured in the cache system; the opening and closing information comprises opening and closing states of all banks in the main memory and corresponding opened rows of the banks in the opening states; the selecting a first target cache line that does not cause page conflict from the linked list by combining the LRU principle and current switching information includes: starting from the tail part of the linked list, selecting a cache line which meets first requirements or second requirements at first, and selecting the first target cache line;

In a possible implementation manner of the present disclosure, the selecting, starting from the tail of the linked list, the first target cache line that first meets the first requirement or the second requirement, includes: starting from the tail part of the linked list, judging whether a cache line meeting the first requirement exists, and if so, selecting a cache line meeting the first requirement first to the first target cache line; and if the first target cache line does not exist, starting from the tail part of the linked list, selecting the cache line which meets the second requirement first.

In a possible implementation manner of the present disclosure, the linked list includes a plurality of entries, and the selecting the first target cache line that first meets the first requirement or the second requirement includes: and selecting the first target cache line of the cache line which meets the first requirement or the second requirement first within the range of a preset entry window.

In a possible implementation manner of the present disclosure, the method further includes: and the cache controller selects a second target cache line to write back to the main memory under the condition that the first target cache is determined not to exist, wherein the second target cache line is the cache line currently positioned at the tail part of the linked list.

According to another aspect of the present disclosure, there is also provided a System On Chip (SOC) including the above-described cache System. In some use cases, the product form of the SOC is embodied as a GPU (Graphics Processing Unit, graphics processor) SOC; in other usage scenarios, the product form of the SOC is embodied as a CPU (Central Processing Unit ) SOC.

According to another aspect of the present disclosure, there is also provided an electronic component including the system on chip SOC described in any of the embodiments above. In some use scenarios, the product form of the electronic assembly is embodied as a graphics card; in other use cases, the product form of the electronic assembly is embodied as a CPU motherboard.

According to another aspect of the present disclosure, there is also provided an electronic device including the above-described electronic component. In some use scenarios, the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some use cases, the electronic device is in the form of a personal computer, a game console, or the like.

Drawings

FIG. 1 is a schematic diagram of a linked list of one embodiment of the present disclosure;

FIG. 2 is one of the architectural diagrams of a cache system of one embodiment of the present disclosure;

FIG. 3 is a second schematic diagram of a cache system according to one embodiment of the disclosure;

FIG. 4 is a diagram illustrating mapping of address information to a host granule address according to one embodiment of the present disclosure;

FIG. 5 is a schematic diagram of a preset entry window range of one embodiment of the present disclosure;

FIG. 6 is a flow chart illustrating a method of cache write back according to one embodiment of the disclosure.

Detailed Description

Before describing embodiments of the present disclosure, it should be noted that:

some embodiments of the disclosure are described as process flows, in which the various operational steps of the flows may be numbered sequentially, but may be performed in parallel, concurrently, or simultaneously.

The terms "first," "second," and the like may be used in embodiments of the present disclosure to describe various features, but these features should not be limited by these terms. These terms are only used to distinguish one feature from another.

The term "and/or," "and/or" may be used in embodiments of the present disclosure to include any and all combinations of one or more of the associated features listed.

It will be understood that when two elements are described in a connected or communicating relationship, unless a direct connection or direct communication between the two elements is explicitly stated, connection or communication between the two elements may be understood as direct connection or communication, as well as indirect connection or communication via intermediate elements.

In order to make the technical solutions and advantages of the embodiments of the present disclosure more apparent, the following detailed description of exemplary embodiments of the present disclosure is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present disclosure, not all embodiments of which are exhaustive. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other.

The purpose of the present disclosure is to provide a cache write-back scheme, which can reduce the probability of page conflict generated when the cache writes back to main memory, thereby realizing the effect of improving the access efficiency of main memory and improving the performance of the whole chip. The following will describe the present disclosure.

First, a brief description will be given of some concepts related to the present disclosure.

The upstream device of the cache system may be a Master device such as a CPU Core, a DMA, etc. The downstream device of the cache device may be a shared memory unit located in the chip, and accordingly, a main memory may be formed in the downstream device. Each upstream device with access rights can initiate a read-write access request to the downstream host.

The buffer system can be connected with the upstream device and the downstream device respectively through buses, and correspondingly, address information generated in the interaction process of the three devices is an address conforming to a corresponding bus protocol, for example, an AXI bus is adopted for connection, and address information generated in the interaction process is an AXI address. Of course, it can be understood that the cache system can also be connected with the upstream and downstream devices through other buses, and correspondingly, the address information generated in the interaction process of the three is the address conforming to other bus protocols.

For downstream devices, DRAM is typically used as the main memory, which may be DDR, GDDR, LPDDR or the like. The memory array of the DRAM is divided into a plurality of rows row, columns col, and banks bank. Commands (e.g., AXI commands) to access a DRAM (e.g., GDDR) may be address mapped to access different banks, different row of the GDDR.

The address mapping, that is, the address information of the upstream DRAM used for accessing the downstream DRAM is mapped into the address information of the memory grain in the DRAM based on the address mapping rule conforming to the bus protocol in the main memory.

Since row corresponds to a page (page) at the software level, the current and subsequent commands access the same row of the same bank of the GDDR, also referred to as a page hit, where the access efficiency is highest; the two commands access different row of the same bank, also called page conflict (page conflict), which results in the lowest access efficiency. Based on this, page conflicts should be avoided as much as possible when accessing the host of the downstream device.

Cache systems typically include a Cache Controller (Cache Controller) and a Cache (Cache). The caches are divided into multiple Cache banks and multiple Cache lines are included within each Cache bank (cacheline). Of course, it should be noted that, in the embodiments of the present disclosure, the Cache may be a first-level Cache or a multi-level Cache.

In the embodiment of the disclosure, when an upstream device has a need to initiate read/write data for the main memory of a downstream device, a read/write request carrying address information is initiated.

Similar to the prior art, the read/write request is firstly acquired by a Cache system, the Cache Controller determines a Cache group corresponding to the read/write request based on preset low-order information (index in corresponding Cache addressing) in address information carried by the read/write request, then searches in Cache lines included in the Cache group based on preset high-order information (tag in corresponding Cache addressing) in the address information, if the tags of the read/write request are consistent, the Cache line is hit, and the read/write operation can be directly performed in the Cache line, otherwise, the Cache line of the read/write request is indicated.

For a read/write request of a miss Cache line, the Cache Controller performs a Cache write-back operation, that is, selects at least one Cache line included in the Cache set from the Cache set determined by address information (for convenience of description, the first address information may be described as first address information, and the first address information may be an AXI address) carried by the read/write request, and writes back content in the selected Cache line to main memory, and then allocates the content to the read/write request.

And writing the content in the selected cache line back to the main memory, namely reversely addressing the offset, tag and index corresponding to the cache line in the cache group to obtain the first address information corresponding to the cache line. And addressing the first address information, namely mapping the first address information into the main memory grain corresponding to a row of a certain bank in the main memory based on a mapping rule of the address information carried by a read-write request sent to the downstream main memory by the upstream equipment and the main memory grain, and then storing the current stored content of the selected cache line in the corresponding main memory grain.

Of course, the above addressing of the cache group and the cache line based on the address information and the addressing of the main memory granule based on the address information are all relatively conventional techniques and only exemplary, and specific addressing processes thereof are not described herein. In addition, other grouping methods and/or addressing methods for cache sets in the prior art are also applicable to the present disclosure, as are other addressing methods for the host particles.

In the above process, a corresponding linked list is generally set for each cache group, and is used for recording the historical operation sequence of the cache lines in the cache group. Specifically, the linked list includes a plurality of entries, a head entry and a tail entry for dynamically indicating a specific cache line, for example, the cache line indicated by the head entry of each linked list indicates a cache line that is used most recently from among all the cache lines indicated by the linked list, and the cache line indicated by the tail entry of each linked list indicates a least recently used cache line among all the cache lines indicated by the linked list. As shown in FIG. 1, it is assumed that the cache set includes 16 cacheline, and correspondingly, 16 entries exist in the corresponding linked list, and the numbers from beginning to end are respectively from entry0 to entry15 (i.e., 0 to 15 shown in the figure). Based on the LRU principle, the recently operated cache line is always located at the head entry (e.g., entry 0), if there is currently a cache line in the non-head entry being operated, the corresponding entry locations where the original cache lines are located are sequentially moved backward, e.g., cacheline in the chain header is moved forward from entry0 to entry1, cacheline in entry1 is moved forward to entry2, and so on.

When a cache line needs to be selected from a cache set for writing back to main memory, the prior art is generally based on the least recently used LRU policy replacement cacheline (i.e., selecting a cache line for writing back to main memory), i.e., selecting at least one cache line from the cache set, writes back the data in cacheline (i.e., cacheline pointed to by the tail of the linked list) that last accessed the least. Although the LRU policy has an advantage in improving the cache hit rate, since the address information of the data written back to the main memory is random for the main memory, it is highly likely that the address information sent to the main memory this time and the address information sent to the main memory last time are different row accessing the same bank of the main memory, and thus a page conflict is caused. Therefore, simply replacing cacheline to main memory according to the LRU principle may result in a large number of page conflicts, thereby affecting the access efficiency of main memory.

To solve the above-mentioned problems, referring specifically to fig. 2, one embodiment of the present disclosure proposes a Cache system, which includes a Cache Controller (Cache Controller), a Cache, and other components (e.g., registers, if necessary). The cache system is connected to the upstream device and to the main memory of the downstream device, for example via an AXI bus.

The history access records of the cache lines belonging to the same cache group are in a linked list corresponding to the cache group, and the cache system updates the cache lines recorded by each entry in the corresponding linked list based on the LRU principle so as to update the history access sequence of each cache line.

In the disclosed embodiment, after a read-write request miss cache line from upstream, to avoid subsequent page conflicts as much as possible, the cache controller is configured to: and acquiring current opening and closing information of each bank in the main memory of the downstream equipment, and selecting a first target cache line which does not cause page conflict from a linked list corresponding to a cache group corresponding to address information carried by a read-write request of a miss cache line to write back to the main memory by combining the LRU principle and the current opening and closing information of each bank in the main memory.

In the above process, since the LRU principle is considered when the select cacheline is written back to the main memory, at least the access efficiency of the system can be ensured not to be lower than that in the prior art; in addition, when the selection cacheline is written back to the main memory, the current opening and closing information of each bank in the main memory is also considered, so that the first target cache line which does not cause page conflict can be selected as far as possible, namely, other row of the bank where the main memory grain mapped by the cache line is located is not in an open state, further, the page conflict can be avoided, and the access efficiency is improved.

As for the above-mentioned manner of obtaining the current opening and closing information of each bank in the main memory, in some embodiments, the downstream device may have a capability of counting the current opening and closing information of each bank in the main memory, and in this case, the cache controller may pull out the current opening and closing information of each bank in the main memory provided by the downstream device, and in terms of hardware, connect the corresponding signal to the cache system. In addition, the buffer controller can also directly pull the current opening and closing information of each bank in the main memory provided by the downstream equipment through a bus interface, such as an AXI bus.

In other embodiments, if the downstream device does not have the capability of counting the current opening and closing information of each bank in the main memory, or the cache system is configured with other modules for monitoring the current opening and closing information of each bank in the main memory of the downstream device, for example, the monitoring module dram_status_monitor, the monitoring module may also provide the opening and closing information of each bank in the main memory of the downstream device to the cache controller.

For example, referring to fig. 3, the cache system further includes a monitoring module.

The monitoring module is configured to: and monitoring read-write requests sent to the main memory by the upstream equipment, and generating and maintaining the opening and closing information of each bank in the main memory according to the address information carried by each read-write request and the mapping rule of the pre-configured address information and the main memory particles.

The read-write request monitored by the monitoring module can be a history read-write request within a preset time period range from the current time point, or can be a preset number of history read-write requests generated nearest to the current time point.

The foregoing mentions that the read-write request carries address information. In the embodiment of the present disclosure, a mapping rule of address information and host particles carried in a read-write request needs to be configured in advance in a cache system by configuring a register.

The mapping rules need to be in compliance with the communication protocol of the bus to which the upstream device and the downstream device are connected and in compliance with the configuration information in the host for address mapping. For example, the mapping rules of address information and host grain may include a Param_row_bits field and a Param_bank_bits field.

The Param_row_bits field is used for representing row information, such as a row address, of the main memory grain mapped to by taking a first preset bit in the upstream address information as the main memory grain; the Param_bank_bit field is used to indicate the bank information, such as a bank number, in the main memory, in which the second preset bit in the upstream address information is used as the main memory granule to which the address information is mapped.

That is, through the correspondence described by the above fields, one address information can be mapped to the address information of the main memory granule, thereby realizing the mapping of the address information to the corresponding main memory granule.

Further, in some embodiments, the main memory may have an interleaving pattern of row interleaving, column interleaving, etc. The row interleaving mode means that the bank information of the main memory grain is inserted into the row address of the main memory grain, and the column interleaving mode means that the bank information of the main memory grain is inserted into the column address of the main memory grain. In this case, the mapping rule of the address information and the host granule may further include: a Param_col_bits field, a Param_ interleave _mode field, and a Param_bank_start_bit.

The Param_col_bits field is used for representing column information, such as a column address, of the main memory grain mapped to by the upstream address information, wherein the column information is in the main memory by taking a third preset bit in the address information as the main memory grain; the Param_ interleave _mode field is used to indicate the interleaving mode specifically used for the main memory, and may include row interleaving, column interleaving, or no interleaving; the Param_bank_Start_bit field is used to indicate that the bank information in the main memory of the main memory granule to which the upstream address information is mapped is inserted into a specific location of a field row for indicating a field/column address of a row address.

In addition, in some embodiments, the main memory may also set a bank alternate access mode, that is, bank group rotation mode, so as to divide all banks into different bank groups, and if the mode is turned on, different bank groups can be accessed by using two consecutive addresses to reduce access time. Accordingly, the mapping rule may further include: and the sixth preset bit in the address information is used for indicating whether the main memory starts the memory bank alternate access mode.

Illustratively, it is assumed that the address information is an AXI address, and as shown in fig. 4, is a mapping relationship between the AXI address and the host granule address in the non-interleaving mode.

If the row interleaving mode is turned on, the original bank address needs to be inserted into the row address. If the column interleaving mode is turned on, the original bank address needs to be inserted into the column address. The specific position of the row and column addresses is determined by the param_bank_start_bit, for example, the parameter is 2, and then the 4-bit bank bit may be inserted in the second, third, fourth and fifth bits of the original row address.

Of course, it should be noted that the mapping rules shown above are merely examples, and it is understood that other mapping rules consistent with the implementation requirements of the present disclosure are applicable to the present disclosure.

Based on the mapping rule pre-configured in any implementation manner, the monitoring module can map the address information carried in the monitored read-write requests into corresponding main memory grains, so that the rows of which banks of the main memory are accessed by the read-write requests can be determined. It should be noted that if a certain main memory grain is accessed, then the bank and row where the accessed main memory grain is located will be in an on state, and correspondingly, after the monitoring module monitors the read-write requests to determine which rows of the banks of the main memory are accessed by the read-write requests, the state where each bank corresponding to the read-write requests in the main memory is located can be recorded, so as to generate the on-off information of each bank in the main memory.

Optionally, the opening and closing information of each bank in the main memory may include: the open/close state of each bank and the open row information, such as a row address, corresponding to the bank in the open state.

Alternatively, in some embodiments, the monitoring module may generate the opening and closing information of the bank in the form of a table. Assuming that the main memory includes 16 banks, the monitoring module may generate the opening and closing information of the banks as shown in the following table.

Of course, after the monitoring module generates the opening and closing information of each bank in the main memory, the monitoring module can also maintain and update the opening and closing information of each bank in the generated main memory according to the read-write request monitored later, so as to ensure that the opening and closing information of each bank in the main memory is consistent with the state of each bank in the main memory as much as possible.

Accordingly, in the case that the monitoring module generates and maintains the opening and closing information of each bank in the main memory, the cache controller may be specifically configured to: and acquiring the opening and closing information of each bank in the main memory generated and maintained by the monitoring module so as to combine the LRU principle with the current opening and closing information of each bank in the follow-up process, and selecting a first target cache line which does not cause page conflict from the linked list to write back to the main memory.

The first target cache line which does not cause page conflict is selected from the linked list according to the cache controller and combining the LRU principle with the current opening and closing information of each bank.

In general, the cache controller selects a first target cache line of the cache line which meets the first requirement or the second requirement first from the tail of a linked list corresponding to the cache line corresponding to the address information carried by the read-write request of the current missed cache line, and the first target cache line which meets the first requirement or the second requirement is a cache line which can ensure that the current write-back of the cache line to the main memory does not cause page conflict.

Wherein the first requirement comprises: mapping address information carried by a read-write request corresponding to a cache line to corresponding main memory particles as candidate particles according to a mapping rule, and inquiring that the bank and the line where the candidate particles are in an open state in the main memory according to the current opening and closing information of each bank in the main memory; the second requirement includes: mapping address information carried by a read-write request corresponding to a cache line to corresponding main memory particles as candidate particles according to a mapping rule, and inquiring that a bank in which the candidate particles are in a closed state in the main memory according to current opening and closing information.

After the cache line meeting the first requirement is reversely addressed according to the offset, tag and index of the cache line to obtain address information (for convenience of description, abbreviated as a second address) carried by a read-write request for the cache line, mapping the second address into main memory particles (candidate particles) corresponding to the main memory according to the mapping rule mentioned above, and determining that the bank and row where the candidate particles are located are in an open state in the current opening and closing information when the candidate particles are queried according to the current opening and closing information of each bank in the main memory.

After the address information (for convenience of description, abbreviated as a second address) carried by the read-write request for the cache line is obtained by reversely addressing according to the offset, tag and index of the cache line, the second address is mapped into the main memory granule (candidate granule) corresponding to the main memory according to the mapping rule mentioned above, and when the candidate granule is queried according to the current opening and closing information of each bank in the main memory, the bank in which the candidate granule is located is determined to be in the closed state.

In some embodiments, the first request may have a higher priority than the second request.

Based on the premise that the priority of the first requirement is higher than the priority of the second priority, in some embodiments, the cache controller module, when starting from the tail of the linked list, selects the first target cache line that first meets the first requirement or the second requirement, is specifically configured to: starting from the tail of the linked list, for each cache line, judging whether the cache line meets the first requirement, if not, judging whether the cache line meets the second requirement, and judging the next cache line.

In addition, it is well known to those skilled in the art that when a host is accessed, the access efficiency when the bank and the row where the host particle to be accessed are in an on state > the access efficiency when the bank where the host particle to be accessed is in an off state > the access efficiency when the bank where the host particle to be accessed is in an on state but the row where the host particle to be accessed is in an off state. In order to ensure the access efficiency as much as possible, in some embodiments, on the premise that the priority of the first requirement is higher than the second priority, the cache controller selects the first target cache line, preferentially selects the cache line in which the bank and the row where the corresponding mapped main memory grain is located are in an on state, and considers selecting the cache line meeting the second requirement after no cache line meeting the first requirement exists.

Based on this, the cache controller, in a case of selecting the first target cache line that first satisfies the first requirement or the second requirement from the end of the linked list, is specifically configured to: starting from the tail entry of the linked list, judging whether a cache line meeting the first requirement exists, and if so, selecting a cache line meeting the first requirement first to the first target cache line; and if the first target cache line does not exist, starting from the tail part of the linked list, selecting the cache line which meets the second requirement first.

In order to further ensure the access efficiency, the cache controller may limit the selection range of the cache controller for selecting the first target cache line when selecting the first target cache line, so as to avoid that the access efficiency of the main memory is affected due to the overlarge selection range.

Based on this, when the cache controller selects the first target cache line using any of the implementations described above, in some embodiments, the cache controller may be specifically configured to: and starting from the tail entry of the linked list, selecting the first target cache line which meets the first requirement or the second requirement first in a preset entry window range. For example, as shown in fig. 5, the preset entry window range may be 3, that is, the cache controller always selects in the entry window range from entry15 to entry13 when selecting the first cache line.

Of course, in some embodiments, there may not be a first target cache line that meets the first requirement or the second requirement, at this time, a page conflict cannot be avoided, and accordingly, the cache controller directly selects the cache line currently located at the tail of the linked list as the second target cache line to write back to the host based on the original LRU principle.

In addition, based on similar inventive concepts, the embodiment of the present disclosure further provides a cache write-back method, which is applied to a cache system. The cache system comprises a cache controller and a cache, wherein the cache comprises a plurality of cache lines; the cache system stores the historical access sequence of the cache line through a linked list, and updates the historical access sequence based on the least recently used LRU principle.

Referring to fig. 6, the cache write-back method includes the following steps.

S110: for the read-write request of the missed cache line, the cache controller acquires current opening and closing information of each storage bank in the main memory, and selects a first target cache line which does not cause page conflict from the linked list to write back to the main memory by combining the LRU principle and the current opening and closing information.

The main memory is located in a downstream device of the cache system.

In addition, the embodiment of the disclosure further provides an SOC, where the SOC includes the cache system in any of the above embodiments. In some use cases, the product form of the SOC is embodied as a GPU (Graphics Processing Unit, graphics processor) SOC; in other usage scenarios, the product form of the SOC is embodied as a CPU (Central Processing Unit ) SOC.

In addition, the embodiment of the disclosure also provides an electronic component, which includes the SOC described in any of the embodiments above. In some use scenarios, the product form of the electronic assembly is embodied as a graphics card; in other use cases, the product form of the electronic assembly is embodied as a CPU motherboard.

In addition, the embodiment of the disclosure also provides electronic equipment, which comprises the electronic component. In some use scenarios, the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some use cases, the electronic device is in the form of a personal computer, game console, workstation, server, etc.

While the preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the disclosure.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A cache system comprising a cache controller and a cache, the cache comprising a plurality of cache lines; the cache system stores the historical access sequence of the cache line through a linked list, and updates the historical access sequence based on the least recently used LRU principle;

for read and write requests that miss a cache line, the cache controller is configured to: acquiring current opening and closing information of each memory bank in a main memory, and selecting a first target cache line which does not cause page conflict from the linked list to write back to the main memory by combining the LRU principle with the current opening and closing information; the main memory is positioned in downstream equipment of the cache system;

The mapping rule of the address information carried by the read-write request sent to the main memory by the upstream equipment and the main memory particles is preconfigured in the cache system; the opening and closing information comprises opening and closing states of all banks in the main memory and corresponding opened rows of the banks in the opening states; selecting the first target cache line, wherein the cache controller is specifically configured to: starting from the tail part of the linked list, selecting a cache line which meets first requirements or second requirements at first, and selecting the first target cache line;

2. The cache system according to claim 1, wherein in case the downstream device has the capability of counting current opening and closing information of each bank in the main memory, the cache controller is specifically configured to: and acquiring current opening and closing information of each bank in the main memory which is pulled out from the downstream equipment.

3. The cache system of claim 1, further comprising a monitoring module configured to: monitoring a read-write request sent to the main memory by an upstream device of the cache system, and generating and maintaining the opening and closing information of each bank in the main memory according to address information carried by the read-write request and a mapping rule of pre-configured address information and main memory particles;

4. The cache system of claim 1, the first requirement having a higher priority than the second requirement.

5. The cache system of claim 4, wherein, when the cache controller selects the first target cache line that first meets the first requirement or the second requirement from the end of the linked list, the cache controller is specifically configured to:

Starting from the tail part of the linked list, judging whether a cache line meeting the first requirement exists, and if so, selecting a cache line meeting the first requirement first to the first target cache line; and if the first target cache line does not exist, starting from the tail part of the linked list, selecting the cache line which meets the second requirement first.

6. The cache system of claim 1, the mapping rule of the address information and the host grain comprising: and respectively indicating the row information and the bank information of the main memory particles mapped to the address information in the main memory by using the first preset bit and the second preset bit in the address information.

7. The cache system of claim 6, the mapping rule of the address information and the host grain further comprising: and respectively using the third preset bit, the fourth preset bit and the fifth bit in the address information to indicate column information of the main memory particles mapped by the address information in the main memory, the interleaving mode of the main memory, the bank information of the main memory particles mapped by the address information in the main memory to be inserted into the position of row information or column information, and/or using the sixth preset bit in the address information to indicate whether the main memory starts a memory bank alternating access bank group rotation mode.

8. The cache system according to any of claims 4-7, the linked list comprising a plurality of entries, the cache controller being specifically configured, in case the first target cache line is selected for the cache line that first meets the first requirement or the second requirement: and selecting the first target cache line of the cache line which meets the first requirement or the second requirement first within the range of a preset entry window.

9. The cache system of any of claims 1-7, the cache controller further configured to: and if the first target cache is determined not to exist, selecting a second target cache line to write back to the main memory, wherein the second target cache line is a cache line currently positioned at the tail part of the linked list.

10. The cache write-back method is applied to a cache system, wherein the cache system comprises a cache controller and a cache, and the cache comprises a plurality of cache lines; the cache system stores the historical access sequence of the cache line through a linked list, and updates the historical access sequence based on the least recently used LRU principle; the method comprises the following steps:

For a read-write request of a missed cache line, the cache controller acquires current opening and closing information of each storage bank in a main memory, and selects a first target cache line which does not cause page conflict from the linked list to write back to the main memory by combining the LRU principle with the current opening and closing information; the main memory is positioned in downstream equipment of the cache system;

the mapping rule of the address information carried by the read-write request sent to the main memory by the upstream equipment of the cache system and the main memory particles is preconfigured in the cache system; the opening and closing information comprises opening and closing states of all banks in the main memory and corresponding opened rows of the banks in the opening states;

The selecting a first target cache line that does not cause page conflict from the linked list by combining the LRU principle and current switching information includes: starting from the tail part of the linked list, selecting a cache line which meets first requirements or second requirements at first, and selecting the first target cache line;

11. The method as claimed in claim 10, wherein the cache controller obtains current opening and closing information of each bank in the main memory, including:

And under the condition that the downstream equipment has the capability of counting the current opening and closing information of each bank in the main memory, the cache controller acquires the current opening and closing information of each bank in the main memory pulled from the downstream equipment.

12. The method of claim 10, the cache system further comprising a monitoring module, the method further comprising:

The monitoring module monitors a read-write request sent to the main memory by an upstream device of the cache system, and generates and maintains the opening and closing information of each bank in the main memory according to address information carried by the read-write request and a mapping rule of pre-configured address information and main memory particles;

13. The method of claim 10, the first requirement having a higher priority than the second requirement.

14. The method of claim 13, the selecting the first target cache line that first satisfies the first requirement or the second requirement from the tail of the linked list, comprising:

15. The method of claim 10, the mapping rule of the address information to the host grain comprising: and respectively indicating the row information and the bank information of the main memory particles mapped to the address information in the main memory by using the first preset bit and the second preset bit in the address information.

16. The method of claim 15, the mapping rule of the address information to the host grain further comprising: and respectively using the third preset bit, the fourth preset bit and the fifth bit in the address information to indicate column information of the main memory particles mapped by the address information in the main memory, the interleaving mode of the main memory, the bank information of the main memory particles mapped by the address information in the main memory to be inserted into the position of row information or column information, and/or using the sixth preset bit in the address information to indicate whether the main memory starts a memory bank alternating access bank group rotation mode.

17. The method of any of claims 13-16, the linked list comprising a plurality of entries, the selecting the first target cache line that first satisfies the first requirement or the second requirement comprising: and selecting the first target cache line of the cache line which meets the first requirement or the second requirement first within the range of a preset entry window.

18. The method of any one of claims 10-16, further comprising:

And the cache controller selects a second target cache line to write back to the main memory under the condition that the first target cache is determined not to exist, wherein the second target cache line is a cache line currently positioned at the tail part of the linked list.

19. A system on chip comprising a cache system according to any of claims 1-9.

20. An electronic assembly comprising the system-on-chip of claim 19.

21. An electronic device comprising the electronic assembly of claim 20.