CN115237585A

CN115237585A - Memory controller, data reading method and memory system

Info

Publication number: CN115237585A
Application number: CN202111082943.XA
Authority: CN
Inventors: 周轶刚; 朱晓明
Original assignee: XFusion Digital Technologies Co Ltd
Current assignee: XFusion Digital Technologies Co Ltd
Priority date: 2021-04-23
Filing date: 2021-09-15
Publication date: 2022-10-25
Also published as: WO2022222377A1

Abstract

The embodiment of the application discloses a memory controller, a data reading method and a memory system, which are used for improving the data reading efficiency. The memory controller of the embodiment of the application comprises: the device comprises a host side interface, a first-level cache, a page table buffer and a second-level cache, wherein the page table buffer stores cache entries corresponding to the second-level cache, and the cache entries are used for indicating data pages stored in the second-level cache. The memory controller receives a read instruction through the host side interface, wherein the read instruction carries a read address; and when determining that the first-level cache misses according to the read address, the memory controller queries the page table buffer according to the read address, and reads data corresponding to the read address from the second-level cache when determining that the data page corresponding to the read address is cached in the second-level cache.

Description

Memory controller, data reading method and memory system

Technical Field

The embodiment of the application relates to the field of computers, in particular to a memory controller, a data reading method and a memory system.

Background

The memory is an indispensable important component in the server, and the Cost accounts for about 30% -40% of the Cost of the whole system of the server, so on the basis of not reducing or slightly reducing the performance, reducing the Cost of the memory becomes an important means for reducing the Total Cost of Ownership (TCO) of the whole system, and the memory technology becomes a hot technology for research of various large server manufacturers and cloud operators. Compressing memory data with a hard compression engine or replacing traditional memory with a newer medium (e.g., non-volatile memory) that has a higher latency but a lower cost can significantly reduce memory costs, but the resulting increase in memory access latency negatively impacts application performance.

Disclosure of Invention

The embodiment of the application provides a memory controller, a data reading method and a memory system, which aim to solve the problem of prolonging of memory access.

In a first aspect, an embodiment of the present invention provides a memory controller, where the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, where a cache entry corresponding to the second-level cache is stored in the page table buffer, the cache entry is used to indicate a data page stored in the second-level cache, and the memory controller is configured to receive a read instruction through the host-side interface, determine a first-level cache miss according to a read address carried by the read instruction, then query the page table buffer according to the read address, and read data corresponding to the read address from the second-level cache when it is determined that the data page corresponding to the read address is cached in the second-level cache.

The memory controller defined in the embodiment of the application comprises a two-level cache and a page table cache, and the cached data page in the two-level cache is recorded through the cache entry in the page table cache, so that the data in the two-level cache is effectively utilized, the situation of reading the data from the memory is reduced, the problem of long time delay of reading the data from the memory is effectively reduced, and the data access efficiency is improved.

Further, when it is determined that the data page corresponding to the read address is not cached in the secondary cache, the memory controller is further configured to read the data corresponding to the read address from the memory through the memory interface.

In a possible embodiment, the memory controller is configured to cache the data page in the secondary cache after reading the data page corresponding to the read address from the memory through the memory interface, and add a cache entry corresponding to the read address to the page table buffer. By adding missed data pages to the second level cache and adding cache entries in the page table buffer, the amount of data page caching is increased, further increasing the likelihood of data access hits.

The memory controller is further configured to eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory. It should be noted that the elimination rule may be an LRU or a minimum cache page principle. The minimum cache page principle means that the macro pages with the least cached data pages are preferentially eliminated.

In another possible embodiment, the second level cache stores decompressed data of a data page, the memory stores compressed data of the data page, and the second level cache is configured to prefetch the compressed data in the memory and cache the decompressed data corresponding to the compressed data.

The embodiment of the present application further provides a format of a read address and a format of a cache entry, exemplarily, the read address includes a page tag, a page index and an offset in a page, the cache entry includes a page tag, a cache tag and a huge page index, wherein the huge page index is used to indicate an address of a huge page in the second-level cache, and the cache tag is used to indicate whether a data page is cached in the huge page.

The memory controller is further configured to query the page table buffer according to a page tag in a read address, determine whether a cache entry corresponding to the page tag exists in the page table buffer, if so, further query a cache tag corresponding to a page index of the read address, determine whether a data page corresponding to the read address is cached in the second-level cache, and if so, read data corresponding to the read address from the second-level cache.

In another possible implementation manner, after the data page corresponding to the read address is cached in the second-level cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is BPA

BPA＝Huge Page Index*M+Page Index*N+Page offset

The Huge Page Index is a Huge Page Index, the Page Index is a Page Index, the Page offset is an intra-Page offset, the M is a Huge Page size, and the N is a Page size.

The memory controller is specifically configured to read data corresponding to the read address from the second-level cache according to the cache address.

The number of bits of the page identifier is x, and the number of bits of the cache tag is 2 ^x Wherein X is an integer greater than 0.

Illustratively, the bit number of the page tag is 19bits, the bit number of the page index is 9bits, the bit number of the offset in the page is 12bits, and then the bit number of the cache mark is 2 ⁹ bits(512bits)。

In one possible implementation, the first level cache is an SRAM and the second level cache is a DRAM.

In one possible implementation, the page table buffer comprises a first level page table buffer and a second level page table buffer, wherein the number of cache entries in the first level page table buffer is less than the number of cache entries in the second level page table buffer. Illustratively, the number of cache entries in the first-level table buffer is 128, and the number of cache entries in the second-level page table buffer is 16k.

In a second aspect, an embodiment of the present application provides a data reading method for a memory controller, where the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, where the page table buffer stores a cache entry corresponding to the second-level cache, and the cache entry is used to indicate a data page stored in the second-level cache,

the method comprises the following steps:

the memory controller receives a read instruction through the host side interface, wherein the read instruction carries a read address;

the memory controller determines a first-level cache miss according to the read address;

and the memory controller queries the page table buffer according to the read address, and reads the data corresponding to the read address from the secondary cache when determining that the data page corresponding to the read address is cached in the secondary cache.

In one possible embodiment, the method further comprises:

and when the data page corresponding to the read address is determined not to be cached in the secondary cache, the memory controller reads the data corresponding to the read address from the memory through a memory interface.

Further, the method may further include:

after reading the data page corresponding to the read address from the memory through the memory interface, the memory controller caches the data page in a secondary cache, and adds a cache entry corresponding to the read address in the page table buffer.

Further, the method may further include:

and the memory controller eliminates the target cache entry according to an elimination rule and writes the data page corresponding to the target cache entry back to the memory.

In a possible implementation manner, the second-level cache stores decompressed data of a data page, the memory stores compressed data of the data page, and the second-level cache is configured to prefetch the compressed data in the memory and cache the decompressed data corresponding to the compressed data.

Illustratively, the read address includes a page tag, a page index, and an offset within the page, and the cache entry includes a page tag, a cache tag, and a macro-page index, wherein the macro-page index is used to indicate an address of a macro-page in the second level cache, and the cache tag is used to indicate whether a data page is cached in the macro-page.

At this time, the memory controller queries the page table buffer according to the read address, and when it is determined that the data page corresponding to the read address is already cached in the secondary cache, reading the data corresponding to the read address from the secondary cache includes:

the memory controller queries the page table buffer according to a page tag in a read address, determines whether a cache entry corresponding to the page tag exists in the page table buffer, further queries a cache tag corresponding to a page index of the read address if the cache entry exists, determines whether a data page corresponding to the read address is cached in the second-level cache, and reads data corresponding to the read address from the second-level cache if the data page corresponding to the read address is cached.

In another possible implementation manner, after determining that the data page corresponding to the read address is cached in the secondary cache, the method further includes:

the memory controller constructs a cache address corresponding to the read address, wherein the cache address BPA is as follows:

BPA＝Huge Page Index*M+Page Index*N+Page offset

wherein, huge Page Index is the Huge Page Index, page Index is the Page Index, page offset is the intra-Page offset,

and the memory controller reads the data corresponding to the read address from the secondary cache according to the cache address.

In a third aspect, an embodiment of the present application further provides a memory system, which includes a memory and the memory controller according to the first aspect.

In a fourth aspect, an embodiment of the present application further provides a chip, which includes a storage medium and a hardware processing logic, where the storage medium has instructions stored therein, and the hardware processing logic is configured to execute the instructions in the storage medium to implement the method steps described in any one of the second aspect or any one of the possible implementation manners of the second aspect.

In a fifth aspect, an embodiment of the present application further provides a server, including a processor and the memory system according to the third aspect.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, in which a computer program or instructions are stored, and when the computer program or instructions are executed by a processor in a server, the computer program or instructions are used to implement the operation steps of the method described in the second aspect or any one of the possible implementation manners of the second aspect.

In a seventh aspect, an embodiment of the present application provides a computer program product, where the computer program product includes instructions that, when executed on a server or a terminal, cause the server or the terminal to execute the instructions to implement the operation steps of the method described in the second aspect or any one of the possible implementation manners of the second aspect.

The present application may further combine to provide more implementation manners on the basis of the implementation manners provided by the above aspects.

Drawings

FIG. 1 is a schematic diagram of a memory controller according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a memory system according to an embodiment of the present application;

FIG. 3 is a schematic diagram illustrating a format of a read address according to an embodiment of the present application;

fig. 4 is a schematic diagram of a data cache structure according to an embodiment of the present application;

fig. 5 is a schematic diagram of another data cache structure provided in the embodiment of the present application;

fig. 6 is a flowchart illustrating a method for a memory controller to read data according to an embodiment of the present disclosure.

Detailed Description

The terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In a computing architecture scenario with memory as the center, real-time compression/decompression of data in a large-capacity memory becomes an important means for reducing the memory cost, for example, compression/decompression in units of 4K pages reduces the cost, but greatly increases the time delay of CPU memory access. Although data can be cached through a large-capacity Cache, the problem of high cost is brought, and the effect is limited for a large-capacity random access scene.

In the embodiment of the application, a secondary cache (e.g., a Low Power consumption Double Data Rate Memory Low Power Double Data Rate SDRAM, LPDDR) with a certain capacity based on a conventional Memory medium is introduced into a serial Memory chip to cache decompressed Data, and a Static Random-Access Memory (SRAM) with a small specification is introduced as a method for supporting a macro Page TLB (Translation Lookaside Buffer) index, so that multiple Memory accesses caused by a conventional Page Table Walk mode are reduced, and the efficiency of Data reading is improved. Specifically, the embodiment of the application introduces a method of a small-capacity Cache (SRAM), a medium-capacity uncompressed memory, a large-capacity compressed memory or a PCM medium memory on the external serial memory chip, so that the cost of the memory is reduced, and the problem of prolonged access time when the CPU accesses the large-capacity compressed memory is avoided.

Fig. 1 is a schematic structural diagram of a memory controller according to an embodiment of the present invention, where the memory controller 100 includes a host-side interface 101, a first-level cache 102, a page table buffer 103, and a second-level cache 104, where the page table buffer 103 stores a cache entry corresponding to the second-level cache 104, the cache entry is used to indicate a data page stored in the second-level cache 104, and the memory controller 100 is configured to receive a read instruction through the host-side interface 101, determine that the first-level cache 102 misses according to a read address carried by the read instruction, then query the page table buffer 103 according to the read address, and when it is determined that the data page corresponding to the read address is already cached in the second-level cache 104, read data corresponding to the read address from the second-level cache 104.

Further, the memory controller further includes a memory interface 105.

The first-level cache 102 includes cacheline and tag corresponding to cacheline, the memory controller 100 first searches for data to be read in the first-level cache 102 according to a read address, and if the data is hit in the first-level cache 102, the hit cacheline is directly read from the first-level cache 102.

Further the memory controller also includes serial to parallel conversion logic.

As shown in fig. 2, which is a schematic structural diagram of a memory system according to an embodiment of the present disclosure, the memory system 200 includes a memory controller 100 and a memory 205.

When it is determined that the data page corresponding to the read address is not cached in the secondary cache 104, the memory controller 100 is further configured to read the data corresponding to the read address from the memory 205 through the memory interface 105.

Specifically, the data read from the memory 205 may be compressed data, and the memory controller 100 stores the corresponding decompressed data in the second-level cache 104 after reading the compressed data. The memory controller 100 is configured to, after reading the data page corresponding to the read address from the memory 205 through the memory interface 105, cache the data page in the second level cache 104, and add a cache entry corresponding to the read address to the page table buffer 103.

The memory controller 100 is further configured to eliminate a target cache entry from the page table buffer 103 according to an elimination rule, and write back a data page corresponding to the target cache entry to the memory 205. It should be noted that the elimination rule may be a Least Recently Used (LRU) or a Least cached page principle. The minimum cache page principle means that the macro page with the least cached data pages is preferentially eliminated.

In another possible implementation manner, the second-level cache 104 stores decompressed data of a data page, the memory 205 stores compressed data of the data page, and the second-level cache 104 is configured to prefetch the compressed data in the memory 205 and cache the decompressed data corresponding to the compressed data.

As shown in fig. 3, a schematic format diagram of a read address provided in the embodiment of the present application is shown, where the read address includes a Page Tag, a Page Index, and an intra-Page Offset. Illustratively, the bit number of the page tag is 19bits, the bit number of the page index is 9bits, and the bit number of the offset in the page is 12bits.

As shown in fig. 4, a schematic diagram of a data cache structure provided in the embodiment of the present application is shown, where a Cacheline (CL) and a CL Tag are recorded in a first-level cache, and a Page table buffer records a cache entry, where the cache entry exemplarily includes a Page Tag, a cache Tag Buffered Flag, and a Huge Page Index, one cache entry corresponds to one Huge Page in the second-level cache, and one data Page in each Huge Page corresponds to one compressed Page in a memory. Illustratively, as shown in FIG. 5, the first level cache size is 64M, and the number of bits of the cache tag in the cache entry in the page table buffer is 2 ⁹ bits (512 bits), each 1-bit cache mark corresponds to one data page in the giant page, and when the data page in the giant page has cached the pre-fetch in the memoryWhen the data page is in (1), the cache flag may be marked as 1, otherwise, the cache flag may be marked as 0.

In another possible implementation manner, after the data page corresponding to the read address is cached in the secondary cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is

BPA＝Huge Page Index*M+Page Index*N+Page offset

According to the magnitude of each parameter described in the foregoing fig. 5, correspondingly, the value of M is 2m, and the value of n is 4k. The second level cache is 32G in capacity at this time, and contains 16k 2M-sized giant pages. In specific implementation, the size and the number of the macro pages can be flexibly adjusted according to the size of the actual physical memory, and details are not repeated in the embodiment of the application.

The first-level Cache in the Memory controller may use a Static Random-Access Memory (SRAM) with a relatively high speed, and store data in the form of a Cache Line. The data of the level one cache is indexed using the CL tag. And inquiring in the first-level cache according to the physical address carried by the reading instruction, if the corresponding tag is inquired, hitting the first-level cache, and reading out the hit cacheline data.

The second level cache in the memory controller may use DRAM, and illustratively, in conjunction with fig. 5, the data in the second level cache is stored in the form of 2M-sized macro-pages, each of which consists of 512 pages of 4K.

The memory controller uses the page table buffer to index the second level cache, and illustratively, two levels of second level caches, i.e., a first level page table buffer and a second level page table buffer, may be used, wherein the number of cache entries in the first level page table buffer is smaller than the number of cache entries in the second level page table buffer. Illustratively, the number of cache entries in the first-level table buffer is 128, and the number of cache entries in the second-level page table buffer is 16k.

The page tag in the cache entry corresponds to the page tag of the physical address carried by the read instruction.

In connection with the example of fig. 5, the cache tags may be 512bits, each bit of the cache tag corresponds to a 4K page of the 2M megapages, and the cache tag is used to indicate whether the corresponding 4K page has been decompressed and cached in the second level cache. Illustratively, 0: not cached, 1: cached. The Huge Page Index in the cache entry indicates the actual address of the current 2M Huge Page in the secondary cache, and the memory controller can determine the address of the corresponding Huge Page in the secondary cache through the Huge Page Index. Correspondingly, if the Physical Address of the host side is in the secondary cache, the cache Physical Address BPA (Buffered Physical Address) of the data in the secondary cache can be obtained through calculation. The calculation formula of BPA is as follows:

BPA＝Page Index*2M+Index*4K+Page offset

when the address of the memory is 40 bits, the memory space of 1T can be addressed.

With reference to the format of the read address described in fig. 3, the upper 19bits of the physical address of the memory is a Page tag, which can be used to query a 2M macro Page in the Page table buffer, the middle 9bits address corresponds to the address of a specific 4K Page to be accessed in the 2M macro Page, and the lower 12bits Page offset corresponds to the final address in the 4K Page.

As shown in fig. 6, a schematic flow chart of a method for a memory controller to read data according to an embodiment of the present application includes:

601: the memory controller receives a physical address sent by the host side for accessing the memory space, firstly inquires a first-level cache, confirms whether the physical address is hit in the first-level cache, if not, the step 602 is executed, and if the physical address is hit, cacheline data is returned;

602: the memory controller queries a page table buffer TLB based on a page tag in the physical address, determines a cache entry corresponding to the page tag, determines whether a cache tag corresponding to the physical address in the cache entry is 1 (for example, a value of 1 indicates that a data page corresponding to the tag bit is cached in a second-level cache), determines a macro-page index included in the cache entry, and thus calculates an address of data to be accessed in the second-level cache, and performs step 603; if the cache tag corresponding to the physical address in the cache entry is 0 (for example, a value of 0 indicates that the data page corresponding to the tag bit is not cached in the second level cache), go to step 604;

specifically, when the page table buffer contains two levels of TLB, the two levels of TLB are consulted in sequence.

603: the memory controller reads data from the secondary cache according to the address of the data to be accessed in the secondary cache;

604: and the memory controller reads data from the memory according to the physical address.

With reference to fig. 5, the memory controller may read from the memory according to the physical address, compress the data by 2k, store the decompressed 4k data in the corresponding 4k data page in the macro page in the second-level cache, and set the cache flag in the cache entry corresponding to the 4k data page to 1.

Further, if there is a miss in the cache entry, the actions of 604 are also performed.

Furthermore, the memory controller may also eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory. The eviction rule may be an LRU or a minimum cached page principle. The minimum cache page principle means that the macro pages with the least cached data pages are preferentially eliminated.

The embodiment of the application provides a serial memory controller with two stages of caches, and provides a memory access scheme with low cost and low time delay by caching decompressed data in the memory controller.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the memory controller, the method and the memory system described above may refer to the corresponding process in the foregoing embodiments, and details are not described herein again.

Claims

1. A memory controller comprising a host-side interface, a first level cache, a page table buffer, and a second level cache, wherein the page table buffer stores a cache entry corresponding to the second level cache, the cache entry indicating a data page stored in the second level cache,

the memory controller is used for receiving a read instruction through the host side interface, and the read instruction carries a read address;

the memory controller is further configured to determine a first level cache miss according to the read address;

the memory controller is further configured to query the page table buffer according to the read address, and read data corresponding to the read address from the secondary cache when it is determined that the data page corresponding to the read address is cached in the secondary cache.

2. The memory controller of claim 1, wherein when it is determined that the data page corresponding to the read address is not cached in the secondary cache, the memory controller is further configured to read the data corresponding to the read address from the memory through a memory interface.

3. The memory controller of claim 2,

and the memory controller is used for caching the data page in a second-level cache after reading the data page corresponding to the read address from the memory through a memory interface, and adding a cache entry corresponding to the read address in the page table buffer.

4. The memory controller of claim 3,

the memory controller is further configured to eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory.

5. The memory controller of any of claims 1-4,

the second-level cache is used for prefetching the compressed data in the memory and caching the decompressed data corresponding to the compressed data.

6. The memory controller of any of claims 1-5,

the read address comprises a page tag, a page index and an in-page offset, and the cache entry comprises a page tag, a cache tag and a giant page index, wherein the giant page index is used for indicating the address of a giant page in the secondary cache, and the cache tag is used for indicating whether a data page is cached in the giant page.

7. The memory controller of claim 6,

8. The memory controller of claim 7,

after the data page corresponding to the read address is cached in the second-level cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is

BPA＝Huge Page Index*M+Page Index*N+Page offset

9. The memory controller of claim 7,

and if the bit number of the page identifier is x, the bit number of the cache mark is 2x.

10. The memory controller of any of claims 1-9,

the first-level cache is an SRAM, and the second-level cache is a DRAM.

11. The memory controller of any one of claims 1-10,

the page table buffer comprises a first-level page table buffer and a second-level page table buffer, wherein the number of cache entries in the first-level page table buffer is smaller than the number of cache entries in the second-level page table buffer.

12. A data reading method of a memory controller is characterized in that the memory controller comprises a host side interface, a first level cache, a page table buffer and a second level cache, wherein the page table buffer stores cache entries corresponding to the second level cache, the cache entries are used for indicating data pages stored in the second level cache,

the method comprises the following steps:

13. The method of claim 12, wherein the method further comprises:

and when the data page corresponding to the read address is determined not to be cached in the secondary cache, the memory controller reads the data corresponding to the read address from the memory through the memory interface.

14. The method of claim 13, wherein the method further comprises:

after reading the data page corresponding to the read address from the memory through the memory interface, the memory controller caches the data page in a second-level cache, and adds a cache entry corresponding to the read address in the page table buffer.

15. The method of claim 14, wherein the method further comprises:

and the memory controller eliminates the target cache entries according to elimination rules and writes the data pages corresponding to the target cache entries back to the memory.

16. The method of any one of claims 12-15,

17. The method of any one of claims 12-16,

18. The method of claim 17, wherein the memory controller queries the page table buffer according to the read address, and wherein reading the data corresponding to the read address from the secondary cache when it is determined that the data page corresponding to the read address is already cached in the secondary cache comprises:

the memory controller inquires the page table buffer according to a page tag in a read address, determines whether a cache entry corresponding to the page tag exists in the page table buffer, further inquires a cache tag corresponding to a page index of the read address if the cache entry exists, determines whether a data page corresponding to the read address is cached in the second-level cache, and reads data corresponding to the read address from the second-level cache if the data page corresponding to the read address is cached.

19. The method of claim 18, wherein after determining that the page of data corresponding to the read address is cached in the level two cache, the method further comprises:

BPA＝Huge Page Index*M+Page Index*N+Page offset

20. A memory system comprising a memory and a memory controller as claimed in any one of claims 1 to 11.

21. A chip comprising a storage medium having instructions stored therein and hardware processing logic to execute the instructions in the storage medium to implement the method of any of claims 12-19.

22. A server comprising a processor and the memory system of claim 20.