CN115237585A - Memory controller, data reading method and memory system - Google Patents

Memory controller, data reading method and memory system Download PDF

Info

Publication number
CN115237585A
CN115237585A CN202111082943.XA CN202111082943A CN115237585A CN 115237585 A CN115237585 A CN 115237585A CN 202111082943 A CN202111082943 A CN 202111082943A CN 115237585 A CN115237585 A CN 115237585A
Authority
CN
China
Prior art keywords
page
cache
data
read address
memory controller
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111082943.XA
Other languages
Chinese (zh)
Inventor
周轶刚
朱晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XFusion Digital Technologies Co Ltd
Original Assignee
XFusion Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XFusion Digital Technologies Co Ltd filed Critical XFusion Digital Technologies Co Ltd
Priority to PCT/CN2021/120427 priority Critical patent/WO2022222377A1/en
Publication of CN115237585A publication Critical patent/CN115237585A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0882Page mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0656Data buffering arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the application discloses a memory controller, a data reading method and a memory system, which are used for improving the data reading efficiency. The memory controller of the embodiment of the application comprises: the device comprises a host side interface, a first-level cache, a page table buffer and a second-level cache, wherein the page table buffer stores cache entries corresponding to the second-level cache, and the cache entries are used for indicating data pages stored in the second-level cache. The memory controller receives a read instruction through the host side interface, wherein the read instruction carries a read address; and when determining that the first-level cache misses according to the read address, the memory controller queries the page table buffer according to the read address, and reads data corresponding to the read address from the second-level cache when determining that the data page corresponding to the read address is cached in the second-level cache.

Description

Memory controller, data reading method and memory system
Technical Field
The embodiment of the application relates to the field of computers, in particular to a memory controller, a data reading method and a memory system.
Background
The memory is an indispensable important component in the server, and the Cost accounts for about 30% -40% of the Cost of the whole system of the server, so on the basis of not reducing or slightly reducing the performance, reducing the Cost of the memory becomes an important means for reducing the Total Cost of Ownership (TCO) of the whole system, and the memory technology becomes a hot technology for research of various large server manufacturers and cloud operators. Compressing memory data with a hard compression engine or replacing traditional memory with a newer medium (e.g., non-volatile memory) that has a higher latency but a lower cost can significantly reduce memory costs, but the resulting increase in memory access latency negatively impacts application performance.
Disclosure of Invention
The embodiment of the application provides a memory controller, a data reading method and a memory system, which aim to solve the problem of prolonging of memory access.
In a first aspect, an embodiment of the present invention provides a memory controller, where the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, where a cache entry corresponding to the second-level cache is stored in the page table buffer, the cache entry is used to indicate a data page stored in the second-level cache, and the memory controller is configured to receive a read instruction through the host-side interface, determine a first-level cache miss according to a read address carried by the read instruction, then query the page table buffer according to the read address, and read data corresponding to the read address from the second-level cache when it is determined that the data page corresponding to the read address is cached in the second-level cache.
The memory controller defined in the embodiment of the application comprises a two-level cache and a page table cache, and the cached data page in the two-level cache is recorded through the cache entry in the page table cache, so that the data in the two-level cache is effectively utilized, the situation of reading the data from the memory is reduced, the problem of long time delay of reading the data from the memory is effectively reduced, and the data access efficiency is improved.
Further, when it is determined that the data page corresponding to the read address is not cached in the secondary cache, the memory controller is further configured to read the data corresponding to the read address from the memory through the memory interface.
In a possible embodiment, the memory controller is configured to cache the data page in the secondary cache after reading the data page corresponding to the read address from the memory through the memory interface, and add a cache entry corresponding to the read address to the page table buffer. By adding missed data pages to the second level cache and adding cache entries in the page table buffer, the amount of data page caching is increased, further increasing the likelihood of data access hits.
The memory controller is further configured to eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory. It should be noted that the elimination rule may be an LRU or a minimum cache page principle. The minimum cache page principle means that the macro pages with the least cached data pages are preferentially eliminated.
In another possible embodiment, the second level cache stores decompressed data of a data page, the memory stores compressed data of the data page, and the second level cache is configured to prefetch the compressed data in the memory and cache the decompressed data corresponding to the compressed data.
The embodiment of the present application further provides a format of a read address and a format of a cache entry, exemplarily, the read address includes a page tag, a page index and an offset in a page, the cache entry includes a page tag, a cache tag and a huge page index, wherein the huge page index is used to indicate an address of a huge page in the second-level cache, and the cache tag is used to indicate whether a data page is cached in the huge page.
The memory controller is further configured to query the page table buffer according to a page tag in a read address, determine whether a cache entry corresponding to the page tag exists in the page table buffer, if so, further query a cache tag corresponding to a page index of the read address, determine whether a data page corresponding to the read address is cached in the second-level cache, and if so, read data corresponding to the read address from the second-level cache.
In another possible implementation manner, after the data page corresponding to the read address is cached in the second-level cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is BPA
BPA=Huge Page Index*M+Page Index*N+Page offset
The Huge Page Index is a Huge Page Index, the Page Index is a Page Index, the Page offset is an intra-Page offset, the M is a Huge Page size, and the N is a Page size.
The memory controller is specifically configured to read data corresponding to the read address from the second-level cache according to the cache address.
The number of bits of the page identifier is x, and the number of bits of the cache tag is 2 x Wherein X is an integer greater than 0.
Illustratively, the bit number of the page tag is 19bits, the bit number of the page index is 9bits, the bit number of the offset in the page is 12bits, and then the bit number of the cache mark is 2 9 bits(512bits)。
In one possible implementation, the first level cache is an SRAM and the second level cache is a DRAM.
In one possible implementation, the page table buffer comprises a first level page table buffer and a second level page table buffer, wherein the number of cache entries in the first level page table buffer is less than the number of cache entries in the second level page table buffer. Illustratively, the number of cache entries in the first-level table buffer is 128, and the number of cache entries in the second-level page table buffer is 16k.
In a second aspect, an embodiment of the present application provides a data reading method for a memory controller, where the memory controller includes a host-side interface, a first-level cache, a page table buffer, and a second-level cache, where the page table buffer stores a cache entry corresponding to the second-level cache, and the cache entry is used to indicate a data page stored in the second-level cache,
the method comprises the following steps:
the memory controller receives a read instruction through the host side interface, wherein the read instruction carries a read address;
the memory controller determines a first-level cache miss according to the read address;
and the memory controller queries the page table buffer according to the read address, and reads the data corresponding to the read address from the secondary cache when determining that the data page corresponding to the read address is cached in the secondary cache.
In one possible embodiment, the method further comprises:
and when the data page corresponding to the read address is determined not to be cached in the secondary cache, the memory controller reads the data corresponding to the read address from the memory through a memory interface.
Further, the method may further include:
after reading the data page corresponding to the read address from the memory through the memory interface, the memory controller caches the data page in a secondary cache, and adds a cache entry corresponding to the read address in the page table buffer.
Further, the method may further include:
and the memory controller eliminates the target cache entry according to an elimination rule and writes the data page corresponding to the target cache entry back to the memory.
In a possible implementation manner, the second-level cache stores decompressed data of a data page, the memory stores compressed data of the data page, and the second-level cache is configured to prefetch the compressed data in the memory and cache the decompressed data corresponding to the compressed data.
Illustratively, the read address includes a page tag, a page index, and an offset within the page, and the cache entry includes a page tag, a cache tag, and a macro-page index, wherein the macro-page index is used to indicate an address of a macro-page in the second level cache, and the cache tag is used to indicate whether a data page is cached in the macro-page.
At this time, the memory controller queries the page table buffer according to the read address, and when it is determined that the data page corresponding to the read address is already cached in the secondary cache, reading the data corresponding to the read address from the secondary cache includes:
the memory controller queries the page table buffer according to a page tag in a read address, determines whether a cache entry corresponding to the page tag exists in the page table buffer, further queries a cache tag corresponding to a page index of the read address if the cache entry exists, determines whether a data page corresponding to the read address is cached in the second-level cache, and reads data corresponding to the read address from the second-level cache if the data page corresponding to the read address is cached.
In another possible implementation manner, after determining that the data page corresponding to the read address is cached in the secondary cache, the method further includes:
the memory controller constructs a cache address corresponding to the read address, wherein the cache address BPA is as follows:
BPA=Huge Page Index*M+Page Index*N+Page offset
wherein, huge Page Index is the Huge Page Index, page Index is the Page Index, page offset is the intra-Page offset,
and the memory controller reads the data corresponding to the read address from the secondary cache according to the cache address.
In a third aspect, an embodiment of the present application further provides a memory system, which includes a memory and the memory controller according to the first aspect.
In a fourth aspect, an embodiment of the present application further provides a chip, which includes a storage medium and a hardware processing logic, where the storage medium has instructions stored therein, and the hardware processing logic is configured to execute the instructions in the storage medium to implement the method steps described in any one of the second aspect or any one of the possible implementation manners of the second aspect.
In a fifth aspect, an embodiment of the present application further provides a server, including a processor and the memory system according to the third aspect.
In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, in which a computer program or instructions are stored, and when the computer program or instructions are executed by a processor in a server, the computer program or instructions are used to implement the operation steps of the method described in the second aspect or any one of the possible implementation manners of the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer program product, where the computer program product includes instructions that, when executed on a server or a terminal, cause the server or the terminal to execute the instructions to implement the operation steps of the method described in the second aspect or any one of the possible implementation manners of the second aspect.
The present application may further combine to provide more implementation manners on the basis of the implementation manners provided by the above aspects.
Drawings
FIG. 1 is a schematic diagram of a memory controller according to an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a memory system according to an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating a format of a read address according to an embodiment of the present application;
fig. 4 is a schematic diagram of a data cache structure according to an embodiment of the present application;
fig. 5 is a schematic diagram of another data cache structure provided in the embodiment of the present application;
fig. 6 is a flowchart illustrating a method for a memory controller to read data according to an embodiment of the present disclosure.
Detailed Description
The terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In a computing architecture scenario with memory as the center, real-time compression/decompression of data in a large-capacity memory becomes an important means for reducing the memory cost, for example, compression/decompression in units of 4K pages reduces the cost, but greatly increases the time delay of CPU memory access. Although data can be cached through a large-capacity Cache, the problem of high cost is brought, and the effect is limited for a large-capacity random access scene.
In the embodiment of the application, a secondary cache (e.g., a Low Power consumption Double Data Rate Memory Low Power Double Data Rate SDRAM, LPDDR) with a certain capacity based on a conventional Memory medium is introduced into a serial Memory chip to cache decompressed Data, and a Static Random-Access Memory (SRAM) with a small specification is introduced as a method for supporting a macro Page TLB (Translation Lookaside Buffer) index, so that multiple Memory accesses caused by a conventional Page Table Walk mode are reduced, and the efficiency of Data reading is improved. Specifically, the embodiment of the application introduces a method of a small-capacity Cache (SRAM), a medium-capacity uncompressed memory, a large-capacity compressed memory or a PCM medium memory on the external serial memory chip, so that the cost of the memory is reduced, and the problem of prolonged access time when the CPU accesses the large-capacity compressed memory is avoided.
Fig. 1 is a schematic structural diagram of a memory controller according to an embodiment of the present invention, where the memory controller 100 includes a host-side interface 101, a first-level cache 102, a page table buffer 103, and a second-level cache 104, where the page table buffer 103 stores a cache entry corresponding to the second-level cache 104, the cache entry is used to indicate a data page stored in the second-level cache 104, and the memory controller 100 is configured to receive a read instruction through the host-side interface 101, determine that the first-level cache 102 misses according to a read address carried by the read instruction, then query the page table buffer 103 according to the read address, and when it is determined that the data page corresponding to the read address is already cached in the second-level cache 104, read data corresponding to the read address from the second-level cache 104.
Further, the memory controller further includes a memory interface 105.
The first-level cache 102 includes cacheline and tag corresponding to cacheline, the memory controller 100 first searches for data to be read in the first-level cache 102 according to a read address, and if the data is hit in the first-level cache 102, the hit cacheline is directly read from the first-level cache 102.
Further the memory controller also includes serial to parallel conversion logic.
The memory controller defined in the embodiment of the application comprises a two-level cache and a page table cache, and the cached data page in the two-level cache is recorded through the cache entry in the page table cache, so that the data in the two-level cache is effectively utilized, the situation of reading the data from the memory is reduced, the problem of long time delay of reading the data from the memory is effectively reduced, and the data access efficiency is improved.
As shown in fig. 2, which is a schematic structural diagram of a memory system according to an embodiment of the present disclosure, the memory system 200 includes a memory controller 100 and a memory 205.
When it is determined that the data page corresponding to the read address is not cached in the secondary cache 104, the memory controller 100 is further configured to read the data corresponding to the read address from the memory 205 through the memory interface 105.
Specifically, the data read from the memory 205 may be compressed data, and the memory controller 100 stores the corresponding decompressed data in the second-level cache 104 after reading the compressed data. The memory controller 100 is configured to, after reading the data page corresponding to the read address from the memory 205 through the memory interface 105, cache the data page in the second level cache 104, and add a cache entry corresponding to the read address to the page table buffer 103.
The memory controller 100 is further configured to eliminate a target cache entry from the page table buffer 103 according to an elimination rule, and write back a data page corresponding to the target cache entry to the memory 205. It should be noted that the elimination rule may be a Least Recently Used (LRU) or a Least cached page principle. The minimum cache page principle means that the macro page with the least cached data pages is preferentially eliminated.
In another possible implementation manner, the second-level cache 104 stores decompressed data of a data page, the memory 205 stores compressed data of the data page, and the second-level cache 104 is configured to prefetch the compressed data in the memory 205 and cache the decompressed data corresponding to the compressed data.
As shown in fig. 3, a schematic format diagram of a read address provided in the embodiment of the present application is shown, where the read address includes a Page Tag, a Page Index, and an intra-Page Offset. Illustratively, the bit number of the page tag is 19bits, the bit number of the page index is 9bits, and the bit number of the offset in the page is 12bits.
As shown in fig. 4, a schematic diagram of a data cache structure provided in the embodiment of the present application is shown, where a Cacheline (CL) and a CL Tag are recorded in a first-level cache, and a Page table buffer records a cache entry, where the cache entry exemplarily includes a Page Tag, a cache Tag Buffered Flag, and a Huge Page Index, one cache entry corresponds to one Huge Page in the second-level cache, and one data Page in each Huge Page corresponds to one compressed Page in a memory. Illustratively, as shown in FIG. 5, the first level cache size is 64M, and the number of bits of the cache tag in the cache entry in the page table buffer is 2 9 bits (512 bits), each 1-bit cache mark corresponds to one data page in the giant page, and when the data page in the giant page has cached the pre-fetch in the memoryWhen the data page is in (1), the cache flag may be marked as 1, otherwise, the cache flag may be marked as 0.
The memory controller is further configured to query the page table buffer according to a page tag in a read address, determine whether a cache entry corresponding to the page tag exists in the page table buffer, if so, further query a cache tag corresponding to a page index of the read address, determine whether a data page corresponding to the read address is cached in the second-level cache, and if so, read data corresponding to the read address from the second-level cache.
In another possible implementation manner, after the data page corresponding to the read address is cached in the secondary cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is
BPA=Huge Page Index*M+Page Index*N+Page offset
The Huge Page Index is a Huge Page Index, the Page Index is a Page Index, the Page offset is an intra-Page offset, the M is a Huge Page size, and the N is a Page size.
According to the magnitude of each parameter described in the foregoing fig. 5, correspondingly, the value of M is 2m, and the value of n is 4k. The second level cache is 32G in capacity at this time, and contains 16k 2M-sized giant pages. In specific implementation, the size and the number of the macro pages can be flexibly adjusted according to the size of the actual physical memory, and details are not repeated in the embodiment of the application.
The first-level Cache in the Memory controller may use a Static Random-Access Memory (SRAM) with a relatively high speed, and store data in the form of a Cache Line. The data of the level one cache is indexed using the CL tag. And inquiring in the first-level cache according to the physical address carried by the reading instruction, if the corresponding tag is inquired, hitting the first-level cache, and reading out the hit cacheline data.
The second level cache in the memory controller may use DRAM, and illustratively, in conjunction with fig. 5, the data in the second level cache is stored in the form of 2M-sized macro-pages, each of which consists of 512 pages of 4K.
The memory controller uses the page table buffer to index the second level cache, and illustratively, two levels of second level caches, i.e., a first level page table buffer and a second level page table buffer, may be used, wherein the number of cache entries in the first level page table buffer is smaller than the number of cache entries in the second level page table buffer. Illustratively, the number of cache entries in the first-level table buffer is 128, and the number of cache entries in the second-level page table buffer is 16k.
The page tag in the cache entry corresponds to the page tag of the physical address carried by the read instruction.
In connection with the example of fig. 5, the cache tags may be 512bits, each bit of the cache tag corresponds to a 4K page of the 2M megapages, and the cache tag is used to indicate whether the corresponding 4K page has been decompressed and cached in the second level cache. Illustratively, 0: not cached, 1: cached. The Huge Page Index in the cache entry indicates the actual address of the current 2M Huge Page in the secondary cache, and the memory controller can determine the address of the corresponding Huge Page in the secondary cache through the Huge Page Index. Correspondingly, if the Physical Address of the host side is in the secondary cache, the cache Physical Address BPA (Buffered Physical Address) of the data in the secondary cache can be obtained through calculation. The calculation formula of BPA is as follows:
BPA=Page Index*2M+Index*4K+Page offset
when the address of the memory is 40 bits, the memory space of 1T can be addressed.
With reference to the format of the read address described in fig. 3, the upper 19bits of the physical address of the memory is a Page tag, which can be used to query a 2M macro Page in the Page table buffer, the middle 9bits address corresponds to the address of a specific 4K Page to be accessed in the 2M macro Page, and the lower 12bits Page offset corresponds to the final address in the 4K Page.
As shown in fig. 6, a schematic flow chart of a method for a memory controller to read data according to an embodiment of the present application includes:
601: the memory controller receives a physical address sent by the host side for accessing the memory space, firstly inquires a first-level cache, confirms whether the physical address is hit in the first-level cache, if not, the step 602 is executed, and if the physical address is hit, cacheline data is returned;
602: the memory controller queries a page table buffer TLB based on a page tag in the physical address, determines a cache entry corresponding to the page tag, determines whether a cache tag corresponding to the physical address in the cache entry is 1 (for example, a value of 1 indicates that a data page corresponding to the tag bit is cached in a second-level cache), determines a macro-page index included in the cache entry, and thus calculates an address of data to be accessed in the second-level cache, and performs step 603; if the cache tag corresponding to the physical address in the cache entry is 0 (for example, a value of 0 indicates that the data page corresponding to the tag bit is not cached in the second level cache), go to step 604;
specifically, when the page table buffer contains two levels of TLB, the two levels of TLB are consulted in sequence.
603: the memory controller reads data from the secondary cache according to the address of the data to be accessed in the secondary cache;
604: and the memory controller reads data from the memory according to the physical address.
With reference to fig. 5, the memory controller may read from the memory according to the physical address, compress the data by 2k, store the decompressed 4k data in the corresponding 4k data page in the macro page in the second-level cache, and set the cache flag in the cache entry corresponding to the 4k data page to 1.
Further, if there is a miss in the cache entry, the actions of 604 are also performed.
Furthermore, the memory controller may also eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory. The eviction rule may be an LRU or a minimum cached page principle. The minimum cache page principle means that the macro pages with the least cached data pages are preferentially eliminated.
The embodiment of the application provides a serial memory controller with two stages of caches, and provides a memory access scheme with low cost and low time delay by caching decompressed data in the memory controller.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the memory controller, the method and the memory system described above may refer to the corresponding process in the foregoing embodiments, and details are not described herein again.

Claims (22)

1. A memory controller comprising a host-side interface, a first level cache, a page table buffer, and a second level cache, wherein the page table buffer stores a cache entry corresponding to the second level cache, the cache entry indicating a data page stored in the second level cache,
the memory controller is used for receiving a read instruction through the host side interface, and the read instruction carries a read address;
the memory controller is further configured to determine a first level cache miss according to the read address;
the memory controller is further configured to query the page table buffer according to the read address, and read data corresponding to the read address from the secondary cache when it is determined that the data page corresponding to the read address is cached in the secondary cache.
2. The memory controller of claim 1, wherein when it is determined that the data page corresponding to the read address is not cached in the secondary cache, the memory controller is further configured to read the data corresponding to the read address from the memory through a memory interface.
3. The memory controller of claim 2,
and the memory controller is used for caching the data page in a second-level cache after reading the data page corresponding to the read address from the memory through a memory interface, and adding a cache entry corresponding to the read address in the page table buffer.
4. The memory controller of claim 3,
the memory controller is further configured to eliminate the target cache entry according to an elimination rule, and write back the data page corresponding to the target cache entry to the memory.
5. The memory controller of any of claims 1-4,
the second-level cache is used for prefetching the compressed data in the memory and caching the decompressed data corresponding to the compressed data.
6. The memory controller of any of claims 1-5,
the read address comprises a page tag, a page index and an in-page offset, and the cache entry comprises a page tag, a cache tag and a giant page index, wherein the giant page index is used for indicating the address of a giant page in the secondary cache, and the cache tag is used for indicating whether a data page is cached in the giant page.
7. The memory controller of claim 6,
the memory controller is further configured to query the page table buffer according to a page tag in a read address, determine whether a cache entry corresponding to the page tag exists in the page table buffer, if so, further query a cache tag corresponding to a page index of the read address, determine whether a data page corresponding to the read address is cached in the second-level cache, and if so, read data corresponding to the read address from the second-level cache.
8. The memory controller of claim 7,
after the data page corresponding to the read address is cached in the second-level cache, the memory controller is further configured to construct a cache address corresponding to the read address, where the cache address BPA is
BPA=Huge Page Index*M+Page Index*N+Page offset
Wherein, huge Page Index is the Huge Page Index, page Index is the Page Index, page offset is the intra-Page offset,
the memory controller is specifically configured to read data corresponding to the read address from the second-level cache according to the cache address.
9. The memory controller of claim 7,
and if the bit number of the page identifier is x, the bit number of the cache mark is 2x.
10. The memory controller of any of claims 1-9,
the first-level cache is an SRAM, and the second-level cache is a DRAM.
11. The memory controller of any one of claims 1-10,
the page table buffer comprises a first-level page table buffer and a second-level page table buffer, wherein the number of cache entries in the first-level page table buffer is smaller than the number of cache entries in the second-level page table buffer.
12. A data reading method of a memory controller is characterized in that the memory controller comprises a host side interface, a first level cache, a page table buffer and a second level cache, wherein the page table buffer stores cache entries corresponding to the second level cache, the cache entries are used for indicating data pages stored in the second level cache,
the method comprises the following steps:
the memory controller receives a read instruction through the host side interface, wherein the read instruction carries a read address;
the memory controller determines a first-level cache miss according to the read address;
and the memory controller queries the page table buffer according to the read address, and reads the data corresponding to the read address from the secondary cache when determining that the data page corresponding to the read address is cached in the secondary cache.
13. The method of claim 12, wherein the method further comprises:
and when the data page corresponding to the read address is determined not to be cached in the secondary cache, the memory controller reads the data corresponding to the read address from the memory through the memory interface.
14. The method of claim 13, wherein the method further comprises:
after reading the data page corresponding to the read address from the memory through the memory interface, the memory controller caches the data page in a second-level cache, and adds a cache entry corresponding to the read address in the page table buffer.
15. The method of claim 14, wherein the method further comprises:
and the memory controller eliminates the target cache entries according to elimination rules and writes the data pages corresponding to the target cache entries back to the memory.
16. The method of any one of claims 12-15,
the second-level cache is used for prefetching the compressed data in the memory and caching the decompressed data corresponding to the compressed data.
17. The method of any one of claims 12-16,
the read address comprises a page tag, a page index and an in-page offset, and the cache entry comprises a page tag, a cache tag and a giant page index, wherein the giant page index is used for indicating the address of a giant page in the secondary cache, and the cache tag is used for indicating whether a data page is cached in the giant page.
18. The method of claim 17, wherein the memory controller queries the page table buffer according to the read address, and wherein reading the data corresponding to the read address from the secondary cache when it is determined that the data page corresponding to the read address is already cached in the secondary cache comprises:
the memory controller inquires the page table buffer according to a page tag in a read address, determines whether a cache entry corresponding to the page tag exists in the page table buffer, further inquires a cache tag corresponding to a page index of the read address if the cache entry exists, determines whether a data page corresponding to the read address is cached in the second-level cache, and reads data corresponding to the read address from the second-level cache if the data page corresponding to the read address is cached.
19. The method of claim 18, wherein after determining that the page of data corresponding to the read address is cached in the level two cache, the method further comprises:
the memory controller constructs a cache address corresponding to the read address, wherein the cache address BPA is as follows:
BPA=Huge Page Index*M+Page Index*N+Page offset
wherein, huge Page Index is the Huge Page Index, page Index is the Page Index, page offset is the intra-Page offset,
and the memory controller reads the data corresponding to the read address from the secondary cache according to the cache address.
20. A memory system comprising a memory and a memory controller as claimed in any one of claims 1 to 11.
21. A chip comprising a storage medium having instructions stored therein and hardware processing logic to execute the instructions in the storage medium to implement the method of any of claims 12-19.
22. A server comprising a processor and the memory system of claim 20.
CN202111082943.XA 2021-04-23 2021-09-15 Memory controller, data reading method and memory system Pending CN115237585A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/120427 WO2022222377A1 (en) 2021-04-23 2021-09-24 Memory controller, data reading method, and memory system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110441816 2021-04-23
CN2021104418168 2021-04-23

Publications (1)

Publication Number Publication Date
CN115237585A true CN115237585A (en) 2022-10-25

Family

ID=83666458

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111082943.XA Pending CN115237585A (en) 2021-04-23 2021-09-15 Memory controller, data reading method and memory system

Country Status (2)

Country Link
CN (1) CN115237585A (en)
WO (1) WO2022222377A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501696B (en) * 2023-06-30 2023-09-01 之江实验室 Method and device suitable for distributed deep learning training prefetching cache management

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090006803A1 (en) * 2007-06-28 2009-01-01 David Arnold Luick L2 Cache/Nest Address Translation
GB2565069B (en) * 2017-07-31 2021-01-06 Advanced Risc Mach Ltd Address translation cache
CN109582600B (en) * 2017-09-25 2020-12-01 华为技术有限公司 Data processing method and device
CN112631962A (en) * 2019-09-24 2021-04-09 阿里巴巴集团控股有限公司 Storage management device, storage management method, processor and computer system
CN112527395B (en) * 2020-11-20 2023-03-07 海光信息技术股份有限公司 Data prefetching method and data processing apparatus

Also Published As

Publication number Publication date
WO2022222377A1 (en) 2022-10-27

Similar Documents

Publication Publication Date Title
US10248576B2 (en) DRAM/NVM hierarchical heterogeneous memory access method and system with software-hardware cooperative management
US9740631B2 (en) Hardware-assisted memory compression management using page filter and system MMU
CN104346294B (en) Data read/write method, device and computer system based on multi-level buffer
CN109219804B (en) Nonvolatile memory access method apparatus and system
US9418011B2 (en) Region based technique for accurately predicting memory accesses
US8335908B2 (en) Data processing apparatus for storing address translations
KR20080063512A (en) Updating multiple levels of translation lookaside buffers(tlbs) field
US20090019254A1 (en) Processing system implementing multiple page size memory organization with multiple translation lookaside buffers having differing characteristics
GB2547306A (en) Profiling cache replacement
CN111061655B (en) Address translation method and device for storage device
JP2012212440A (en) Caching memory attribute indicators with cached memory data
CN111949572A (en) Page table entry merging method and device and electronic equipment
CN110389911A (en) A kind of forecasting method, the apparatus and system of device memory administrative unit
JP6198952B2 (en) Method and apparatus for querying physical memory addresses
WO2020101564A1 (en) Accessing compressed computer memory
CN113039530A (en) Free space management for compressed storage systems
WO2017222801A1 (en) Pre-fetch mechanism for compressed memory lines in a processor-based system
CN108874691B (en) Data prefetching method and memory controller
WO2022222377A1 (en) Memory controller, data reading method, and memory system
US6587923B1 (en) Dual line size cache directory
CN112148638A (en) Page table for granular allocation of memory pages
Xie et al. ECAM: an efficient cache management strategy for address mappings in flash translation layer
CN117472807B (en) Virtual address conversion method and device and electronic equipment
CN113076267B (en) Address conversion method and data storage device based on hot spot aggregation
US10977176B2 (en) Prefetching data to reduce cache misses

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination