CN105279113B - Reduce the methods, devices and systems that DRAM Cache missings access - Google Patents

Reduce the methods, devices and systems that DRAM Cache missings access Download PDF

Info

Publication number
CN105279113B
CN105279113B CN201410315469.4A CN201410315469A CN105279113B CN 105279113 B CN105279113 B CN 105279113B CN 201410315469 A CN201410315469 A CN 201410315469A CN 105279113 B CN105279113 B CN 105279113B
Authority
CN
China
Prior art keywords
cache
physical page
dram
solicited message
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410315469.4A
Other languages
Chinese (zh)
Other versions
CN105279113A (en
Inventor
王琪
李佳芮
王东辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN201410315469.4A priority Critical patent/CN105279113B/en
Publication of CN105279113A publication Critical patent/CN105279113A/en
Application granted granted Critical
Publication of CN105279113B publication Critical patent/CN105279113B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The present invention relates to the method that a kind of reduction DRAM Cache missings access, methods described includes:MissMap receives global prediction result when being that DRAM Cache are hit, the solicited message that processor is sent, wherein, the solicited message includes Physical Page and Cache block messages corresponding to the reference address of L2Cache missings;The MissMap inquires about the Physical Page and whether the Physical Page locally to prestore is equal, when the Physical Page is equal with the Physical Page to prestore, state in bit vector corresponding to query cache block, and the state in the bit vector according to corresponding to the Cache blocks, the solicited message is handled, wherein, the Physical Page to prestore is the Physical Page that most frequentation is asked recently.The present invention is with less hardware spending, reduce the access to DRAM Cache under deletion condition, on the basis of global prediction, further predicted by MissMap, reduce the situation by mistake by miss prediction for hit, access when reducing missing to DRAM Cache, therefore, the present invention can improve performance and reduce energy consumption.

Description

Reduce the methods, devices and systems that DRAM Cache missings access
Technical field
The present invention relates to computer memory technical field, more particularly to the side that a kind of reduction DRAM Cache missings access Method, device and system.
Background technology
As 3D stacks the development of (3D-stacked) technology, make traditional dynamic random access memory (Dynamic Random Access Memory, abbreviation:DRAM) can be integrated into processor.Most promising application is to make DRAM For cache memory (Cache), SRAM (Static Random Access Memory, abbreviation are placed in SRAM) cache between Cache and internal memory (Core), as shown in figure 1, Fig. 1 is memory hierarchy schematic diagram.Compared with SRAM, DRAM Cache can provide bigger memory capacity and bandwidth, so as to improve systematic function.Due to DRAM Cache memory access Delay is smaller than the memory access latency of internal memory outside piece, so memory access latency can be reduced when DRAM Cache are hit, but in DRAM System access delays when Cache is lacked but than no DRAM Cache are long.Therefore, reduce when being lacked to DRAM Cache Access can improve systematic function, while can also reduce system energy consumption.
Cache stores two kinds of information, flag bit (tag) and data, in traditional Cache based on SRAM, tag sums According to being stored in the array of two different structures, in DRAM Cache, also have and tag is placed in the array based on SRAM, such as Shown in Fig. 2 a, if DRAM a line size is 2KB, the data block that 32 sizes are 64B can be deposited, with this 32 data In sram, the structure needs 48MB SRAM to deposit for 500MB DRAM Cache for 32 tags storages corresponding to block Tag, this is unpractical.So tag and data are collectively stored in DRAM by nearest researching and proposing, as shown in Figure 2 b, 3 blocks in DRAM row are used to deposit 29 tags, 29 data blocks of remaining storage.Although this method solves Large Copacity SRAM tag structure problems, but bring two new problems again simultaneously:First, it this method increase the access to DRAM Cache Delay, it is once to read tag when DRAM Cache are hit, it is necessary to be accessed twice DRAM Cache, is once to read Data;Secondly, compared with the system of no DRAM Cache structures, when DRAM Cache are lacked more than once read tag access, Add total Memory accessing delay.
Traditional Cache access module is serial access module (Serial Access Model, abbreviation:SAM), such as Shown in Fig. 3 a, Fig. 3 a are Cache SAM access modules, and in Fig. 3 a, Cache access and the access of internal memory are serial, the moulds Under formula, Cache is first accessed, if Cache missings visit again internal memory.Fig. 3 b are another Cache access modules, i.e. Cache's Concurrent access pattern (Parallel Access Model, abbreviation:PAM), in this mode, Cache access and internal storage access Carry out parallel, but Cache higher than the priority of internal memory, under PAM patterns, whether Cache hits all can be to internal memory Conduct interviews.
In order to reduce the access under deletion condition to DRAM Cache, on the basis of Fig. 2 b structures, prior art one carries A hardware data structure MissMap is gone out, as shown in figure 4, Fig. 4 is MissMap structural representations, in Fig. 4, each MissMap entry is made up of two parts, for storing the flag bit (Page tag) of physical page number and for recording current page In the bit vector (Bit vector) that whether there is of Cache blocks.In order to accurately track DRAM Cache content, Corresponding Cache block messages in all DRAM Cache physical page numbers and Physical Page are stored in MissMap, it is new when having every time Cache blocks insertion DRAM Cache when, in MissMap to the Cache blocks corresponding to the corresponding bit positions of Physical Page can be set to Position;On the contrary, when having Cache blocks to be replaced away from DRAM Cache every time, corresponding Physical Page is corresponding in MissMap Bit positions are reset, and such MissMap cans keep the record consistent with current DRAM Cache content.Processor can be with The Cache blocks are judged whether in DRAM Cache by searching in MissMap bit bit values corresponding to a Cache block, If the Cache blocks to be accessed corresponding bit positions in MissMap are 0, illustrate this Cache block not in DRAM Cache, that The access to internal memory is directly initiated, without being conducted interviews to DRAM Cache.MissMap structures can make processor not Can correctly judges that the access to DRAM Cache is missing from or hit in the case of accessing DRAM Cache.
Although MissMap structures can make processor, the can in the case where not accessing DRAM Cache is correctly judged Access to DRAM Cache is missing from or hit, but is that area overhead is too big the shortcomings that this method.MissMap structures are similar In a Cache structure, in order to accurately track DRAM Cache content, MissMap, which will be stored in DRAM Cache, to be owned Physics page information, such as, need 2MB MissMap for 500MB DRAM Cache.
For PAM patterns compared with SAM patterns, advantage is the reduction of the delay to be conducted interviews during Cache missings to Cache, but Whether PAM patterns Cache, which hits all internally to deposit to conduct interviews, adds bandwidth.Therefore, prior art two proposes dynamic Memory access mode (Dynamic Access Model, abbreviation:DAM), as shown in figure 5, DAM mutually ties SAM with PAM both of which Close, bandwidth is reduced using SAM patterns, reduced and postponed using PAM patterns.The saturated counters that the technology proposes a 3bit are made For memory access fallout predictor, both of which is dynamically selected according to prediction result.The counter is to L2 cache (L2Cache) Cause an internal storage access or a DRAM Cache hit to be recorded during missing, subtract one if hit, if lacked Lose and just add one.When highest order (Most Significant Bit, the abbreviation of counter:MSB) be 1 when, L2Cache lack when adopt With PAM patterns;If MSB uses SAM patterns when being 0, L2Cache missings.
Prior art two uses the global prediction based on history, by adding a saturated counters come to DRAM Cache Nearest access is counted, and ensuing DRAM Cache are accessed by count results and are predicted, but only by One global counter is predicted, and prediction accuracy is than relatively low.If being missing by DRAM Cache hit predicteds, use PAM patterns, it is little to performance impact, but if being by mistake hit by DRAM Cache miss predictions, then add DRAM Cache The delay to be conducted interviews during missing to DRAM Cache, reduces systematic function.
Therefore, in the prior art, when carrying out DRAM Cache missing access, too big there is area overhead, prediction is accurate Exactness than it is relatively low the problem of.
The content of the invention
During present invention aim to address carrying out DRAM Cache missing access, too big there is area overhead, prediction is accurate Exactness than it is relatively low the problem of.
In a first aspect, the embodiments of the invention provide the method that a kind of DRAM Cache missings access, methods described includes:
MissMap receives global prediction result when being that DRAM Cache are hit, the solicited message that processor is sent, wherein, The solicited message includes Physical Page and Cache block messages corresponding to the reference address of L2Cache missings, the Cache blocks letter Breath points to Cache blocks;
The MissMap inquires about the Physical Page and whether the Physical Page locally to prestore is equal, when the Physical Page and in advance When the Physical Page deposited is equal, the state inquired about in bit vector corresponding to the Cache blocks, and according to corresponding to the Cache blocks State in bit vector, the solicited message is handled, wherein, the Physical Page to prestore is the thing that most frequentation is asked recently Manage page.
Preferably, also include before methods described:
Saturated counters are judged the solicited message that processor is sent, when the type of the solicited message is Caused by L2Cache missings during internal storage access, the saturated counters add 1, when the type of the solicited message lacks for L2Cache When DRAM Cache caused by mistake are hit, the saturated counters subtract 1, when the MSB of the saturated counters is 1, the place Reason device is internally deposited using PAM patterns to conduct interviews.
Preferably, the state in institute's bit vector is represented with bit, in the bit vector according to corresponding to the Cache blocks State, processing is carried out to the solicited message and specifically included:
If it is 1 that the MissMap, which inquires bit corresponding to the Cache blocks, judge the Cache blocks in institute State in DRAM Cache, the processor is conducted interviews using SAM patterns to DRAM Cache;Or
If it is 0 that the MissMap, which inquires bit corresponding to the Cache blocks, judge that the Cache blocks do not exist In the DRAM Cache, the processor is conducted interviews using PAM patterns to DRAM Cache, wherein, the Cache blocks and Bit corresponds.
Preferably, the MissMap obtains the Physical Page that most frequentation is asked recently using algorithm LRU using minimum in the recent period.
Second aspect, the embodiments of the invention provide the device that a kind of reduction DRAM Cache missings access, described device Including:Receiving unit, processing unit;
The receiving unit, when being hit for receiving global prediction result for DRAM Cache, the request of processor transmission Information, and the solicited message is sent to the processing unit, wherein, the solicited message includes the visit of L2Cache missings Physical Page corresponding to address and Cache block messages are asked, the Cache block messages point to Cache blocks;
The processing unit, the solicited message sent for receiving the receiving unit, inquires about the Physical Page and local Whether the Physical Page to prestore is equal, when the Physical Page is equal with the Physical Page to prestore, inquires about position corresponding to the Cache blocks State in vector, and the state in the bit vector according to corresponding to the Cache blocks, are handled the solicited message, its In, the Physical Page to prestore is the Physical Page that most frequentation is asked recently.
Preferably, the state in institute's bit vector is represented with bit, the processing unit is specifically used for:
If it is 1 to inquire bit corresponding to the Cache blocks, judge the Cache blocks in the DRAM In Cache, the processor is conducted interviews using SAM patterns to DRAM Cache;Or
If it is 0 to inquire bit corresponding to the Cache blocks, judge the Cache blocks not in the DRAM In Cache, the processor is conducted interviews using PAM patterns to DRAM Cache, wherein, the Cache blocks and bit one One correspondence.
The third aspect, the embodiments of the invention provide the system that a kind of reduction DRAM Cache missings access, the system Including the device as described in any one of claim and global prediction device.
Preferably, the saturated counters are 3bit saturated counters.
The method that the present invention is accessed by reducing DRAM Cache missings, it is DRAM that MissMap, which receives global prediction result, When Cache is hit, solicited message that processor is sent, wherein, the solicited message includes the reference address pair of L2Cache missings The Physical Page and Cache block messages answered, the Cache block messages point to Cache blocks;MissMap inquires about the Physical Page and this Whether the Physical Page that ground prestores is equal, when the Physical Page is equal with the Physical Page to prestore, inquires about corresponding to the Cache blocks State in bit vector, and the state in the bit vector according to corresponding to the Cache blocks, are handled the solicited message, Wherein, the Physical Page to prestore is the Physical Page that most frequentation is asked recently, and the present invention reduces missing feelings with less hardware spending To DRAM Cache access under condition, on the basis of global prediction, further predicted by MissMap and see whether hit, this Sample just reduces the situation by miss prediction for hit by mistake, and access when reducing missing to DRAM Cache is therefore, of the invention Performance can be improved and reduce energy consumption.
Brief description of the drawings
Technical scheme in order to illustrate the embodiments of the present invention more clearly, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for For those of ordinary skill in the art, without having to pay creative labor, it can also be obtained according to these accompanying drawings His accompanying drawing.
Fig. 1 is Fig. 1 for memory hierarchy schematic diagram in the prior art;
Fig. 2 a are that the Cache based on SRAM stores schematic diagram in the prior art;
Fig. 2 b are that the Cache based on DRAM stores schematic diagram in the prior art;
Fig. 3 a are SAM access modules schematic diagram in the prior art;
Fig. 3 b are PAM access modules schematic diagram in the prior art;
Fig. 4 is MissMap structural representations in the prior art;
Fig. 5 is DAM pattern diagrams in the prior art;
Fig. 6 is the schematic diagram for the structure that the reduction DRAM Cache missings that the embodiment of the present invention one provides access;
Fig. 7 is the method flow diagram that the reduction DRAM Cache missings that the embodiment of the present invention one provides access;
Fig. 8 is the schematic device that the reduction DRAM Cache missings that the embodiment of the present invention two provides access.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is Part of the embodiment of the present invention, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art The every other embodiment obtained under the premise of creative work is not made, belongs to the scope of protection of the invention.
For ease of the understanding of the present invention, explanation is further explained with specific embodiment below in conjunction with accompanying drawing, it is real Apply example and do not form restriction to the embodiment of the present invention.
Fig. 7 is the method flow diagram that the reduction DRAM Cache missings that the embodiment of the present invention one provides access, such as Fig. 7 institutes Show, the main body of the present embodiment is MissMap, and the method that reduction DRAM Cache of the invention missing accesses includes:
S710, MissMap receive global prediction result when being that DRAM Cache are hit, the solicited message that processor is sent, Wherein, the solicited message includes Physical Page and Cache block messages corresponding to the reference address of L2Cache missings, the Cache Block message points to Cache blocks;
Specifically, as shown in fig. 6, the knot that the reduction DRAM Cache missings that Fig. 6, which is the embodiment of the present invention one, to be provided access The schematic diagram of structure, processor send solicited message, and the solicited message includes physics corresponding to the reference address of L2Cache missings Page and Cache block messages, the Physical Page point to a line in DRAM, and Cache block messages point to what is accessed in this line Cache blocks, saturated counters are judged solicited message, wherein, the saturated counters can be 3bit saturated counters, It is as global prediction device, when the type of the solicited message causes internal storage access for L2Cache missings, the saturation count Device adds 1, and when the type of the solicited message causes DRAM Cache to hit for L2Cache missings, the saturated counters subtract 1, when the MSB of the saturated counters is 1, the processor is internally deposited using PAM patterns and conducted interviews.
When the MSB of saturated counters is 0, illustrate that global prediction result is hit for DRAM Cache, now, MissMap The solicited message that reception processing device is sent, carry out next step S720 processing.
S720, MissMap inquire about the Physical Page and whether the Physical Page locally to prestore is equal, when the Physical Page and in advance When the Physical Page deposited is equal, the state inquired about in bit vector corresponding to the Cache blocks, and according to Query Result to the request Information is handled, wherein, the Physical Page to prestore is the Physical Page that most frequentation is asked recently.
Specifically, MissMap can be hardware data structure, and it includes Physical Page (page) most often accessed recently, And the state of the bit vector (Block Presence Vector) corresponding to the Physical Page, wherein, corresponding to each Physical Page Cache blocks (block) have several, then the bit vector just has several bits, such as, the Cache blocks can be 32,32 Cache An each corresponding bit (bit) position in block, the bit is used to indicate that the Cache blocks whether there is, if the Cache blocks In the presence of then its bit is 1, if the Cache blocks are not present, its bit is 0, because what is stored in MissMap is most Closely most often accessed Physical Page, therefore, it is stored corresponding to part DRAM Cache Physical Page and the Physical Page Cache blocks bit vector state, the present invention has the characteristics of spatial locality using memory access, only asks most frequentation recently Page storage only records into MissMap to Cache block messages corresponding to these pages, rather than storage and record DRAM Pages all Cache, therefore, greatly reduce area overhead.
Physical Page corresponding with the Cache blocks in DRAM Cache, MissMap is inserted when there are new Cache blocks every time Corresponding bits position can be set;On the contrary, when thering are Cache blocks to be replaced away from DRAM Cache every time, it is right in MissMap The corresponding bits position for the Physical Page answered is reset, wherein, MissMap can be utilized and at least be used algorithm (Least in the recent period Recently Used, abbreviation:LRU), the Physical Page being of little use recently is replaced, so as to get physics the most frequently used recently Page, and represents whether Physical Page effective with V, such as, can represent that Physical Page is effective with V=1, with V=0 represent Physical Page without Effect.
When MissMap receives the solicited message that processor is sent, the solicited message carries the visit of L2Cache missings Physical Page corresponding to address and Cache block messages are asked, the Cache block messages point to the Cache blocks to be accessed, wherein, the thing Reason page can be made a distinction with physical page number, first, MissMap judge the physical page number whether with the physical page number that locally prestores Equal, if the physical page number is equal with the physical page number locally to prestore, the Cache blocks that will be accessed are navigated in DRAM A line, then, continue to judge in this line, whether the bit corresponding to Cache blocks that Cache block messages point to is 1, when When bit is 1, the Cache blocks in the address to be accessed further are understood in DRAM Cache, at this point it is possible to so that Processor uses SAM modes, accesses DRAM Cache.
If bit is 0, illustrate Cache blocks not in DRAM Cache, now, processor uses PAM modes, visits Ask DRAM Cache.
The method accessed using reduction DRAM Cache provided in an embodiment of the present invention missings, MissMap receive global pre- When to survey result be that DRAM Cache are hit, solicited message that processor is sent, wherein, the solicited message lacks including L2Cache Physical Page corresponding to the reference address of mistake and Cache block messages, the Cache block messages point to Cache blocks, MissMap inquiries Whether the Physical Page and the Physical Page locally to prestore are equal, when the Physical Page is equal with the Physical Page to prestore, inquire about institute The state in bit vector corresponding to Cache blocks, and the state in the bit vector according to corresponding to Cache blocks are stated, the request is believed Breath is handled, wherein, the Physical Page to prestore is the Physical Page that most frequentation is asked recently, is reduced and lacked with less hardware spending To DRAM Cache access in the case of mistake, on the basis of global prediction result is DRAM Cache hits, pass through MissMap Further prediction sees whether hit, which reduces the situation by miss prediction for hit by mistake, to DRAM when reducing missing Cache access, therefore, the present invention can improve performance and reduce energy consumption.
Fig. 8 is the device that the reduction DRAM Cache missings that the embodiment of the present invention two provides access, as shown in figure 8, at this In embodiment, including:Receiving unit 810, processing unit 820;
Receiving unit 810, when being hit for receiving global prediction result for DRAM Cache, the request that processor is sent is believed Breath, and the solicited message is sent to the processing unit, wherein, the solicited message includes the access of L2Cache missings Physical Page corresponding to address and Cache block messages, the Cache block messages point to Cache blocks;The processing unit 820, is used for The solicited message that the receiving unit is sent is received, whether with the Physical Page that locally prestores equal, work as institute if inquiring about the Physical Page When to state Physical Page equal with the Physical Page to prestore, the state inquired about in bit vector corresponding to the Cache blocks, and according to described State in bit vector corresponding to Cache blocks, the solicited message is handled, wherein, the Physical Page to prestore is most The Physical Page that nearly most frequentation is asked.
Alternatively, the state in institute's bit vector is represented with bit, the processing unit is specifically used for, if inquiring institute It is 1 to state bit corresponding to Cache blocks, then judges the Cache blocks in the DRAM Cache, and the processor uses SAM patterns conduct interviews to DRAM Cache;Or
If it is 0 to inquire the bit corresponding to the Cache blocks, judge the Cache blocks not in the DRAM In Cache, the processor is conducted interviews using PAM patterns to DRAM Cache, wherein, the Cache blocks and bit one One correspondence.
The device accessed by application reduction DRAM Cache provided in an embodiment of the present invention missings, receiving unit receives complete When office's prediction result is that DRAM Cache are hit, solicited message that processor is sent, and the solicited message is sent to described Processing unit, wherein, the solicited message includes Physical Page and Cache block messages corresponding to the reference address of L2Cache missings, The Cache block messages point to Cache blocks;Processing unit receives the solicited message that the receiving unit is sent, and inquires about the thing Manage page and whether the Physical Page locally to prestore is equal, when the Physical Page is equal with the Physical Page to prestore, inquire about the Cache State in bit vector corresponding to block, and the state in the bit vector according to corresponding to the Cache blocks, enter to the solicited message Row processing, wherein, the Physical Page to prestore is the Physical Page that most frequentation is asked recently, and missing feelings are reduced with less hardware spending To DRAM Cache access under condition, on the basis of global prediction, further predicted by MissMap and see whether hit, this Sample just reduces the situation by miss prediction for hit by mistake, and access when reducing missing to DRAM Cache is therefore, of the invention Performance can be improved and reduce energy consumption.
Present invention additionally comprises the system that a kind of reduction DRAM Cache missings access, the system includes the dress described in Fig. 8 Put and saturated counters.
Professional should further appreciate that, each example described with reference to the embodiments described herein Unit and algorithm steps, it can be realized with electronic hardware, computer software or the combination of the two, it is hard in order to clearly demonstrate The interchangeability of part and software, the composition and step of each example are generally described according to function in the above description. These functions are performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme. Professional and technical personnel can realize described function using distinct methods to each specific application, but this realization It is it is not considered that beyond the scope of this invention.
The method that is described with reference to the embodiments described herein can use hardware, computing device the step of algorithm Software module, or the two combination are implemented.Software module can be placed in random access memory (RAM), internal memory, read-only storage (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technical field In any other form of storage medium well known to interior.
Above-described embodiment, the purpose of the present invention, technical scheme and beneficial effect are carried out further Describe in detail, should be understood that the embodiment that the foregoing is only the present invention, be not intended to limit the present invention Protection domain, within the spirit and principles of the invention, any modification, equivalent substitution and improvements done etc., all should include Within protection scope of the present invention.

Claims (8)

1. a kind of method that reduction DRAM Cache missings access, it is characterised in that methods described includes:
MissMap receives global prediction result when being that DRAM Cache are hit, the solicited message that processor is sent, wherein, it is described Solicited message includes Physical Page and Cache block messages corresponding to the reference address of L2 cache L2 Cache missings, the Cache Block message points to Cache blocks;
The MissMap inquires about whether the Physical Page is equal with the Physical Page locally to prestore, when the Physical Page is with prestoring When Physical Page is equal, the state inquired about in bit vector corresponding to the Cache blocks, and according to corresponding to the Cache blocks position to State in amount, the solicited message is handled, wherein, the Physical Page to prestore is the physics that most frequentation is asked recently Page.
2. the method as described in claim 1, it is characterised in that also include before methods described:
Saturated counters are judged the solicited message that processor is sent, when the type of the solicited message lacks for L2 Cache Caused by mistake during internal storage access, the saturated counters add 1, when the type of the solicited message is caused by L2 Cache missing When DRAM Cache are hit, the saturated counters subtract 1, and when the MSB of the saturated counters is 1, the processor uses PAM patterns, which are internally deposited, to conduct interviews.
3. the method as described in claim 1, it is characterised in that the state in institute's bit vector, the basis are represented with bit State in bit vector corresponding to the Cache blocks, processing is carried out to the solicited message and specifically included:
If it is 1 that the MissMap, which inquires bit corresponding to the Cache blocks, judge the Cache blocks in DRAM In Cache, the processor is conducted interviews using SAM patterns to DRAM Cache;Or
If it is 0 that the MissMap, which inquires bit corresponding to the Cache blocks, judge the Cache blocks not described In DRAM Cache, the processor is conducted interviews using PAM patterns to DRAM Cache, wherein, the Cache blocks and bit Position corresponds.
4. the method as described in claim 1, it is characterised in that the MissMap is utilized and at least obtained in the recent period using algorithm LRU The Physical Page that most frequentation is asked recently.
5. the device that a kind of reduction DRAM Cache missings access, it is characterised in that described device includes:Receiving unit, processing Unit;
The receiving unit, when being hit for receiving global prediction result for DRAM Cache, the solicited message of processor transmission, And the solicited message is sent to the processing unit, wherein, the solicited message includes the access of L2 Cache missings Physical Page corresponding to location and Cache block messages, the Cache block messages point to Cache blocks;
The processing unit, the solicited message sent for receiving the receiving unit, inquire about the Physical Page and prestored with local Physical Page it is whether equal, when the Physical Page is equal with the Physical Page to prestore, inquire about bit vector corresponding to the Cache blocks In state, and the state in the bit vector according to corresponding to the Cache blocks handled the solicited message, wherein, institute It is the Physical Page that most frequentation is asked recently to state the Physical Page to prestore.
6. device as claimed in claim 5, it is characterised in that the state in institute's bit vector, the processing are represented with bit Unit is specifically used for:
If it is 1 to inquire bit corresponding to the Cache blocks, the Cache blocks are judged in the DRAM Cache, The processor is conducted interviews using SAM patterns to DRAM Cache;Or
If it is 0 to inquire bit corresponding to the Cache blocks, judge the Cache blocks not in the DRAM Cache In, the processor is conducted interviews using PAM patterns to DRAM Cache, wherein, a pair of the Cache blocks and bit 1 Should.
7. the system that a kind of reduction DRAM Cache missing accesses, it is characterised in that the system include global prediction device and Device as described in claim any one of 5-6.
8. system as claimed in claim 7, it is characterised in that the global prediction implement body is 3bit saturated counters.
CN201410315469.4A 2014-07-03 2014-07-03 Reduce the methods, devices and systems that DRAM Cache missings access Expired - Fee Related CN105279113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410315469.4A CN105279113B (en) 2014-07-03 2014-07-03 Reduce the methods, devices and systems that DRAM Cache missings access

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410315469.4A CN105279113B (en) 2014-07-03 2014-07-03 Reduce the methods, devices and systems that DRAM Cache missings access

Publications (2)

Publication Number Publication Date
CN105279113A CN105279113A (en) 2016-01-27
CN105279113B true CN105279113B (en) 2018-01-30

Family

ID=55148151

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410315469.4A Expired - Fee Related CN105279113B (en) 2014-07-03 2014-07-03 Reduce the methods, devices and systems that DRAM Cache missings access

Country Status (1)

Country Link
CN (1) CN105279113B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019955A (en) * 2011-09-28 2013-04-03 中国科学院上海微***与信息技术研究所 Memory management method based on application of PCRAM (phase change random access memory) main memory
CN103810113A (en) * 2014-01-28 2014-05-21 华中科技大学 Fusion memory system of nonvolatile memory and dynamic random access memory

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6789169B2 (en) * 2001-10-04 2004-09-07 Micron Technology, Inc. Embedded DRAM cache memory and method having reduced latency

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103019955A (en) * 2011-09-28 2013-04-03 中国科学院上海微***与信息技术研究所 Memory management method based on application of PCRAM (phase change random access memory) main memory
CN103810113A (en) * 2014-01-28 2014-05-21 华中科技大学 Fusion memory system of nonvolatile memory and dynamic random access memory

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch;Jaewoong Sim等;《2012 IEEE/ACM 45th Annual International Symposium on Microarchitecture》;20121205;第247-257页 *
Fundamental Latency Trade-offs in Architecting DRAM Caches:Outperforming Impractical SRAM-Tags with a Simple and Practical Design;Moinuddin K. Qureshi 等;《2012 IEEE/ACM 45th Annual International Symposium on Microarchitecture》;20121205;第235-246页 *
SUPPORTING VERY LARGE DRAM;Gabriel H. Loh等;《IEEE Micro》;20120630;第32卷(第03期);第70-78页 *
一种高效、可扩展细粒度缓存管理混合存储研究;姜国松;《计算机科学》;20130815;第40卷(第08期);第79-83页 *

Also Published As

Publication number Publication date
CN105279113A (en) 2016-01-27

Similar Documents

Publication Publication Date Title
CN108810041A (en) A kind of data write-in of distributed cache system and expansion method, device
CN102760101B (en) SSD-based (Solid State Disk) cache management method and system
CN105493053B (en) Multi-core processor and the method for updating the cache in multi-core processor
KR100978156B1 (en) Method, apparatus, system and computer readable recording medium for line swapping scheme to reduce back invalidations in a snoop filter
CN104850358B (en) A kind of magneto-optic electricity mixing storage system and its data acquisition and storage method
CN105701219B (en) A kind of implementation method of distributed caching
CN105900076A (en) A data processing system and method for handling multiple transactions
CN102694828B (en) A kind of method of distributed cache system data access and device
CN109684237B (en) Data access method and device based on multi-core processor
CN106021128B (en) A kind of data pre-fetching device and its forecasting method based on stride and data dependence
CN110297787A (en) The method, device and equipment of I/O equipment access memory
CN107438837A (en) Data high-speed caches
CN101751980A (en) Embedded programmable memory based on memory IP core
CN109471843A (en) A kind of metadata cache method, system and relevant apparatus
CN107623722A (en) A kind of remote data caching method, electronic equipment and storage medium
CN110018971A (en) Cache replacement technology
CN109521962A (en) A kind of metadata query method, apparatus, equipment and computer readable storage medium
CN106528451A (en) Cloud storage framework for second level cache prefetching for small files and construction method thereof
CN109117088A (en) A kind of data processing method and system
CN110399096A (en) Metadata of distributed type file system caches the method, apparatus and equipment deleted again
CN104714898B (en) A kind of distribution method and device of Cache
CN107506154A (en) A kind of read method of metadata, device and computer-readable recording medium
CN109086462A (en) The management method of metadata in a kind of distributed file system
CN107506466A (en) A kind of small documents storage method and system
CN109478164A (en) For storing the system and method for being used for the requested information of cache entries transmission

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180130

Termination date: 20200703

CF01 Termination of patent right due to non-payment of annual fee