WO1996033462A1 - Antememoire - Google Patents

Antememoire Download PDF

Info

Publication number
WO1996033462A1
WO1996033462A1 PCT/EP1995/001458 EP9501458W WO9633462A1 WO 1996033462 A1 WO1996033462 A1 WO 1996033462A1 EP 9501458 W EP9501458 W EP 9501458W WO 9633462 A1 WO9633462 A1 WO 9633462A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
cache
cache memory
data sequence
address
Prior art date
Application number
PCT/EP1995/001458
Other languages
German (de)
English (en)
Inventor
Udo Wille
Klaus-Jörg GETZLAFF
Birgit Withelm
Hans-Werner Tast
Original Assignee
International Business Machines Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corporation filed Critical International Business Machines Corporation
Priority to PCT/EP1995/001458 priority Critical patent/WO1996033462A1/fr
Publication of WO1996033462A1 publication Critical patent/WO1996033462A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies

Definitions

  • the invention relates to a cache memory according to the preamble of claim 1.
  • the invention also relates to a microprocessor system and a method for operating a cache memory.
  • cache memories are widely used. They generally serve to limit the access time to data e.g. minimized by a microprocessor or other system unit. Such cache memories are e.g. known from EP-A-0 301 921.
  • the cache memory associated with the microprocessor is the first level of cache memory.
  • This first level cache is also referred to as a first order cache, or L1 for short.
  • a second-order cache L2 is provided separately from the microprocessor with the associated first-order cache L1.
  • the cache memory L2 generally has a much larger storage capacity than the cache memory L1.
  • the cache memory L2 is therefore implemented on a separate chip. If the microprocessor in operation needs input data, for example, the system first checked whether these required data are available in the cache memory Ll. If this check reveals that the required data is not in the cache memory L1, the corresponding check is carried out for the cache memory L2. If this check is successful for the cache memory L2, the corresponding data are copied into the cache memory L1. Before this copying of the required data from the cache memory L2 into the cache memory L1, storage space for the data to be copied must be created in the cache memory Ll.
  • a data sequence stored in the cache memory L1 is selected and removed from the cache memory L1.
  • the corresponding data can be stored back, for example, in the main working memory of the computer system.
  • the problem of ensuring the integrity of the data in the cache memories occurs in particular in multiprocessor systems.
  • This problem can be solved by implementing a so-called MESI cache protocol.
  • MESI cache protocol This means that for each data sequence that is stored in the cache memories, there is coded information as to whether the data sequence is a data sequence modified in comparison with the content of the main memory or whether the data sequence is exclusive only in one of the cache memories. Memory is stored or whether other cache memories also have this data sequence and whether the stored data sequence is ultimately invalid.
  • To encode this information in generally three bits, the so-called valid bit (VBit), change bit (CBit) and the multi-copy bit (MCBit).
  • registers are provided at least in the L2 caches, a triple of V, C and MC bits each indicating information required to maintain cache integrity.
  • This is already known from the journal Microprocessing and Microprogramming 32 (1991), pages 215-220, "Data Consistency in a Multiprocessor System with 'Store In'Cache-Concept" by Gerhard Döttling.
  • IBM Technical Disclosure Bulletin Vol. 37, No. 10, 10-94 pages 557 - 562, vol. 37, no. 6a, 06-94, pages 241 and 242, vol. 37, no. 5, 05-94, pages 553 and 554, vol. 37, no. 4B, 04-94, pages 207-208, processor systems in which the cache integrity is guaranteed by the MESI protocol.
  • a description of the MESI protocol can also be found in the user manual for the Power PC 601, Chapter 4, page 10.
  • the object of the invention is therefore to create an improved cache memory.
  • the invention is also based on the object of providing an improved microprocessor system in which the use of the system bus is particularly effective.
  • the corresponding action must also be carried out in the cache memory L1 in order to ensure the integrity of the data in the cache memories .
  • the status change of the data sequence in the cache L1 must, however can only be made if a copy of this data sequence is also present in the cache memory L1. Otherwise, the status change of the data sequence in the cache L2 is sufficient.
  • information about the cache memory L1 is stored in the cache memory L2, so that there is no need for the cache memory L2 to check whether the corresponding data is present in the cache memory Ll.
  • This information is stored in the form of validity bits Vi (Ll), preferably in registers provided for this purpose in the cache memory L2. If the status of a stored data sequence is now to be changed within the cache L2, it can already be checked at the level of the cache L2 whether the corresponding action is also required in the cache L1. This makes the corresponding interrupt of the microprocessor belonging to the cache memory L1 and the blocking of the system bus unnecessary.
  • Fig. 1 is a schematic representation of a
  • FIG. 2 shows a microprocessor system with several of the microprocessors shown in FIG. 1 and associated cache memories.
  • the cache memory L2 is integrated on a separate chip 1.
  • the cache memory L2 has a memory area 2 which can be divided into the memory sections YI, Y2, ... Yn-1, Yn. In the embodiment shown, this division of the memory area 2 takes place line by line, each line of the memory area 2 representing one of the memory sections Yi.
  • the memory sections Yi are the same size in the exemplary embodiment shown.
  • FIG. 1 a microprocessor PU is shown in FIG. 1, which is integrated on a chip 4.
  • a cache memory L1 of the order X-1 belonging to the microprocessor PU is also integrated on the chip 4.
  • the order X of the cache memory indicates the memory level from the microprocessor PU, the corresponding cache memory is located.
  • the cache memory L1 can be divided into memory sections ZI, Z2,... Zm-1, Zm in a similar way to the cache memory L2. As in the cache L2, the corresponding memory sections Zj are in rows in the cache L1 organized and of the same memory length. The memory sections Zj thus divide the memory area 5 of the cache memory L1.
  • the cache memory L1 is connected to the cache memory L2 via a data bus 6.
  • the cache memory L2 is in turn connected to the system bus 7.
  • the microprocessor PU is connected to the cache memory L2 via the address bus 8.
  • the cache memory L2 has a directory 9 (DIRECTORY).
  • the memory addresses i of the data sequences stored in the memory sections Yi are stored in the directory 9.
  • the cache memory L1 also has a corresponding directory 10 for storing the addresses j of the data sequences stored in the memory sections Zj.
  • Each address stored in the directory 10 of the cache memory L1 is uniquely assigned a valid bit and a multiple copy bit.
  • a valid bit Vj indicates for the corresponding data sequence j, which is stored in a memory section Zj in the memory area 5 of the cache memory L1, whether the data sequence j is valid. For example, if the processor PU needs the data sequence associated with the address j, the microprocessor PU will make a corresponding request to its cache memory L1.
  • the directory 10 of the cache memory L1 will first be searched for the address j of the data sequence. If the address j is in the directory 10, this means that the corresponding data sequence with the address j is present in a memory section Zj of the memory area 5 in the cache memory L1.
  • the valid bit Vj associated with address j is then checked. If the valid bit Vj is logic 1, this means that the data sequence stored in the memory section Zj is still valid.
  • the data sequence Zj is then output to the processor PU for processing via an internal data bus. After processing this data sequence j by the Microprocessor PU can be restored to the same logical memory section Zj.
  • Each of the addresses j stored in the directory 10 is also uniquely assigned a multiple copy bit (MC bit). If the MC bit is logic 1, for example, this means that copies of the data sequence j stored in the memory section Zj are located in other cache memories of the computer system.
  • the cache memory L2 has corresponding registers 11 and 13, in which the valid bits Vi (L2) and the multiple copy bits MCi of the corresponding memory sections Yi are stored. To implement the MESI cache protocol, the cache memory L2 also has a register 12 in which change bits Ci of the corresponding memory sections Yi are stored. Furthermore, both the cache memory L1 and the cache memory L2 have what are known as least recently used (LRU) registers 14 and 17, which, in a manner known per se, serve to ensure that only the most frequently required data sequences are stored in the cache -Stores are located.
  • LRU least recently used
  • the present invention can be applied to various cache systems, e.g. for the following cache systems:
  • the cache memory L1 is a so-called write-through memory. This means that each storage of a data sequence of address 1 by the microprocessor PU in the cache memory L1 automatically results in the storage of the same data sequence 1 in a memory section Yi of the cache memory L2.
  • the cache memory L2 is a so-called store-in cache. This means that the data sequence i just mentioned is not automatically fed to the main memory via the system bus. Only if the microprocessor requests a new data sequence i from the main memory, but for which there is no free memory space in the cache memory L2, must the corresponding memory space in the cache memory L2 first be created before the data sequence i is stored.
  • That memory section Yi of the cache memory L2 which contains the LRU data sequence of the address j can be used for this purpose.
  • This memory section Yi can generally be overwritten because a copy of the data sequence j is present in the main memory. Only if the corresponding C bit of the data sequence which is stored in the memory section Yi is logic 1, must the data sequence j located there be stored in the main memory before the new data sequence i is stored in the memory section Yi.
  • the cache memory Ll is also a "store-in-cache". Then, when a data sequence of the logical address i, which is stored in the memory section Zj of the cache memory L1, changes, a corresponding change in the memory section Yi of the cache memory L2, in which this data sequence of the address i is also stored, no longer occurs automatically , be carried out, as is the case in the example under 1.. However, in order to ensure the integrity of the data nevertheless, a change bit must also be provided in the cache memory L1.
  • the cache memory L2 has a further register 3, which is used to store validity bits Vi (Ll).
  • a validity bit Vi (Ll) is uniquely assigned to each of the memory sections Yi.
  • a valid bit Vi (Ll) of the corresponding memory section Yi of the cache memory L2 indicates whether the data sequence with the address i stored in the memory section Yi also in one Memory section Zj of the cache memory Ll is stored.
  • the cache L2 is generally much larger than the cache L1.
  • the cache L2 can for example be 256 KB and the cache Ll only 16 KB. The probability that a data sequence 1 is stored in both cache memories L1 and L2 of a processor is therefore relatively low.
  • FIG. 2 shows a schematic illustration of a computer system according to the invention, only two microprocessors PU1 and PU2, which correspond to the microprocessor PU of FIG. 1, being shown.
  • the cache memories L1 and L2 of the microprocessors PU1 and PU2 are constructed in the same way as the corresponding cache memories of the microprocessor PU in FIG. 1.
  • the microprocessors PU1 and PU2 exchange commands and data indirectly via the system bus 7 via their cache memory L2.
  • the microprocessors PU1 and PU2 are connected to one another via a signal line 18, to which all other microprocessors of the microprocessor system shown in detail in FIG. 2 are preferably also connected.
  • the microprocessor PU 1 If the microprocessor PU 1 is to transmit a command via the system bus 7, this first requires that the microprocessor PU1 be given access to the system bus 7 for this purpose. For this purpose, the microprocessor PU1 must receive access authorization from an arbiter, not shown. Arbiters suitable for this purpose are known, for example, from EP-A-0 575 651. As soon as the system bus 7 is assigned to the microprocessor PU1, the processor PUl can issue a data request command for a data sequence of the address i on the system bus 7. This command is aimed on the one hand at the Main memory of the microprocessor system, on the other hand to all "listening" L2 chips.
  • the cache memory L2 according to the invention has an advantageous effect on the overall processing speed of the system in the following situations, for example:
  • the processor PU2 has processed and changed a data sequence of the address 1. This data sequence is now to be stored in the cache memories L1 and L2 of the processor PU2, specifically at the logical address 1 + 1. However, the corresponding data sequence of the address 1 + 1 is not available in the cache memories L1 and L2 of the processor PU2.
  • the processor PU2 Before the processor PU2 can save the changed data sequence of the address i + 1 in its cache L1, the earlier version of this data sequence i + 1 must be saved in the cache L1 and L2 of the processor PU2. After this earlier version of the data sequence i + 1 has been stored in the cache memories L1 and L2 of the processor PU2, it can then be overwritten by the changed version of the data sequence i + 1. This process is also called "linefetch due to sturgeon".
  • the processor PU2 will first issue a request command on the bus 7, which is directed to the memory of the microprocessor system. With the request command, the microprocessor PU2 signals that it requires the data sequence of the address i + 1. Normally this has the consequence that the memory of the microprocessor system outputs the corresponding data on the bus 7.
  • the microprocessor PU1 has the data sequence of the address i + 1 in its cache memory L2 and the associated change bit (CBit) is logic 1.
  • the data sequence of the address i + 1 in the cache memory L2 of the processor PU1 is changed compared to the memory of the microprocessor system.
  • the valid bit V (L2) of the data sequence of the address i + 1 in the cache memory L2 of the Microprocessor PUl is logic 1, ie the corresponding data in the cache memory L2 are valid.
  • the microprocessor PU1 is then used instead of the memory
  • Microprocessor system output the data sequence of the address i + 1 on the bus. This is ensured by the bus protocol.
  • This process is also "bus snooping" the processor
  • the valid bit V (L2) of the address i + 1 in the cache memory L2 of the processor PU1 must become logic 0, since the request for the data sequence i + 1 "linefetch" due to an upcoming memory operation "disturbs" the cache Memory L1 of processor PU2 takes place. This is because this means that the data sequence i + 1, as it exists in the cache memory L2 of the processor PU1, should no longer be valid.
  • Such an interrupt of the processor is unnecessary, however, if the valid bit Vi (Ll) of the address i + 1 in the cache memory L2 of the processor PU1 is logic 0, which means that the data sequence of the address i + 1 in the cache memory Ll of the processor PUl is not present at all.
  • the saving of interrupt operations means that the "BUSY" signal has to be active for a correspondingly shorter period of time. That has a significant impact on the whole
  • Processing speed of the microcomputer system In this example, this is due to the fact that no other processor in the system can access bus 7 during the period of execution of the "linefetch due to disturb" operation. Because in most cases no processor interrupt is required, another system participant can be granted access to the bus earlier.
  • a necessary interrupt can take a long time.
  • An interrupt of a microprocessor is not possible at any time, but only in certain states of the microprocessor. If, for example, the microprocessor in question is currently executing an instruction that takes many system cycles, the interrupt cannot be executed during this time. The processor can only be interrupted after this instruction has been processed. However, due to the use of the L2 cache according to the teaching of the invention, a time-consuming interrupt is not necessary in most cases.
  • the microprocessor PU1 has in its L2 cache a data sequence of address 1, of which copies exist in other caches. This means that the multiple copy bit (MCBit) of address i (register 13) is logical 1 is. It is assumed that the data sequence of the address i does not exist in the cache memory L1 of the same processor PU1. The corresponding validity bit Vi (Ll) in the cache memory L2 of the microprocessor PU1 is therefore logic 0.
  • the processor PU2 has also stored a valid copy of the data sequence i in its cache memories L1 and L2, so that the respective MC bits of the address i in the cache memories L1 and L2 of the processor PU2 are logic 1.
  • the processor PU2 has processed and changed the data sequence i. Now the changed data sequence i is to be stored in the cache memory L1 of the processor PU2.
  • a corresponding command is output on bus 7 and received by processor PU1 - as well as by other processors that may be affected ("bus snooping").
  • the validity bit V (L2) in the cache memory L2 of the processor PU1 is then invalidated. Since the valid bit Vi (Ll) of the address 1 in the register 3 of the cache memory L2 of the processor PU1 indicates that the data sequence of the address i is not present in the cache memory L1 of the processor PU1, there is no need for an invalidation in it Processor cache L1 PUl. As a result, there is also no need for an interrupt of the processor PU1.
  • the changed data sequence i can then be stored in the cache memories L1 and L2 of the processor PU2. This overwrites the previous version of this data sequence.
  • the MCBit then becomes logical 0 because there are no valid copies in the cache memories of other processors, since these were previously invalidated.
  • the processor PU2 has an exclusive i data sequence in its cache L2, i.e. there are no copies of this sequence in other processors' caches. It is assumed that the data sequence of the address i is not present in the cache memory L1 of the processor PU2. The corresponding validity bit Vi (Ll) in the cache memory L2 is then logic 0.
  • the processor PU1 needs the data sequence of the address i exclusively for reading, but has neither in its L1 nor in its L2 cache.
  • the processor PU1 therefore issues a "line fetch due to fetch" command on the bus 7 in order to request the data sequence i from the memory of the microprocessor system.
  • the L2 cache memory of the processor PU2 recognizes this process on the bus 7 by "bus snooping".
  • the fact that another processor fetches a copy of the data sequence 1 from the memory of the microprocessor system means that this data sequence is no longer exclusively available in the cache memory L2 of the processor PU2. Therefore, the MCBlt of address 1 must be set to logic 1 in processor PU2.
  • the processor PU1 In the processor PU1 that requested the data sequence i, it is also necessary to set the MC bit of the address i in its cache memory L1 and L2 to logic 1.
  • the information that the requested data sequence i already exists in the cache memory of another processor - here the processor PU2 - is transmitted to the processor PU1 through the busy line 18 (FIG. 2).
  • the bus protocol ensures that a corresponding signal is generated by the cache memory L2 of the processor PU2 during a system cycle the busy line 18 is output after the match has been determined.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

L'invention concerne une antémémoire L2 d'ordre X pour une antémémoire d'ordre X-1. Les antémémoires L2 et L1 peuvent être subdivisées en sections Yi et Zj. L'antémémoire L2 comprend un registre (3) de mémorisation d'un bit de validité Vi (L2) pour chaque section Yi de la mémoire. Le bit de validité d'une section Yi de la mémoire indique si le contenu de cette section de la mémoire est également enregistré dans une section correspondante Zj de l'antémémoire L1. On évite ainsi des interruptions du microprocesseur en question, on améliore l'efficacité d'utilisation du bus du système (7) et la vitesse de traitement de tout le système du microprocesseur.
PCT/EP1995/001458 1995-04-18 1995-04-18 Antememoire WO1996033462A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP1995/001458 WO1996033462A1 (fr) 1995-04-18 1995-04-18 Antememoire

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP1995/001458 WO1996033462A1 (fr) 1995-04-18 1995-04-18 Antememoire

Publications (1)

Publication Number Publication Date
WO1996033462A1 true WO1996033462A1 (fr) 1996-10-24

Family

ID=8165999

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1995/001458 WO1996033462A1 (fr) 1995-04-18 1995-04-18 Antememoire

Country Status (1)

Country Link
WO (1) WO1996033462A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0461926A2 (fr) * 1990-06-15 1991-12-18 Compaq Computer Corporation Inclusion à niveaux multiples dans des hiérarchies d'antémémoires à niveaux multiples
EP0549219A1 (fr) * 1991-12-24 1993-06-30 Motorola, Inc. Circuit de commande d'antrémémoire
EP0649094A1 (fr) * 1993-10-14 1995-04-19 International Business Machines Corporation Hiérarchie d'antémémoires de multiprocesseurs

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0461926A2 (fr) * 1990-06-15 1991-12-18 Compaq Computer Corporation Inclusion à niveaux multiples dans des hiérarchies d'antémémoires à niveaux multiples
EP0549219A1 (fr) * 1991-12-24 1993-06-30 Motorola, Inc. Circuit de commande d'antrémémoire
EP0649094A1 (fr) * 1993-10-14 1995-04-19 International Business Machines Corporation Hiérarchie d'antémémoires de multiprocesseurs

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BAER J -L ET AL: "ON THE INCLUSION PROPERTIES FOR MULTI-LEVEL CACHE HIERARCHIES", PROCEEDINGS OF THE ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, HONOLULU, MAY 30 - JUNE 2, 1988, no. 1988, 30 May 1988 (1988-05-30), INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, pages 73 - 80, XP000039790 *
JEAN-LOUP BAER ET AL: "MULTILEVEL CACHE HIERARCHIES:ORGANIZATIONS, PROTOCOLS, AND PERFORMANCE*", JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, vol. 6, no. 3, 1 June 1989 (1989-06-01), pages 451 - 476, XP000133092 *

Similar Documents

Publication Publication Date Title
DE10262164B4 (de) Computersystem mit einer hierarchischen Cacheanordnung
DE3803759C2 (fr)
DE68902193T2 (de) Datenspeicheranordnung.
DE3782335T2 (de) Speichersteuersystem.
EP0013737B1 (fr) Hiérarchie de mémoire à plusieurs étages pour un système de traitement des données
DE69133302T2 (de) Registerabbildung in einem einzigen Taktzyklus
DE3502147C2 (fr)
EP0600112A1 (fr) Système de traitement de données à adressage de mémoire virtuelle et clés de protection d'accès en mémoire
DE2856715C3 (de) Verfahren zum Durchführen einer Pufferspeicher-Koinzidenz in einem Mehrprozessorsystem
DE3131341A1 (de) "pufferspeicherorganisation"
DE112006003917T5 (de) Verfahren, Gerät und System angewendet in einem Cachespeicher-Kohärenzprotokoll
DE3102150A1 (de) "schaltungsanordnung mit einem cachespeicher fuer eine zentraleinheit einer datenverarbeitungsanlage
DE3932675A1 (de) Virtuelles maschinensystem
DE3621321A1 (de) Cache-speicher- bzw. multiprozessor-system und betriebsverfahren
DE2847960A1 (de) Speichersteuereinrichtung
DE112005002180T5 (de) Lösen von Cachekonflikten
DE10219623A1 (de) System und Verfahren zur Speicherentscheidung unter Verwendung von mehreren Warteschlangen
DE10002120A1 (de) Logikstruktur eines Adressumsetzpuffers
DE10006430B4 (de) Verfahren zur Aufrechterhaltung einer Kohärenz für ein Multi-Prozessor-System
DE3046912C2 (de) Schaltungsanordnung zum selektiven Löschen von Cachespeichern in einer Multiprozessor-Datenverarbeitungsanlage
DE3873388T2 (de) Cache-speicher.
DE69130626T2 (de) Verfahren zur Verwaltung einer Cache-Speicheranordnung
DE2710477C2 (fr)
EP0265636A1 (fr) Multiprocesseur avec plusieurs processeurs munis d'antémémoires et une mémoire commune
DE4227784A1 (de) Rechnersystem und verfahren zum beheben eines seitenfehlers

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase