WO2016068870A1 - Media controller with coordination buffer - Google Patents
Media controller with coordination buffer Download PDFInfo
- Publication number
- WO2016068870A1 WO2016068870A1 PCT/US2014/062593 US2014062593W WO2016068870A1 WO 2016068870 A1 WO2016068870 A1 WO 2016068870A1 US 2014062593 W US2014062593 W US 2014062593W WO 2016068870 A1 WO2016068870 A1 WO 2016068870A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- memory
- computing nodes
- shared memory
- media controller
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/084—Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0817—Cache consistency protocols using directory methods
- G06F12/082—Associative directories
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
Definitions
- Nonstop class computing systems refer to systems that have redundant computing nodes such that if any one of the redundant computing nodes fails the remaining nodes continue system operations. Nonstop class systems also must detect process failures and errors before data is committed to persistent memory. Increasingly, these systems are employing new non-volatile memory (NVM) technologies operating at near main memory latencies to improve system
- FIG. 1 illustrates an example of a media controller that employs a coordination buffer to control access to a shared memory.
- FIG. 2 illustrates an example of a media controller that employs a coordination buffer to control access to a segmented shared memory.
- FIG. 3 illustrates an example of a network of computing nodes that communicate with media controllers that employs a coordination buffer to control access to a shared memory.
- FIG. 4 illustrates an example of a method to control access to a shared memory.
- This disclosure relates to a media controller that employs a coordination buffer to control access to a shared memory.
- the media controller controls data access from a set of computing nodes to the shared memory by processing data update requests from the set of computing nodes.
- the data update requests represent data that a given computing node desires to modify at a selected address of the shared memory. If data updates from a subset of computing nodes correlates (e.g., data requests from at least two nodes to the same shared memory address matches), the media controller enables the shared memory to be modified in accordance with the data update request. For example, if one computing node requests that data be updated to a given data value, the media controller can hold off the actual modification of memory until at least one other node requests the same update to the same shared memory address.
- all computing nodes in the set of computing nodes may have to request the same data update before modification (e.g., write to shared memory address) of the shared memory can commence.
- a proper subset of computing nodes e.g., some number less than all of the computing nodes in the set
- the media controller updates the shared memory.
- the coordination buffer holds data update requests generated from the set of computing nodes to the memory controller where the data update requests represent desired updates to the shared memory.
- the memory controller enables data in the shared memory to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer.
- FIG. 1 illustrates an example of a media controller 1 10 that employs a coordination buffer 120 to control access to a shared memory 130.
- the shared memory 130 is typically a non-volatile memory (e.g., Memristor, PC RAM, Spin Torque, and so forth) although volatile memory can also be employed.
- the media controller 1 10 controls data access from a set of computing nodes 140, shown as nodes 1 though N, with N being a positive integer, to the shared memory 130 by processing data update requests from the set of computing nodes.
- the coordination buffer 120 holds the data update requests, shown as update request 1 though M with M being a positive integer, generated from the set of computing nodes 140 to the memory controller 1 10.
- the data update requests held in the coordination buffer 120 represent desired data updates to the shared memory 130 at a selected address of memory as requested from the respective computing node from the set 140.
- the media controller 1 10 and coordination buffer can be provided as a circuit and/or as part of a memory bus system to control access to the shared memory 130.
- the media controller 1 10 enables data in the shared memory 130 to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer 120.
- the media controller 1 10 determines if a subset of the set of computing nodes 140 generate the same data update request before enabling the data in the shared memory 130 to be modified.
- the data update requests represent data that a given computing node from the set of nodes 140 desires to modify at a selected address of the shared memory 130. If data updates from a subset of computing nodes correlates (e.g., data from at least two nodes to the same shared memory address matches), the media controller 1 10 enables the shared memory 130 to be modified in accordance with the data update request.
- the media controller 1 10 can hold off the actual modification of shared memory 130 until at least one other node from the set 140 requests the same update to the same shared memory address.
- the media controller 1 10 can hold off the actual modification of shared memory 130 until at least one other node from the set 140 requests the same update to the same shared memory address.
- All computing nodes in the set of computing nodes 140 may have to request the same data update before modification (e.g., write to shared memory address) of the shared memory 130 can commence.
- a proper subset of computing nodes e.g., some number less than all of the computing nodes in the set) may request an update.
- the media controller 1 10 updates the shared memory 130.
- the shared memory 130 and media controller disclosed herein can be employed in one example to serialize concurrent accesses by multiple redundancy controllers to memory (e.g., See e.g., FIG. 3). Redundancy controllers, for instance, may access memory using redundant array of independent disks (RAID) algorithms and/or memory mirroring, to provide fault tolerance in the event of a shared memory module [304A-M] failure.
- RAID redundant array of independent disks
- the computing nodes in the set 140 can include a central processing unit (CPU) that can include a single core or can include multiple cores where each core is given similar or dissimilar permissions by the media controller to access the shared memory 130.
- the CPU can also be bundled with other CPU's to perform a server and/or client function, for example.
- Multiple servers and/or clients can be employed to access the memory 130 via the media controller 1 10 (or controllers).
- the media controller 1 10 can control and facilitate access to the memory 130 with respect to a single CPU core, multiple CPU cores, multiple servers, and/or multiple clients, for example.
- the media controller 1 10 can be provided as part of a memory bus architecture to provide access the shared memory 130. This can also include employment of a memory controller (not shown). In some examples, the functions of the memory controller and media controller 1 10 can be combined into a single integrated circuit. The media controller 1 10 controls aspects of the memory interface that are specific to the type of medium attached (e.g. various non-volatile memory types, DRAM, flash, and so forth).
- media-specific decoding or interleave e.g., Row/Column/Bank/Rank
- media-specific wear management e.g., Wear Leveling
- media-specific error management e.g., ECC correction, CRC detection, Wear-out relocation, device deletion
- media-specific optimization e.g. conflict scheduling
- the memory controller controls aspects of the memory interface that are independent of media, but specific to the CPU or system features employed. This may include, for example, system address decoding (e.g., interleaving between multiple media controllers, if there are more than one), and redundancy features described below with respect to FIG. 3, for example (e.g., RAID, mirroring, and so forth).
- the comparison between multiple cores running the same application can be checked for proper operation by monitoring the changes they desire to make to shared memory 130.
- the memory subsystem including the media controller 1 10 (or device between the cores and the memory subsystem) receives an update request for change and waits for the "other" computing nodes from the set 140 to request the same operation before committing to the shared memory 130. Should the
- the majority can rule via the media controller 1 10 whereas the non-matching computing node's update may be rejected.
- the media controller 1 10 can track new update requests (e.g., writes) to shared memory 130 in the coordination buffer 120 that holds the transaction pending until additional writes to the same memory location from the set of coordinated validating systems (e.g., computing nodes) from the set 140 are received.
- the coordination buffer 120 can check any newly arriving writes against pending writes, and upon a match, compare the data to determine whether the update matches the pending transaction - or not. Any number of validating systems may be supported in this memory configuration.
- a matching data update request can be recorded as a "vote for" the update, whereas a mismatch can be recorded as a "vote against” via the coordination buffer 120. If three or more updates are considered, then subsequent updates can continue to be accumulated until all (or subset) cooperating system updates are observed, or non-reporting systems are determined to be non-reporting and removed from the set 140. The prevalent data (e.g., that of the majority) can then be committed to shared memory 130. If only one update is received, then updating system from the set of computing nodes 140 can be determined to be suspect, and the update is discarded (e.g., after a predetermined period of time or after a number of events have occurred).
- write responses can be returned to all (or a subset) of updating systems, with success, or failure indicated to each requesting node to trigger a suitable response to mismatched data (e.g., success flag if data written or error flag if update request rejected by media controller).
- Example implementations may achieve a tight consistency of data update timing, in which case a direct hardware implementation of the update tracking buffer 120 can be sufficient to collect and process pending updates within the expected time window. Pending updates may be rejected in this model when the capacity of the coordination buffer 120 is overrun before the necessary corroborating updates have been received.
- Other example implementations may allocate a portion of shared memory 130 (See e.g., FIG. 2) to gather pending updates still requiring validating updates from other systems. Such a configuration can manage pending updates from a hardware front end as described herein. While some multisystem architectures may have a tendency to have their timing drift relative to one another, this media controller 1 10 and coordination buffer 120 can tend to re-align (e.g., synchronize) the timing by aligning the completion indication of matching updates, for example.
- each computing node from the set 140 may be assigned address ranges that are private, to be used to accumulate data before software algorithms gather the data intended for cross validation. Other address ranges may be shared, but not validated so that it may be used for communication between the cooperating systems to keep them in sync with one-another.
- New cores or systems
- non-volatile memories may be redundantly deployed, with mirroring, and/or RAIDing of the data from the compute node 140 across non-volatile memory modules with each independently validated.
- the media controller 1 10 and coordination buffer enables straight-forward majority rule (e.g., nonstop) operation within the memory media controller 1 10, allows direct memory access performance without special hardware on the processor while providing the system reliability required of non-stop systems.
- a variety of commodity processors may be applied to this computing space.
- the management of pending transactions may be handled by a "machine-in-the-middle" that provides the described functionality before handing committed updates to standard memory modules.
- the media controller 1 10 and coordination buffer 120 provides for quick comparison and sign off of matching writes to fulfill the expectations of processor load/store latencies.
- FIG. 2 illustrates an example of a media controller 210 that employs a coordination buffer 220 to control access to a segmented shared memory 230.
- the memory 230 can include a private memory segment 240.
- the private memory segment 240 can be reserved for a given node from the set of nodes described above.
- each of the nodes from the set of nodes can be assigned a separate private memory segment, where update requests to the private memory do not have to have corroboration by the memory controller 210 and can be directly updated upon request by the given node.
- the memory 230 can also include shared non-controlled access memory 250.
- the memory 250 can be accessed by multiple computing nodes upon request yet do not need to be validated by another computing node in the set before the memory controller proceeds to update the memory 250.
- the memory 230 can also include a shared controlled access memory segment 260.
- the memory segment 260 requires validation by the memory controller 210 before it can be modified. For example, validation can include the requirement that at least two computing nodes from a set of computing nodes generate the same update request before the memory controller 210 proceeds to modify the memory segment 260.
- FIG. 3 illustrates an example of a network system 300 of computing nodes that communicate with media controllers that employs a coordination buffer to control access to a shared memory.
- the system 300 can be deployed as a fault tolerant system to serialize concurrent accesses by multiple redundancy controllers to fault tolerant memory according to an example of the present disclosure. It should be understood that the system 300 may include additional components and that one or more of the components described herein may be removed and/or modified without departing from a scope of the system 300.
- the system 300 may include multiple computing nodes 300A-N (where the number of computing nodes is greater than or equal to 1 ), multiple redundancy controllers 302A-N, a network interconnect module 340, and memory modules 304A-M.
- the multiple compute nodes 300A-N may be coupled to the memory modules 304A-M by the network interconnect module 340.
- the memory modules 304A-M may be coupled to the memory modules 304A-M by the network interconnect module 340.
- modules 304A-M may include media controllers 320A-M and memories 321 A-M. Each media controller, for instance, may communicate with its associated memory and control access to the memory.
- the media controllers 320A-M provide access to regions of memory with each media controller including a respective coordination buffer (not shown) as disclosed herein.
- the regions of memory can be accessed by multiple redundancy controllers 302A-N in the compute nodes 300A-N using access primitives such as read, write, lock, unlock, and so forth.
- media controllers 320A-M may be accessed by multiple redundancy controllers (e.g., acting on behalf of multiple servers).
- the memory 321 A-M may include volatile dynamic random access memory (DRAM) with battery backup, non-volatile phase change random access memory (PCRAM), spin transfer torque-magnetoresistive random access memory (STT-MRAM), resistive random access memory (reRAM), memristor, FLASH, or other types of memory devices.
- DRAM volatile dynamic random access memory
- PCRAM non-volatile phase change random access memory
- STT-MRAM spin transfer torque-magnetoresistive random access memory
- reRAM resistive random access memory
- memristor FLASH
- the memory may be solid state, persistent, dense, fast memory.
- Fast memory can be memory having an access time similar to DRAM memory.
- the redundancy controller 302A- M may maintain fault tolerance across the memory modules 304A-M.
- redundancy controller 302A-M may receive read or write commands from one or more processors, I/O devices, or other sources. In response to these, it generates sequences of primitive accesses to multiple media controllers 320A-M. The redundancy controller 302A-M may also generate certain sequences of primitives independently, not directly resulting from processor commands. These include sequences used for scrubbing, initializing, migrating, or error-correcting memory, for example.
- FIG. 4 illustrates an example of a method 400 to control access to a shared memory.
- the method 400 includes comparing a set of data update requests generated from a set of computing nodes (e.g., via coordination buffer 120 of FIG. 1 ).
- the method 400 includes determining if the set of data update requests from the set of computing nodes represent the same data (e.g., via media controller 1 10 of FIG. 1 ).
- the method 400 includes modifying data in a shared memory if a subset of the data update requests generated from the set of computing nodes represents the same data (e.g., via media controller 1 10 of FIG. 1 ).
- the method 400 can also include sending a success flag to computer nodes who have succeeded with a respective update request associated with a positive vote and sending a failure flag to computer nodes who have had been rejected for a respective update request associated with a negative vote.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Hardware Redundancy (AREA)
Abstract
A system includes a media controller to control data access from a set of computing nodes to a shared memory. A coordination buffer holds data update requests generated from the set of computing nodes to the media controller. The media controller enables data in the shared memory to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer.
Description
MEDIA CONTROLLER WITH COORDINATION BUFFER
BACKGROUND
[0001] Nonstop class computing systems refer to systems that have redundant computing nodes such that if any one of the redundant computing nodes fails the remaining nodes continue system operations. Nonstop class systems also must detect process failures and errors before data is committed to persistent memory. Increasingly, these systems are employing new non-volatile memory (NVM) technologies operating at near main memory latencies to improve system
performance by allowing new flat memory hierarchies that permit servers to write directly to non-volatile memory as storage. These systems can also take advantage of both the shorter latencies offered by the NVM memory technology and the reduced software management layers that are necessary with the current separate memory and storage system architectures.
BRIEF DESCRIPTION OF THE DRAWINGS
[0002] FIG. 1 illustrates an example of a media controller that employs a coordination buffer to control access to a shared memory.
[0003] FIG. 2 illustrates an example of a media controller that employs a coordination buffer to control access to a segmented shared memory.
[0004] FIG. 3 illustrates an example of a network of computing nodes that communicate with media controllers that employs a coordination buffer to control access to a shared memory.
[0005] FIG. 4 illustrates an example of a method to control access to a shared memory.
DETAILED DESCRIPTION
[0006] This disclosure relates to a media controller that employs a coordination buffer to control access to a shared memory. The media controller controls data access from a set of computing nodes to the shared memory by processing data update requests from the set of computing nodes. The data update requests represent data that a given computing node desires to modify at a selected address of the shared memory. If data updates from a subset of computing nodes correlates (e.g., data requests from at least two nodes to the same shared memory address matches), the media controller enables the shared memory to be modified in accordance with the data update request. For example, if one computing node requests that data be updated to a given data value, the media controller can hold off the actual modification of memory until at least one other node requests the same update to the same shared memory address.
[0007] Different control configurations are possible in the media controller. In some examples, all computing nodes in the set of computing nodes may have to request the same data update before modification (e.g., write to shared memory address) of the shared memory can commence. In another example, a proper subset of computing nodes (e.g., some number less than all of the computing nodes in the set) may request an update. If the subset of nodes generates the same update request, then the media controller updates the shared memory. The coordination buffer holds data update requests generated from the set of computing nodes to the memory controller where the data update requests represent desired updates to the shared memory. The memory controller enables data in the shared memory to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer. In one example, the media controller determines if a subset of the set of computing nodes generate the same data update request before enabling the data in the shared memory to be modified.
[0008] FIG. 1 illustrates an example of a media controller 1 10 that employs a coordination buffer 120 to control access to a shared memory 130. The shared memory 130 is typically a non-volatile memory (e.g., Memristor, PC RAM, Spin Torque, and so forth) although volatile memory can also be employed. The media controller 1 10 controls data access from a set of computing nodes 140, shown as nodes 1 though N, with N being a positive integer, to the shared memory 130 by processing data update requests from the set of computing nodes. The coordination buffer 120 holds the data update requests, shown as update request 1 though M with M being a positive integer, generated from the set of computing nodes 140 to the memory controller 1 10. The data update requests held in the coordination buffer 120 represent desired data updates to the shared memory 130 at a selected address of memory as requested from the respective computing node from the set 140. As disclosed herein, the media controller 1 10 and coordination buffer can be provided as a circuit and/or as part of a memory bus system to control access to the shared memory 130.
[0009] The media controller 1 10 enables data in the shared memory 130 to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer 120. In one example, the media controller 1 10 determines if a subset of the set of computing nodes 140 generate the same data update request before enabling the data in the shared memory 130 to be modified. The data update requests represent data that a given computing node from the set of nodes 140 desires to modify at a selected address of the shared memory 130. If data updates from a subset of computing nodes correlates (e.g., data from at least two nodes to the same shared memory address matches), the media controller 1 10 enables the shared memory 130 to be modified in accordance with the data update request. For example, if one computing node requests that data be updated to a given data value, the media controller 1 10 can hold off the actual modification of shared memory 130 until at least one other node from the set 140 requests the same update to the same shared memory address.
[00010] Different control configurations are possible in the media controller 1 10. In some examples, all computing nodes in the set of computing nodes 140 may have to request the same data update before modification (e.g., write to shared memory address) of the shared memory 130 can commence. In another example, a proper subset of computing nodes (e.g., some number less than all of the computing nodes in the set) may request an update. If the proper subset of nodes (e.g., simple majority, predetermined number defining majority) generate the same update request, then the media controller 1 10 updates the shared memory 130. The shared memory 130 and media controller disclosed herein can be employed in one example to serialize concurrent accesses by multiple redundancy controllers to memory (e.g., See e.g., FIG. 3). Redundancy controllers, for instance, may access memory using redundant array of independent disks (RAID) algorithms and/or memory mirroring, to provide fault tolerance in the event of a shared memory module [304A-M] failure.
[00011] The computing nodes in the set 140 can include a central processing unit (CPU) that can include a single core or can include multiple cores where each core is given similar or dissimilar permissions by the media controller to access the shared memory 130. The CPU can also be bundled with other CPU's to perform a server and/or client function, for example. Multiple servers and/or clients can be employed to access the memory 130 via the media controller 1 10 (or controllers). Thus the media controller 1 10 can control and facilitate access to the memory 130 with respect to a single CPU core, multiple CPU cores, multiple servers, and/or multiple clients, for example.
[00012] In some examples, the media controller 1 10 can be provided as part of a memory bus architecture to provide access the shared memory 130. This can also include employment of a memory controller (not shown). In some examples, the functions of the memory controller and media controller 1 10 can be combined into a single integrated circuit. The media controller 1 10 controls aspects of the memory interface that are specific to the type of medium attached (e.g. various non-volatile memory types, DRAM, flash, and so forth). These may include, for example, media-
specific decoding or interleave (e.g., Row/Column/Bank/Rank), media-specific wear management (e.g., Wear Leveling), media-specific error management (e.g., ECC correction, CRC detection, Wear-out relocation, device deletion), and/or media- specific optimization (e.g. conflict scheduling). If a memory controller is also employed, the memory controller controls aspects of the memory interface that are independent of media, but specific to the CPU or system features employed. This may include, for example, system address decoding (e.g., interleaving between multiple media controllers, if there are more than one), and redundancy features described below with respect to FIG. 3, for example (e.g., RAID, mirroring, and so forth).
[00013] With a memory based entity (e.g., no I/O directly controlled from a core) the comparison between multiple cores running the same application can be checked for proper operation by monitoring the changes they desire to make to shared memory 130. The memory subsystem including the media controller 1 10 (or device between the cores and the memory subsystem) receives an update request for change and waits for the "other" computing nodes from the set 140 to request the same operation before committing to the shared memory 130. Should the
computing nodes not agree on the change, the majority can rule via the media controller 1 10 whereas the non-matching computing node's update may be rejected.
[00014] The media controller 1 10 can track new update requests (e.g., writes) to shared memory 130 in the coordination buffer 120 that holds the transaction pending until additional writes to the same memory location from the set of coordinated validating systems (e.g., computing nodes) from the set 140 are received. The coordination buffer 120 can check any newly arriving writes against pending writes, and upon a match, compare the data to determine whether the update matches the pending transaction - or not. Any number of validating systems may be supported in this memory configuration.
[00015] A matching data update request can be recorded as a "vote for" the update, whereas a mismatch can be recorded as a "vote against" via the
coordination buffer 120. If three or more updates are considered, then subsequent updates can continue to be accumulated until all (or subset) cooperating system updates are observed, or non-reporting systems are determined to be non-reporting and removed from the set 140. The prevalent data (e.g., that of the majority) can then be committed to shared memory 130. If only one update is received, then updating system from the set of computing nodes 140 can be determined to be suspect, and the update is discarded (e.g., after a predetermined period of time or after a number of events have occurred). When a conclusion is reached by examining the multiple updates of the data, write responses can be returned to all (or a subset) of updating systems, with success, or failure indicated to each requesting node to trigger a suitable response to mismatched data (e.g., success flag if data written or error flag if update request rejected by media controller).
[00016] Example implementations may achieve a tight consistency of data update timing, in which case a direct hardware implementation of the update tracking buffer 120 can be sufficient to collect and process pending updates within the expected time window. Pending updates may be rejected in this model when the capacity of the coordination buffer 120 is overrun before the necessary corroborating updates have been received. Other example implementations may allocate a portion of shared memory 130 (See e.g., FIG. 2) to gather pending updates still requiring validating updates from other systems. Such a configuration can manage pending updates from a hardware front end as described herein. While some multisystem architectures may have a tendency to have their timing drift relative to one another, this media controller 1 10 and coordination buffer 120 can tend to re-align (e.g., synchronize) the timing by aligning the completion indication of matching updates, for example.
[00017] Not all shared memory 130 need to be considered part of the validated address space. For example, each computing node from the set 140 may be assigned address ranges that are private, to be used to accumulate data before software algorithms gather the data intended for cross validation. Other address
ranges may be shared, but not validated so that it may be used for communication between the cooperating systems to keep them in sync with one-another. New cores (or systems) can be added (e.g., in a non-stop redundancy system) by temporarily stopping operation, mirroring the memory image of that core to a new core, and then continuing operation of all cores. As noted previously, non-volatile memories may be redundantly deployed, with mirroring, and/or RAIDing of the data from the compute node 140 across non-volatile memory modules with each independently validated.
[00018] In one example, the media controller 1 10 and coordination buffer enables straight-forward majority rule (e.g., nonstop) operation within the memory media controller 1 10, allows direct memory access performance without special hardware on the processor while providing the system reliability required of non-stop systems. With this non-stop model, a variety of commodity processors may be applied to this computing space. Alternatively, the management of pending transactions may be handled by a "machine-in-the-middle" that provides the described functionality before handing committed updates to standard memory modules. The media controller 1 10 and coordination buffer 120 provides for quick comparison and sign off of matching writes to fulfill the expectations of processor load/store latencies.
[00019] FIG. 2 illustrates an example of a media controller 210 that employs a coordination buffer 220 to control access to a segmented shared memory 230. In this example, the memory 230 can include a private memory segment 240. The private memory segment 240 can be reserved for a given node from the set of nodes described above. In some examples, each of the nodes from the set of nodes can be assigned a separate private memory segment, where update requests to the private memory do not have to have corroboration by the memory controller 210 and can be directly updated upon request by the given node. The memory 230 can also include shared non-controlled access memory 250. The memory 250 can be accessed by multiple computing nodes upon request yet do not need to be validated by another computing node in the set before the memory controller proceeds to
update the memory 250. The memory 230 can also include a shared controlled access memory segment 260. The memory segment 260 requires validation by the memory controller 210 before it can be modified. For example, validation can include the requirement that at least two computing nodes from a set of computing nodes generate the same update request before the memory controller 210 proceeds to modify the memory segment 260.
[00020] FIG. 3 illustrates an example of a network system 300 of computing nodes that communicate with media controllers that employs a coordination buffer to control access to a shared memory. The system 300 can be deployed as a fault tolerant system to serialize concurrent accesses by multiple redundancy controllers to fault tolerant memory according to an example of the present disclosure. It should be understood that the system 300 may include additional components and that one or more of the components described herein may be removed and/or modified without departing from a scope of the system 300. The system 300 may include multiple computing nodes 300A-N (where the number of computing nodes is greater than or equal to 1 ), multiple redundancy controllers 302A-N, a network interconnect module 340, and memory modules 304A-M.
[00021] The multiple compute nodes 300A-N may be coupled to the memory modules 304A-M by the network interconnect module 340. The memory
modules 304A-M may include media controllers 320A-M and memories 321 A-M. Each media controller, for instance, may communicate with its associated memory and control access to the memory. The media controllers 320A-M provide access to regions of memory with each media controller including a respective coordination buffer (not shown) as disclosed herein. The regions of memory can be accessed by multiple redundancy controllers 302A-N in the compute nodes 300A-N using access primitives such as read, write, lock, unlock, and so forth. In order to support aggregation or sharing of memory, media controllers 320A-M may be accessed by multiple redundancy controllers (e.g., acting on behalf of multiple servers). Thus, there can be a many-to-many relationship between redundancy controllers and
media controllers. The memory 321 A-M may include volatile dynamic random access memory (DRAM) with battery backup, non-volatile phase change random access memory (PCRAM), spin transfer torque-magnetoresistive random access memory (STT-MRAM), resistive random access memory (reRAM), memristor, FLASH, or other types of memory devices. For example, the memory may be solid state, persistent, dense, fast memory. Fast memory can be memory having an access time similar to DRAM memory.
[00022] As described in the disclosed examples, the redundancy controller 302A- M may maintain fault tolerance across the memory modules 304A-M. The
redundancy controller 302A-M may receive read or write commands from one or more processors, I/O devices, or other sources. In response to these, it generates sequences of primitive accesses to multiple media controllers 320A-M. The redundancy controller 302A-M may also generate certain sequences of primitives independently, not directly resulting from processor commands. These include sequences used for scrubbing, initializing, migrating, or error-correcting memory, for example.
[00023] In view of the foregoing structural and functional features described above, an example method will be better appreciated with reference to FIG. 4.
While, for purposes of simplicity of explanation, the method is shown and described as executing serially, it is to be understood and appreciated that the method is not limited by the illustrated order, as parts of the method could occur in different orders and/or concurrently from that shown and described herein. Such method can be executed by various components and executed by an integrated circuit, computer, or a controller, for example.
[00024] FIG. 4 illustrates an example of a method 400 to control access to a shared memory. At 410, the method 400 includes comparing a set of data update requests generated from a set of computing nodes (e.g., via coordination buffer 120 of FIG. 1 ). At 420, the method 400 includes determining if the set of data update requests from the set of computing nodes represent the same data (e.g., via media
controller 1 10 of FIG. 1 ). At 430, the method 400 includes modifying data in a shared memory if a subset of the data update requests generated from the set of computing nodes represents the same data (e.g., via media controller 1 10 of FIG. 1 ). The method 400 can also include sending a success flag to computer nodes who have succeeded with a respective update request associated with a positive vote and sending a failure flag to computer nodes who have had been rejected for a respective update request associated with a negative vote.
[00025] What have been described above are examples. It is, of course, not possible to describe every conceivable combination of components or methods, but one of ordinary skill in the art will recognize that many further combinations and permutations are possible. Accordingly, the invention is intended to embrace all such alterations, modifications, and variations that fall within the scope of this application, including the appended claims. Additionally, where the disclosure or claims recite "a," "an," "a first," or "another" element, or the equivalent thereof, it should be interpreted to include one or more than one such element, neither requiring nor excluding two or more such elements. As used herein, the term
"includes" means includes but not limited to, and the term "including" means including but not limited to. The term "based on" means based at least in part on.
Claims
1 . A circuit, comprising:
a media controller to control data access from a set of computing nodes to a shared memory; and
a coordination buffer that holds data update requests generated from the set of computing nodes to the media controller, wherein the media controller enables data in the shared memory to be modified in accordance with the data update requests based on a comparison of the data update requests in the coordination buffer.
2. The circuit of claim 1 , wherein the shared memory is employed by redundancy controllers in a redundant array of independent disks (RAID) or memory mirroring configuration.
3. The circuit of claim 1 , wherein each data update request in the coordination buffer functions as a vote from a respective member of the set of computing nodes for updating the shared memory.
4. The circuit of claim 3, wherein update requests for data to a given address of the shared memory that match at least one other update request function as positive votes and update requests that do not match data to the given address of shared memory function as negative votes.
5. The circuit of claim 4, wherein the media controller sends a success flag to computer nodes who have succeeded with a respective update request associated with a positive vote and sends a failure flag to computer nodes who have had been rejected for a respective update request associated with a negative vote.
6. The circuit of claim 4, wherein an update request that does not receive a matching positive vote is discarded by the media controller after a predetermined period of time.
7. The circuit of claim 1 , wherein the media controller waits for the subset of the set of computing nodes to generate the same data update request before enabling the data in the shared memory to be modified.
8. The circuit of claim 7, wherein each of the plurality of computing nodes includes a redundancy controller to communicate with a plurality of media controllers.
9. The circuit of claim 1 , wherein the shared memory is segmented such that a portion of memory is designated as private memory to a given computing node, a portion of memory is designated as shared non-controlled access between computing nodes, and a portion of memory is designated as shared controlled access between computing nodes.
10. The circuit of claim 1 , wherein the media controller utilizes the coordination buffer to synchronize timing between multiple computing nodes from the set of computing nodes.
1 1 . A system, comprising:
a media controller to control data access from a set of computing nodes to a shared memory; and
a coordination buffer that holds data update requests generated from the set of computing nodes for comparison by the media controller, wherein the media controller determines if a subset of the set of computing nodes generate the same data update request before enabling the data in the shared memory to be modified.
12. The system of claim 1 1 , wherein update requests for data to a given address of the shared memory that match at least one other update request function as positive votes and update requests that do not match data to the given address of the shared memory function as negative votes.
13. The system of claim 12, wherein the media controller sends a success flag to computer nodes who have succeeded with a respective update request associated with a positive vote and sends a failure flag to computer nodes who have had been rejected for a respective update request associated with a negative vote.
14. A method, comprising:
comparing, by a controller, a set of data update requests generated from a set of computing nodes;
determining, by the controller, if the set of data update requests from the set of computing nodes represent the same data; and
modifying, by the controller, data in a shared memory if a subset of the data update requests generated from the set of computing nodes represents the same data.
15. The method of claim 14, further comprising sending a success flag to computer nodes who have succeeded with a respective update request associated with a positive vote and sending a failure flag to computer nodes who have had been rejected for a respective update request associated with a negative vote.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2014/062593 WO2016068870A1 (en) | 2014-10-28 | 2014-10-28 | Media controller with coordination buffer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/US2014/062593 WO2016068870A1 (en) | 2014-10-28 | 2014-10-28 | Media controller with coordination buffer |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016068870A1 true WO2016068870A1 (en) | 2016-05-06 |
Family
ID=55857987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/062593 WO2016068870A1 (en) | 2014-10-28 | 2014-10-28 | Media controller with coordination buffer |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2016068870A1 (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6601151B1 (en) * | 1999-02-08 | 2003-07-29 | Sun Microsystems, Inc. | Apparatus and method for handling memory access requests in a data processing system |
US20050108231A1 (en) * | 2003-11-17 | 2005-05-19 | Terrascale Technologies Inc. | Method for retrieving and modifying data elements on a shared medium |
US20070050574A1 (en) * | 2005-09-01 | 2007-03-01 | Hitachi, Ltd. | Storage system and storage system management method |
WO2008047070A1 (en) * | 2006-10-17 | 2008-04-24 | Arm Limited | Handling of write access requests to shared memory in a data processing apparatus |
US20140215159A1 (en) * | 2010-09-23 | 2014-07-31 | International Business Machines Corporation | Managing concurrent accesses to a cache |
-
2014
- 2014-10-28 WO PCT/US2014/062593 patent/WO2016068870A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6601151B1 (en) * | 1999-02-08 | 2003-07-29 | Sun Microsystems, Inc. | Apparatus and method for handling memory access requests in a data processing system |
US20050108231A1 (en) * | 2003-11-17 | 2005-05-19 | Terrascale Technologies Inc. | Method for retrieving and modifying data elements on a shared medium |
US20070050574A1 (en) * | 2005-09-01 | 2007-03-01 | Hitachi, Ltd. | Storage system and storage system management method |
WO2008047070A1 (en) * | 2006-10-17 | 2008-04-24 | Arm Limited | Handling of write access requests to shared memory in a data processing apparatus |
US20140215159A1 (en) * | 2010-09-23 | 2014-07-31 | International Business Machines Corporation | Managing concurrent accesses to a cache |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11698844B2 (en) | Managing storage systems that are synchronously replicating a dataset | |
US11086740B2 (en) | Maintaining storage array online | |
US7996608B1 (en) | Providing redundancy in a storage system | |
US8255739B1 (en) | Achieving data consistency in a node failover with a degraded RAID array | |
US11467732B2 (en) | Data storage system with multiple durability levels | |
US20200083909A1 (en) | Data storage system with enforced fencing | |
US10146646B1 (en) | Synchronizing RAID configuration changes across storage processors | |
US8010829B1 (en) | Distributed hot-spare storage in a storage cluster | |
US9384065B2 (en) | Memory array with atomic test and set | |
US8521685B1 (en) | Background movement of data between nodes in a storage cluster | |
US20140281138A1 (en) | Synchronous mirroring in non-volatile memory systems | |
US10521316B2 (en) | System and method for handling multi-node failures in a disaster recovery cluster | |
US20200117377A1 (en) | Serializing access to fault tolerant memory | |
US10402113B2 (en) | Live migration of data | |
US20210406280A1 (en) | Non-disruptive transition to synchronous replication state | |
US10649764B2 (en) | Module mirroring during non-disruptive upgrade | |
WO2016068870A1 (en) | Media controller with coordination buffer | |
US9336102B2 (en) | Systems and methods for preventing input/output performance decrease after disk failure in a distributed file system | |
US11467736B1 (en) | Dropped write detection and correction | |
US11372730B2 (en) | Method and system for offloading a continuous health-check and reconstruction of data in a non-accelerator pool | |
US20220342767A1 (en) | Detecting corruption in forever incremental backups with primary storage systems | |
US20240143189A1 (en) | Mapped raid configuration with multiple disk groups and a shared pool of hot spare extents | |
CN116257177A (en) | Distributed storage system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14904954 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14904954 Country of ref document: EP Kind code of ref document: A1 |