GB2579757A - Handling effective address synonyms in a load-store unit that operates without address translation - Google Patents

Handling effective address synonyms in a load-store unit that operates without address translation Download PDF

Info

Publication number
GB2579757A
GB2579757A GB2006344.2A GB202006344A GB2579757A GB 2579757 A GB2579757 A GB 2579757A GB 202006344 A GB202006344 A GB 202006344A GB 2579757 A GB2579757 A GB 2579757A
Authority
GB
United Kingdom
Prior art keywords
effective address
instruction
entry
address
ert
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB2006344.2A
Other versions
GB2579757B (en
GB202006344D0 (en
Inventor
Sinharoy Balaram
Lloyd Bryan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/726,627 external-priority patent/US11175924B2/en
Priority claimed from US15/726,596 external-priority patent/US10606591B2/en
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of GB202006344D0 publication Critical patent/GB202006344D0/en
Publication of GB2579757A publication Critical patent/GB2579757A/en
Application granted granted Critical
Publication of GB2579757B publication Critical patent/GB2579757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • G06F9/30189Instruction operation extension or modification according to execution mode, e.g. mode flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/383Operand prefetching
    • G06F9/3832Value prediction for operands; operand history buffers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3838Dependency mechanisms, e.g. register scoreboarding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3856Reordering of instructions, e.g. using queues or age tags
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1063Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently virtually addressed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1008Correctness of operation, e.g. memory ordering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/652Page size control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/655Same page detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/657Virtual address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/68Details of translation look-aside buffer [TLB]
    • G06F2212/681Multi-level TLB, e.g. microTLB and main TLB
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/3834Maintaining memory consistency

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Advance Control (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

Technical solutions are described for issuing, by a load-store unit (LSU), a plurality of instructions from an out-of- order (OoO) window. The issuing includes, in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address. Further, the execution includes in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) entry in an SDT, wherein the SDT entry maps the second effective address to the ERT entry, and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.

Claims (20)

1. A processing unit for executing one or more instructions, the processing unit comprising: a load-store unit (LSU) for transferring data between memory and registers, the LSU configured to execute a plurality of instructions in an out-of-order (OoO) window, the execution comprising: in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address; and in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) entry in an SDT, wherein the SDT entry maps the second effective address to the ERT entry; and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.
2. The processing unit of claim 1, wherein, in response to the second effective address also corresponding to said first real address: comparing a first page size associated with the first instruction with a second page size associated with the second instruction; and wherein the SDT entry that maps the second effective address to the ERT is created in response to the first page size being greater than the second page size.
3. The processing unit of claim 2, wherein, in response to the first page size being smaller than the second page size: modifying the ERT entry by replacing the mapping between the first effective address and the first real address with a mapping between the second effective address and the first real address.
4. The processing unit of claim 3, wherein, further in response to the first page size being smaller than the second page size: creating the SDT entry that maps the first effective address to the ERT entry.
5. The processing unit of claim 1 , wherein the SDT entry comprises a thread identifier of a thread on which the first instruction is launched, the effective address of the first instruction, a page size of the first instruction, a relaunch effective address of the first instruction, and an ERT entry identifier of the corresponding ERT entry.
6. The processing unit of claim 1 , wherein the first instruction is one from a group of instructions consisting of a load instruction and a store instruction.
7. The processing unit of claim 1 , wherein a counter is maintained to indicate number of instructions launched with the first effective address, and in response to the counter crossing a predetermined threshold, invalidating the ERT entry corresponding to the first effective address.
8. A computer-implemented method for executing one or more out-of-order instructions by a processing unit, the method comprising: issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window, the issuing comprising: in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address; and in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) in an SDT, wherein the SDT entry maps the second effective address to the ERT entry; and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.
9. The computer-implemented method of claim 8, wherein, in response to the second effective address also corresponding to said first real address: comparing a first page size associated with the first instruction with a second page size associated with the second instruction; and wherein, the SDT entry that maps the second effective address to the ERT entry is created in response to the first page size being greater than the second page size.
10. The computer-implemented method of claim 9, wherein, in response to the first page size being smaller than the second page size: modifying the ERT entry by replacing the mapping between the first effective address and the first real address with a mapping between the second effective address and the first real address.
11. The computer-implemented method of claim 10, wherein, in response to the first page size being smaller than the second page size: creating the SDT entry that maps the first effective address to the ERT entry.
12. The computer-implemented method of claim 8, wherein the SDT entry comprises a thread identifier of a thread on which the first instruction is launched, the effective address of the first instruction, a page size of the first instruction, a relaunch effective address of the first instruction, and an ERT entry identifier of the corresponding ERT entry.
13. The computer-implemented method of claim 8, wherein the first instruction is one from a group of instructions consisting of a load instruction and a store instruction.
14. The computer-implemented method of claim 8, wherein a counter is maintained to indicate number of instructions launched with the first effective address, and in response to the counter crossing a predetermined threshold, invalidating the ERT entry corresponding to the first effective address.
15. A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations comprising: issuing, by a load-store unit (LSU), a plurality of instructions from an out-of-order (OoO) window by: in response to determining a first effective address being used by a first instruction, the first effective address corresponding to a first real address, creating an effective real table (ERT) entry in an ERT, the ERT entry mapping the first effective address to the first real address; and in response to determining an effective address synonym used by a second instruction, the effective address synonym being a second effective address that is also corresponding to said first real address: creating a synonym detection table (SDT) in an SDT, wherein the SDT entry maps the second effective address to the ERT entry; and relaunching the second instruction by replacing the second effective address in the second instruction with the first effective address.
16. The computer program product of claim 15, wherein, in response to the second effective address also corresponding to said first real address: comparing a first page size associated with the first instruction with a second page size associated with the second instruction; and wherein, the SDT entry that maps the second effective address to the ERT entry is created in response to the first page size being greater than the second page size.
17. The computer program product of claim 16, wherein, in response to the first page size being smaller than the second page size: modifying the ERT entry by replacing the mapping between the first effective address and the first real address with a mapping between the second effective address and the first real address.
18. The computer program product of claim 17, wherein, in response to the first page size being smaller than the second page size: creating the SDT entry that maps the first effective address to the ERT entry.
19. The computer program product of claim 15, wherein the SDT entry comprises a thread identifier of a thread on which the first instruction is launched, the effective address of the first instruction, a page size of the first instruction, a relaunch effective address of the first instruction, and an ERT entry identifier of the corresponding ERT entry.
20. The computer program product of claim 15, wherein the first instruction is one from a group of instructions consisting of a load instruction and a store instruction.
GB2006344.2A 2017-10-06 2018-10-03 Handling effective address synonyms in a load-store unit that operates without address translation Active GB2579757B (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US15/726,627 US11175924B2 (en) 2017-10-06 2017-10-06 Load-store unit with partitioned reorder queues with single cam port
US15/726,596 US10606591B2 (en) 2017-10-06 2017-10-06 Handling effective address synonyms in a load-store unit that operates without address translation
US15/825,453 US11175925B2 (en) 2017-10-06 2017-11-29 Load-store unit with partitioned reorder queues with single cam port
US15/825,494 US10606592B2 (en) 2017-10-06 2017-11-29 Handling effective address synonyms in a load-store unit that operates without address translation
PCT/IB2018/057694 WO2019069255A1 (en) 2017-10-06 2018-10-03 Handling effective address synonyms in a load-store unit that operates without address translation

Publications (3)

Publication Number Publication Date
GB202006344D0 GB202006344D0 (en) 2020-06-17
GB2579757A true GB2579757A (en) 2020-07-01
GB2579757B GB2579757B (en) 2020-11-18

Family

ID=65994519

Family Applications (2)

Application Number Title Priority Date Filing Date
GB2006344.2A Active GB2579757B (en) 2017-10-06 2018-10-03 Handling effective address synonyms in a load-store unit that operates without address translation
GB2006338.4A Active GB2579534B (en) 2017-10-06 2018-10-03 Load-store unit with partitioned reorder queues with single CAM port

Family Applications After (1)

Application Number Title Priority Date Filing Date
GB2006338.4A Active GB2579534B (en) 2017-10-06 2018-10-03 Load-store unit with partitioned reorder queues with single CAM port

Country Status (5)

Country Link
JP (2) JP7064273B2 (en)
CN (2) CN111133413B (en)
DE (2) DE112018004006B4 (en)
GB (2) GB2579757B (en)
WO (2) WO2019069255A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023056289A (en) 2021-10-07 2023-04-19 富士通株式会社 Arithmetic processing unit, and arithmetic processing method
CN114780146B (en) * 2022-06-17 2022-08-26 深流微智能科技(深圳)有限公司 Resource address query method, device and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7343469B1 (en) * 2000-09-21 2008-03-11 Intel Corporation Remapping I/O device addresses into high memory using GART
CN103198028A (en) * 2013-03-18 2013-07-10 华为技术有限公司 Method, device and system for migrating stored data
WO2016105961A1 (en) * 2014-12-26 2016-06-30 Wisconsin Alumni Research Foundation Cache accessed using virtual addresses
US9740409B2 (en) * 2013-12-13 2017-08-22 Ineda Systems, Inc. Virtualized storage systems

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6694425B1 (en) 2000-05-04 2004-02-17 International Business Machines Corporation Selective flush of shared and other pipeline stages in a multithread processor
US6931639B1 (en) * 2000-08-24 2005-08-16 International Business Machines Corporation Method for implementing a variable-partitioned queue for simultaneous multithreaded processors
US20040117587A1 (en) * 2002-12-12 2004-06-17 International Business Machines Corp. Hardware managed virtual-to-physical address translation mechanism
US7730282B2 (en) * 2004-08-11 2010-06-01 International Business Machines Corporation Method and apparatus for avoiding data dependency hazards in a microprocessor pipeline architecture using a multi-bit age vector
US8145887B2 (en) * 2007-06-15 2012-03-27 International Business Machines Corporation Enhanced load lookahead prefetch in single threaded mode for a simultaneous multithreaded microprocessor
US8645974B2 (en) * 2007-08-02 2014-02-04 International Business Machines Corporation Multiple partition adjunct instances interfacing multiple logical partitions to a self-virtualizing input/output device
US7711929B2 (en) 2007-08-30 2010-05-04 International Business Machines Corporation Method and system for tracking instruction dependency in an out-of-order processor
US8639884B2 (en) * 2011-02-28 2014-01-28 Freescale Semiconductor, Inc. Systems and methods for configuring load/store execution units
US9182991B2 (en) * 2012-02-06 2015-11-10 International Business Machines Corporation Multi-threaded processor instruction balancing through instruction uncertainty
US8966232B2 (en) 2012-02-10 2015-02-24 Freescale Semiconductor, Inc. Data processing system operable in single and multi-thread modes and having multiple caches and method of operation
GB2503438A (en) * 2012-06-26 2014-01-01 Ibm Method and system for pipelining out of order instructions by combining short latency instructions to match long latency instructions
US10209995B2 (en) * 2014-10-24 2019-02-19 International Business Machines Corporation Processor core including pre-issue load-hit-store (LHS) hazard prediction to reduce rejection of load instructions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7343469B1 (en) * 2000-09-21 2008-03-11 Intel Corporation Remapping I/O device addresses into high memory using GART
CN103198028A (en) * 2013-03-18 2013-07-10 华为技术有限公司 Method, device and system for migrating stored data
US9740409B2 (en) * 2013-12-13 2017-08-22 Ineda Systems, Inc. Virtualized storage systems
WO2016105961A1 (en) * 2014-12-26 2016-06-30 Wisconsin Alumni Research Foundation Cache accessed using virtual addresses

Also Published As

Publication number Publication date
GB2579534B (en) 2020-12-16
CN111133413A (en) 2020-05-08
JP7025100B2 (en) 2022-02-24
DE112018004006B4 (en) 2021-03-25
GB2579757B (en) 2020-11-18
CN111133421B (en) 2023-09-29
WO2019069256A1 (en) 2019-04-11
JP2020536308A (en) 2020-12-10
GB202006338D0 (en) 2020-06-17
CN111133421A (en) 2020-05-08
DE112018004006T5 (en) 2020-04-16
WO2019069255A1 (en) 2019-04-11
JP7064273B2 (en) 2022-05-10
JP2020536310A (en) 2020-12-10
DE112018004004T5 (en) 2020-04-16
GB2579534A (en) 2020-06-24
CN111133413B (en) 2023-09-29
GB202006344D0 (en) 2020-06-17

Similar Documents

Publication Publication Date Title
US9251088B2 (en) Mechanisms for eliminating a race condition between a hypervisor-performed emulation process requiring a translation operation and a concurrent translation table entry invalidation
US9703562B2 (en) Instruction emulation processors, methods, and systems
US10474369B2 (en) Mapping guest pages to disk blocks to improve virtual machine management processes
US10877793B2 (en) Extending the base address register by modifying the number of read-only bits associated with a device to be presented to a guest operating system
US9772867B2 (en) Control area for managing multiple threads in a computer
US10346330B2 (en) Updating virtual machine memory by interrupt handler
US9223574B2 (en) Start virtual execution instruction for dispatching multiple threads in a computer
US9454400B2 (en) Memory duplication by origin host in virtual machine live migration
US20140281398A1 (en) Instruction emulation processors, methods, and systems
US10055136B2 (en) Maintaining guest input/output tables in swappable memory
US10339068B2 (en) Fully virtualized TLBs
RU2016126976A (en) GENERAL DOWNLOAD SEQUENCE FOR A MANAGING SERVICE PROGRAM ABLE TO INITIALIZE IN MULTIPLE ARCHITECTURES
TWI790242B (en) Address translation data invalidation
US10049043B2 (en) Flushing control within a multi-threaded processor
GB2580854A (en) Bulk store and load operations of configuration state registers
US20150277946A1 (en) Dispatching multiple threads in a computer
US20160055108A1 (en) Managing message signaled interrupts in virtualized computer systems
GB2577468A (en) Sharing virtual and real translations in a virtual cache
GB2582095A (en) Context switch by changing memory pointers
GB2579757A (en) Handling effective address synonyms in a load-store unit that operates without address translation
WO2015017129A4 (en) Multi-threaded gpu pipeline
US20170357595A1 (en) Tlb shootdowns for low overhead
CN105989758B (en) Address translation method and apparatus
US9507724B2 (en) Memory access processing method and information processing device
US9389897B1 (en) Exiting multiple threads of a simulation environment in a computer

Legal Events

Date Code Title Description
746 Register noted 'licences of right' (sect. 46/1977)

Effective date: 20201201