CN112363824A - Memory virtualization method and system under Shenwei architecture - Google Patents

Memory virtualization method and system under Shenwei architecture Download PDF

Info

Publication number
CN112363824A
CN112363824A CN202011084199.2A CN202011084199A CN112363824A CN 112363824 A CN112363824 A CN 112363824A CN 202011084199 A CN202011084199 A CN 202011084199A CN 112363824 A CN112363824 A CN 112363824A
Authority
CN
China
Prior art keywords
page table
tlb
address
shadow
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011084199.2A
Other languages
Chinese (zh)
Other versions
CN112363824B (en
Inventor
沙赛
罗英伟
汪小林
张毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Advanced Technology Research Institute
Peking University
Original Assignee
Wuxi Advanced Technology Research Institute
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Advanced Technology Research Institute, Peking University filed Critical Wuxi Advanced Technology Research Institute
Priority to CN202011084199.2A priority Critical patent/CN112363824B/en
Publication of CN112363824A publication Critical patent/CN112363824A/en
Application granted granted Critical
Publication of CN112363824B publication Critical patent/CN112363824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5077Logical partitioning of resources; Management or configuration of virtualized resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to a memory virtualization method and system under a Shenwei architecture. The method comprises the following steps: establishing a buffer area for storing a base address of a shadow page table; when the CPU queries the TLB and generates TLB miss, the CPU accesses the buffer to acquire the base address of the shadow page table of the current process, loads the base address of the shadow page table into a memory management unit and starts page table query; when mapping is missing in page table query, the CPU switches the client context to the host context to perform missing page interrupt processing; directly filling virtual and real address translation mapping obtained after the interruption process of missing page into corresponding TLB to realize TLB prefetching; and the CPU inquires the TLB again to complete the address conversion from the virtual address of the client to the physical address of the host machine. The invention realizes the simultaneous refreshing of the shadow page table and the TLB based on the TLB characteristic managed by the Shenwei architecture software, thereby realizing the synchronization of the shadow page table and the client process page table.

Description

Memory virtualization method and system under Shenwei architecture
Technical Field
The invention relates to the field of Shenwei architecture virtualization, in particular to a method and a system for realizing efficient memory virtualization under the Shenwei architecture.
Background
The development of the Shenwei family of processors, which are representative of the domestic processors, is of great interest. The huge success of super computer Shenwei Taihu lake lays the important position of Shenwei in the field of domestic processors. Particularly, in a part with high safety and autonomous controllability such as a government, the Shenwei server is favored and is mainly used for a desktop office system. The first generation of the Shenwei architecture instruction set is derived from the Alpha instruction set, and then has been developed into the independent and autonomous Shenwei instruction set through continuous improvement and development.
Compared to the international mainstream computer processor architecture (e.g., x86), the Shenwei architecture still has a significant gap in functionality and performance. With the continuous development of the information technology, the Shenwei processor is not limited to a desktop office system but follows the trend of the times, so that the Shenwei processor is oriented to a wider cloud service system. Virtualization is one of the main support technologies for cloud services. Virtualization refers to the virtualization of a physical computer system into one or more virtual computer systems (virtual machines). Each virtual machine has its own virtual hardware (such as CPU, memory, etc.) to provide an independent and complete computer execution environment. Virtualization mainly aims at three physical resources, namely CPU virtualization, memory virtualization and I/O virtualization. The memory virtualization is the most complex virtualization technology, and the quality of the memory virtualization is often the bottleneck of the system performance of the virtual machine.
From the perspective of the operating system, it has two fundamental insights into physical memory: physical addresses start at zero and memory address continuity. The virtual machine runs on the host machine as an independent process, and the two basic conditions are difficult to meet. Virtualization introduces a new layer of system software, called a virtual machine monitor (or hypervisor), that controls access to physical resources by the guest operating systems. To satisfy the two basic conditions, the hypervisor introduces a new address space called the guest physical address space. In a computer system, accessing a memory by a CPU includes two steps: virtual-real address translation and accessing memory data based on physical addresses. Virtual-to-real address translation refers to translating virtual addresses of program operations to actual physical addresses.
In a virtualized environment, address translation is divided into two steps: client virtual address- > client physical address- > host physical address. The task of memory virtualization is how to efficiently accomplish the two-layer address translation. In this process, the address translation overhead is mainly divided into three parts: TLB lookup, page table lookup, and page fault handling.
The existing main stream architecture memory virtualization solutions are divided into two types: software memory virtualization represented by traditional shadow page tables and hardware-assisted virtualization represented by extended page tables. Both of these solutions are implemented on a mainstream processor architecture such as x86, but neither is suitable for a Shenwei architecture processor. For expanding the page table model, the two-layer address translation is efficiently completed by using hardware essentially, but the Shenwei architecture lacks hardware support, and if the model is realized by pure software, the performance cannot meet the production and living requirements. Furthermore, while the extended page table model reduces the miss interrupt handling overhead compared to traditional shadow page tables, it introduces additional page table walk overhead.
With respect to the traditional shadow page table model, on one hand, the code implementation is extremely complex and inefficient due to the synchronization mechanism of write protection; on the other hand, the software flexibility advantage unique to the Shenwei architecture cannot be exerted. The Shenwei architecture has unique virtualization advantages compared to the x86 architecture. First, the Shenwei architecture is a software-managed Translation Lookaside Buffer (TLB), which provides the necessary conditions for memory virtualization optimization. The TLB is a hardware device having a small storage capacity, and directly stores a mapping relationship from a virtual address to a physical address. It is an address translation unit next to the CPU, which first accesses the TLB to look up the virtual-real address translation map at each address translation. Second, the Shenwei architecture has a hardware mode higher than the Kernel mode and a unique programmable software interface HMcode. The software interface runs on a hardware mode, and has the highest system authority to directly access registers, a memory and other devices such as a TLB and the like. By the characteristic, the Shenwei architecture has extremely high flexibility of bottom-layer software and can provide rich and diverse support for virtualization.
The Shenwei architecture has unique virtualization advantages. In addition to user mode and kernel mode, the Shenwei architecture has a mode with the highest privilege, called hardware mode. Hosts and clients under the AWARE architecture may have orthogonal three levels of privileges, namely user mode, kernel mode, and hardware mode. This is similar to the Intel VMX mode of operation. The svwey HMcode is a programmable interface between the kernel layer and the hardware, operating in hardware mode to execute privileged instructions. The HMcode interface is transparent to the user layer and even the kernel layer and can directly access registers and memory using physical addresses and the like. The operating system may be trapped in hardware mode by system calls. For example, HMcode provides a TLB flush interface for the kernel called TBI. Similar to VPID (Virtual Process ID) and PCID (Process context identifier) in TLB of x86 architecture, VPN (Virtual processor number) and UPN (User Process number) in Sunway TLB may distinguish different Virtual processors and processes, respectively. The HMcode interface provides software flexibility and may also help verify virtualized hardware support.
Disclosure of Invention
The invention aims to realize a memory virtualization system on a Shenwei server by fully combining the advantages of the Shenwei architecture and based on a TLB (translation lookaside buffer) managed by software. Specifically, the memory virtualization system of the Shenwei 1621 server is realized by fully utilizing the characteristics of the Shenwei software programmable interface HMcode, the TLB managed by software and the like based on the shadow page table memory virtualization model. The core idea of the invention is to realize the simultaneous refreshing of the shadow page table and the TLB based on the TLB characteristic managed by the Shenwei architecture software, thereby realizing the synchronization of the shadow page table and the client process page table.
The technical scheme adopted by the invention is as follows:
a memory virtualization method under Shenwei architecture comprises the following steps:
establishing a buffer area for storing a base address of a shadow page table;
when the CPU queries the TLB and generates TLB miss, the CPU accesses the buffer area to acquire the base address of the shadow page table of the current process by using the TLB characteristic managed by the Shenwei architecture software, loads the base address of the shadow page table into a memory management unit and starts page table query;
when mapping is missing in page table query, the CPU switches the client context to the host context to perform missing page interrupt processing;
filling the TLB by utilizing the characteristic that Shenwei architecture software fills the TLB, and directly filling virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB to realize the TLB prefetching;
and the CPU inquires the TLB again to complete the address conversion from the virtual address of the client to the physical address of the host machine.
Further, the buffer uses 16 physical pages, each page stores 1024 entries, the buffer is indexed using a combination of a 2-bit VPN and an 8-bit UPN, and each entry in the buffer contains a 64-bit shadow page table base address of a process.
Further, when the page table is queried, a 4-level shadow page table structure is adopted, the memory management unit traverses the 4-level shadow page table to obtain the mapping from the virtual address of the client to the physical address of the host, if the page table query is successful, the mapping is filled into the TLB, then the CPU queries the TLB again to complete the address translation; a mapping miss for any one level of page tables results in a page miss interrupt.
Further, the page fault interrupt processing includes: client process page table walk, host process page table walk, and shadow page table build and fill.
Further, the page fault interrupt processing includes:
inquiring a client process page table, converting a client virtual address into a client physical address, and if the client process page table is missing or incompletely mapped, entering a virtual machine to perfect the client process page table;
converting the physical address of the client into a virtual address of a host machine, and inquiring a page table of a process of the host machine to convert the virtual address of the host machine into the physical address of the host machine;
and utilizing the mapping from the virtual address of the client to the physical address of the host obtained by the two-step query, constructing a four-level page table according to the organization rule of the shadow page table, and filling the mapping.
Further, maintaining synchronization of the shadow page table and the guest process page table is performed by:
the client operating system refreshes the TLB entry of the current process directly through system call without exiting the virtual machine;
implementing a shadow page table flusher in the HMcode interface to monitor the software managed TLB interface;
the shadow page table flusher decodes the captured TLB flush instruction and invalidates the corresponding shadow page table entries, thereby enabling a simultaneous flush of the TLB and shadow page tables.
Further, TLB flush requests are issued over two interfaces in the claims operating system: one interface is the operating system process page table page fault handling function and the other interface is the process context switch handling function.
Further, the following steps are adopted to timely recycle the discarded shadow page table:
when the TLB refreshing instruction captured by the shadow page table refresher is to refresh the TLB entries of the whole process, immediately invalidating the entries of the corresponding process stored in the shadow page table base address buffer;
when the shadow page table is in page missing processing, when the virtual machine monitoring program constructs shadow page table mapping, firstly checking the valid bit of the base address of the current process in the buffer, and directly recycling all the shadow page tables of the current process if the valid bit is invalid.
A memory virtualization system under the Shenwei architecture using the above method, comprising:
the TLB query module is used for realizing that the CPU queries the TLB to acquire virtual-real address translation mapping;
the page table query module is used for accessing a pre-established buffer area for storing the base address of the shadow page table by a CPU (central processing unit) to acquire the base address of the shadow page table of the current process by utilizing the TLB characteristic managed by the Shenwei architecture software when the TLB is queried and the TLB is missed, loading the base address of the shadow page table into the memory management unit and starting page table query;
the page missing processing module is used for switching the client context to the host context by the CPU to carry out page missing interrupt processing when mapping missing occurs in page table query;
and the TLB prefetching module is used for directly filling virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB by utilizing the characteristic that the Shenwei architecture software fills the TLB so as to realize TLB prefetching, so that the address translation from the virtual address of the client to the physical address of the host is completed when the CPU inquires the TLB again.
The virtual machine based on the Shenwei architecture adopts the method to perform memory virtualization.
The invention provides a novel memory virtualization method and a novel memory virtualization system under a Shenwei architecture, which are based on the characteristics of the Shenwei architecture, in particular to a TLB mechanism of software management. On one hand, the shadow page table is used as a basis, the advantages of efficient page table query of a traditional shadow page table model are inherited, and the page missing processing overhead caused by write protection synchronization of the traditional shadow page table model is eliminated; on the other hand, the invention does not need complex hardware support and does not introduce extra page table query cost as the expansion of the page table model.
Drawings
Fig. 1 is an implementation interface diagram of a memory virtualization model on the schenware architecture.
FIG. 2 is a diagram of the Shenwei memory virtualization overhead results using the SPEC CPU2006 test set.
FIG. 3 is a graph of x86 memory virtualization overhead results using the SPEC CPU2006 test set.
FIG. 4 is a comparison graph of Shenwei and x86 memory virtualization overhead using the SPEC CPU2017 large working set program.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention shall be described in further detail with reference to the following detailed description and accompanying drawings.
The "software managed TLB" described in the present disclosure refers to an architecture that exposes a TLB software management interface for an operating system to flush/fill TLB entries.
The shadow page table is used for directly caching mapping relation from virtual address of a client to physical address of a host in a virtualization environment. This is an important method for accelerating the efficiency of two-layer address translation.
The page table synchronization refers to the fact that in a shadow page table memory virtualization model, a shadow page table needs to keep mapping consistency with a client process page table, and the process of maintaining the consistency is called page table synchronization.
The present invention relates to a TLB and shadow page table simultaneous refreshing method, which is characterized in that a monitoring-capturing TLB refreshing instruction is realized based on a Shenwei architecture TLB software management interface, and corresponding shadow page table entries are refreshed simultaneously. In a computer system, to ensure the validity of TLB entries, once the process page table is modified by the operating system, the corresponding TLB entry must be flushed, i.e., the old mapping invalidated.
The KVM is a module in a Linux kernel, and is an open-source efficient virtualization solution. It consists of a loadable kernel module that provides the core virtualization infrastructure, and a processor specific module for architecture emulation and interrupt handling. The invention realizes a memory virtualization model based on KVM on the Shenwei 1621 server. Fig. 1 shows an implementation interface of a memory virtualization model on the schenware architecture. Wherein, I-TLB represents instruction TLB, and D-TLB represents data TLB. In the invention, the memory virtualization mainly ensures two aspects of work, namely, efficiently completing the address conversion from the virtual address of the client to the physical address of the host, and maintaining the synchronization of the shadow page table and the process page table of the client.
1. And (3) address conversion flow:
the task of memory virtualization is to complete the conversion of a virtual address of a client to a physical address of a host.
1) And (4) querying the TLB. CPU visits TLB, according to the virtual address inquiry mapping of the client, if TLB hits, then address translation is finished; otherwise, a TLB miss is generated, and a page table walk is entered.
2) And (5) page table query. The invention designs a buffer with limited size for storing the base address of the shadow page table. By utilizing the TLB characteristic managed by the Shenwei architecture software, before the page table query, the CPU accesses the buffer to acquire the base address of the shadow page table of the current process. The Shenwei 1621 has 16 physical cores and a page size of 8 KB. The buffer uses 16 physical pages, each storing 1024 entries. The present invention uses a 2-bit VPN and 8-bit UPN combination index buffer. Each entry in the buffer contains a 64-bit shadow page table base for one process. And loading the base address of the shadow page table into a memory management unit, and starting page table query. The invention adopts 4-level shadow page table structure, the memory management unit traverses 4-level page table to obtain the mapping from the virtual address of the client to the physical address of the host. If the page table walk is successful, the mapping to the TLB is filled, and then the CPU walks the TLB again, completing the address translation. A missing mapping of any one of the level page tables results in a missing page interrupt.
3) And interrupting the processing of missing pages. Once a mapping miss occurs in the page table walk, the CPU will switch the guest context to the host context to handle the page miss interrupt. The parameters of the incoming page fault interrupt handler include the virtual address of the client, error information, and the like. The missing page interrupt processing mainly comprises three parts: client process page table walk, host process page table walk and shadow page table build and fill.
a) The system first queries the client process page table to translate the client virtual address to the client physical address. If the guest process page table is missing (or not fully mapped), then the virtual machine needs to be re-entered to complete the guest process page table.
b) The client physical address and the host virtual address are continuous and have a direct linear mapping relationship. The conversion can be direct. The system queries a host process page table to convert a host virtual address to a host physical address.
c) According to the two steps of inquiry, the system obtains the mapping from the virtual address of the client machine to the physical address of the host machine. The system constructs a four-level page table according to the organization rule of the shadow page table and fills the mapping.
4) TLB prefetching. After the page fault interrupt processing is finished, the CPU enters the virtual machine again to execute an instruction which generates the TLB fault. Before that, the present invention utilizes the characteristic of filling TLB by using Shenwei architecture software, and fills the virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB directly, which is called TLB prefetching. Under the condition of no TLB prefetching, the CPU needs to execute the original instruction, generates TLB missing and page table query, acquires mapping in the shadow page table and fills the TLB, and then executes the original instruction again to complete address translation. TLB prefetch optimization may save one TLB miss and one page table walk.
2. Page table synchronization:
the guest operating system can flush (invalidate) the TLB entries of the current process directly through system calls without requiring a virtual machine exit (i.e., context switch). In the Shenwei operating system, there are two main interfaces that can issue "TLB flush" requests. One is the operating system process page table page fault handling function. When the operating system updates the process page table, it needs to invalidate the old TLB entries with the same guest virtual address. The other interface is a process context switch handling function. When switching process contexts, all TLB entries under the entire virtual processor should be flushed if a rotation UPN is required. The TLB entry of the Shenwei 1621 contains an 8-bit UPN that identifies the active process. Each process gets a UPN when it is first scheduled on the CPU. If the number of processes exceeds 256, the UPN will rotate, meaning that all TLB entries under the current virtual processor will be flushed, and the system will reassign the UPN to the active process. The present invention implements a shadow page table flusher in the HMcode interface to monitor the TLB interface for software management. The shadow page table flusher decodes the captured TLB flush instruction and invalidates the corresponding shadow page table entry. Therefore, the present invention realizes the simultaneous refreshing of the TLB and the shadow page table.
3. And (3) recovering the table page of the shadow page:
the shadow page table belongs to the host process memory and is managed by the virtual machine monitoring program. In a multitasking virtual machine, each process uses a shadow page table. Under the Shenwei architecture, we maintain 246 process shadow page tables for each virtual processor, and at most 1024 process shadow page tables on one physical core. The frequent creation and destruction of processes requires that the memory virtualization model must recycle obsolete shadow page tables in a timely manner. The TLB flush has different granularities such as a single TLB entry, a TLB entry of the whole process, and the like, and when the TLB flush instruction captured by the shadow page table flusher flushes the TLB entry of the whole process, the entry of the corresponding process stored in the shadow page table base address buffer is immediately invalidated. When the shadow page table is in page missing processing, when the virtual machine monitoring program constructs shadow page table mapping, the virtual machine monitoring program firstly checks the valid bit of the base address of the current process in the buffer area, and directly recycles all the shadow page table of the current process if the valid bit is invalid.
4. And (3) experimental evaluation:
to verify the efficiency of the present invention, we evaluated using the SPEC CPU test set and STREAM bandwidth test script process. Since the test set in SPEC CPU2006 is generally small (less than 3GB), we also selected a partial large working set in SPEC CPU2017 for testing. FIGS. 2 and 3 show the test results of the SPEC CPU2006 of the new shadow page table model under the Shenwei architecture and the conventional shadow page table model and the extended page table model under x86, respectively. Experiment results show that the execution time average cost of memory virtualization of the new shadow page table under the Shenwei architecture is only 1.36%, which is significantly lower than the cost of the traditional shadow page table (5.97%) and the extended page table (5.36%) under the x86 architecture. FIG. 4 is a test result using the SPEC CPU2017, again, the new model exhibits good performance even with a large working set program. Under the test, the virtualization overhead of the new shadow page table model under the Shenwei architecture is only 3.22%, and the overhead of the traditional shadow page table model and the extended page table model under x86 is respectively as high as 9.27% and 11.06%. STREAM is a classic procedure for testing system bandwidth. The experimental result shows that the bandwidth loss of memory virtualization under the Shenwei architecture is only 0.5%.
Based on the same inventive concept, another embodiment of the present invention provides a memory virtualization system under the Shenwei architecture using the above method, including:
the TLB query module is used for realizing that the CPU queries the TLB to acquire virtual-real address translation mapping;
the page table query module is used for accessing a pre-established buffer area for storing the base address of the shadow page table by a CPU (central processing unit) to acquire the base address of the shadow page table of the current process by utilizing the TLB characteristic managed by the Shenwei architecture software when the TLB is queried and the TLB is missed, loading the base address of the shadow page table into the memory management unit and starting page table query;
the page missing processing module is used for switching the client context to the host context by the CPU to carry out page missing interrupt processing when mapping missing occurs in page table query;
and the TLB prefetching module is used for directly filling virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB by utilizing the characteristic that the Shenwei architecture software fills the TLB so as to realize TLB prefetching, so that the address translation from the virtual address of the client to the physical address of the host is completed when the CPU inquires the TLB again.
Based on the same inventive concept, another embodiment of the present invention provides a virtual machine based on the Shenwei architecture, and the virtual machine performs memory virtualization by using the method of the present invention.
The foregoing disclosure of the specific embodiments of the present invention and the accompanying drawings is directed to an understanding of the present invention and its implementation, and it will be appreciated by those skilled in the art that various alternatives, modifications, and variations may be made without departing from the spirit and scope of the invention. The present invention should not be limited to the disclosure of the embodiments and drawings in the specification, and the scope of the present invention is defined by the scope of the claims.

Claims (10)

1. A memory virtualization method under Shenwei architecture is characterized by comprising the following steps:
establishing a buffer area for storing a base address of a shadow page table;
when the CPU queries the TLB and generates TLB miss, the CPU accesses the buffer area to acquire the base address of the shadow page table of the current process by using the TLB characteristic managed by the Shenwei architecture software, loads the base address of the shadow page table into a memory management unit and starts page table query;
when mapping is missing in page table query, the CPU switches the client context to the host context to perform missing page interrupt processing;
filling the TLB by utilizing the characteristic that Shenwei architecture software fills the TLB, and directly filling virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB to realize the TLB prefetching;
and the CPU inquires the TLB again to complete the address conversion from the virtual address of the client to the physical address of the host machine.
2. The method of claim 1, wherein the buffer uses 16 physical pages, each page stores 1024 entries, the buffer is indexed using a combination of a 2-bit VPN and an 8-bit UPN, and each entry in the buffer contains a 64-bit shadow page table base for a process.
3. The method of claim 1, wherein when performing the page table walk, a 4-level shadow page table structure is adopted, the memory management unit traverses the 4-level shadow page table to obtain the mapping from the virtual address of the client to the physical address of the host, if the page table walk is successful, the mapping is filled in the TLB, and then the CPU queries the TLB again to complete the address translation; a mapping miss for any one level of page tables results in a page miss interrupt.
4. The method of claim 1, wherein the page fault interrupt handling comprises: client process page table walk, host process page table walk, and shadow page table build and fill.
5. The method of claim 4, wherein the page fault interrupt handling comprises:
inquiring a client process page table, converting a client virtual address into a client physical address, and if the client process page table is missing or incompletely mapped, entering a virtual machine to perfect the client process page table;
converting the physical address of the client into a virtual address of a host machine, and inquiring a page table of a process of the host machine to convert the virtual address of the host machine into the physical address of the host machine;
and utilizing the mapping from the virtual address of the client to the physical address of the host obtained by the two-step query, constructing a four-level page table according to the organization rule of the shadow page table, and filling the mapping.
6. The method of claim 1, wherein the step of maintaining synchronization of the shadow page table and the guest process page table is performed by:
the client operating system refreshes the TLB entry of the current process directly through system call without exiting the virtual machine;
implementing a shadow page table flusher in the HMcode interface to monitor the software managed TLB interface;
the shadow page table flusher decodes the captured TLB flush instruction and invalidates the corresponding shadow page table entries, thereby enabling a simultaneous flush of the TLB and shadow page tables.
7. The method of claim 6, wherein the TLB flush request is issued over two interfaces in the SW operating system: one interface is the operating system process page table page fault handling function and the other interface is the process context switch handling function.
8. The method of claim 1, wherein the obsolete shadow page tables are reclaimed in time by:
when the TLB refreshing instruction captured by the shadow page table refresher is to refresh the TLB entries of the whole process, immediately invalidating the entries of the corresponding process stored in the shadow page table base address buffer;
when the shadow page table is in page missing processing, when the virtual machine monitoring program constructs shadow page table mapping, firstly checking the valid bit of the base address of the current process in the buffer, and directly recycling all the shadow page tables of the current process if the valid bit is invalid.
9. A memory virtualization system under an Schwek architecture using the method of any one of claims 1-8, comprising:
the TLB query module is used for realizing that the CPU queries the TLB to acquire virtual-real address translation mapping;
the page table query module is used for accessing a pre-established buffer area for storing the base address of the shadow page table by a CPU (central processing unit) to acquire the base address of the shadow page table of the current process by utilizing the TLB characteristic managed by the Shenwei architecture software when the TLB is queried and the TLB is missed, loading the base address of the shadow page table into the memory management unit and starting page table query;
the page missing processing module is used for switching the client context to the host context by the CPU to carry out page missing interrupt processing when mapping missing occurs in page table query;
and the TLB prefetching module is used for directly filling virtual-real address translation mapping obtained after the missing page interrupt processing into the corresponding TLB by utilizing the characteristic that the Shenwei architecture software fills the TLB so as to realize TLB prefetching, so that the address translation from the virtual address of the client to the physical address of the host is completed when the CPU inquires the TLB again.
10. A virtual machine based on the schenware architecture, wherein the virtual machine performs memory virtualization by using the method of any one of claims 1 to 8.
CN202011084199.2A 2020-10-12 2020-10-12 Memory virtualization method and system under Shenwei architecture Active CN112363824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011084199.2A CN112363824B (en) 2020-10-12 2020-10-12 Memory virtualization method and system under Shenwei architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011084199.2A CN112363824B (en) 2020-10-12 2020-10-12 Memory virtualization method and system under Shenwei architecture

Publications (2)

Publication Number Publication Date
CN112363824A true CN112363824A (en) 2021-02-12
CN112363824B CN112363824B (en) 2022-07-22

Family

ID=74506664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011084199.2A Active CN112363824B (en) 2020-10-12 2020-10-12 Memory virtualization method and system under Shenwei architecture

Country Status (1)

Country Link
CN (1) CN112363824B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860381A (en) * 2021-03-09 2021-05-28 上海交通大学 Virtual machine memory capacity expansion method and system based on Shenwei processor
CN113297104A (en) * 2021-06-16 2021-08-24 无锡江南计算技术研究所 Address translation device and method facing message transmission mechanism
CN113986775A (en) * 2021-11-03 2022-01-28 苏州睿芯集成电路科技有限公司 Method, system and device for generating page table entries in RISC-V CPU verification
CN114201269A (en) * 2022-02-18 2022-03-18 阿里云计算有限公司 Memory page changing method, system and storage medium
CN114595164A (en) * 2022-05-09 2022-06-07 支付宝(杭州)信息技术有限公司 Method and apparatus for managing TLB cache in virtualized platform
CN114610655A (en) * 2022-05-10 2022-06-10 沐曦集成电路(上海)有限公司 Continuous data access processing device and chip

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567217A (en) * 2012-01-04 2012-07-11 北京航空航天大学 MIPS platform-oriented memory virtualization method
CN107193759A (en) * 2017-04-18 2017-09-22 上海交通大学 The virtual method of device memory administrative unit
CN110196757A (en) * 2019-05-31 2019-09-03 龙芯中科技术有限公司 TLB filling method, device and the storage medium of virtual machine

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102567217A (en) * 2012-01-04 2012-07-11 北京航空航天大学 MIPS platform-oriented memory virtualization method
CN107193759A (en) * 2017-04-18 2017-09-22 上海交通大学 The virtual method of device memory administrative unit
WO2018192160A1 (en) * 2017-04-18 2018-10-25 上海交通大学 Virtualization method for device memory management unit
CN110196757A (en) * 2019-05-31 2019-09-03 龙芯中科技术有限公司 TLB filling method, device and the storage medium of virtual machine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
蔡万伟等: "基于MIPS架构的内存虚拟化研究", 《计算机研究与发展》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860381A (en) * 2021-03-09 2021-05-28 上海交通大学 Virtual machine memory capacity expansion method and system based on Shenwei processor
CN112860381B (en) * 2021-03-09 2022-04-26 上海交通大学 Virtual machine memory capacity expansion method and system based on Shenwei processor
CN113297104A (en) * 2021-06-16 2021-08-24 无锡江南计算技术研究所 Address translation device and method facing message transmission mechanism
CN113297104B (en) * 2021-06-16 2022-11-15 无锡江南计算技术研究所 Address translation device and method facing message transmission mechanism
CN113986775A (en) * 2021-11-03 2022-01-28 苏州睿芯集成电路科技有限公司 Method, system and device for generating page table entries in RISC-V CPU verification
CN113986775B (en) * 2021-11-03 2023-08-18 苏州睿芯集成电路科技有限公司 Page table item generation method, system and device in RISC-V CPU verification
CN114201269A (en) * 2022-02-18 2022-03-18 阿里云计算有限公司 Memory page changing method, system and storage medium
CN114595164A (en) * 2022-05-09 2022-06-07 支付宝(杭州)信息技术有限公司 Method and apparatus for managing TLB cache in virtualized platform
CN114610655A (en) * 2022-05-10 2022-06-10 沐曦集成电路(上海)有限公司 Continuous data access processing device and chip
CN114610655B (en) * 2022-05-10 2022-08-05 沐曦集成电路(上海)有限公司 Continuous data access processing device and chip

Also Published As

Publication number Publication date
CN112363824B (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN112363824B (en) Memory virtualization method and system under Shenwei architecture
US10303620B2 (en) Maintaining processor resources during architectural events
JP5680179B2 (en) Address mapping in virtual processing systems
US9529611B2 (en) Cooperative memory resource management via application-level balloon
TWI531912B (en) Processor having translation lookaside buffer for multiple context comnpute engine, system and method for enabling threads to access a resource in a processor
US8615643B2 (en) Operational efficiency of virtual TLBs
US6907600B2 (en) Virtual translation lookaside buffer
CN110196757B (en) TLB filling method and device of virtual machine and storage medium
US20020065989A1 (en) Master/slave processing system with shared translation lookaside buffer
US7234038B1 (en) Page mapping cookies
US20200319913A1 (en) System, apparatus and method for accessing multiple address spaces via a virtualization device
JP2021532468A (en) A memory protection unit that uses a memory protection table stored in the memory system
CN112328354A (en) Virtual machine live migration method and device, electronic equipment and computer storage medium
EP3757799B1 (en) System and method to track physical address accesses by a cpu or device
US20220269615A1 (en) Cache-based trace logging using tags in system memory
KR101200083B1 (en) A risc processor device and its instruction address conversion looking-up method
JP2021531583A (en) Binary search procedure for control tables stored in memory system
Chen et al. DMM: A dynamic memory mapping model for virtual machines
CN112363960B (en) Novel memory virtualization method and system based on shadow page table mechanism
CN115061955A (en) Processor, electronic device, address translation method and cache page table entry method
Sha et al. Accelerating address translation for virtualization by leveraging hardware mode
WO2021225896A1 (en) Memory page markings as logging cues for processor-based execution tracing
CN114840299A (en) Improved nested page table memory virtualization method and system under Shenwei architecture
US11687453B2 (en) Cache-based trace logging using tags in an upper-level cache
US11561896B2 (en) Cache-based trace logging using tags in an upper-level cache

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant