WO2017035813A1 - 一种数据访问方法、装置及*** - Google Patents

一种数据访问方法、装置及*** Download PDF

Info

Publication number
WO2017035813A1
WO2017035813A1 PCT/CN2015/088872 CN2015088872W WO2017035813A1 WO 2017035813 A1 WO2017035813 A1 WO 2017035813A1 CN 2015088872 W CN2015088872 W CN 2015088872W WO 2017035813 A1 WO2017035813 A1 WO 2017035813A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory page
memory
identifier
mru
page
Prior art date
Application number
PCT/CN2015/088872
Other languages
English (en)
French (fr)
Inventor
汪涛
张广飞
宋风龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201580001271.1A priority Critical patent/CN107209761B/zh
Priority to PCT/CN2015/088872 priority patent/WO2017035813A1/zh
Publication of WO2017035813A1 publication Critical patent/WO2017035813A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor

Definitions

  • the present invention relates to the field of computers, and in particular, to a data access method, apparatus, and system.
  • the Graphics Processing Unit is a microprocessor that performs image computing operations on personal computers, workstations, game consoles, and some mobile devices such as tablets, smartphones, and the like.
  • the data that the GPU needs to process is stored in the memory and the memory respectively.
  • the memory is directly connected to the GPU, and the memory is directly connected to the central processing unit (CPU), and the GPU needs to acquire the memory.
  • the CPU can pass the GPU.
  • the CPU accesses the corresponding data from memory.
  • the communication bandwidth between the GPU and the video memory is usually around 200 GB/s, and the communication bandwidth between the CPU and the memory is usually about 80 GB/s, but the memory capacity is usually about 8 times of the memory capacity, so usually installed.
  • the compiler is used to predict hotspot data with high access times, and the hotspot data is allocated in the video memory, so that when the program is running, the access frequency of the hotspot data stored in the video memory is high, so the GPU The required data can be obtained from the video memory relatively quickly.
  • this data distribution method has already allocated the predicted hotspot data to the video memory when the program is installed.
  • the data stored in the video memory cannot be modified, and with the development of the multi-core technology, the process of running the program.
  • the access data of the GPU is characterized by burstiness, irregularity, and unpredictability.
  • the data originally stored in the memory may also become hot data.
  • the GPU can only access the memory with lower communication bandwidth through the CPU. Data, which causes the GPU to increase access latency when accessing data.
  • Embodiments of the present invention provide a data access method, apparatus, and system, which can reduce an access delay when a GPU accesses data.
  • an embodiment of the present invention provides a data access method, including:
  • the GPU acquires an access request of the first memory page, where the access request carries a physical address of the first memory page;
  • the GPU searches for the MRU corresponding to the physical address of the first memory page in the most recently accessed MRU table according to the physical address of the first memory page and the correspondence between the preset physical address and the identifier of the memory page. Whether the identifier of the first memory page is included in the table entry, where the MRU table includes an identifier of a memory page stored in the video memory in each memory page group, and the identifier of the first memory page is the first memory The unique identifier of the page in the associated memory page group;
  • the GPU sends a page handling request to the central processing unit CPU, so that the CPU is configured according to the page. Transmitting a request to store the first memory page into the video memory;
  • the GPU accesses the first memory page from the video memory.
  • the MRU table is composed of a plurality of MRU entries, and the MRU entries are in one-to-one correspondence with the memory page group, where
  • the GPU searches for an MRU entry corresponding to the physical address of the first memory page in the MRU table according to the physical address of the first memory page and the correspondence between the preset physical address and the identifier of the memory page. Whether to include the identifier of the first memory page, including:
  • the GPU Determining, by the GPU, a first MRU entry corresponding to a physical address of the first memory page in the MRU table according to X bits in a physical address of the first memory page, the X bits Bit is used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1;
  • the GPU is configured according to the physical address of the first memory page and the first MRU entry And determining, by the identifier of the second memory page, whether the identifier of the first memory page is included in the first MRU entry, including:
  • the GPU determines that the identifier of the first memory page is included in the first MRU entry.
  • the GPU determines that the identifier of the first memory page is not included in the first MRU entry.
  • the method further includes:
  • the GPU modifies an identifier of the second memory page in the first MRU entry to an identifier of the first memory page.
  • the method further includes:
  • the GPU stores the second memory page to the memory by the CPU
  • the GPU deletes the second memory page stored in the video memory.
  • the method further includes:
  • the GPU accesses the first memory page from the video memory.
  • an embodiment of the present invention provides a data access apparatus, including:
  • An obtaining unit configured to acquire an access request of the first memory page, where the access request carries a physical address of the first memory page;
  • a searching unit configured to search, according to a physical address of the first memory page and a correspondence between a preset physical address and an identifier of the memory page, a physical address corresponding to the physical address of the first memory page in the most recently accessed MRU table Whether the identifier of the first memory page is included in the MRU entry, the MRU table includes an identifier of a memory page stored in the video memory in each memory page group, and the identifier of the first memory page is the A unique identifier of a memory page in its associated memory page group;
  • the GPU sends a page handling request to the central processing unit CPU, so that the CPU And storing the first memory page into the video memory according to the page handling request;
  • an access unit configured to access the first memory page from the video memory.
  • the device further includes:
  • a determining unit configured to determine, in the MRU table, a first MRU entry corresponding to a physical address of the first memory page according to X bits in a physical address of the first memory page, where the X The bits are used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1; and, according to the physical address of the first memory page and the second stored in the first MRU entry An identifier of the memory page, determining whether the identifier of the first memory page is included in the first MRU entry;
  • the MRU table is composed of multiple MRU entries, and the MRU entries are in one-to-one correspondence with the memory page group.
  • the determining unit is configured to determine, according to the physical address of the first memory page and the identifier of the second memory page that is stored in the first MRU entry, whether the first MRU entry includes the first An identifier of a memory page; if the identifier of the second memory page is the same as the identifier of the first memory page, the GPU determines that the first MRU entry is The GPU includes an identifier of the first memory page; if the identifier of the second memory page is different from the identifier of the first memory page, the GPU determines that the first memory is not included in the first MRU entry. The identity of the page.
  • the apparatus further includes:
  • a modifying unit configured to modify an identifier of the second memory page in the first MRU entry to an identifier of the first memory page.
  • the device further includes:
  • a storage unit configured to store the second memory page to the memory by using a CPU
  • a deleting unit configured to delete the second memory page stored in the video memory.
  • the access unit is further configured to: if the identifier of the first memory page is included in the first MRU entry, access the first memory page from the video memory.
  • the data access device is a GPU.
  • an embodiment of the present invention provides a data access system, where the system includes a graphics processor GPU, a central processing unit CPU and a video memory connected to the GPU, and a memory connected to the CPU;
  • the GPU is configured to: acquire an access request of a first memory page, where the access request carries a physical address of the first memory page; according to a physical address of the first memory page and a preset physical address and a memory Corresponding relationship between the identifiers of the pages, and whether the MRU entry corresponding to the physical address of the first memory page in the most recently accessed MRU table includes the identifier of the first memory page, where the MRU table includes An identifier of a memory page stored in the memory in the memory page group, the identifier of the first memory page being a unique identifier of the first memory page in the associated memory page group; if the physical of the first memory page If the identifier of the first memory page is not included in the MRU entry corresponding to the address, the GPU sends a page handling request to the central processing unit CPU, so that the CPU moves according to the page.
  • the transport request stores the first memory page into the video memory; and accesses the first memory page from the video memory.
  • the MRU table is composed of a plurality of MRU entries, and the MRU entries are in one-to-one correspondence with the memory page group, where
  • the GPU is further configured to: determine, according to the X bits in the physical address of the first memory page, a first MRU entry corresponding to the physical address of the first memory page in the MRU table, where The X bits are used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1; and, according to the physical address of the first memory page and the first MRU entry, And an identifier of the second memory page, determining whether the identifier of the first memory page is included in the first MRU entry.
  • the GPU is further configured to: determine, according to the X bits in the physical address of the first memory page, a first MRU entry corresponding to the physical address of the first memory page in the MRU table, where The X bits are used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1; and, according to the physical address of the first memory page and the first MRU entry, And an identifier of the second memory page, determining whether the identifier of the first memory page is included in the first MRU entry.
  • the GPU is further configured to: modify an identifier of the second memory page in the first MRU entry to an identifier of the first memory page.
  • the GPU is further configured to: store, by the CPU, the second memory page to the memory; and delete the second memory page stored in the video memory.
  • the GPU is further configured to: if the identifier of the first memory page is included in the first MRU entry, access the first memory page from the video memory.
  • An embodiment of the present invention provides a data access method, apparatus, and system, wherein a GPU acquires an access request of a first memory page, where the access request carries a physical address of a first memory page; and further, the GPU according to the physical address and Corresponding relationship between the preset physical address and the identifier of the memory page, and searching whether the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page in the MRU table, where the MRU table is The identifier of the memory page stored in the memory in each memory page group; if the MRU entry corresponding to the physical address of the first memory page in the MRU table does not include the identifier of the first memory page, A memory page is stored in the memory and is not stored in the video memory.
  • the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page into the video memory according to the page handling request; so that the GPU accesses from the video memory.
  • the first memory page since each memory page group in the memory has a memory page stored in the video memory, the GPU can dynamically move each of the programs according to the physical address of the first memory page to be accessed during the running of the program.
  • the memory pages to be accessed in the memory page group are transferred to the memory for access, so that the GPU can fully utilize the high communication bandwidth of the memory and the high capacity of the memory, and dynamically modify the memory pages stored in the memory according to the access requirements, thereby reducing The access latency of the GPU to access data from memory and graphics memory.
  • FIG. 1 is a schematic diagram of a connection between a GPU, a CPU, a memory, and a memory in the prior art
  • FIG. 2 is a schematic flowchart 1 of a data access method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of multiple memory page groups in a memory according to an embodiment of the present invention.
  • FIG. 4 is a second schematic flowchart of a data access method according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram 1 of a data access apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is a schematic structural diagram 2 of a data access apparatus according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram 3 of a data access apparatus according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram 4 of a data access apparatus according to an embodiment of the present disclosure.
  • FIG. 9 is a schematic structural diagram 1 of a data access system according to an embodiment of the present disclosure.
  • FIG. 10 is a schematic structural diagram 2 of a data access system according to an embodiment of the present invention.
  • first and second are used for descriptive purposes only, and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining “first” and “second” may include one or more of the features either explicitly or implicitly. In the description of the present invention, "a plurality” means two or more unless otherwise stated.
  • An embodiment of the present invention provides a data access method, as shown in FIG. 2, including:
  • the GPU acquires an access request of the first memory page, where the access request carries a physical address of the first memory page.
  • the GPU searches for the object in the MRU table and the first memory page according to the physical address of the first memory page and the correspondence between the preset physical address and the identifier of the memory page. Whether the MRU entry corresponding to the address contains the identifier of the first memory page.
  • the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page into the video memory according to the page handling request.
  • the GPU accesses the first memory page from the video memory.
  • the GPU needs to retrieve the data stored in the memory or the memory to realize the image computing work, and the data in the memory or the memory is managed and accessed in units of pages, that is, the data in the memory or the memory can be divided into Multiple memory pages. For example, if a memory page is 64 KB in size, that is, a memory page includes 64 KB of data, then an 8 Gbyte of memory contains 131072 memory pages, and each memory page corresponds to a physical address (Physical Address). Or the CPU can perform an addressing operation through a physical address to find and access a memory page indicated by the physical address.
  • Physical Address Physical Address
  • the GPU may acquire an access request of the first memory page from the CPU to implement image computing work for the data stored in the first memory page, for example, when the user triggers the 3D display instruction, the CPU After the access request of the first memory page is generated according to the 3D display instruction, the GPU is sent to the GPU through the bus, so that the GPU acquires the first memory page according to the access request, and performs 3D display.
  • the access request carries the physical address of the first memory page.
  • step 102 the GPU searches for the pre-stored MRU (Most Recently Used) according to the physical address of the first memory page acquired in step 101 and the correspondence between the preset physical address and the identifier of the memory page. In the table, whether the MRU entry corresponding to the physical address of the first memory page includes the identifier of the first memory page.
  • MRU Mobile Radio Unit
  • the MRU table includes an identifier of a memory page stored in the video memory in each memory page group, and the identifier of the first memory page is a unique identifier of the first memory page in the associated memory page group.
  • the GPU can quickly access the data directly from the video memory, or the GPU needs to access the data from the memory with a large capacity but a small communication bandwidth through the CPU.
  • the MRU table can be pre-stored in the system running by the GPU, wherein at least each memory is recorded in the MRU table. The ID of the memory page stored in the video memory in the page group.
  • 64G of memory can be divided into multiple memory page groups, for example, A, B, C, D, E, F, G, and H have a total of 8 memory pages as one memory page group.
  • the MRU table records the identifier of the memory page stored in the memory in each memory page group. For example, the MRU table is also set.
  • 131072 MRU entries which are the same as the number of memory page groups.
  • the MRU table may include a correspondence between the preset physical address and the identifier of the memory page.
  • the GPU may determine the first memory page according to the physical address of the first memory page.
  • the physical address of the memory page group falls within the address range of the memory page group, thereby determining the memory page group to which the first memory page belongs (ie, determining the MRU entry corresponding to the physical address of the first memory page), and the MRU in the MRU.
  • the identifier of the memory page stored in the video memory in the table entry.
  • the GPU can select the MRU corresponding to the physical address of the first memory page according to the physical address and the correspondence between the preset physical address and the identifier of the memory page.
  • the identifier of the first memory page is stored in the memory. If the identifier of the first memory page is recorded in the MRU table corresponding to the physical address of the first memory page in the MRU table, the memory is stored in the memory. If the identifier of the first memory page is not recorded in the MRU entry corresponding to the physical address of the first memory page in the MRU table, the first internal memory is not stored in the video memory. Save the page.
  • a storage device (such as a register, etc.) may be separately set in the system where the GPU is located for storing the MRU table, so that when the GPU acquires the physical address of the first memory page, the GPU can directly access the storage device.
  • the MRU table prevents the GPU from accessing the MRU table from memory or video memory.
  • FIG. 3 is only a way of dividing 64G memory into multiple memory page groups. It should be understood that there are various ways to divide the memory into multiple memory page groups, for example, according to the memory.
  • the physical address is divided, or divided according to the access frequency of the memory page, and the like, which is not limited by the embodiment of the present invention.
  • step 103 if the first memory page is not included in the video memory, that is, the identifier of the first memory page is not recorded in the MRU entry corresponding to the physical address of the first memory page, the first memory is stored in the memory.
  • the memory page at this time, the GPU can send a page handling request to the CPU to cause the CPU to store the first memory page into the video memory according to the page handling request.
  • the GPU may also use the MRU entry corresponding to the physical address of the first memory page in the MRU table.
  • the modification is performed, and the identifier of the first memory page stored in the video memory is recorded in the MRU entry to establish a correspondence between the identifier of the first memory page and the physical address of the first memory page, so as to facilitate subsequent GPU acquisition.
  • the MRU table is directly searched to determine that the first memory page is stored in the video memory.
  • the GPU determines whether the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page, thereby determining the identifier of the first memory page. Whether it is stored in the video memory, if the identifier of the first memory page is not recorded in the MRU entry corresponding to the physical address of the first memory page in the MRU table, the first memory page is not stored in the video memory, and the GPU may A memory page is transferred to the memory for access, so that the most recently used memory page of each memory page group can be dynamically transferred to the video memory during the execution of the program, so that the GPU can fully utilize the high communication bandwidth of the memory. And the high-capacity characteristics of the memory, dynamically modify the memory pages stored in the memory according to the access requirements, and reduce the GPU access data. Access delay.
  • step 104 since the first memory page has been stored in the video memory in step 103, the GPU can directly access the first memory page from the video memory having a higher communication bandwidth.
  • An embodiment of the present invention provides a data access method, in which a GPU acquires an access request of a first memory page, where the access request carries a physical address of a first memory page; and further, the GPU according to the physical address and a preset physical The mapping between the address and the identifier of the memory page, and whether the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page in the MRU table, where the MRU table includes each memory The identifier of the memory page stored in the memory in the page group; if the MRU entry corresponding to the physical address of the first memory page in the MRU table does not include the identifier of the first memory page, that is, the first memory page is stored In the memory, but not stored in the video memory, at this time, the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page into the video memory according to the page handling request; so that the GPU accesses the first memory page from the video memory.
  • the GPU can dynamically move each of the programs according to the physical address of the first memory page to be accessed during the running of the program.
  • the memory pages to be accessed in the memory page group are transferred to the memory for access, so that the GPU can fully utilize the high communication bandwidth of the memory and the high capacity of the memory, and dynamically modify the memory pages stored in the memory according to the access requirements, thereby reducing The access latency of the GPU to access data from memory and graphics memory.
  • An embodiment of the present invention provides a data access method, as shown in FIG. 4, including:
  • the GPU acquires an access request of the first memory page, where the access request carries a physical address of the first memory page.
  • the GPU determines, according to the X bits in the physical address of the first memory page, a first MRU entry corresponding to the physical address of the first memory page in the MRU table, where the X bits are used for unique representation.
  • the GPU uses, as the identifier of the first memory page, the Z bits in the physical address of the first memory page, where the Z bits are used to uniquely represent in the first memory page group.
  • the identifier of the first memory page is the Z bits in the physical address of the first memory page, where the Z bits are used to uniquely represent in the first memory page group.
  • the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page according to the page handling request. To the video memory.
  • the GPU modifies the identifier of the second memory page in the first MRU entry to the identifier of the first memory page in the MRU table.
  • the GPU stores the second memory page into the memory through the CPU; and deletes the second memory page stored in the video memory.
  • the GPU accesses the first memory page from the video memory.
  • the GPU may obtain an access request of the first memory page from the CPU to implement an image operation operation on the data in the first memory page, where the access request carries the first memory page. Physical address.
  • the MRU table is pre-stored in the system where the GPU is located, and the MRU table is composed of multiple MRU entries, and one MRU entry corresponds to one memory page group, and is stored in any MRU entry.
  • There is an identifier of a memory page whose identifier is the identifier of the memory page stored in the memory in the memory page group corresponding to the MRU entry in which it resides.
  • step 202 the GPU determines, according to the X bits (X ⁇ 1) in the physical address of the first memory page acquired in step 201, the first corresponding to the physical address of the first memory page in the MRU table.
  • the MRU entry is a one-to-one correspondence between an MRU entry and a memory page group. Therefore, the first memory page group in which the first memory page is located can be determined according to the X bits in the physical address.
  • a total of 8 memory pages of A, B, C, D, E, F, G, and H are one memory page group, 64G memory.
  • the MRU table is also set with 131072 MRU entries (that is, 2 17 MRU entries), which is the same as the number of memory page groups in the MRU table.
  • Each MRU entry stores an identifier of a memory page stored in the memory in the corresponding memory page group.
  • the first MRU entry stores the identifier 000 of the memory page A, that is, A, B, C, and D.
  • the memory page A in the memory page group composed of E, F, G, and H is stored in the video memory.
  • the GPU may determine, from the X bits of the physical addresses of the Y bits, the physical address corresponding to the MRU table.
  • the first MRU entry that is, the first memory page group in which the first memory page is located, where the identifier of the second memory page is stored in the first MRU entry, Y ⁇ X ⁇ 1.
  • the physical address of the first memory page obtained from step 201 is 36 bits, assuming that the size of each memory page is 64 Kb, wherein 1-16 bits of the 36 physical addresses are used to indicate one An intra-page offset of the memory page, and the 17-33 bits of the 36-bit physical address are used as the X-bit to indicate the first MRU entry to which the first memory page belongs, that is, the first memory The first memory page group to which the page belongs.
  • the GPU can determine, according to the 16-33 bits of the 36-bit physical address, which MRU entry indicated by the physical address is in the MRU table of the 131072 MRU entries (that is, the first MRU entry). Further, the identifier of the memory page stored in the first MRU entry is the identifier of the second memory page.
  • the GPU can determine the physical address indicated in the MRU table according to 17-33 bits therein.
  • the MRU entry is determined according to 00000000000000000.
  • the MRU entry indicated by the physical address in the MRU table is the first MRU entry (as shown in Table 2).
  • the GPU may use Z bits of the 36-bit physical address as the identifier of the first memory page, wherein the Z bits are used for unique representation.
  • the identifier of the first memory page in the first memory page group For example, if there are 8 memory pages in a memory page group, then 3 bits can uniquely represent any one of the 8 memory pages. Identification, by way of example, the 34th to 36th bits of the physical address may be used as the identification of the first memory page.
  • step 204 the GPU compares the Z bits of the identifier of the first memory page in step 203 with the identifier of the second memory page stored in the first MRU entry in step 202, if Z The second memory page indicated by the first MRU entry in the MRU table is the same as the first memory page, indicating that the first memory page is stored in the video memory; if Z bits are The bit is different from the identifier of the second memory page, and the second memory page indicated by the first MRU entry in the MRU table is different from the first memory page, indicating that the first memory page is stored in the memory.
  • the GPU can identify the 34th to 36th bits (ie, 111) of the 36-bit physical address and the second memory page stored in the first MRU entry in Table 2 ( That is, 000) for comparison, it can be seen that 111 and 000 are not the same, therefore, it can be determined that the first memory page that the GPU needs to access is stored in the memory, and the memory page group in which the first memory page is located is stored in The memory page in the video memory is the second memory page identified as 111.
  • the GPU may send a page transfer request to the CPU to cause the CPU to store the first memory page into the video memory according to the page transfer request.
  • step 205 after the CPU stores the first memory page to the video memory according to the page handling request, the GPU may modify the identifier of the second memory page in the first MRU entry to the first memory page in the MRU table. logo.
  • the memory page with access can be dynamically transferred to the video memory according to the access request received by the GPU, and the physical data of the data in the first memory page can be obtained by the subsequent GPU by modifying the MRU table.
  • the MRU table is directly looked up to determine that the first memory page is stored in the video memory.
  • step 206 if the second memory page is different from the first memory page, that is, the memory
  • the first memory page is stored in the memory.
  • the GPU since the GPU needs to transfer the first memory page to the memory for storage by the CPU, the GPU can store the second memory page originally stored in the video memory to the memory through the CPU; The second memory page already stored in the video memory is deleted, so that the GPU can carry the first memory page to the memory where the second memory page is stored by the CPU.
  • the GPU does not need to send a page handling request or modify the first MRU entry in the MRU table, which can be directly Access the first memory page from the video memory.
  • step 206 since the first memory page has been stored in the video memory in step 203, the GPU can directly access the first memory page from the video memory having a higher communication bandwidth.
  • the embodiment of the present invention does not limit the execution order between the steps 204-206, that is, if the identifier of the second memory page is different from the identifier of the first memory page, the GPU may first Modifying the identifier of the second memory page into the identifier of the first memory page in the MRU table, and further sending a page handling request to the CPU, and storing the second memory page in the memory, and deleting the second memory page stored in the memory.
  • the GPU may perform each step in steps 204-206 at the same time, which is not limited in this embodiment of the present invention.
  • An embodiment of the present invention provides a data access method, in which a GPU acquires an access request of a first memory page, where the access request carries a physical address of a first memory page; and further, the GPU according to the physical address and a preset physical The mapping between the address and the identifier of the memory page, and whether the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page in the MRU table, where the MRU table includes each memory The identifier of the memory page stored in the video memory in the page group; if the MRU table does not include the identifier of the first memory page, the first memory page is stored in the memory and is not stored in the video memory.
  • the GPU sends a page handling request to the CPU to cause the CPU to store the first memory page into the video memory according to the page handling request; so that the GPU accesses the first memory page from the video memory.
  • the GPU can be based on the physicality of the first memory page to be accessed.
  • Address, in the process of running the program dynamically move the memory pages to be accessed in each memory page group to the memory for access, so that the GPU can make full use of the high communication bandwidth of the memory and the high capacity of the memory, according to the access
  • the requirement dynamically modifies the memory pages stored in the memory to reduce the access delay of the GPU to access data from the memory and the video memory.
  • An embodiment of the present invention provides a data access device, as shown in FIG. 5, including:
  • the obtaining unit 01 is configured to acquire an access request of the first memory page, where the access request carries a physical address of the first memory page;
  • the searching unit 02 is configured to search for the physical address of the most recently accessed MRU table and the first memory page according to the physical address of the first memory page and the correspondence between the preset physical address and the identifier of the memory page. Whether the identifier of the first memory page is included in the corresponding MRU entry, where the MRU table includes an identifier of a memory page stored in the video memory in each memory page group, where the identifier of the first memory page is The unique identifier of the first memory page in the associated memory page group;
  • a sending unit 03 where the MRU entry corresponding to the physical address of the first memory page does not include the identifier of the first memory page, and the GPU sends a page handling request to the central processing unit CPU, so that the The CPU stores the first memory page into the video memory according to the page handling request;
  • the access unit 04 is configured to access the first memory page from the video memory.
  • the device further includes:
  • a determining unit 05 configured to determine, according to X bits in a physical address of the first memory page, a first MRU entry corresponding to a physical address of the first memory page in the MRU table, X bits are used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1; and, according to the physical address of the first memory page and the first stored in the first MRU entry An identifier of the second memory page, determining whether the identifier of the first memory page is included in the first MRU entry;
  • the MRU table is composed of multiple MRU entries, and the MRU entries are in one-to-one correspondence with the memory page group.
  • the determining unit 05 is specifically configured to be used according to the first memory page The physical address and the identifier of the second memory page stored in the first MRU entry, determining whether the identifier of the first memory page is included in the first MRU entry; if the identifier of the second memory page is The identifier of the first memory page is the same, the GPU determines that the identifier of the first memory page is included in the first MRU entry, and the identifier of the second memory page and the first memory page The GPU determines that the identifier of the first memory page is not included in the first MRU entry.
  • the device further includes:
  • the modifying unit 06 is configured to modify an identifier of the second memory page in the first MRU entry to an identifier of the first memory page.
  • the device further includes:
  • a storage unit 07 configured to store the second memory page to the memory by using a CPU
  • the deleting unit 08 is configured to delete the second memory page stored in the video memory.
  • the access unit 04 is further configured to: if the identifier of the first memory page is included in the first MRU entry, access the first memory page from the video memory.
  • the data access device is a GPU.
  • An embodiment of the present invention provides a data access device, where the device acquires an access request of a first memory page, where the access request carries a physical address of a first memory page; and further, the GPU is based on the physical address and a preset physical address.
  • the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page in the MRU table, where the MRU table includes each memory page An identifier of a memory page stored in the video memory in the group; if the MRU entry corresponding to the physical address of the first memory page in the MRU table does not include the identifier of the first memory page, the first memory page is stored in the In the memory, at this time, the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page into the video memory according to the page handling request; so that the GPU accesses the first memory page from the video memory.
  • each memory page group in the memory has a memory page stored in the video memory
  • the GPU can dynamically move each of the programs according to the physical address of the first memory page to be accessed during the running of the program.
  • the memory pages to be accessed in the memory page group are transferred to the video memory for access, so that the GPU can fully utilize the high communication bandwidth of the memory and the high capacity of the memory, and dynamically repair according to the access requirements. Change the memory page stored in the memory to reduce the access delay of the GPU to access data from memory and video memory.
  • An embodiment of the present invention provides a data access system, as shown in FIG. 9, the system includes a GPU 11, a CPU 12 and a memory 13 connected to the GPU 11, and a memory 14 connected to the CPU 12; among them,
  • the GPU 11 is configured to: acquire an access request of a first memory page, where the access request carries a physical address of the first memory page; according to a physical address of the first memory page and a preset physical address Correspondence between the identifiers of the memory pages, and whether the MRU entry corresponding to the physical address of the first memory page in the most recently accessed MRU table includes the identifier of the first memory page, where the MRU table includes An identifier of a memory page stored in the memory 13 in each memory page group, the identifier of the first memory page being a unique identifier of the first memory page in the associated memory page group; if the first memory page
  • the GPU 11 sends a page handling request to the central processing unit CPU 12 to cause the CPU 12 to perform the page handling request according to the page handling request, and the MRU entry corresponding to the physical address does not include the identifier of the first memory page.
  • the first memory page is stored in the video memory 13; and the first memory page is accessed from the video memory 13.
  • the system further includes a register 15 connected to the GPU 11, and the MRU table is stored in the register 15, so that when the GPU 11 acquires the physical address of the first memory page, the system directly Accessing the MRU table from the storage device avoids the access delay caused by the GPU 11 accessing the MRU table from the memory 14 or the memory 13.
  • the MRU table is composed of a plurality of MRU entries, and the MRU entries are in one-to-one correspondence with the memory page group, where
  • the GPU 11 is further configured to: determine, according to the X bits in the physical address of the first memory page, a first MRU entry corresponding to the physical address of the first memory page in the MRU table, The X bits are used to uniquely represent the first memory page group to which the first memory page belongs, X ⁇ 1; and, according to the physical address of the first memory page and the first MRU entry The identifier of the second memory page determines whether the identifier of the first memory page is included in the first MRU entry.
  • the GPU 11 is further configured to: use Z bits in a physical address of the first memory page as an identifier of the first memory page, where the Z bits are used to uniquely represent the The identifier of the first memory page in the first memory page group, Z ⁇ 1; if the identifier of the second memory page is the same as the identifier of the first memory page, the GPU 11 determines the first MRU The CDR 11 determines that the first MRU entry does not include the identifier of the first memory page, and the identifier of the second memory page is different from the identifier of the first memory page. The identifier of the first memory page.
  • the GPU 11 is further configured to: modify an identifier of the second memory page in the first MRU entry to an identifier of the first memory page.
  • the GPU 11 is further configured to: store, by the CPU 12, the second memory page to the memory 14; and delete the second memory page stored in the video memory 13.
  • the GPU 11 is further configured to: if the identifier of the first memory page is included in the first MRU entry, access the first memory page from the video memory 13.
  • An embodiment of the present invention provides a data access system, wherein a GPU acquires an access request of a first memory page, where the access request carries a physical address of a first memory page; and further, the GPU according to the physical address and a preset physical The mapping between the address and the identifier of the memory page, and whether the identifier of the first memory page is stored in the MRU entry corresponding to the physical address of the first memory page in the MRU table, where the MRU table includes each memory The identifier of the memory page stored in the memory in the page group; if the MRU entry corresponding to the physical address of the first memory page in the MRU table does not include the identifier of the first memory page, that is, the first memory page is stored In the memory, at this time, the GPU sends a page handling request to the CPU, so that the CPU stores the first memory page into the video memory according to the page handling request; so that the GPU accesses the first memory page from the video memory.
  • the GPU can dynamically move each of the programs according to the physical address of the first memory page to be accessed during the running of the program.
  • the memory pages to be accessed in the memory page group are transferred to the memory for access, so that the GPU can fully utilize the high communication bandwidth of the memory and the high capacity of the memory, and dynamically modify the memory pages stored in the memory according to the access requirements, thereby reducing GPU accesses data from memory and video memory Access delay.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division.
  • there may be another division manner for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium. , including several instructions All or part of the steps of the method of the various embodiments of the present invention are performed by a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

一种数据访问方法、装置及***,涉及计算机领域,可降低GPU访问数据时的访问延时。该方法包括:GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址(101);GPU根据第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与第一内存页的物理地址对应的MRU表项中是否包含第一内存页的标识(102);若第一内存页的物理地址对应的MRU表项中不包含第一内存页的标识,则GPU向中央处理器CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中(103);GPU从显存中访问第一内存页(104)。

Description

一种数据访问方法、装置及*** 技术领域
本发明涉及计算机领域,尤其涉及一种数据访问方法、装置及***。
背景技术
图形处理器(Graphics Processing Unit,GPU),是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上进行图像运算工作的微处理器。
通常,GPU需要处理的数据分别存放在显存和内存中,如图1所示,显存直接与GPU相连接,而内存与中央处理器(Central Processing Unit,CPU)直接连接,当GPU需要获取内存中的数据时,由于,GPU与CPU之间通过AGP(Accelerated Graphic Ports,一种图形***接口)总线或者PCI-E(Peripheral Component Interconnect-Express,新一代的总线接口)总线连接,因此,GPU可以通过CPU从内存中访问相应的数据。
其中,GPU与显存之间的通信带宽通常在200GB/S左右,CPU与内存之间的通信带宽通常在80GB/S左右,但内存的容量通常是显存容量的8倍左右,因此,通常在安装程序的同时,使用编译器预测访问次数较高的热点数据,并将该热点数据分配在显存内,这样,在程序运行时,由于存储在显存中的热点数据的访问频率较高,因此,GPU可以较为快速的从显存中获取需要的数据。
然而,这种数据分配方法在安装程序时就已经将预测得到的热点数据分配至显存中,一旦程序运行时显存中存储的数据便无法修改,而随着多核技术的发展,在程序运行的过程中GPU的访问数据具有突发性、不规则性和不可预测性等特征,原本存储在内存中的数据也有可能成为热点数据,此时,GPU只能通过CPU从通信带宽较低的内存中访问数据,导致GPU访问数据时的访问延时增加。
发明内容
本发明的实施例提供一种数据访问方法、装置及***,可降低GPU访问数据时的访问延时。
为达到上述目的,本发明的实施例采用如下技术方案:
第一方面,本发明的实施例提供一种数据访问方法,包括:
GPU获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;
所述GPU根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;
若所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;
所述GPU从所述显存中访问所述第一内存页。
结合第一方面,在第一方面的第一种可能的实现方式中,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应,其中,
所述GPU根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,包括:
所述GPU根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;
所述GPU根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述GPU根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识,包括:
所述GPU将所述第一内存页的物理地址中的Z个比特位作为所述第一内存页的标识,所述Z个比特位用于唯一表示在所述第一内存页组中所述第一内存页的标识,Z≥1;
若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中包含所述第一内存页的标识;
若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
结合第一方面的第一或第二种可能的实现方式,在第一方面的第三种可能的实现方式中,若所述第一MRU表项中不包含所述第一内存页的标识,所述方法还包括:
所述GPU将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
结合第一方面的第三种可能的实现方式,在所述GPU将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识之后,还包括:
所述GPU通过所述CPU将所述第二内存页存储至所述内存;
所述GPU将所述显存中存储的所述第二内存页删除。
结合第一方面的第一至第四种可能的实现方式,在第一方面的第五种可能的实现方式中,若所述第一MRU表项中包含所述第一内存页的标识,则所述方法还包括:
所述GPU从所述显存中访问所述第一内存页。
第二方面,本发明的实施例提供一种数据访问装置,包括:
获取单元,用于获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;
查找单元,用于根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;
发送单元,用于所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;
访问单元,用于从所述显存中访问所述第一内存页。
结合第二方面,在第二方面的第一种可能的实现方式中,所述装置还包括:
确定单元,用于根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;
其中,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应。
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,
所述确定单元,具体用于根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中 包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
结合第二方面的第一或第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述装置还包括:
修改单元,用于将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
结合第二方面的第三种可能的实现方式,在第二方面的第四种可能的实现方式中,所述装置还包括:
存储单元,用于通过CPU将所述第二内存页存储至所述内存;
删除单元,用于将所述显存中存储的所述第二内存页删除。
结合第二方面的第一至第四种可能的实现方式,在第二方面的第五种可能的实现方式中,
所述访问单元,还用于若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存中访问所述第一内存页。
结合第二方面的第一至第五种可能的实现方式,在第二方面的第五种可能的实现方式中,所述数据访问装置为GPU。
第三方面,本发明的实施例提供一种数据访问***,所述***包括图形处理器GPU、与所述GPU均相连的中央处理器CPU和显存、以及与所述CPU相连的内存;其中,
所述GPU用于:获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;若所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬 运请求将所述第一内存页存储至所述显存中;并从所述显存中访问所述第一内存页。
结合第三方面,在第三方面的第一种可能的实现方式中,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应,其中,
所述GPU还用于:根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
结合第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,
所述GPU还用于:根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
结合第三方面的第一或第二种可能的实现方式,在第三方面的第三种可能的实现方式中,
所述GPU还用于:将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
结合第三方面的第一或第二种可能的实现方式,在第三方面的第四种可能的实现方式中,
所述GPU还用于:通过所述CPU将所述第二内存页存储至所述内存;以及,将所述显存中存储的所述第二内存页删除。
结合第三方面的第一或第四种可能的实现方式,在第三方面的第五种可能的实现方式中,
所述GPU还用于:若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存中访问所述第一内存页。
本发明的实施例提供一种数据访问方法、装置及***,其中,GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址;进而,GPU根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与第一内存页的物理地址对应的MRU表项中是否存储有第一内存页的标识,其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识;若该MRU表中与第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,即该第一内存页存储在内存中,而没有存储在显存中,此时,GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中;以便于GPU从显存中访问第一内存页。这样一来,由于内存中的各个内存页组中均有一个内存页存储在显存中,因此,GPU可以根据待访问的第一内存页的物理地址,在程序运行的过程中动态的将每个内存页组中待访问的内存页搬运到显存中进行访问,这样,GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修改显存中的存放的内存页,降低GPU从内存和显存中访问数据的访问延时。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍。
图1为现有技术中GPU、CPU、显存和内存之间的连接示意图;
图2为本发明实施例提供的一种数据访问方法的流程示意图一;
图3为本发明实施例提供的内存中的多个内存页组的示意图;
图4为本发明实施例提供的一种数据访问方法的流程示意图二;
图5为本发明实施例提供的一种数据访问装置的结构示意图一;
图6为本发明实施例提供的一种数据访问装置的结构示意图二;
图7为本发明实施例提供的一种数据访问装置的结构示意图三;
图8为本发明实施例提供的一种数据访问装置的结构示意图四;
图9为本发明实施例提供的一种数据访问***的结构示意图一;
图10为本发明实施例提供的一种数据访问***的结构示意图二。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。
另外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
实施例1
本发明的实施例提供一种数据访问方法,如图2所示,包括:
101、GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址。
102、GPU根据该第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与该第一内存页的物 理地址对应的MRU表项中是否包含该第一内存页的标识。
103、若该MRU表中不包含该第一内存页的标识,则GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中。
104、GPU从该显存中访问该第一内存页。
具体的,GPU需要通过调取存储在显存或内存中的数据,实现图像运算工作,而显存或内存中的数据均是以页为单位进行管理和访问,即显存或内存中的数据可以分为多个内存页。例如,一个内存页的大小为64KB,即一个内存页包括64KB的数据,那么一个8G的内存中便含有131072个内存页,而每个内存页均与一个物理地址(Physical Address)相对应,GPU或者CPU可以通过物理地址进行寻址操作,查找并访问该物理地址所指示的内存页。
在步骤101中,GPU在运行过程中,可以从CPU处获取第一内存页的访问请求,以实现针对第一内存页中存储的数据的图像运算工作,例如,用户触发3D显示指令时,CPU根据该3D显示指令生成第一内存页的访问请求后通过总线发送至GPU,以便于GPU根据该访问请求获取第一内存页,进行3D显示。
其中,该访问请求中携带有第一内存页的物理地址。
在步骤102中,GPU根据步骤101中获取的该第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找预先存储的MRU(Most Recently Used,最近最多使用的)表中,与该第一内存页的物理地址对应的MRU表项中是否包含第一内存页的标识。
其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识,该第一内存页的标识为该第一内存页在所属的内存页组中的唯一标识。
具体的,由于GPU需要处理的数据分别存放在显存和内存中,GPU可以快速的直接从显存中访问数据,或者,GPU需要通过CPU从容量较大但通信带宽较小的内存中访问数据。为了降低GPU从内存 和显存中访问数据的访问延时,同时充分利用显存的高通信带宽和内存的高容量的特点,可以在GPU运行的***中预先存储MRU表,其中,该MRU表中至少记录有每个内存页组中存储在显存中的内存页的标识。
示例性的,如图3所示,可以将64G的内存划分为多个内存页组,例如A、B、C、D、E、F、G和H共8个内存页为一个内存页组,以一个内存页组为64KB举例,那么64G的内存中共有131072个内存页组,其中,MRU表中记录了每个内存页组中存储在显存中的内存页的标识,例如,MRU表也设置有131072个MRU表项,与内存页组的个数相同,当内存页A存储在显存中,那么如果内存页A的标识为000,则MRU表的第一MRU表项内便存储有内存页A的标识000。
其中,MRU表中可以包括该预置的物理地址与内存页的标识之间的对应关系,如表1所示的MRU表,GPU可以根据第一内存页的物理地址,确定该第一内存页的物理地址落入哪个内存页组的地址范围之中,进而确定该第一内存页所属的内存页组(即确定与第一内存页的物理地址所对应的MRU表项),以及在该MRU表项中存储在显存中的内存页的标识。
表1
内存页组的地址范围 内存页的标识
第一内存页组的地址范围 000
第二内存页组的地址范围 101
…… ……
这样,当GPU获取到第一内存页的物理地址时,GPU可以根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,在与第一内存页的物理地址对应的MRU表项中,查找显存中是否存储有第一内存页的标识,如果第一内存页的标识记录在MRU表中与第一内存页的物理地址对应的MRU表项中,则说明显存中存储有该第一内存页;如果第一内存页的标识未记录在MRU表中与第一内存页的物理地址对应的MRU表项中,则说明显存中没有存储该第一内 存页。
另外,可以在GPU所在的***内单独设置一个存储装置(例如寄存器等)用于存放该MRU表,这样,当GPU获取到第一内存页的物理地址时,可直接从该存储装置中访问该MRU表,避免GPU从内存或显存中访问该MRU表所带来的访问延时。
需要说明的是,图3中仅仅是举例说明将64G的内存划分为多个内存页组的方式,应当理解的是,将内存划分为多个内存页组的方式可以有多种,例如按照内存的物理地址进行划分,或者按照内存页的访问频率进行划分等,本发明实施例对此不做限定。
在步骤103中,若显存中不包括该第一内存页,即第一内存页的标识未记录在该第一内存页的物理地址对应的MRU表项中,则说明内存中存储有该第一内存页,此时,GPU可以向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中。
当然,当CPU根据页搬运请求将第一内存页存储至显存之后,由于第一内存页存储在显存中,因此GPU还可以对MRU表中,与第一内存页的物理地址对应的MRU表项进行修改,在该MRU表项中记录存储在显存中的第一内存页的标识,以建立该第一内存页的标识与该第一内存页的物理地址的对应关系,以方便后续GPU再次获取到该第一内存页中的数据的物理地址时,直接查找MRU表以确定该第一内存页存储在显存中。
可以看出,GPU根据待访问的第一内存页的物理地址,确定第一内存页的标识是否存储在第一内存页的物理地址所对应的MRU表项中,进而确定第一内存页的标识是否存储在显存中,如果第一内存页的标识未记录在MRU表中与第一内存页的物理地址对应的MRU表项中,则说明显存中没有存储该第一内存页,GPU可以将第一内存页搬运到显存中进行访问,这样,可以在程序执行的过程中,动态的将每个内存页组中最近最多使用的内存页搬运至显存中,使得GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修改显存中的存放的内存页,降低GPU访问数据的 访问延时。
最后,在步骤104中,由于在步骤103中已经将第一内存页存储至显存中,因此,GPU可以直接从通信带宽较高的显存中访问该第一内存页。
本发明的实施例提供一种数据访问方法,其中,GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址;进而,GPU根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与第一内存页的物理地址对应的MRU表项中是否存储有第一内存页的标识,其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识;若该MRU表中与第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,即该第一内存页存储在内存中,而没有存储在显存中,此时,GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中;以便于GPU从显存中访问第一内存页。这样一来,由于内存中的各个内存页组中均有一个内存页存储在显存中,因此,GPU可以根据待访问的第一内存页的物理地址,在程序运行的过程中动态的将每个内存页组中待访问的内存页搬运到显存中进行访问,这样,GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修改显存中的存放的内存页,降低GPU从内存和显存中访问数据的访问延时。
实施例2
本发明的实施例提供一种数据访问方法,如图4所示,包括:
201、GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址。
202、GPU根据该第一内存页的物理地址中的X个比特位,在MRU表中确定与该第一内存页的物理地址对应的第一MRU表项,该X个比特位用于唯一表示该第一内存页所属的第一内存页组。
203、GPU将该第一内存页的物理地址中的Z个比特位作为该第一内存页的标识,该Z个比特位用于唯一表示在该第一内存页组中 该第一内存页的标识。
204、若该第一内存页的标识与该第一MRU表项中存储的第二内存页的标识不同,GPU则向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中。
205、GPU在该MRU表中将第一MRU表项中该第二内存页的标识修改为该第一内存页的标识。
206、GPU通过CPU将第二内存页存储至内存;并且,将显存中存储的第二内存页删除。
207、GPU从显存中访问该第一内存页。
在步骤201中,GPU在运行过程中,可以从CPU处获取第一内存页的访问请求,以实现针对第一内存页中数据的图像运算工作,其中,该访问请求中携带有第一内存页的物理地址。
具体的,GPU所在的***中预先存储有MRU表,该MRU表由多个MRU表项组成,一个MRU表项与一个内存页组之间一一对应,并且,任意一个MRU表项中均存储有一个内存页的标识,该内存页的标识为其所在的MRU表项对应的内存页组中存储在显存中的内存页的标识。
在步骤202中,GPU根据步骤201中获取的该第一内存页的物理地址中的X个比特位(X≥1),在MRU表中确定该第一内存页的物理地址所对应的第一MRU表项,由于一个MRU表项与一个内存页组之间一一对应,因此,可以根据该物理地址中的X个比特,确定该第一内存页所在的第一内存页组。
仍以实施例1中图3所示的64G内存和8G显存为例进行说明,A、B、C、D、E、F、G和H共8个内存页为一个内存页组,64G的内存中共有131072个内存页组,对应的,如表2所示,MRU表也设置有131072个MRU表项(即217个MRU表项),与内存页组的个数相同,该MRU表中每一个MRU表项内存储有对应的内存页组中存储在显存中的内存页的标识,例如,第一MRU表项中存储有内存页A的标识000,即A、B、C、D、E、F、G和H组成的内存页组中内存页A存储 在显存中。
表2
Figure PCTCN2015088872-appb-000001
此时,若从步骤201中获取的物理地址为Y个比特位,那么,GPU可以从这Y个比特位的物理地址中的X个比特位中,确定在MRU表中该物理地址所对应的第一MRU表项,即确定该第一内存页所在的第一内存页组,其中,该第一MRU表项中存储有第二内存页的标识,Y≥X≥1。
示例性的,从步骤201中获取的第一内存页的物理地址为36位,假设每个内存页的大小为64Kb,其中,这36位物理地址中的1-16个比特位用于指示一个内存页的页内偏移量,而这36位物理地址中的17-33个比特位作为该X个比特,用于指示该第一内存页所属的第一MRU表项,即该第一内存页所属的第一内存页组。
那么GPU可以根据这36位物理地址中的16-33个比特位,在131072个MRU表项组成的MRU表中,确定该物理地址指示的MRU表项是哪一个(即第一MRU表项),进而,该第一MRU表项中存储的内存页的标识即为上述第二内存页的标识。
例如,第一内存页的物理地址为E0000FFFF(16进制),转换为2进制即为111000000000000000001111111111111111共36位,那么,GPU可以根据其中的17-33位确定在MRU表中该物理地址所指示的MRU表项,即根据00000000000000000确定在MRU表中该物理地址所指示的MRU表项为第一MRU表项(如表2所示)。
此时,在步骤203中,GPU可以将该36位物理地址中的Z个比特位作为该第一内存页的标识,其中,该Z个比特位用于唯一表示 在该第一内存页组中该第一内存页的标识,例如,若一个内存页组中共有8个内存页,那么,3个比特位即可以唯一表示8个内存页中任一个内存页的标识,示例性的,可以用该物理地址中的第34-36个比特位作为该第一内存页的标识。
进一步地,在步骤204中,GPU将步骤203中作为该第一内存页的标识的Z个比特位,与步骤202中第一MRU表项内存储的第二内存页的标识进行比较,若Z个比特位与第二内存页的标识相同,则MRU表中第一MRU表项所指示的第二内存页与第一内存页相同,说明该第一内存页存储在显存中;若Z个比特位与第二内存页的标识不同,则MRU表中第一MRU表项所指示的第二内存页与第一内存页不相同,说明该第一内存页存储在内存中。
仍以上述36位物理地址为例,GPU可以将该36位物理地址中第34至36个比特位(即111),与表2中第一MRU表项中存储的第二内存页的标识(即000)进行比较,可以看出,111与000并不相同,因此,可以确定GPU需要访问的第一内存页存储在内存中,并且,该第一内存页所在的内存页组中,存储在显存中的内存页是标识为111的第二内存页。
具体的,若该第一内存页的标识与该第一MRU表项中存储的第二内存页的标识不同,即内存中存储有该第一内存页,在显存中并未存储该第一内存页。此时,GPU可以向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中。
在步骤205中,在CPU根据页搬运请求将第一内存页存储至显存之后,GPU可以在该MRU表中,将第一MRU表项内该第二内存页的标识修改为该第一内存页的标识。这样,在程序运行的过程中可以动态的根据GPU接收的访问请求,将带访问的内存页搬运至显存中,并通过修改MRU表方便后续GPU再次获取到该第一内存页中的数据的物理地址时,直接查找MRU表以确定该第一内存页存储在显存中。
在步骤206中,若该第二内存页与第一内存页不相同,即内存 中存储有该第一内存页,此时,由于GPU需要通过CPU将第一内存页搬运至显存中进行存储,因此,GPU可以通过CPU将显存中原来存储的第二内存页存储至内存;并且,将显存中已经存储的第二内存页删除,这样,GPU便可以通过CPU将第一内存页搬运至显存中原来存储第二内存页的位置进行存储。
当然,如果该第二内存页与第一内存页相同,即说明该第一内存页存储在显存中,那么,GPU便无需发送页搬运请求或修改MRU表中的第一MRU表项,可直接从显存中访问该第一内存页。
最后,在步骤206中,由于在步骤203中已经将第一内存页存储至显存中,因此,GPU可以直接从通信带宽较高的显存中访问该第一内存页。
进一步需要说明的是,本发明实施例并不限定步骤204-206之间的执行顺序,也就是说,若该第二内存页的标识与第一内存页的标识不相同,GPU可以先在该MRU表中将该第二内存页的标识修改为该第一内存页的标识,进而向CPU发送页搬运请求,并将第二内存页存储至内存,将显存中存储的第二内存页删除,或者,若该第二内存页的标识与第一内存页的标识不相同,GPU可以同时执行步骤204-206中的每个步骤,本发明实施例对此不作任何限制。
本发明的实施例提供一种数据访问方法,其中,GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址;进而,GPU根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与第一内存页的物理地址对应的MRU表项中是否存储有第一内存页的标识,其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识;若该MRU表中不包含所述第一内存页的标识,即该第一内存页存储在内存中,而没有存储在显存中,此时,GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中;以便于GPU从显存中访问第一内存页。这样一来,由于内存中的各个内存页组中均有一个内存页存储在显存中,因此,GPU可以根据待访问的第一内存页的物理 地址,在程序运行的过程中动态的将每个内存页组中待访问的内存页搬运到显存中进行访问,这样,GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修改显存中的存放的内存页,降低GPU从内存和显存中访问数据的访问延时。
实施例3
本发明的实施例提供一种数据访问装置,如图5所示,包括:
获取单元01,用于获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;
查找单元02,用于根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;
发送单元03,用于所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;
访问单元04,用于从所述显存中访问所述第一内存页。
进一步地,如图6所示,所述装置还包括:
确定单元05,用于根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;
其中,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应。
进一步地,所述确定单元05,具体用于根据所述第一内存页的 物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
进一步地,如图7所示,所述装置还包括:
修改单元06,用于用于将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
进一步地,如图8所示,所述装置还包括:
存储单元07,用于通过CPU将所述第二内存页存储至所述内存;
删除单元08,用于将所述显存中存储的所述第二内存页删除。
进一步地,所述访问单元04,还用于若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存中访问所述第一内存页。
可选的,所述数据访问装置为GPU。
本发明的实施例提供一种数据访问装置,该装置获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址;进而,GPU根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与第一内存页的物理地址对应的MRU表项中是否存储有第一内存页的标识,其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识;若该MRU表中与第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,即该第一内存页存储在内存中,此时,GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中;以便于GPU从显存中访问第一内存页。这样一来,由于内存中的各个内存页组中均有一个内存页存储在显存中,因此,GPU可以根据待访问的第一内存页的物理地址,在程序运行的过程中动态的将每个内存页组中待访问的内存页搬运到显存中进行访问,这样,GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修 改显存中的存放的内存页,降低GPU从内存和显存中访问数据的访问延时。
实施例4
本发明的实施例提供一种数据访问***,如图9所示,所述***包括GPU 11、与所述GPU 11均相连的CPU 12和显存13、以及与所述CPU 12相连的内存14;其中,
所述GPU 11用于:获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存13中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;若所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU 11向中央处理器CPU 12发送页搬运请求,以使得所述CPU 12根据所述页搬运请求将所述第一内存页存储至所述显存13中;并从所述显存13中访问所述第一内存页。
进一步地,如图10所示,所述***还包括与GPU 11相连的寄存器15,该寄存器15中存储有该MRU表,这样,当GPU 11获取到第一内存页的物理地址时,可直接从该存储装置中访问该MRU表,避免GPU 11从内存14或显存13中访问该MRU表所带来的访问延时。
具体的,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应,其中,
所述GPU 11还用于:根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
进一步地,所述GPU 11还用于:将所述第一内存页的物理地址中的Z个比特位作为所述第一内存页的标识,所述Z个比特位用于唯一表示在所述第一内存页组中所述第一内存页的标识,Z≥1;若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU 11则确定所述第一MRU表项中包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU 11则确定所述第一MRU表项中不包含所述第一内存页的标识。
进一步地,所述GPU 11还用于:将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
进一步地,所述GPU 11还用于:通过所述CPU 12将所述第二内存页存储至所述内存14;以及,将所述显存13中存储的所述第二内存页删除。
进一步地,所述GPU 11还用于:若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存13中访问所述第一内存页。
本发明的实施例提供一种数据访问***,其中,GPU获取第一内存页的访问请求,该访问请求中携带有第一内存页的物理地址;进而,GPU根据该物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与第一内存页的物理地址对应的MRU表项中是否存储有第一内存页的标识,其中,该MRU表中包含每个内存页组中存储在显存中的内存页的标识;若该MRU表中与第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,即该第一内存页存储在内存中,此时,GPU向CPU发送页搬运请求,以使得CPU根据页搬运请求将第一内存页存储至显存中;以便于GPU从显存中访问第一内存页。这样一来,由于内存中的各个内存页组中均有一个内存页存储在显存中,因此,GPU可以根据待访问的第一内存页的物理地址,在程序运行的过程中动态的将每个内存页组中待访问的内存页搬运到显存中进行访问,这样,GPU可以充分利用显存的高通信带宽和内存的高容量的特点,根据访问需求动态的修改显存中的存放的内存页,降低GPU从内存和显存中访问数据的 访问延时。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用 以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种数据访问方法,其特征在于,包括:
    图形处理器GPU获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;
    所述GPU根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;
    若所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;
    所述GPU从所述显存中访问所述第一内存页。
  2. 根据权利要求1所述的方法,其特征在于,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应,其中,
    所述GPU根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,包括:
    所述GPU根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;
    所述GPU根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
  3. 根据权利要求2所述的方法,其特征在于,所述GPU根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页 的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识,包括:
    所述GPU将所述第一内存页的物理地址中的Z个比特位作为所述第一内存页的标识,所述Z个比特位用于唯一表示在所述第一内存页组中所述第一内存页的标识,Z≥1;
    若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中包含所述第一内存页的标识;
    若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
  4. 根据权利要求2或3所述的方法,其特征在于,若所述第一MRU表项中不包含所述第一内存页的标识,所述方法还包括:
    所述GPU将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
  5. 根据权利要求4所述的方法,其特征在于,在所述GPU将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识之后,还包括:
    所述GPU通过所述CPU将所述第二内存页存储至所述内存;
    所述GPU将所述显存中存储的所述第二内存页删除。
  6. 根据权利要求2-5中任一项所述的方法,其特征在于,若所述第一MRU表项中包含所述第一内存页的标识,则所述方法还包括:
    所述GPU从所述显存中访问所述第一内存页。
  7. 一种数据访问装置,其特征在于,包括:
    获取单元,用于获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;
    查找单元,用于根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页 组中的唯一标识;
    发送单元,用于所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;
    访问单元,用于从所述显存中访问所述第一内存页。
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括:
    确定单元,用于根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;
    其中,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应。
  9. 根据权利要求8所述的装置,其特征在于,
    所述确定单元,具体用于根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
  10. 根据权利要求8或9所述的装置,其特征在于,所述装置还包括:
    修改单元,用于将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
  11. 根据权利要求10所述的装置,其特征在于,所述装置还包括:
    存储单元,用于通过CPU将所述第二内存页存储至所述内存;
    删除单元,用于将所述显存中存储的所述第二内存页删除。
  12. 根据权利要求7-11中任一项所述的装置,其特征在于,
    所述访问单元,还用于若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存中访问所述第一内存页。
  13. 根据权利要求7-12中任一项所述的装置,其特征在于,所述数据访问装置为图形处理器GPU。
  14. 一种数据访问***,其特征在于,所述***包括图形处理器GPU、与所述GPU均相连的中央处理器CPU和显存、以及与所述CPU相连的内存;其中,
    所述GPU用于:获取第一内存页的访问请求,所述访问请求中携带有所述第一内存页的物理地址;根据所述第一内存页的物理地址以及预置的物理地址与内存页的标识之间的对应关系,查找最近最多访问MRU表中与所述第一内存页的物理地址对应的MRU表项中是否包含所述第一内存页的标识,所述MRU表中包含每个内存页组中存储在显存中的内存页的标识,所述第一内存页的标识为所述第一内存页在所属的内存页组中的唯一标识;若所述第一内存页的物理地址对应的MRU表项中不包含所述第一内存页的标识,则所述GPU向中央处理器CPU发送页搬运请求,以使得所述CPU根据所述页搬运请求将所述第一内存页存储至所述显存中;并从所述显存中访问所述第一内存页。
  15. 根据权利要求14所述的***,其特征在于,所述MRU表由多个MRU表项组成,所述MRU表项与所述内存页组之间一一对应,其中,
    所述GPU还用于:根据所述第一内存页的物理地址中的X个比特位,在所述MRU表中确定与所述第一内存页的物理地址对应的第一MRU表项,所述X个比特位用于唯一表示所述第一内存页所属的第一内存页组,X≥1;以及,根据所述第一内存页的物理地址和所述第一MRU表项中存储的第二内存页的标识,确定所述第一MRU表项中是否包含所述第一内存页的标识。
  16. 根据权利要求15所述的***,其特征在于,
    所述GPU还用于:将所述第一内存页的物理地址中的Z个比特位作为所述第一内存页的标识,所述Z个比特位用于唯一表示在所述第一内存页组中所述第一内存页的标识,Z≥1;若所述第二内存页的标识与所述第一内存页的标识相同,所述GPU则确定所述第一MRU表项中包含所述第一内存页的标识;若所述第二内存页的标识与所述第一内存页的标识不同,所述GPU则确定所述第一MRU表项中不包含所述第一内存页的标识。
  17. 根据权利要求15或16所述的***,其特征在于,
    所述GPU还用于:将所述第一MRU表项中所述第二内存页的标识修改为所述第一内存页的标识。
  18. 根据权利要求17所述的***,其特征在于,
    所述GPU还用于:通过所述CPU将所述第二内存页存储至所述内存;以及,将所述显存中存储的所述第二内存页删除。
  19. 根据权利要求14-18中任一项所述的***,其特征在于,
    所述GPU还用于:若所述第一MRU表项中包含所述第一内存页的标识,则从所述显存中访问所述第一内存页。
PCT/CN2015/088872 2015-09-02 2015-09-02 一种数据访问方法、装置及*** WO2017035813A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201580001271.1A CN107209761B (zh) 2015-09-02 2015-09-02 一种数据访问方法、装置及***
PCT/CN2015/088872 WO2017035813A1 (zh) 2015-09-02 2015-09-02 一种数据访问方法、装置及***

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/088872 WO2017035813A1 (zh) 2015-09-02 2015-09-02 一种数据访问方法、装置及***

Publications (1)

Publication Number Publication Date
WO2017035813A1 true WO2017035813A1 (zh) 2017-03-09

Family

ID=58186501

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/088872 WO2017035813A1 (zh) 2015-09-02 2015-09-02 一种数据访问方法、装置及***

Country Status (2)

Country Link
CN (1) CN107209761B (zh)
WO (1) WO2017035813A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107861815A (zh) * 2017-10-31 2018-03-30 华中科技大学 一种多gpu环境下的数据通信性能优化方法
CN113469215A (zh) * 2021-05-28 2021-10-01 北京达佳互联信息技术有限公司 数据处理方法、装置、电子设备及存储介质
CN118132273A (zh) * 2024-04-29 2024-06-04 阿里云计算有限公司 数据处理方法、装置及设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981806A (zh) * 2012-11-05 2013-03-20 中国船舶重工集团公司第七二四研究所 基于显存池的态势显示区高速刷新技术及其实现方法
CN103020320A (zh) * 2013-01-11 2013-04-03 西安交通大学 一种基于动态搜索的运行时gpu显存级数据复用优化方法
CN103198514A (zh) * 2013-03-25 2013-07-10 南京大学 一种三维地震体数据的实时光线投射体绘制方法
CN103200128A (zh) * 2013-04-01 2013-07-10 华为技术有限公司 一种网络包处理的方法、装置和***
US20140146065A1 (en) * 2012-11-29 2014-05-29 Nvidia Corporation Mpi communication of gpu buffers
CN103995684A (zh) * 2014-05-07 2014-08-20 广东粤铁瀚阳科技有限公司 超高分辨率平台下的海量影像并行处理显示方法及***

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102646074B (zh) * 2012-02-22 2015-04-15 中国人民解放军国防科学技术大学 龙芯3a平台大内存设备的地址映射方法
CN104426971B (zh) * 2013-08-30 2017-11-17 华为技术有限公司 一种远程内存交换分区方法、装置及***
CN104572509B (zh) * 2014-12-26 2017-11-07 中国电子科技集团公司第十五研究所 一种在龙芯计算平台上实现独立显卡显存分配的方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981806A (zh) * 2012-11-05 2013-03-20 中国船舶重工集团公司第七二四研究所 基于显存池的态势显示区高速刷新技术及其实现方法
US20140146065A1 (en) * 2012-11-29 2014-05-29 Nvidia Corporation Mpi communication of gpu buffers
CN103020320A (zh) * 2013-01-11 2013-04-03 西安交通大学 一种基于动态搜索的运行时gpu显存级数据复用优化方法
CN103198514A (zh) * 2013-03-25 2013-07-10 南京大学 一种三维地震体数据的实时光线投射体绘制方法
CN103200128A (zh) * 2013-04-01 2013-07-10 华为技术有限公司 一种网络包处理的方法、装置和***
CN103995684A (zh) * 2014-05-07 2014-08-20 广东粤铁瀚阳科技有限公司 超高分辨率平台下的海量影像并行处理显示方法及***

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107861815A (zh) * 2017-10-31 2018-03-30 华中科技大学 一种多gpu环境下的数据通信性能优化方法
CN107861815B (zh) * 2017-10-31 2020-05-19 华中科技大学 一种多gpu环境下的数据通信性能优化方法
CN113469215A (zh) * 2021-05-28 2021-10-01 北京达佳互联信息技术有限公司 数据处理方法、装置、电子设备及存储介质
CN113469215B (zh) * 2021-05-28 2022-07-08 北京达佳互联信息技术有限公司 数据处理方法、装置、电子设备及存储介质
CN118132273A (zh) * 2024-04-29 2024-06-04 阿里云计算有限公司 数据处理方法、装置及设备

Also Published As

Publication number Publication date
CN107209761B (zh) 2019-08-06
CN107209761A (zh) 2017-09-26

Similar Documents

Publication Publication Date Title
US10055158B2 (en) Providing flexible management of heterogeneous memory systems using spatial quality of service (QoS) tagging in processor-based systems
US9933979B2 (en) Device and method for storing data in distributed storage system
JP6542909B2 (ja) ファイル操作方法及び装置
US11354230B2 (en) Allocation of distributed data structures
US20150113230A1 (en) Directory storage method and query method, and node controller
TWI627533B (zh) 提供記憶體管理單元分割之轉譯快取及其相關設備、方法及電腦可讀取媒體
US9984003B2 (en) Mapping processing method for a cache address in a processor to provide a color bit in a huge page technology
WO2015180598A1 (zh) 对存储设备的访问信息处理方法和装置、***
US9063667B2 (en) Dynamic memory relocation
WO2017035813A1 (zh) 一种数据访问方法、装置及***
CN116860665A (zh) 由处理器执行的地址翻译方法及相关产品
JP6674460B2 (ja) 不均一メモリアーキテクチャにおける改善されたレイテンシのためのシステムおよび方法
WO2019011311A1 (zh) 数据访问方法及装置
US11256630B2 (en) Cache address mapping method and related device
US10817432B2 (en) Memory address assignment method for virtual machine and apparatus
EP3964996A1 (en) Database access method and device
JP2018508869A (ja) 仮想化環境におけるストレージリソース管理
JP2018501559A5 (zh)
JP6036190B2 (ja) 情報処理装置、情報処理システムの制御方法及び情報処理システムの制御プログラム
US20170031601A1 (en) Memory system and storage system
CN106407347A (zh) 一种缓存数据的方法和装置
CN116795740A (zh) 数据存取方法、装置、处理器、计算机***及存储介质
WO2016131175A1 (zh) 多核***中数据访问者目录的访问方法及设备
WO2015161804A1 (zh) 一种Cache分区的方法及装置
KR102657586B1 (ko) 그래픽스 데이터를 관리하는 방법 및 장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15902620

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15902620

Country of ref document: EP

Kind code of ref document: A1