CN115114192B - Memory interface, functional core, many-core system and memory data access method - Google Patents

Memory interface, functional core, many-core system and memory data access method Download PDF

Info

Publication number
CN115114192B
CN115114192B CN202110309611.4A CN202110309611A CN115114192B CN 115114192 B CN115114192 B CN 115114192B CN 202110309611 A CN202110309611 A CN 202110309611A CN 115114192 B CN115114192 B CN 115114192B
Authority
CN
China
Prior art keywords
target
address
functional core
core
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110309611.4A
Other languages
Chinese (zh)
Other versions
CN115114192A (en
Inventor
吴臻志
丁瑞强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Lynxi Technology Co Ltd
Original Assignee
Beijing Lynxi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Lynxi Technology Co Ltd filed Critical Beijing Lynxi Technology Co Ltd
Priority to CN202110309611.4A priority Critical patent/CN115114192B/en
Priority to PCT/CN2022/079235 priority patent/WO2022199357A1/en
Publication of CN115114192A publication Critical patent/CN115114192A/en
Application granted granted Critical
Publication of CN115114192B publication Critical patent/CN115114192B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/1652Handling requests for interconnection or transfer for access to memory bus based on arbitration in a multiprocessor architecture
    • G06F13/1663Access to shared memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/1605Handling requests for interconnection or transfer for access to memory bus based on arbitration
    • G06F13/161Handling requests for interconnection or transfer for access to memory bus based on arbitration with latency improvement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • G06F13/16Handling requests for interconnection or transfer for access to memory bus
    • G06F13/18Handling requests for interconnection or transfer for access to memory bus based on priority control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application discloses a memory interface, a functional core, a many-core system and a memory data access method, and belongs to the technical field of computers. The method for accessing storage data is applied to a memory interface, wherein the memory interface is positioned in a first functional core in a many-core system, and the method comprises the following steps: receiving a first read request, wherein the first read request carries a target global address; and responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is positioned in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space. The embodiment of the application can improve the operation efficiency of the many-core system.

Description

Memory interface, functional core, many-core system and memory data access method
Technical Field
The application belongs to the technical field of computers, and particularly relates to a memory interface, a functional core, a many-core system and a memory data access method.
Background
The many-core system has stronger data processing capability. Wherein, many functional cores are provided in many core systems.
In the related art, when multiple functional cores need to access the same data, there is a problem of access delay, so that the operation efficiency of the many-core system is low.
Disclosure of Invention
The embodiment of the application aims to provide a memory interface, a functional core, a many-core system and a storage data access method, which can solve the problem of low operation efficiency of the many-core system caused by data delay in the storage data access method in the related technology.
In order to solve the technical problems, the application is realized as follows:
in a first aspect, an embodiment of the present application provides a memory interface, the memory interface being located in a first functional core in a many-core system, the memory interface comprising:
The target location analyzer is used for determining a first private address corresponding to the target global address under the condition that a target storage space indicated by a target global address carried by a first read request is located in a first functional core, so as to access first target data corresponding to the first private address in the target storage space based on the first private address;
The target location analyzer stores a mapping relation between the target global address and the first private address in advance, and the target storage space is located in the first functional core.
In a second aspect, an embodiment of the present application provides a functional core, where the functional core includes a memory and a memory interface connected to the memory, and the memory interface is a memory interface according to the first aspect.
In a third aspect, an embodiment of the present application provides a many-core system, where the many-core system includes a plurality of functional cores as described in the second aspect, and any two functional cores in the many-core system are communicatively connected.
In a fourth aspect, an embodiment of the present application provides a method for accessing stored data, applied to the memory interface in the first aspect, where the method includes:
receiving a first read request, wherein the first read request carries a target global address;
And responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is positioned in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space.
In a fifth aspect, an embodiment of the present application provides a storage data access device applied to the memory interface according to the first aspect, the device including:
The first receiving module is used for receiving a first reading request, wherein the first reading request carries a target global address;
The first transmission module is used for responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is located in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space.
In a sixth aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, and a program or instruction stored on the memory and executable on the processor, the program or instruction implementing the steps of the method according to the fourth aspect when executed by the processor.
In a seventh aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the fourth aspect.
In an eighth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement a method according to the fourth aspect.
In an embodiment of the present application, a memory interface in a first functional core is configured to receive a first read request, where the first read request carries a target global address; and responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is located in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space. Thus, the functional core in the many-core system can access the private storage space of another functional core according to the global address, and different shared data can be stored in the private storage space of different functional cores in a scattered mode, so that the problems that the waiting time is long and the waiting time is uncertain when a plurality of functional cores access the global shared storage space respectively due to the fact that a large amount of shared data are stored in the global shared storage space are avoided, and the operation efficiency of the many-core system is improved.
Drawings
FIG. 1 is a flow chart of a method for accessing stored data according to an embodiment of the present application;
FIG. 2 is a schematic diagram of data interaction between a first functional core and a second functional core in a method for accessing stored data according to an embodiment of the present application;
FIG. 3 is a schematic diagram of data interaction in a first functional core in a method for accessing stored data according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a memory interface according to an embodiment of the present application;
FIG. 5 is a block diagram of a memory data access device according to an embodiment of the present application;
Fig. 6 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type, and are not limited to the number of objects, such as the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The many-core system comprises a plurality of functional cores, and if the plurality of functional cores in the many-core system need to access the same storage data, in the related art, the plurality of functional cores can acquire the same shared data in the following two ways:
In a first mode, a global shared memory space is set, the global shared memory space is set outside a functional core and can be set outside a chip where the functional core is located, in an application, when the functional core needs to acquire target data from the global shared memory space, a read request is sent to the global shared memory space through a shared memory bus, and the target data is transmitted through the shared memory bus.
In the above-mentioned application scenario, where multiple computing units access the global shared memory space simultaneously, it is necessary to determine that one core obtains the usage right through contention arbitration, so that one core obtaining the usage right obtains the stored data from the global shared memory space through the shared memory bus, and other cores not obtaining the usage right will need to wait continuously, which results in a longer time being required for the cores to access the global shared memory space. In addition, the data in the global shared memory space needs to be transmitted to the functional core through the shared memory bus, and the data delay time is not fixed, even unpredictable, so that the shared memory bus is easy to be congested, and the operation efficiency of the many-core system is obviously reduced.
In the second mode, the shared data are copied into the private memory space of each functional core that needs to use the shared data, and because the private memory space is only used by the functional core, the external functional core cannot access the private space of other functional cores, and in implementation, each functional core acquires the shared data through the private memory space.
In this embodiment, when an algorithm (such as a large neural network) with shared data is run, the shared portion needs to be copied to the private memory of each functional core, which wastes resources. In addition, the private memory can not arrange a large array or share data, so that the application range of the many-core system is limited.
In order to solve the technical problems, the embodiment of the application converts the global address and the corresponding private address so that the external functional core can access the private memory space of other functional cores through the global address. On one hand, shared data does not need to be copied into the private memory of each core, so that resource waste is reduced, and the application range of a many-core system is improved; on the other hand, the global shared storage space is not required to be additionally arranged outside the functional cores, so that the operation efficiency of the many-core system is improved.
The storage data access method, the storage data access device, the electronic equipment and the readable storage medium provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for accessing storage data according to an embodiment of the present application can be applied to a memory interface, where the memory interface is located in a first functional core in a many-core system, and as shown in fig. 1, the method for accessing storage data may include the following steps:
step 101, receiving a first read request, wherein the first read request carries a target global address.
Step 102, in response to the first read request, when the target storage space indicated by the target global address is located in the first functional core, determining a first private address corresponding to the target global address according to a mapping relationship between the global address and the private address, and transmitting first target data corresponding to the first private address in the target storage space.
In practice, the above-mentioned functional core may be referred to as a "core" or "core", which is the smallest unit of the many-core system that can be independently scheduled and has complete computing power.
In some alternative embodiments, the private memory spaces in each functional core have the same private address, respectively, for example: the number of private storage spaces of the first functional core is N, the number of N storage space identifiers in the first functional core is 0-N-1, the number of private storage spaces of the second functional core is also N, and the number of N storage space identifiers in the second functional core is also 0-N-1.
The global address is unique in the many-core system and can point to a target private storage space in the many-core system, where the target private storage space is located in a target functional core, for example: the global address is a combination of a core identification of a first functional core and a storage space identification of a target storage space in the first functional core. In an implementation, the global address may be a starting storage address of the first target data, and the first read request may further include a first length of the first target data. At this time, the above-mentioned transmission of the first target data corresponding to the first private address in the target storage space may be understood as: and transmitting the data with the first length, which comprises the initial address and the later first length, in the first functional core to a requester of the first read request by taking the first private address corresponding to the global address as the initial address.
As an alternative embodiment, the first read request includes at least one of:
a read request sent by a first data path of the first functional core;
A read request sent by the second functional core;
Wherein the second functional core is a different functional core in the many-core system than the first functional core.
The second functional core may be any functional core in the many-core system other than the first functional core, that is, an external functional core of the first functional core, and the number of the second functional cores may be one or multiple.
The read request sent by the first data path of the first functional core may be understood as: the first functional core accesses data stored within the first functional core using the global address.
In practical application, the first functional core may directly use the private address to perform private access to the data stored in the first functional core, which is not limited herein.
The read request sent by the second functional core may be understood as: the external functional core uses the global address to make global access to the data stored in the first functional core.
In implementations, the same memory space within the first functional core may respond to only one of the global access and the private access at the same time, e.g.: the memory space may be targeted for switching by mode switching in a first mode of operation (which may also be referred to as a "global mode") and/or in a second mode of operation (which may also be referred to as a "private mode"). Wherein, in the first operation mode (which may be also referred to as a "global mode"), the global address in the received access request is converted into a private address through the memory interface in the first functional core, so as to access the data in the target storage space based on the private address (i.e. only the global access to the target storage space is responded in this operation mode, and the response may be refused for the private access); in the second mode of operation (which may also be referred to as a "private mode"), the data in the target memory space is accessed via the memory interface within the first functional core based on the private address in the received access request (i.e., in this mode of operation only the private access to the target memory space is responded to, and the response may be denied for global access).
In addition, the private memory space (i.e., memory slice) of the first functional core may include one or more, where in a case where there are a plurality of private memory spaces (i.e., memory slices) of the first functional core, a part of the private memory space may operate in a private mode, and another part may operate in a global mode, where the private memory space in the private mode is only accessed by the first functional core and the private memory space in the global mode is only accessed by the second functional core.
In this embodiment, the first functional core and the second functional core in the many-core system may access data in the first functional core by using the global address.
As an alternative embodiment, the first read request and the first target data are transmitted over a network-on-chip or a bus-on-chip.
In an implementation, the first functional core and the second functional core are connected by the network-on-chip or bus-on-chip.
For example: as shown in fig. 2, a communication connection is established between a request core 21 (i.e., a second functional core) and a destination core 22 (i.e., a first functional core) through an on-chip/inter-chip network 23, where the request core 21 specifically includes: the first private memory 211, the first dual-mode memory interface 212, the first route 213, and the data path 214, the destination core 22 specifically includes: a second private memory 221, a second dual-mode memory interface 222, and a second route 223. When the request core 21 needs to read the target data stored in the private memory 224 of the target core 22, the data path 214 in the request core 21 sends a read request to the on-chip/inter-chip network 23 through the second dual-mode memory interface 222 and the first route 213, the read request carries the core identifier of the request core 21 and the global address of the request read data, the target core 22 receives the read request from the on-chip/inter-chip network 23 through the second route 223, and when the core identifier in the read request is not consistent with the core identifier of the target core 22, the second dual-mode memory interface 222 converts the global address in the read request into the private address and transmits the target data stored in the private address in the second private memory 221 to the on-chip/inter-chip network 23 through the second route 223, so that the request core 21 obtains the target data from the on-chip/inter-chip network 23 through the first route 213, thereby realizing that the request core 21 obtains the target data from the private memory of the target core 22.
It should be noted that, in the embodiment shown in fig. 2, the arrow direction indicates the transmission direction of the signaling or the target data in the process of the request core 21 obtaining the target data from the private memory 224 of the destination core 22.
In this embodiment, the request signaling and the data packet between the first functional cores are transmitted through the on-chip network or the on-chip bus, so that when the plurality of second functional cores access the private memory of the first functional core at the same time, the waiting time can be reduced.
Of course, in the implementation, the first functional core and the second functional core may also be connected through other networks, for example: a short-range communication network, etc., are not particularly limited herein.
In addition, in an alternative embodiment, in a case that the number of the second functional cores is a plurality, that is, a plurality of external functional cores access the private storage space of the first functional core, the first functional core may also respond to the read requests of the plurality of second functional cores one by one.
As an alternative embodiment, the method further comprises:
And storing the mapping relation between the global address and the private address.
In an implementation, the mapping relationship between the global address and the private address may be: and in operation, inquiring the private address corresponding to the global address in the mapping table. Of course, it may also be a conversion relationship between the pre-stored private address and the global address, so as to dynamically convert the global address into the corresponding private address in the running process.
Of course, the mapping relationship between the global address and the private address may further include: and determining a mapping relation according to the conversion relation between the global address and the private address.
As an optional implementation manner, the storing the mapping relationship between the global address and the private address includes:
under the condition that the many-core system comprises S chips, indicating a target functional core where a corresponding private address is located based on each global address and the target chip where the target functional core is located, so as to store the mapping relation according to the global address corresponding to each private address;
Or alternatively
When the many-core system comprises 1 chip and the chip comprises a functional core array arranged in J rows and P columns, storing the mapping relation based on the target functional core where each global address refers to the corresponding private address and the position of the target functional core in the functional core array according to the global address corresponding to each private address.
In an optional implementation manner, the storing the mapping relationship based on the target functional core where each global address indicates the corresponding private address and the target chip where the target functional core is located, where the global address corresponds to each private address may be understood as: the mapping relation comprises a conversion relation between the global address and the private address, and the conversion relation is based on a target functional core where the private address is located and a target chip where the target functional core is located to realize the correspondence with the global address. In other words, it can also be understood as: the global address carries the corresponding private address, the identifier of the functional core where the private address is located, and the identifier of the chip where the functional core is located.
For example: under the condition that the many-core system comprises S chips, the mapping relation between the global address and the private address is determined by adopting the following formula:
p=(qV+c)×N+k;
The method comprises the steps of setting a global address, setting a k corresponding to a functional core, setting a q corresponding to the functional core, setting a P corresponding to the k, setting a private address, setting a k corresponding to the functional core, setting a q corresponding to the functional core, setting a Q corresponding to the functional core, setting a N corresponding to the functional core, setting a V corresponding to the functional core, and setting a Q corresponding to the functional core, wherein p represents the global address, k represents the private address, c represents the identification of the functional core corresponding to the k, q represents the identification of the chip where the functional core corresponding to the c is located, N represents the total number of the private addresses in the functional core corresponding to the k, and V represents the total number of the functional cores in each chip.
Of course, in the specific implementation, besides expressing the mapping relationship between the global address and the private address through the above formula, the mapping relationship between the global address and the private address can be expressed in a mode of jointly forming the global address by combining and arranging the private address, the chip identifier and the functional core identifier.
In another optional implementation manner, the storing the mapping relationship according to the global address corresponding to each private address based on the target functional core where each global address refers to the corresponding private address and the location of the target functional core in the functional core array may be understood as: the global address carries the identifier of the functional core where the corresponding private address is located and the arrangement position of the functional core in the functional core array.
For example: in the case that the many-core system includes 1 chip, and the chip includes a functional core array arranged in J rows and P columns, the mapping relationship between the global address and the private address is determined by adopting the following formula:
p=(Px+y)×N+k;
Wherein p represents a global address, k represents a private address, x represents a row identifier of a functional core corresponding to k in the functional core array, y represents a column identifier of a functional core corresponding to k in the functional core array, and N represents a total number of private addresses in the functional core corresponding to k.
Of course, in the specific implementation, besides the mapping relationship between the global address and the private address is expressed through the above formula, the mapping relationship between the global address and the private address can be expressed in a mode of jointly forming the global address by combining and arranging the private address and the chip position.
In this embodiment, the association between the preset address and the global address is implemented by the chip identifier, the functional core position, and the like, so that in application, the global address and the private address can be mutually converted according to the chip identifier, the functional core position, and the like, thereby simplifying the process of determining the first private address corresponding to the target global address.
As an alternative embodiment, the method further comprises:
receiving a second read request sent by a first data path, wherein the first data path is positioned in the first functional core, and the second read request carries a second private address;
And responding to the second reading request, and transmitting second target data corresponding to the second private address in the target storage space to the first data path.
Wherein the first data path represents a data path within a first functional core, for example: a data path between the processor and the memory within the first functional core.
Wherein the second read request is different from the first read request in that: the first read request carries a global address, and the second read request carries a private address, that is, the first read request is a global access request, and the second read request is a private access request.
The present embodiment is applied to: the first functional core reads the application scenario of the data from the local private memory, at this time, the first functional core directly sends the private address to the local private memory without sending the global address of the private address, and the target storage space can learn that the second read request is a local access request by carrying the identifier of the first functional core in the second read request, so that the private address is identified, and the private address is not mistaken for the global address.
For example: as shown in fig. 3, the computing module in the first functional core 31 may send a read request to the private memory 313 of the first functional core through the data path 311 via the dual-mode memory interface 312, where the read request carries a private address of the target data, so that the private memory 313 feeds back the target data stored in the private address to the computing module through the dual-mode memory interface 312 and the data path 311.
It should be noted that, in the embodiment shown in fig. 3, the arrow direction indicates the transmission direction of the signaling or the target data in the process of the first functional core 31 obtaining the target data from the private memory 313 of the functional core.
In this embodiment, the first data path in the first functional core sends the second read request carrying the private address to the target storage space of the first functional core, so that the implementation process of reading data from the target storage space is the same as the process of reading data from the local private memory by the functional core in the prior art, which is not described herein.
It should be noted that, in practical applications, the first functional core may include a plurality of memory slices, so that there may be a case where some of the memory slices in the first functional core are accessed by the functional core and other memory slices are accessed by the external functional core. But the same memory chip can only respond to one of the local access request and the global access request (i.e., the first read request) at the same time.
In an alternative embodiment, the memory of the functional core may be controlled to operate in a global mode or in a private mode according to a control signal of the control logic.
Case one
In the case where the memory of the functional core operates in the global mode, the memory can only be globally accessed by the external functional core, and access to the functional core is denied. In the global mode, the received read request carries a global address, the global address mapping table records and is responsible for translating the global address to a private address, and finally, the private address is adopted to access the memory.
It should be noted that, in the global mode, the memories of all the functional cores form a large-capacity logic memory, and when global access is performed, the memory of which chip is located is not required to be indicated, and only the global address is required to be used for access.
Case two
In the case where the memory of the functional core is operated in private mode, the memory can only be accessed locally by the functional core, and global access by the external functional core will be denied. In this private mode, memory can only be accessed by the functional core, and therefore, the latency of access is fixed and predictable.
In another alternative embodiment, whether the access request is a local access request inside the functional core or a global access request of the external functional core may be determined according to whether the received read request carries a private address or a global address, so as to determine what response is performed to the read request.
Further, the working modes of the target storage space comprise a first working mode and a second working mode;
In the first working mode, the memory interface converts a global address in a received access request into a private address so as to access data in the target storage space based on the private address; in the second operating mode, the memory interface accesses data in the target memory space based on a private address in the received access request;
the method further comprises at least one of:
determining the working mode of the target storage space according to the control instruction;
determining, in response to the first read request, that the operating mode is the first operating mode;
and responding to the second reading request, and determining the working mode to be the second working mode.
In the first working mode, the memory interface converts the global address in the received access request into a private address, so as to access the data in the target storage space based on the private address, which can be understood as: the first mode of operation may be a global mode in which the memory interface is responsive only to global accesses.
In addition, in the second operation mode, the memory interface accesses the data in the target storage space based on the private address in the received access request, which can be understood as: the second mode of operation may be a private mode, in which case the memory interface only responds to private accesses.
In an alternative embodiment, the above determining, according to the control instruction, the operation mode of the target storage space may be understood as: the operation mode of the target storage space is determined according to the instruction of the preset control instruction and is not influenced by the received access request.
In this embodiment, the working mode of the target storage space may be adjusted by a preset control instruction, so as to control whether the data in the target storage space can be accessed only by the functional core or by other functional cores.
In another optional embodiment, the determining, in response to the first read request, the operation mode is the first operation mode; and determining that the operating mode is the second operating mode in response to the second read request may be understood as: the memory interface determines whether to operate in the first mode of operation or in the second mode of operation based on whether the received access request is a private request or a global request.
Specifically, the above memory interface refuses the second read request to the target storage space until the first read request completes the response, which can be understood as: in the first working mode, only the global access of the target storage space is responded, and when all global access responses are completed, the system can be switched to the second working mode so as to respond to the private access of the target storage space in the second working mode; or when all the global access responses are completed, the private access of the target storage space can be directly responded without mode switching.
Accordingly, the above memory interface refuses the first read request to the target storage space until the second read request completes the response, which can be also understood as: in the second working mode, only the private access of the target storage space is responded, and when all private access responses are completed, the first working mode can be switched to, so that the global access of the target storage space is responded in the first working mode; or after all private access responses are completed, the global access of the target storage space can be directly responded without mode switching.
Further, in an implementation, when the memory interface receives the first read request and switches to the first operation mode and then receives the second read request, the memory interface may respond to each first read request first, and during this period, the received second read request may be stored in the to-be-responded list, so that after the memory interface responds to all the first read requests, the memory interface may switch to the second operation mode to respond to the second read request in the to-be-responded list.
Correspondingly, when the memory interface receives the second read requests and then switches to the second working mode and then receives the first read requests, the memory interface can respond to each second read request first, and the first read requests received in the period can be stored in the to-be-responded list, so that after the memory interface responds to all the second read requests, the memory interface can switch to the first working mode to respond to the first read requests in the to-be-responded list.
In this embodiment, the operation mode of the target storage space may be determined according to the type of the received read request, so as to facilitate switching the operation mode of the target storage space according to different read requests, so as to match the target storage space with the received read request.
If the target storage space receives both the local access request and the global access request at the same time, the target storage space may also receive the global access request through arbitration to preferentially select to respond to one of the local access request and the global access request.
As an alternative embodiment, in a case where the difference in receiving time between the first read request and the second read request is less than a preset time, it is determined to respond to at least one of the first read request and the second read request by arbitration or a preset priority.
The preset time may be any time length of 0.1s (second), 1 second, etc., and is not specifically limited herein.
Wherein, the above-mentioned response to at least one of the first read request and the second read request determined by the preset time may be understood as: if the priority of the first read request is set to be greater than the priority of the second read request in advance, responding to the first read request preferentially, and rejecting the second read request or responding to the second read request after waiting for the response of the first read request to be completed; in the case where the priority of the second read request is set to be greater than the priority of the first read request in advance, the second read request is preferentially responded, and the first read request may be rejected, or the second read request is responded after waiting for the response of the second read request to be completed.
As an alternative embodiment, the first read request is a read request sent by a first data path of the first functional core, and the method further includes:
when the storage space indicated by the target global address is determined to be located in a second functional core, sending the first read request to the second functional core, wherein the second functional core is a functional core different from the first functional core in the many-core system;
and receiving third target data returned by the second functional core, and transmitting the third target data to the first data path.
In implementation, the functional core may access data stored in other functional cores, at this time, a data path in the functional core will send a read request for requesting access to data stored in other functional cores, and the functional core matches a functional core identifier carried in the read request with an identifier of the functional core, and when the read request does not match, it may be determined that the data requested by the read request is stored in the other functional cores, thereby determining that an address carried in the read request is a global address, and sending the read request to a second functional core corresponding to the functional core identifier carried in the read request.
In addition, when the second functional core receives the first reading request, the global address carried in the first reading request is converted to obtain a private address, the storage space where the third target data is located is accessed based on the private address, and the third target data stored in the storage space is transmitted to the first functional core according to the access result.
In this embodiment, the data stored in the first functional core can be accessed by the present functional core and other functional cores, and the first functional core can acquire the third target data stored in the second functional core based on the global address.
The following exemplifies the response procedure of the local access request and the global access request, taking the example that the target storage space includes a memory interface (which may also be referred to as a "dual-mode memory interface") for switching the target storage space between the private mode and the global mode:
in this embodiment, the many-core system includes: a first functional core, a second functional core, and a network-on-chip 50 connecting the first functional core and the second functional core.
Wherein, as shown in fig. 4, the first functional core includes: a data path 41, a memory 42, a routing module 43, and a dual mode memory interface 44, the routing module 43 being connected to the network on chip 50; the dual mode memory interface 44 includes a target location parser 441, a mode switch 442, an address mapping table storage module 443, a request signaling packetizer 444, a request signaling depacketizer 445, a data packetizer 446, and a data depacketizer 447.
Specifically, the destination location resolver 441 is connected to the data path 41, the memory 42 mode switch 442, the address mapping table storage module 443, the request signaling packetizer 444, the request signaling depacketizer 445, the data packetizer 446, and the data depacketizer 447, respectively, and the mode switch 442, the address mapping table storage module 443, the request signaling depacketizer 444, the request signaling depacketizer 445, the data packetizer 446, and the data depacketizer 447 are connected to the routing module 43, respectively.
It should be noted that the structure of the second functional core may be the same as that of the first functional core, and will not be described herein. In addition, the arrow direction in the embodiment shown in fig. 4 indicates the transmission direction of the signaling or target data in the process that the memory 42 of the first functional core is accessed locally or accessed globally.
In one case, if the dual-mode memory interface 44 is in the private mode, the target location resolver 441 directly accesses the memory 42 according to the private address provided by the data generating unit in the data path 41, that is, the private address of the target data is carried in the second read request, and the memory 42 directly outputs the data packet of the returned target data to the data path 41.
In another case, if the dual-mode memory interface 44 is in the global mode, when the data path 41 sends a read request to the target location resolver 441, the target location resolver 441 is configured to determine whether the destination address in the read request sent by the data path 41 is located in the functional core or another functional core located outside according to the address mapping table stored in the address mapping table storage module 443, so as to determine whether the read request needs to be generated as a local access request or a global access request.
If the destination address in the read request is located in the functional core, it is determined that the read request is a local access request, so that the memory 42 is directly accessed according to the private address in the local access request.
In addition, if the destination address in the read request is located in another external functional core, it is determined that the read request is a global access request, at this time, the destination location resolver 441 sends a request location in the global access request to the request signaling packet former 444, the request signaling packet former 444 forms a global access request signaling packet and sends the global access request signaling packet to the routing module 43, and the routing module 43 sends the global access request signaling packet received from the request signaling packet former 444 to the on-chip network 50 to send the global access request signaling packet to a storage location of data to be read through the on-chip network 50, so that when the target functional core where the storage location is located returns a data packet in response to the global access request signaling packet, the routing module 43 receives the return data packet from the on-chip network 50, and unpacks the data packet by the data unpacker 447 to obtain target data (i.e., data that the functional core needs to read), and then the destination location resolver 441 also parses the target data and sends the target data to the memory 42 or the data path 41 according to the parsed information.
Meanwhile, the routing module 43 is also responsible for receiving the global access request signaling packet sent from the network on chip 50 to the functional core by the external functional core, and unpacking the global access request signaling packet by the request signaling unpacker 445 to obtain the global address, the data length, and other information of the global access request signaling packet. And sends this information to the target location resolver 441. In this way, the destination location resolver 441 translates the global address of the global access request into a private address, so as to access the memory 42 through the private address, and returns the access result (i.e. the destination data) of the memory 42 to the data packetizer 446, so that after the destination data is assembled into a data packet by the data packetizer 446, the data packet is sent to the network on chip 50 through the routing module 43, at this time, the requester of the destination data will receive the data packet of the destination data from the network on chip 50, specifically, in the requester of the destination data, the data packet received by the routing module is unpacked in the data unpacker, and the relevant data packet is sent to the destination location resolver, and the destination location resolver sends the data in the data packet to the memory or the data path according to the parsed information.
In implementation, the format of the request signaling packet may be as shown in table 1 below:
TABLE 1
Wherein the signaling identifier is used to distinguish between different signaling; the target functional core address is used for indicating the address of the functional core where the memory storing the data to be accessed is located; the data start global address represents the start global address of the data to be accessed, and the start global address plus the data length can represent the end global address of the data to be accessed; under the condition that a plurality of functional cores access the storage space where the data to be accessed are located at the same time, the target position analyzer can arbitrate based on the priority, so as to determine to respond to one access request signaling with the highest priority; additional information may be added through the additional information field described above.
It should be noted that the arrangement positions of the respective sub-signals in the request signaling packet may be exchanged, and may further include other sub-information besides the signaling identifier, the target function core address, the data start global address, the priority, and the additional information field, which is not exhaustive herein.
In addition, in implementation, the format of the data packet may be as shown in table 2 below:
TABLE 2
Wherein the packet identifier is used for distinguishing different packets; the data volume represents specific data (i.e. data to be accessed) in the data packet; in addition, the specific meanings of the target functional core address, the data start global address, the data length, and the additional information field in the request signaling packet format shown in table 1 may be referred to respectively, and will not be described herein.
In the related art, when a set of weights is needed for a plurality of pictures, the plurality of pictures are respectively input into a plurality of functional cores to be respectively processed, and in the application, because the set of weights is needed for a plurality of functional cores, and the private memory of each functional core can only be accessed by the functional core, the whole set of weights exist in all the functional cores needing to use the weights, so that the pictures can be processed layer by layer.
In the embodiment of the application, the set of weights can be stored in only 1 or a few functional cores (for example, each weight value in the set of weights is stored in the functional core needing to use the weight value, and the functional core does not need to store the whole set of weights), and when one functional core needs to use the weight value which is not stored, the weight value can be obtained from other functional cores storing the weight value in a global access mode.
As can be seen from the above, according to the method for accessing stored data provided by the embodiment of the present application, when an algorithm (such as a large neural network) with shared data is operated, the shared data does not need to be copied to the private memory of each functional core, so that resource waste can be reduced, the application range of a many-core system with a large array and shared data is wider, and a high-speed mode and a sparse mode can be simultaneously supported. In addition, unlike the prior art that shared data needs to be transmitted through a shared memory center line, in the embodiment of the application, signaling and data for global access are transmitted through a network-on-chip, a bus-on-chip or an inter-chip network, so that waiting time of signaling and data transmission can be reduced, and the operation efficiency of a many-core system can be improved.
In the embodiment of the application, a first functional core receives a first read request sent by a second functional core, wherein the first read request carries an identifier of the second functional core and a target global address; responding to the first reading request, determining a first private address corresponding to the target global address, and transmitting first target data corresponding to the first private address in the target storage space to the second functional core; the first functional core stores a mapping relation between the target global address and the first private address in advance, and a target storage space corresponding to the first private address is located in the first functional core. Therefore, the private storage space in the first functional core can be used as the shared storage space to be accessed by the second functional core, and different shared data can be stored in the private storage spaces of different functional cores in a scattered mode, so that the problems that the waiting time is long and the waiting time is uncertain when a plurality of functional cores access the global shared storage space respectively due to the fact that a large amount of shared data are stored in the global shared storage space are avoided, and the operation efficiency of the many-core system is improved.
It should be noted that, the method for accessing storage data provided by the embodiment of the present application may be applied to a memory interface, where the memory interface may be a memory interface located in a first functional core in a many-core system (which may also be referred to as a "dual-mode memory interface"). As shown in fig. 4, the memory interface 44 may include:
A target location resolver 441, configured to determine, when a target storage space indicated by a target global address carried by a first read request is located in a first functional core, a first private address corresponding to the target global address, so as to access first target data corresponding to the first private address in the target storage space based on the first private address;
The target location analyzer stores a mapping relation between the target global address and the first private address in advance, and the target storage space is located in the first functional core.
In an implementation, the memory interface 44 may further include:
A data unpacker 446, if the first read request is sent by the second functional core, the data unpacker 446 is configured to parse the first read request received from the second functional core to obtain the target global address carried by the first read request;
the target location resolver 441 is connected to the data unpacker 446, and the target location resolver 441 obtains the target global address from the data unpacker 446.
In this embodiment, the first read request has the same meaning as the first read request in the method embodiment shown in fig. 1, and the mapping relationship between the global address and the private address is the same as the mapping relationship between the global address and the private address in the method embodiment shown in fig. 1, which is not described herein.
Optionally, the target location resolver 441 is further configured to resolve a second read request received to obtain a second private address carried by the second read request, and access second target data corresponding to the second private address in the target storage space based on the second private address, where the second read request is sent by the first functional core.
Optionally, the memory interface as shown in fig. 4 further includes:
A mode switch 442, the mode switch 442 being connected to the target position resolver 441;
the mode switch 442 is configured to control the target storage space to be in a first working mode and/or a second working mode;
Wherein, in the first operation mode, the target location resolver 441 converts the global address in the received access request into a private address, so as to access the data in the target storage space based on the private address;
In the second mode of operation, the target location resolver 441 accesses data in the target memory space based on a private address in the received access request.
Optionally, when a global address is carried in the received read request for the target storage space, the mode switcher 442 controls the target storage space to be in the first working mode;
The mode switch 442 controls the target storage space to be in the second operation mode when a read request for the target storage space is received and a private address is carried in the read request.
Optionally, the target location resolver 441 is further configured to resolve a third read request sent by the data path in the first functional core, so as to obtain a request address carried by the third read request;
The memory interface 44 further includes:
And the signaling packet device 444 generates a fourth read request based on the request address and sends the fourth read request to the second functional core through the first functional core when the storage space indicated by the request address is located in the second functional core, wherein the target global address carried in the fourth read request is a global address corresponding to the request address, and the second functional core is a functional core different from the first functional core in the many-core system.
The signaling packetizer 444 may also be referred to as a "request signaling packetizer".
The request address carried by the third read request may be an address generated by a data address generating unit in a data path in the first functional core according to a location where data to be accessed is stored.
In addition, when the second functional core receives the fourth read request from the first functional core, the second functional core performs the same processing as the first functional core performs to the fourth read request, and the processing is not repeated here.
Optionally, as shown in fig. 4, the memory interface 44 further includes:
A storage unit 443 of the address mapping table is stored, and the storage unit 443 is used for storing the mapping relation between the global address and the private address.
In implementation, when the storage address (private address) of the target data carried in the global access request sent by the functional core is located in another functional core, the global address corresponding to the storage address of the target data may be determined based on the mapping relationship between the global address and the private address stored in the storage unit 443.
In addition, when the functional core receives an access request carrying a global address, the private address corresponding to the global address can be determined based on the mapping relationship between the global address and the private address stored in the storage unit 443, so that the target storage space in the functional core is accessed based on the private address.
Accordingly, as shown in fig. 4, the memory interface 44 further includes:
The signaling unpacker 445 is configured to parse the received access request to obtain access information carried in the access request, for example: global addresses or private addresses, etc. where the target data to be accessed is stored.
As shown in fig. 4, the above-mentioned signaling unpacker 445 may be referred to as a "request signaling unpacker".
Optionally, as shown in fig. 4, the memory interface 44 further includes:
A data packetizer 447 for packetizing the data indicated for access by the access request for transmission to a requester of the access request.
Accordingly, as shown in fig. 4, the memory interface 44 further includes:
The data unpacker 446 is configured to perform analysis processing on the received data, and its function corresponds to that of the data unpacker 447, so that the data after the analysis processing is convenient for the user to read.
The memory interface provided in the embodiment of the present application can execute each process in the embodiment of the method shown in fig. 1, and can obtain the same beneficial effects, so that repetition is avoided, and no further description is given here.
The embodiment of the application also provides a functional core, which comprises a memory and a memory interface connected with the memory, wherein the memory interface is provided in the last memory interface embodiment.
Optionally, the number of the target storage spaces included in the functional core is a plurality of, and the memory interface is respectively connected with the plurality of target storage spaces;
the memory interface is used for controlling the working mode of each target storage space respectively, and the working modes of different target storage spaces in the functional cores are not identical.
The target storage space in the functional core provided by the embodiment of the present application can be in different working modes, so that the target storage space is accessed by the functional core in a private manner or accessed by the external functional core in a global manner, which can also execute each process in the method embodiment shown in fig. 1, and can obtain the same beneficial effects, so that repetition is avoided, and no further description is provided herein.
The embodiment of the application also provides a many-core system which comprises a plurality of functional cores provided by the embodiment, and any two functional cores in the many-core system are in communication connection.
Optionally, any two functional cores in the many-core system are connected through a network on chip or a bus on chip.
The many-core system provided by the embodiment of the application does not need to store shared data in each functional core or set an independent shared storage space, can execute each process in the method embodiment shown in fig. 1, and can obtain the same beneficial effects, and is not repeated here.
It should be noted that, in the method for accessing storage data according to the embodiment of the present application, the execution body may be a storage data access device, or a control module in the storage data access device for executing the method for accessing storage data. In the embodiment of the application, the method for executing the loading and storing data access by the storing data access device is taken as an example, and the storing data access device provided by the embodiment of the application is described.
Referring to fig. 5, which is a block diagram of a storage data access device according to an embodiment of the present application, the storage data access device 500 is applied to any of the memory interfaces according to the embodiment of the present application, as shown in fig. 5, the storage data access device 500 includes:
a first receiving module 501, configured to receive a first read request, where the first read request carries a target global address;
The first transmission module 502 is configured to determine, in response to the first read request, a first private address corresponding to the target global address according to a mapping relationship between the global address and the private address when a target storage space indicated by the target global address is located in the first functional core, and transmit first target data corresponding to the first private address in the target storage space.
Optionally, the first read request includes at least one of:
a read request sent by a first data path of the first functional core;
A read request sent by the second functional core;
Wherein the second functional core is a different functional core in the many-core system than the first functional core.
Optionally, the stored data access device 500 further includes:
The second receiving module is used for receiving a second read request sent by a first data path, wherein the first data path is positioned in the first functional core, and the second read request carries a second private address;
And the second transmission module is used for responding to the second reading request and transmitting second target data corresponding to the second private address in the target storage space to the first data path.
Optionally, the first read request is a read request sent by a first data path of the first functional core, and the storage data access device 500 further includes:
a sending module, configured to send the first read request to a second functional core when it is determined that the storage space indicated by the target global address is located in the second functional core, where the second functional core is a functional core different from the first functional core in the many-core system;
and the receiving module is used for receiving third target data returned by the second functional core and transmitting the third target data to the first data path.
Optionally, the stored data access device 500 further includes:
and the storage module is used for storing the mapping relation between the global address and the private address.
Optionally, the storage module is specifically configured to:
under the condition that the many-core system comprises S chips, indicating a target functional core where a corresponding private address is located based on each global address and the target chip where the target functional core is located, so as to store the mapping relation according to the global address corresponding to each private address;
Or alternatively
When the many-core system comprises 1 chip and the chip comprises a functional core array arranged in J rows and P columns, storing the mapping relation based on the target functional core where each global address refers to the corresponding private address and the position of the target functional core in the functional core array according to the global address corresponding to each private address.
Optionally, the working modes of the target storage space include a first working mode and a second working mode;
In the first working mode, the memory interface converts a global address in a received access request into a private address so as to access data in the target storage space based on the private address; in the second operating mode, the memory interface accesses data in the target memory space based on a private address in the received access request;
the stored data access device 500 further comprises at least one of:
The first determining module is used for determining the working mode of the target storage space according to the control instruction;
A second determining module, configured to determine, in response to the first read request, that the operation mode is the first operation mode;
And the third determining module is used for responding to the second reading request and determining that the working mode is the second working mode.
Optionally, in a case where a difference in receiving time between the first read request and the second read request is less than a preset time, determining to respond to at least one of the first read request and the second read request by arbitration or a preset priority.
Optionally, the first read request and the first target data are transmitted over a network-on-chip or a bus-on-chip.
The storage data access device 500 provided in the embodiment of the present application can execute each process executed by the memory interface in the embodiment of the method shown in fig. 1, and can improve the operation efficiency of the many-core system and save the storage resources, and has the same advantages as the embodiment of the method shown in fig. 1, and for avoiding repetition, the description is omitted here.
The storage data access device in the embodiment of the application can be a device, and can also be a component, an integrated circuit or a chip in a terminal. The device may be a mobile electronic device or a non-mobile electronic device. By way of example, the mobile electronic device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a wearable device, an ultra-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and the non-mobile electronic device may be a personal computer (personal computer, PC), an automated teller machine, a self-help machine, and the like, and the embodiments of the present application are not limited in particular.
The storage data access device provided by the embodiment of the present application can implement each process implemented by the method embodiment shown in fig. 1, and in order to avoid repetition, a description is omitted here.
Optionally, as shown in fig. 6, an embodiment of the present application further provides an electronic device 600, including a processor 601, a memory 602, and a program or an instruction stored in the memory 602 and capable of running on the processor 601, where the program or the instruction implements each process of the above-mentioned embodiment of the stored data access method when executed by the processor 601, and the process can achieve the same technical effect, so that repetition is avoided and no further description is given here.
It should be noted that, the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.
The embodiment of the application also provides a readable storage medium, on which a program or an instruction is stored, which when executed by a processor, implements each process of the above-described stored data access method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
The embodiment of the application further provides a chip, the chip comprises a processor and a communication interface, the communication interface is coupled with the processor, the processor is used for running programs or instructions, the processes of the embodiment of the stored data access method can be realized, the same technical effects can be achieved, and the repetition is avoided, and the description is omitted here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims (20)

1. A memory interface for use in a many-core system, the memory interface being located within a first functional core in the many-core system, the memory interface comprising:
The target location analyzer is used for determining a first private address corresponding to the target global address under the condition that a target storage space indicated by a target global address carried by a first read request is located in a first functional core, so as to access first target data corresponding to the first private address in the target storage space based on the first private address; the target position analyzer stores a mapping relation between the target global address and the first private address in advance;
A mode switcher connected with the target position resolver; the mode switcher is used for controlling the target storage space to be in a first working mode and/or a second working mode;
In the first working mode, the target position analyzer converts a global address in a received access request into a private address so as to access data in the target storage space based on the private address; in the second mode of operation, the target location resolver accesses data in the target storage space based on a private address in the received access request.
2. The memory interface of claim 1, wherein the target location resolver is further configured to resolve a received second read request to obtain a second private address carried by the second read request, and access second target data corresponding to the second private address in the target storage space based on the second private address, wherein the second read request is sent by the first functional core.
3. The memory interface of claim 1, wherein;
Under the condition that a global address is carried in a received reading request of the target storage space, the mode switcher controls the target storage space to be in the first working mode;
and under the condition that a read request of the target storage space carries a private address, the mode switcher controls the target storage space to be in the second working mode.
4. The memory interface of claim 1, wherein the target location resolver is further configured to resolve a third read request sent by a data path in the first functional core to obtain a request address of the third read request;
The memory interface further comprises:
And the signaling packet device generates a fourth read request based on the request address and sends the fourth read request to the second functional core through the first functional core under the condition that the storage space indicated by the request address is located in the second functional core, wherein a target global address carried in the fourth read request is a global address corresponding to the request address, and the second functional core is a functional core different from the first functional core in the many-core system.
5. A functional core, characterized in that the functional core comprises a memory and a memory interface connected to the memory, the memory interface being a memory interface according to any of claims 1-4.
6. The functional core according to claim 5, wherein the number of target storage spaces included in the functional core is plural, and the memory interface is connected to the plural target storage spaces, respectively;
The memory interface is used for controlling the working mode of each target storage space respectively.
7. A many-core system, comprising a plurality of functional cores according to claim 5 or 6, wherein any two of the functional cores in the many-core system are communicatively coupled.
8. The many-core system of claim 7, wherein any two functional cores in the many-core system are connected through a network-on-chip or a bus-on-chip.
9. A method of accessing stored data, applied to a memory interface as claimed in any one of claims 1 to 4, the method comprising:
receiving a first read request, wherein the first read request carries a target global address;
And responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is positioned in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space.
10. The method of claim 9, wherein the first read request comprises at least one of:
a read request sent by a first data path of the first functional core;
A read request sent by the second functional core;
Wherein the second functional core is a different functional core in the many-core system than the first functional core.
11. The method of claim 9, further comprising:
receiving a second read request sent by a first data path, wherein the first data path is positioned in the first functional core, and the second read request carries a second private address;
And responding to the second reading request, and transmitting second target data corresponding to the second private address in the target storage space to the first data path.
12. The method of claim 9, wherein the first read request is a read request sent by a first datapath of the first functional core, the method further comprising:
when the storage space indicated by the target global address is determined to be located in a second functional core, sending the first read request to the second functional core, wherein the second functional core is a functional core different from the first functional core in the many-core system;
and receiving third target data returned by the second functional core, and transmitting the third target data to the first data path.
13. The method of claim 9, further comprising:
And storing the mapping relation between the global address and the private address.
14. The method of claim 13, wherein storing the mapping relationship between the global address and the private address comprises:
under the condition that the many-core system comprises S chips, indicating a target functional core where a corresponding private address is located based on each global address and the target chip where the target functional core is located, so as to store the mapping relation according to the global address corresponding to each private address;
Or alternatively
When the many-core system comprises 1 chip and the chip comprises a functional core array arranged in J rows and P columns, storing the mapping relation based on the target functional core where each global address refers to the corresponding private address and the position of the target functional core in the functional core array according to the global address corresponding to each private address.
15. The method of claim 11, wherein the operating modes of the target storage space include a first operating mode and a second operating mode;
In the first working mode, the memory interface converts a global address in a received access request into a private address so as to access data in the target storage space based on the private address; in the second operating mode, the memory interface accesses data in the target memory space based on a private address in the received access request;
the method further comprises at least one of:
determining the working mode of the target storage space according to the control instruction;
determining, in response to the first read request, that the operating mode is the first operating mode;
and responding to the second reading request, and determining the working mode to be the second working mode.
16. The method of claim 11, wherein the stored data access is performed;
and in the case that the receiving time difference between the first reading request and the second reading request is smaller than the preset time, determining to respond to at least one of the first reading request and the second reading request through arbitration or preset priority.
17. The method of claim 9, wherein the first read request and the first target data are transmitted over a network-on-chip or a bus-on-chip.
18. A storage data access device, applied to a memory interface as claimed in any one of claims 1 to 4, the device comprising:
The first receiving module is used for receiving a first reading request, wherein the first reading request carries a target global address;
The first transmission module is used for responding to the first reading request, determining a first private address corresponding to the target global address according to the mapping relation between the global address and the private address when the target storage space indicated by the target global address is located in the first functional core, and transmitting first target data corresponding to the first private address in the target storage space.
19. An electronic device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor implements the steps of the stored data access method of any one of claims 9 to 17.
20. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the stored data access method according to any of claims 9-17.
CN202110309611.4A 2021-03-23 2021-03-23 Memory interface, functional core, many-core system and memory data access method Active CN115114192B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110309611.4A CN115114192B (en) 2021-03-23 2021-03-23 Memory interface, functional core, many-core system and memory data access method
PCT/CN2022/079235 WO2022199357A1 (en) 2021-03-23 2022-03-04 Data processing method and apparatus, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110309611.4A CN115114192B (en) 2021-03-23 2021-03-23 Memory interface, functional core, many-core system and memory data access method

Publications (2)

Publication Number Publication Date
CN115114192A CN115114192A (en) 2022-09-27
CN115114192B true CN115114192B (en) 2024-06-14

Family

ID=83323938

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110309611.4A Active CN115114192B (en) 2021-03-23 2021-03-23 Memory interface, functional core, many-core system and memory data access method

Country Status (1)

Country Link
CN (1) CN115114192B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115525583B (en) * 2022-11-29 2023-04-07 太初(无锡)电子科技有限公司 Memory data access method of many-core processor

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102326145A (en) * 2011-08-10 2012-01-18 华为技术有限公司 Reset vector code realization method, system and apparatus

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2441005A2 (en) * 2009-06-09 2012-04-18 Martin Vorbach System and method for a cache in a multi-core processor
WO2013186694A2 (en) * 2012-06-11 2013-12-19 Stefanos Kaxiras System and method for data classification and efficient virtual cache coherence without reverse translation
IN2013CH04449A (en) * 2013-09-30 2015-04-03 Empire Technology Dev Llc
KR101815171B1 (en) * 2013-10-31 2018-01-04 인텔 코포레이션 A method, apparatus and system for dynamically controlling an addressing mode for a cache memory
CN105740164B (en) * 2014-12-10 2020-03-17 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing method, device and equipment
CN104699631B (en) * 2015-03-26 2018-02-02 中国人民解放军国防科学技术大学 It is multi-level in GPDSP to cooperate with and shared storage device and access method
CN105677580B (en) * 2015-12-30 2019-04-12 杭州华为数字技术有限公司 The method and apparatus of access cache
CN107229593B (en) * 2016-03-25 2020-02-14 华为技术有限公司 Cache consistency operation method of multi-chip multi-core processor and multi-chip multi-core processor
CN108628676A (en) * 2017-03-16 2018-10-09 哈尔滨英赛克信息技术有限公司 A kind of memory management device and method towards multiple nucleus system
US10482024B2 (en) * 2017-07-20 2019-11-19 Alibaba Group Holding Limited Private caching for thread local storage data access
US10860487B2 (en) * 2019-04-17 2020-12-08 Chengdu Haiguang Integrated Circuit Design Co. Ltd. Multi-core processing device and method of transferring data between cores thereof
US11403234B2 (en) * 2019-06-29 2022-08-02 Intel Corporation Cryptographic computing using encrypted base addresses and used in multi-tenant environments
CN110704362B (en) * 2019-09-12 2021-03-12 无锡江南计算技术研究所 Processor array local storage hybrid management method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102326145A (en) * 2011-08-10 2012-01-18 华为技术有限公司 Reset vector code realization method, system and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
众核***私有存储自适应共享化架构设计与实现;叶英;刘佩林;;计算机与现代化;20130415(第04期);48-52 *

Also Published As

Publication number Publication date
CN115114192A (en) 2022-09-27

Similar Documents

Publication Publication Date Title
CN108268328B (en) Data processing device and computer
JP3473975B2 (en) Network system and communication method in network
JP2019091494A (en) Emulated endpoint configuration
CN105335309B (en) A kind of data transmission method and computer
US11531636B2 (en) Storage system, method, and apparatus for fast IO on PCIE devices
CN106662895B (en) The method of computer equipment and computer equipment reading and writing data
CN114647602B (en) Cross-chip access control method, device, equipment and medium
WO2022032990A1 (en) Command information transmission method, system, and apparatus, and readable storage medium
CN103430165A (en) Sharing internet capability of a mobile computing device with a client computing device using a virtual machine
CN115114192B (en) Memory interface, functional core, many-core system and memory data access method
WO2021247113A1 (en) System and method for scheduling sharable pcie endpoint devices
CN112749113A (en) Data interaction method, system, device and medium
CN102207920A (en) Conversion bridge for conversion from BVCI (basic virtual component interface) bus to AHB (advanced high performance bus)
CN100405333C (en) Method and device for processing memory access in multi-processor system
CN115114042A (en) Storage data access method and device, electronic equipment and storage medium
CN108182119A (en) Read and write abruption control method and device, storage medium and electronic device
CN117033275B (en) DMA method and device between acceleration cards, acceleration card, acceleration platform and medium
US10382575B2 (en) Program execution system, method of executing program, and computer-readable storage medium
CN108959134B (en) Communication for field programmable gate array devices
CN105868137B (en) Expanded distribution unit
WO2022199357A1 (en) Data processing method and apparatus, electronic device, and computer-readable storage medium
CN102646058A (en) Method and device for selecting node where shared memory is located in multi-node computing system
CN111240845B (en) Data processing method, device and storage medium
CN112395245B (en) Access device and method of processor and computer equipment
CN115904488A (en) Data transmission method, system, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant