CN114970409A

CN114970409A - Integrated circuit based on multi-die interconnection

Info

Publication number: CN114970409A
Application number: CN202210890029.6A
Authority: CN
Inventors: 马恺声; 伍毅夫; 谭展宏; 张延年; 张武科
Original assignee: Arctic Xiongxin Information Technology Xi'an Co ltd
Current assignee: Arctic Xiongxin Information Technology Xi'an Co ltd
Priority date: 2022-07-27
Filing date: 2022-07-27
Publication date: 2022-08-30

Abstract

Embodiments of the present disclosure relate to an integrated circuit based on multi-die interconnection, including: a substrate comprising an interconnect structure; a first die comprising a control module, a communication module, a storage module, and D2D interconnect components and configured to control or schedule a second die; and one or more second dies comprising D2D interconnect components and IP cores of a particular circuit function and configured to: communicating with the first die interconnect via a D2D interconnect component and a D2D interconnect component of the first die; receive data from the first die, perform operations of the IP core based on the particular circuit function and send results of the operations to the first die, wherein the first die and the second die are packaged on the substrate via the interconnect fabric.

Description

Integrated circuit based on multi-die interconnection

Technical Field

Embodiments of the present disclosure generally relate to the field of semiconductors, and more particularly, to a multi-die interconnect based integrated circuit.

Background

A conventional single-die system-on-chip soc (system on chip) is designed to highly integrate components required by a system into one chip, for example, integrated circuits with originally different functions are integrated into one chip. By the method, not only the volume can be reduced, but also the distance between different integrated circuits can be reduced, and the calculation speed of the chip is improved. It is characterized by large hardware scale and is usually based on IP design mode.

The IP core (Intellectual Property core) is divided into an analog IP core and a digital IP core, wherein the digital IP core is a hardware description language program with specific circuit functions, and the program is irrelevant to the integrated circuit process and can be transplanted to different semiconductor processes to produce integrated circuit chips. Analog IP is mostly PLL or various interfaces, and is directly built based on a circuit library of a certain process and related to an integrated circuit process.

Whether digital or analog, are mature circuits with some general or common function that have been implemented by the IP core provider on some process. The traditional SoC development design work is based on an IP module, and developers purchase some commonly used IP cores such as PLL, DDR, PCIe from different IP providers, and then integrate them on one chip together with self-developed algorithm modules to complete SoC development.

However, under the era wave of AI and 5G, in order to improve higher operation performance and the number of processor cores, and to meet the requirements of high bandwidth, low latency and a large number of wires, the chip needs to have higher integration degree and more memories, so that more and more IP cores need to be integrated in chip development, and the integrated architecture is more and more complex. And due to the change of the algorithm day by day, the requirement on the development period of the chip is shorter and shorter. Meanwhile, with the end of moore's law, the process node moves to the physical limits of 3nm and 1nm, and the development cost of the chip is greatly improved.

In summary, the conventional scheme for parsing a bitmap into structured data has the following disadvantages: the traditional single-die SoC chip has long research and development period; traditional single-die SoC chip development or iteration requires IP cores to be purchased or repeatedly purchased, which results in high economic cost; the traditional single-die SoC chip development requires a great deal of manpower for IP core integration, which results in high labor cost; traditional single-die SoC chip development entails the risk of IP core integration or quality.

Disclosure of Invention

In view of the above, the present disclosure provides an integrated circuit based on multi-die interconnection, which greatly reduces the labor cost and time cost of a chip complicated to involve IP core development.

According to a first aspect of the present disclosure, there is provided a multi-die interconnect based integrated circuit comprising: a substrate comprising an interconnect structure; a first die comprising a control module, a communication module, a memory module, and D2D interconnect components, and configured to control or schedule a second die; and one or more second dies comprising D2D interconnect components and IP cores of a particular circuit function and configured to: communicating with the first die interconnect via a D2D interconnect component and a D2D interconnect component of the first die; receive data from the first die, perform operations of the IP core based on the particular circuit function and send results of the operations to the first die, wherein the first die and the second die are packaged on the substrate via the interconnect fabric.

In one embodiment, the D2D interconnect assembly includes an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die receives incoming data from a device external to the integrated circuit, the D2D interconnect assembly is configured to: analyzing the destination operation address of the data so as to determine a target die of the data to be transmitted; in response to the data being transferred to the first die, translating the destination operating address to an on-die address of the first die, thereby saving the data on the first die; and in response to the data being transmitted to the second die, translating the destination operating address to an on-chip address of the second die, thereby forwarding the data to the second die.

In one embodiment, the D2D interconnect assembly includes an external data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die sends the stored data, the D2D interconnect assembly is configured to: obtaining the data from a memory module of the first die based on a source address of the data; resolving a destination operating address of the data, thereby determining a second die corresponding to an IP core to be transmitted to which the data is to be transmitted; and forwarding the data to the second die based on the determined destination operating address of the second die.

In one embodiment, the D2D interconnect assembly includes an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the second die reads data in the first die, the D2D interconnect assembly is configured to: sending an on-die address for the data to the second die for reading the data in the first die by the second die.

In some embodiments, the control module of the first die is configured to implement an IP core having a die control function, the communication module of the first die is configured to implement an IP core having a data interaction function with a device outside the integrated circuit, the communication module of the first die is configured to implement an IP core having a data interaction function with other dies within the integrated circuit, and the storage module of the first die is configured to implement an IP core having a data interaction function with the memory or the storage granule.

In some embodiments, the D2D interface is any one of a low speed parallel interface, a medium speed serial interface, and a high speed serial interface.

In some embodiments, the first die and the second die being packaged on the substrate via the interconnect structure comprises: and packaging the first die and the second die on the substrate through substrate interconnection, rerouting RDL or intermediate layer interposer interconnection.

In some embodiments, the second die is iteratively developed based on different IP cores and the first die is multiplexed after the iterative development of the second die.

According to a second aspect of the present disclosure, there is provided a multi-die interconnect based integrated circuit comprising: a substrate comprising an interconnect structure; a first die comprising a control module, a communication module, a memory module, and D2D interconnect components and configured to control or schedule the second die, the third die; one or more second dies comprising a D2D interconnect component and an IP core having a particular circuit function, and configured to: communicating with the first die interconnect via a D2D interconnect component and a D2D interconnect component of the first die or communicating with a third die interconnect via a D2D interconnect component and a D2D interconnect component of the third die; receiving data from the first die or a third die, performing an operation of the IP core based on the specific circuit function and sending a result of the operation to the first die or the third die; and one or more third dies comprising a control module, a communication module, a storage module, and D2D interconnect components and configured to: communicating with the first die, other third die, any of the first die and other third die interconnects via the D2D interconnect assembly; receiving data from the first die or other third die, sending the received data to a second die in interconnect communication with the third die, receiving a result of an operation of the second die, and sending the result of the operation to the first die or other third die, wherein the first die, second die, and third die are packaged on the substrate via the interconnect structure.

In some embodiments, the D2D interconnect components of the first die and the third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die receives incoming data from a device external to the integrated circuit, the D2D interconnect components are configured to: analyzing the destination operation address of the data so as to determine a target die of the data to be transmitted; in response to the data being transferred to a first die, translating the destination operating address to an on-chip address of the first die, thereby saving the data at a target first die; determining a relay die interconnected with the second die in response to the data transfer to the second die; translating the destination operating address to an on-chip address of the second die in response to the second die being interconnected with the first die, thereby forwarding the data to a target second die; translating the destination operating address to an on-chip address of the third die in response to the second die being interconnected with the third die, thereby forwarding the data to a transit third die; determining a relay die interconnected with the third die in response to the data transfer to the third die; translating the destination operating address to an on-chip address of the third die in response to the third die being interconnected with the first die, thereby forwarding the data to a target third die; in response to the third die interconnecting with other third dies, translating the destination operating address to an on-chip address of the other third dies of the transit die, thereby forwarding the data to the target third die.

In one embodiment, the D2D interconnect components of the first die and the third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die transmits data stored in the first die, the D2D interconnect components are configured to: obtaining the data from a memory module of the first die based on a source address of the data; analyzing a destination operation address of the data to be transmitted so as to determine a target die of the data to be transmitted; determining a relay die interconnected with a second die in response to the data transfer to the second die; translating the destination operating address to an on-chip address of the second die in response to the second die being interconnected with the first die, thereby forwarding the data to a target second die; translating the destination operating address to an on-chip address of the third die in response to the second die being interconnected with the third die, thereby forwarding the data to the third die; determining a relay die interconnected with the third die in response to the data transfer to the third die; translating the destination operating address to an on-chip address of the third die in response to the third die being interconnected with the first die, thereby forwarding the data to a target third die; in response to the third die interconnecting with other third dies, translating the destination operation address to an on-chip address of the other third dies of the transit die, thereby forwarding the data to a target third die.

In one embodiment, the D2D interconnect components of the first die and the third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the second die or the third die reads data in the first die or the third die, the D2D interconnect components are configured to: determining a die in which the data to be read is stored based on the source address of the data; determining whether to interconnect the second die or the third die with the first die in response to the data being stored on the first die; sending an on-die address of the data to the second die or a third die in response to the second die or the third die being interconnected with the first die; responding to the second die or the third die and the first die being not interconnected, acquiring a transfer third die between the second die or the third die and the first die, and sending the on-chip address of the data to the transfer third die so as to send the on-chip address of the data to the second die or the third die through the transfer third die; determining whether to interconnect with the second die or a third die with a third die in response to the data being stored on the third die; sending an on-die address of the data to the second die or a third die in response to the second die or the third die being interconnected with the third die; and responding to the second die or the third die and the third die being not interconnected, acquiring a transfer third die between the second die or the third die and the third die, and sending the on-chip address of the data to the transfer third die, so as to send the on-chip address of the data to the second die or the third die through the transfer third die.

In some embodiments, the third die is further configured to store data distributed with the first die and/or other third dies.

In some embodiments, the control module of the third die is configured to implement an IP core having a die control function, the communication module of the third die is configured to implement an IP core having a data interaction function with a device outside the integrated circuit, the communication module of the third die is configured to implement an IP core having a data interaction function with other dies within the integrated circuit, and the storage module of the third die is configured to implement an IP core having a data interaction function with the memory or the storage granule.

In some embodiments, the first die, the second die, and the third die being packaged on the substrate via the interconnect structure includes: and packaging the first die and the second die on the substrate through substrate interconnection, rerouting RDL interconnection or intermediate layer interposer.

In some embodiments, the second die is iteratively developed based on different IP cores and the first die and the third die are multiplexed after the iterative development of the second die.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters designate like or similar elements.

FIG. 1 shows a schematic diagram of a multi-die interconnect based integrated circuit according to an embodiment of the invention.

Fig. 2 shows a block diagram of a multi-die interconnect based integrated circuit according to an embodiment of the present disclosure.

Fig. 3 shows a schematic diagram of a master die according to an embodiment of the disclosure.

Fig. 4 shows a schematic diagram of a slave die according to an embodiment of the present disclosure.

Fig. 5 shows a schematic diagram of another multi-die interconnect based integrated circuit according to an embodiment of the present disclosure.

Fig. 6 shows a schematic diagram of a D2D interconnect assembly, according to an embodiment of the disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The term "include" and variations thereof as used herein is meant to be inclusive in an open-ended manner, i.e., "including but not limited to". The term "or" means "and/or" unless specifically stated otherwise. The term "based on" means "based at least in part on". The terms "one example embodiment" and "one embodiment" mean "at least one example embodiment". The term "another embodiment" means "at least one additional embodiment". The terms "first," "second," and the like may refer to different or the same object. Other explicit and implicit definitions are also possible below.

As described above, the conventional SoC has a high development cost. Traditional SoC development usually requires purchasing some functional IP cores or interface IP cores, the price of these IP cores is hundreds of thousands, millions, or even tens of millions, so that the economic cost of SoC development increases geometrically, and as the chip process is reduced from 10nm to 7nm, and then further reduced to 5nm, each process reduction also leads to a great increase in the economic cost of the IP core. Therefore, many small and medium-sized enterprises are difficult to bear the high economic cost brought by the IP core, and particularly when the chip shipment volume does not reach a certain amount, the chip cost is difficult to reduce, and the enterprises are difficult to realize profit.

Meanwhile, different IP cores have different interface protocols or use specifications, so in the SoC development process, at least 1 architecture design engineer is required to be responsible for an integration scheme compatible with multiple IP cores at the architecture design stage. At least 1 circuit design engineer is required per IP core to be specifically responsible for integration during the module design phase. At least 1 verification engineer is also required per IP core to be exclusively responsible for verification during the verification phase. These directly increase the labor cost as well as the time cost of SoC development.

In addition, when the product needs to be updated and iterated, if the enterprise needs to update its own algorithm module, the purchased IP does not need to be updated, the authorization fee of the IP needs to be paid to the IP provider again, and the integration, verification and backend work also needs to be repeated again, so that the iteration cost is increased. Therefore, the price of the IP core itself or the manpower consumed in the integration process leads to high SoC development cost. Meanwhile, as the number of integrated IP cores in one chip is increased, the production yield of the chip is directly or indirectly influenced, and the risk cost is increased.

Secondly, the conventional SoC development cycle is long. In the traditional SoC development, a soft core (RTL code) or a hard core (layout) is purchased from an IP provider, then the purchased IP core and a self-developed module are integrated together for simulation verification, and the back end lays out and routes wires in the integrated process, so how many IP cores directly influence the simulation verification of the whole chip and the time for laying out and routing wires. And advanced processes make the problems of power convergence and timing convergence more prominent, which will inevitably result in longer design verification time.

Along with the iteration of the algorithm, the requirements on the operation speed, the function and the compatibility of the chip are higher and higher, so that the integration level of the chip is higher and higher, the architecture is more and more complex, and the development period of the SoC chip is longer and longer. However, for areas where the algorithm iterations are fast, such as: in the field of AI, if the development cycle of a chip is too long, it is possible that the current development is not yet completed, and a new algorithm appears again. Therefore, the development cycle of the conventional SoC has difficulty in meeting the market and the update speed of the algorithm.

Finally, traditional SoC development flexibility is low. In the back-end design stage of SoC development, due to the fact that the special layout and wiring requirements of the IP cores, especially the special layout and wiring requirements of the analog IP cores are high, the flexibility of the back-end layout and wiring of the whole SoC is directly limited, and the difficulty of time sequence design is indirectly improved.

Conventional SoC development has the above problems: higher development cost, longer development period and lower development flexibility

In the present disclosure, Die refers to a Die divided from a wafer in the field of integrated circuits, and its designation includes, but is not limited to, Die, chip, Die, and bare chip. For the sake of brevity, this application is generally described using dies, but it is noted that the aspects of the present disclosure are equally applicable to other integrated circuits as described above.

To address, at least in part, one or more of the above problems and other potential problems, example embodiments of the present disclosure propose a multi-die interconnect-based chiplet (chipset) integrated circuit. A multi-Die interconnect-based chiplet is composed of a master Die (hereinafter referred to as a first Die or Hub Die) and a slave Die (hereinafter referred to as a second Die or Side Die). Based on the structure of the Hub Die and the Side Die, as the general or common IP is integrated into the mature Hub Die, the expensive soft core or hard core IP does not need to be purchased in the design process, and the development of the SoC can be completed only by purchasing the Hub Die with relatively low single-chip price for integration, so that the economic cost for purchasing the IP core can be directly reduced. Meanwhile, as the design of the Hub Die is mature and stable, the compatibility of an IP core is not required to be considered in the design process, and only a special algorithm needs to be designed into the Side Die and the Hub and the Side Die are packaged together, so that the labor cost and the time cost of development can be greatly reduced. Therefore, whether the expensive IP core is not needed to be purchased any more or the workload of integrating the IP core is reduced, the development cost of SoC development is directly or indirectly reduced.

From the conventional SoC development process, all socs need to go to IP providers (e.g., synopsys, TCI, core activation, core source, etc.) to purchase IP cores that are mature on a certain process node. The degree of IP core delivery is divided into hard and soft cores, where the hard core is the final GDS file delivered, which has the benefit of reducing the self back-end workload and the disadvantage of being inflexible to the back-end top layer. Soft cores refer to delivery of code (encrypted or not) with the benefit being the opposite of the badness and hard cores. However, in the process of integrating, verifying and back-end working, the designer needs to spend much manpower and material resources to deeply understand the purchased IP, and needs to make architectural modification according to the characteristics of the IP and to bear the risk of IP integration errors.

Based on the structures of the Hub Die and the Side Die, the integration of an IP core is not required to be considered in the design process, the verification of the IP core and the layout and wiring of the rear end are not required to be considered, and the cycle of SoC development is greatly reduced. For the iteration of the chip, for example, for the AI chip, the algorithm of the core part of the AI chip is changed frequently, and only Side Die is iterated when the chip is iterated, and other parts do not need to be repeatedly developed. Saving the integration of the IP core can greatly reduce the development period of the first development or the iterative development.

As described above, many chip application scenarios are fragmented, each with its own proprietary algorithm, but many scenarios do not have a large chip demand. For example, a sweeping robot may have a annual demand for 10 ten thousand chips. Then, a great deal of manpower and financial resources are invested to develop a special SoC chip, which results in that the cost of the product cannot be reduced, so that the algorithm update iteration period used by the application scene is very short. For example, in the AI field, the development cycle of the conventional SoC is 1 year short and 2-3 years long, and in this process, the algorithm has been updated and iterated many times, and the manufactured SoC may not be adapted to the latest algorithm. Thus, many scenarios are consistent in development with memory, interface, and CPU requirements. The present disclosure therefore proposes a solution: the more general IP is integrated into a central bare Die, which is called a master Die (hereinafter referred to as a first Die or Hub Die), and the algorithm dedicated to itself or a client is implemented on another bare Die, which is called a slave Die (hereinafter referred to as a second Die or Side Die).

FIG. 1 shows a schematic diagram of a multi-die interconnect based integrated circuit according to an embodiment of the invention. As shown in fig. 1, an integrated circuit, hereinafter also referred to as a chiplet (chiplet), includes a substrate 100, the substrate 100 including an interconnect structure; a first Die, Hub Die 101, said first Die 101 comprising a control module, a communication module, a storage module, and D2D interconnect components; and one or more second Die Side Die 102, the second Die 102 including D2D interconnect components; the Hub Die 102 and the Side Die 102 may be interconnected in a 2D or 2.5D packaging manner based on a substrate, a redistribution layer (RDL) and an interposer, so as to be packaged into a complete chip.

In the present invention, we integrate and encapsulate these more versatile IP cores as mature Hub Die 101. The Hub Die 101 is provided with an IP core of an own D2D interconnection component, so that a designer does not need to find an IP supplier to purchase the IP cores, and does not need to spend time to integrate, verify and back-end the IP cores, so that the design does not need to design an architecture of an SoC, the designer only needs to design the architecture of an algorithm subsystem of the designer and complete the design, and the Hub Die 101 and the Side Die 102 can be connected through a D2D interface by packaging by adding the IP core of the D2D interconnection component of the same style as the Hub Die 101 in the technology, so that the problems of long development period, high development cost and high development risk are effectively solved.

In one embodiment, the control module of the first Die master Hub Die is configured to implement an IP core having Die control functionality. The communication module of the first Die master Hub Die is configured to implement an IP core having data interaction functionality with other dies within the integrated circuit. The communication module of the first Die master Hub Die is also configured to implement an IP core having data interaction functionality with devices external to the integrated circuit. The memory module of the first Die master Hub Die is configured to implement an IP core with data interaction functionality with a memory or memory granule.

The Side Die second Die may be iteratively developed based on different IP cores, e.g., developing multiple different IP cores on one or more Side Die second dies and multiplexing the Hub Die first Die after the Side Die second Die iterative development.

The Hub Die, Side Die, and Die-to-Die (hereinafter referred to as D2D) interfaces will be described in detail below.

Fig. 2 shows a block diagram of a multi-die interconnect based integrated circuit according to an embodiment of the present disclosure. As shown in fig. 2, a multi-Die interconnect-based integrated circuit may be composed of one master Die (Hub Die) and a plurality of slave dies (Side Die). The Hub Die and the Side Die are interconnected through a D2D interconnection component and a D2D interconnection component (e.g., a D2D interface or a D2D PHY) and are packaged on a substrate based on a substrate, RDL, interposer manner, thereby forming a chiplet (chiplet).

Fig. 3 shows a schematic diagram of a master die according to an embodiment of the disclosure. As described above, the master Die (Hub Die) includes at least a control module 301, a communication module 302, a storage module 303, and a D2D interconnect assembly 304.

In one embodiment, a primary Die, also known as Hub Die, may include a control module 301, a communication module 302, a memory module 303, and a D2D interconnect assembly 304. Control module 301 is configured to provide CPU control functions that can coordinate scheduling of Hub Die internal and Side Die data processing modules. The communication module 302 is configured to provide a data interaction function at a host (host) end, so that data can be communicated with the host through Hub Die. In addition, the communication module 302 is configured to provide data interaction functions with off-chip devices, and may provide multiple types of interfaces: MIPI, ETH, EMMC, SPI, etc. The memory module 303 is configured to provide an off-chip memory interface, which can store data transmitted from an upper computer or an external device into the DDR for the second Die operation.

In one embodiment, the main core may be configured to implement an interface class IP core, such as PCIe, EMMC, USB, ETH, or the like, where the class IP is mainly used for data interaction between the chip and the outside; the storage-class IP cores such as DDR3/4, LPDDR4/4X or GDDR5/6 and the like are realized, and the class of IP is mainly used for data interaction between chips and storage particles or a memory; the method includes the steps that controller type IP cores such as various CPUs of ARM or RSIC _ V and the like are realized, and the type of IP is mainly used for control and some calculation of chips; and an interconnection type IP core, such as an AXI bus, is realized, and the type of IP is mainly used for data interaction among Subsystems (IPs) in a chip.

In one embodiment, the Hub Die may provide other communication functions in the conventional SoC except for a special data processing module, and provide data interaction channels and control scheduling for multiple Side dies. The invention has the advantages that the invention can multiplex the Hub Die in multiple scenes and multiple iterations, can design the Side Die with a certain size range according to the requirements of the Hub Die when aiming at different application scenes or iterations, and integrates D2D (Die to Die) Die-to-Die interface IP consistent with the Hub Die in the Side Die. During packaging, Hub Die and Side Die can be sealed in the same chip on a substrate or an RDL layer or an intermediate layer interposer to form a chip with complete system function.

The D2D interconnect component 304 may be divided into a parallel interface and a serial interface. The parallel interface has the advantages of low power consumption and low time delay, and has the disadvantages that although the unit area speed of the parallel interface is better than that of the serial interface, in order to realize a certain bandwidth requirement, the number of IO of the parallel interface is far greater than that of the serial interface, so that a more complex packaging design is often introduced at the present stage, and the packaging difficulty and cost are increased (for example, Interposer, RDL and EMIB). The serial interface has the advantages that the bandwidth of single IO is faster than that of a parallel interface, so that the number of IO is far better than that of the parallel interface under the requirement of larger bandwidth, the packaging cost is low, but the high power consumption and high time delay caused by the high speed introduction of single IO cannot be avoided.

Based on the above advantages and disadvantages, the solution of D2D interconnecting the D2D interface 304 in the component may be any one or more of a low-speed parallel interface, a medium-speed serial interface, and a high-speed serial interface, and may be dynamically adjusted by the customer according to the customer's own considerations for bandwidth, power consumption, area, and packaging cost.

The low-speed D2D interface may be a low-speed parallel interface of a substrate package, with a frequency <1.6 GHz. In one embodiment, the requirement for bandwidth is not great, only a channel of D2D is expected to realize some information interaction between die, but the bandwidth of ordinary IO cannot be satisfied, and if a high-speed parallel interface is adopted, higher packaging cost is introduced, so we define a low-speed parallel interface as a low-speed interface, where D2D communication can be realized on a substrate and the low-speed parallel interface has a certain bandwidth.

The medium-speed D2D interface adopts a high-speed parallel interface with advanced encapsulation such as RDL or Interposer, and the frequency is: 1.6 GHz-16 GHz. And a SerDes low speed serial interface below (including) 32 Gb/s. In one embodiment, the user has a certain demand for bandwidth, and in this type of client, the client is classified into two types which have low tolerance to delay or acceptable delay, so in order to meet the demand of the client, we define a parallel interface scheme and a serial interface scheme in the medium speed interface. The parallel interface scheme can meet the bandwidth requirement, the delay of data transmission is low, the real-time communication requirement is met, the power consumption and the area have certain advantages, the packaging needs to be packaged in an advanced mode in order to guarantee the bandwidth, and the packaging cost is improved. The serial interface scheme can meet the bandwidth requirement, does not need to introduce advanced packaging, and has long data transmission delay due to functions such as error correction and the like.

The high-speed D2D interface adopts a high-speed SerDes serial interface with the highest 112Gb/s, and the rate can be configured to be downward compatible. In one embodiment, the requirement of the customer on the bandwidth is particularly large, in this case, if the parallel interface is adopted, many IO connection lines are introduced, the cost of the package is greatly increased, the package yield is also greatly reduced, and the area and the power consumption are no longer outstanding advantages, so the parallel interface is no longer suitable. So there is only a serial interface in the definition of a high speed interface.

In one embodiment, a medium speed serial D2D interface (32 Gbps) may also be employed. The medium speed serial interface is a D2D interface dedicated to substrate interconnect, which addresses the condition that the substrate signal-to-noise path is far superior to the PCB signal-to-noise path. D2D adopts the structure of pure simulation, has reduced the framework complexity of port physical layer (PHY) and controller (controller), and then reduces and builds chain time and path delay time, and the consumption is lower, and the area is littleer, and system integration is simple, and encapsulation cost is low, stability is higher, can get into fast simultaneously and exit from the low-power consumption state.

Fig. 4 shows a schematic diagram of a slave die according to an embodiment of the present disclosure. As described above, the slave Die (Side Die) includes at least D2D interconnect component 401 and Socket 402 of an IP core of a particular circuit function. The slave Die may be a different type of Die that customizes the IP core with specific circuit functionality according to user needs. A master die may be interconnected with a plurality of slave dies. The second Die, the slave Die, is configured to implement an IP core with specific circuit functions using its Socket 402 and communicate with the first Die interconnect via D2D interconnect component 401 and the D2D interconnect component of the first Die. For example, the second die may be configured by a user to implement an IP core with neural network operations, such that the neural network operations (e.g., convolution operations) are completed on the second die.

In one embodiment, the second Die Side Die may also be an input output IO Die, i.e., a Die that provides an input output interface. The interface type on the Hub Die may be extended by the way that the first Die Hub Die cooperates with the second Die IO Die. For example, the interface type and the number of interfaces on the Hub Die are limited, but in some application scenarios, a new interface type may be needed, or more interfaces may be needed, for example, more DDR memory capacity is needed, and then the interface type or the number of interfaces may be increased by implementing an extended DDR PHY on an IO Die (a type of Side Die), and then by using the Hub Die + the IO Die.

Thus, the second Die Side Die is further configured to receive data from the first Die, perform operations of the IP core based on the particular circuit function, and send results of the operations to the first Die.

The D2D interconnect component may be a port physical layer (PHY) that matches the D2D interconnect component of the first die. Therefore, the D2D interface in the D2D interconnection component 401 may also be any one of the low-speed parallel interface, the medium-speed serial interface, and the high-speed serial interface described above, and will not be described herein again.

By utilizing the technical means, the chips are developed by multiplexing the Hub Die, so that the cost caused by repeatedly purchasing the IP can be reduced when other application scene algorithm chips or iterative algorithm chips are developed, and the workload and risk caused by the need of integrating the IP are reduced.

Fig. 5 shows a schematic diagram of another multi-die interconnect based integrated circuit according to an embodiment of the present disclosure. In one embodiment, there is provided another multi-die interconnect based integrated circuit, comprising: a substrate comprising an interconnect structure; a first die comprising a control module, a communication module, a storage module, and a D2D interface; one or more second dies comprising D2D interconnect components; one or more third die comprising a control module, a communication module, a memory module, and a D2D interface, wherein the first, second, and third die are packaged on the substrate via the interconnect structure. In the example of fig. 5, 1 first Die (Hub Die) and 1 third Die (sub-Hub Die) are included, and the first Die (Hub Die) and the third Die (sub-Hub Die) each have a plurality of second dies (Side Die) connected to itself.

The integrated circuit in this embodiment is similar to that described in the previous embodiments, but includes one or more third dies similar to the first die. The third Die may be referred to as a secondary primary Die or a secondary Hub Die. The secondary Hub Die is interconnected with the primary core Hub Die and is likewise extended with one or more Side dies belonging to the secondary Hub Die. In such a chiplet, the second die is also configured to implement an IP core with specific circuit functions and to communicate with the first die interconnect via the D2D interconnect components and the D2D interconnect components of the first die or with the third die interconnect via the D2D interconnect components and the D2D interconnect components of the third die.

In one embodiment, the second die is configured to receive data from the first die connected thereto or the third die connected thereto, perform an operation of the IP core based on the specific circuit function, and send a result of the operation to the first die connected thereto or the third die connected thereto.

In one embodiment, the third die is configured to communicate with the first die, other third dies, any of the first die and other third dies via the D2D interconnect assembly, thereby enabling expansion of the first die.

In one embodiment, the secondary Hub Die has similar control, memory, communication and D2D interconnect components as the primary Hub Die.

In one embodiment, the third Die secondary Hub Die is further configured to receive data from the first Die primary Hub Die, or other third Die secondary Hub Die, send the received data to a second Die Side Die in interconnect communication with the third Die, receive a result of an operation of the second Die Side Die, and send the result of the operation to the first Die primary Hub Die, or other third Die secondary Hub Die. For example, an integrated circuit that includes a third Die (one or more secondary Hub Die) may implement the following three types of communication: external data path: for example, there are many external data interfaces on the master Hub Die, such as: PCIe, MIPI, USB, etc., so there are various data input from off-chip from different interfaces, or data output from on-chip from different interfaces to off-chip. When data is input, the data may be stored in the on-chip storage of the main Hub Die, or in the on-chip system of the Side Die, or in the on-chip storage of the auxiliary Hub Die; on-chip data path: DDR storage and SRAM storage are arranged on the main Hub Die, so that the Side Die or the auxiliary Hub Die can read and write data from and to the on-chip storage of the main Hub Die, and meanwhile, the main Hub Die can actively send the data to the Side Die or the auxiliary Hub Die from the on-chip storage; and an interconnect data path: the Side Die can directly perform data interaction with the auxiliary Side Die through the main Hub Die, and similarly, the auxiliary Hub Die can also perform data interaction with other auxiliary Hub dies through the main Hub Die, and even can realize data interaction through transferring a plurality of auxiliary Hub dies.

Based on the communication as above, one or more third Die secondary Hub Die may also be configured to store data in a distributed manner with the first Die primary Hub Die and/or other third Die secondary Hub Die.

In one embodiment, the control modules of the first Die primary Hub Die and the third Die secondary Hub Die are configured to implement an IP core having Die control functionality. The communication modules of the first Die primary Hub Die and the third Die secondary Hub Die are configured to implement IP cores having data interaction functionality with other dies within the integrated circuit. The communication modules of the first Die primary Hub Die and the third Die secondary Hub Die are also configured to implement an IP core having a data interaction function with a device external to the integrated circuit. The memory modules of the first Die main Hub Die and the third Die sub-Hub Die are configured to implement IP cores having a function of data interaction with a memory or memory granule.

Fig. 6 shows a schematic diagram of a D2D interconnect assembly, according to an embodiment of the disclosure. In one embodiment, the D2D interconnect components include an outbound datapath module 601, an on-chip datapath module 602, an interconnect datapath module 603, and a D2D interface 604.

The external data path module 601 may implement an external data path: for example, there are many external data interfaces on the main Hub Die, such as: PCIe, MIPI, USB, etc., so there are various data input from off-chip from different interfaces, or data output from on-chip from different interfaces to off-chip. When data is input, the data may be stored in the on-chip storage of the main Hub Die, the on-chip system of the Side Die or the on-chip storage of the auxiliary Hub Die.

The on-chip data path module 602 may implement on-chip data paths: the main Hub Die has DDR storage and SRAM storage, so the Side Die or the sub-Hub Die will read and write data from and to the on-chip storage of the main Hub Die, and the main Hub Die will also actively send data from the on-chip storage to the Side Die or the sub-Hub Die.

The interconnection data path module 603 may implement an interconnection data path: the Side Die can directly perform data interaction with the auxiliary Side Die through the main Hub Die, and similarly, the auxiliary Hub Die can also perform data interaction with other auxiliary Hub dies through the main Hub Die, and even can realize data interaction through transferring a plurality of auxiliary Hub dies.

The D2D interconnection component, which is composed of the above three data paths, implements an interconnection forwarding mechanism. The interconnect forwarding mechanism is address mapping over the standard axi protocol. Through a self-defined data packet protocol, basic information such as a source ID, a destination ID, a data operation address, a data length, a data type and the like of transmission data is packaged into a data packet header, prior to data transmission, and after an analysis module receives the data packet header, the source and the destination of current data can be identified according to the information in the data packet header, so that data and information interaction between the Side Die and the main Hub Die, between the Side Die and the auxiliary Hub Die, and between the main Hub Die and the auxiliary Hub Die is realized.

When the external data path module 601 implements the external data path, for example, when the transmission data is input from the external interface, the external data path module 601 needs to resolve the destination operation address. Thus, the data is determined to be transmitted to the on-chip storage of the main Hub Die, or to the interconnected Side Die of the main Hub Die, or to the on-chip storage of the adjacent sub-Hub Die, or to the non-adjacent sub-Hub Die. Then, according to the destination of the data, the original destination operation address is converted into a path address (the D2D interface 604 needs to be passed through to reach the destination Side Die or the sub Die), and the original storage address and basic information such as the source ID, the destination ID, the data operation address, the data length, the data type, etc. are packaged into a data packet header and transmitted together with the original data.

As described above, when the primary Hub Die first Die receives incoming data from a device external to the integrated circuit, the external data path module is configured to: the destination operation address of the data incoming from the outside is parsed, thereby determining the die on which the data is to be transmitted. For example, if externally incoming data corresponds to an IP core on the Side Die second Die, the second Die for which the data is to be transmitted is determined. In response to a data transfer to the first die, translating a destination operating address of the data to an on-die address of the first die (hereinafter referred to as an address, i.e., an on-die address representing the first die) to thereby save the data on the first die; in response to the data being transferred to the second Die, the destination operating address is translated to an address of a second Die Side Die connected to the first Die, thereby forwarding the data to the second Die via the first Die.

In the case where the integrated circuit further includes a plurality of third dies (sub-Hub Die), the outer datapath module 601 of the D2D interconnect assembly is parsed for the destination operational address of the data, thereby determining the Die on which the data is to be transmitted; translating the destination operating address to an address of a first die in response to the data being transferred to the first die, thereby saving the data on the first die; in response to the data transfer to the second Die, determining a Die interconnected with the second Die, i.e., determining that the second Die to which the data to be transferred is connected to the first Die (primary Hub Die) or a third Die (secondary Hub Die). Responsive to a second die being interconnected with the first die, translating the destination operating address to an address of the second die, thereby forwarding the data to the second die; responsive to a second Die being interconnected with the third Die (secondary Hub Die), translating the destination operation address to an address of the third Die, thereby forwarding the data to the third Die; in response to the data transfer to the third die, determining the die interconnected with the third die, i.e., determining whether the third die is connected directly to the first die or to every other third die. Translating the destination operating address to an address of the third die in response to the third die being interconnected with the first die, thereby forwarding the data to the third die; in response to the third die interconnecting with other third dies, translating the destination operating address to addresses of the other third dies, so that the data is forwarded by the D2D interfaces of the other third dies to the third dies where the final data needs to be transmitted.

When the on-chip data path module 602 implements an on-chip data path, for example, when data stored on a chip needs to be transmitted to a Side Die interconnected by a main Hub Die, or to an on-chip storage of a sub-Hub Die, or to an interconnected Side Die of a sub-Hub Die, a register on the on-chip data path module 602 needs to be configured, which is equivalent to sending an instruction to the on-chip data path module 602, the on-chip data path module 602 will fetch the data from a corresponding on-chip storage system according to the instruction, and pack an operation address of a destination and basic information such as a source ID, a destination ID, a data operation address, a data length, a data type, and the like into a data packet header according to the operation address of the destination, and send the data packet header together with the data. To prevent deadlock, the on-chip datapath module 602 may read back all of the data to be transmitted before transmitting them together.

When the first Die master Hub Die transmits data stored at the master Hub Die, the on-chip datapath module 602 of the D2D interconnect component is configured to: obtaining the data from a memory module of the first die based on a source address of the data; analyzing the IP core to be transmitted of the data, and determining a second tube core corresponding to the IP core to be transmitted, wherein the second tube core is to be transmitted of the data; and forwarding the data to the second die based on the determined destination operating address of the second die.

Where the integrated circuit further includes a plurality of third dies (sub-Hub Die), the on-chip data path module 602 may further retrieve the data from the memory module of the first Die based on a source address of the data; and analyzing the IP core to be transmitted, thereby determining the die to be transmitted, namely, the data is transmitted to the third die or the second die, and if the data is transmitted to the second die, the second die is connected with the first die or the third die. If the second die to be transferred is connected to a third die, it is either directly connected or connected to another third die in between.

Thus, in response to a data transfer to a second die, the on-chip datapath module 602 determines a die that is interconnected with the second die; in response to the second die being interconnected with the first die, the on-chip datapath module 602 translates the destination operating address to an address of the second die, forwarding the data to the second die; in response to the second die being interconnected with the third die, the on-chip datapath module 602 translates the destination operating address to an address of the third die, forwarding the data to the third die; in response to the data transfer to the third die, on-chip datapath module 602 determines a die that is interconnected with the third die; in response to the third die being interconnected with the first die, the on-chip datapath module 602 translates the destination operating address to an address of the third die, forwarding the data to the third die; in response to the third die interconnecting with other third dies, the on-chip datapath module 602 translates the destination operating address to an address of the other third dies, forwarding the data to the other third dies.

The on-chip data path module 602 may also configure a reserved information register through software, and a user defines the meaning of information, and packages the information into an information packet, thereby implementing software-level information communication between the Hub Die and the Hub Die.

When the interconnect data path module 603 of the D2D interconnect component implements an interconnect data path, the interconnect data path module 603 may implement different functions. First, the interconnection data path module 603 may enable the Side Die (second Die) or the sub-Hub Die (third Die) to perform on-chip storage of the main Hub Die (first Die) (for example, a storage module of the first Die) to read and write data, because an on-chip storage address is already allocated at the beginning of design, under the data path, the interconnection data path module 603 does not modify or map an operation address, and the Side Die or the sub-Hub Die may directly operate on the on-chip storage of the main Hub Die.

When the second Die Side Die reads data in the first Die master Hub Die, the interconnect data path module 603 is configured to: sending an address for the data to the second die.

In the case where the integrated circuit further includes a plurality of third dies (sub-Hub Die), the interconnect data path module 603 may further determine, based on the source address of the data, the Die in which the data to be read is stored, i.e., whether it is stored in the first Die (main Hub Die) or the third Die (sub-Hub Die); determining whether to interconnect with the second die or the third die with the first die in response to the data being stored on the first die; in response to the second die or the third die being interconnected with the first die, sending an address of the data to the second die or the third die; responding to the second die or the third die and the first die being not interconnected, acquiring the third die between the second die or the third die and the first die, and sending the address of the data to the third die; determining whether to interconnect with the second die or the third die with the third die in response to the data being stored on the third die; in response to the second die or the third die being interconnected with the third die, sending an address of the data to the second die or the third die; and responding to the second die or the third die and the third die not being interconnected, acquiring other third dies between the second die or the third die and the third die, and sending the address of the data to the other third dies.

Then, the interconnection data path module 603 may enable the Side Die or the sub-Hub Die to perform an operation by removing the on-chip storage of another Side Die or another sub-Hub Die through the main Hub Die, if the other Side Die or the sub-Hub Die is adjacent to the Hub Die, the address mapping register of the interconnection data path module 603 module needs to be configured through software, and the interconnection data path module 603 module may map the operation address sent by the Side Die or the sub-Hub Die to the on-chip storage address of the other Side Die or the sub-Hub Die (adjacent), so as to directly perform a data read-write operation. If the other Side Die or sub Hub Die is not adjacent to the Hub Die, software is required to configure a register of the interconnection data path module 603, the mode of the interconnection data path module 603 is switched to the multi-Hub Die interconnection mode, the interconnection data path module 603 identifies the ID of the destination Side Die or sub Hub Die according to the operation address, packs basic information such as a source ID, a destination ID, a data operation address, a data length, a data type and the like into a data packet header, converts the source operation address into a path address, and then sends an axi operation instruction.

When the interconnection data path module 603 serves as a node on a path, and the interconnection data path module 603 receives a data packet header, the interconnection data path module will analyze information in the data packet header, and if the destination ID is the Hub Die or Side Die of the interconnection data path module, the interconnection data path module 603 will take out a data operation address in the data packet header as an axi operation address, and initiate an operation request. And when the operation is completed, an interrupt signal is generated to notify the CPU of the destination.

If the analyzed result indicates that the destination ID is not the Hub Die or the Side Die of the system itself, the operation address of the packet header is converted into a new path address according to the path address set in the register (the system may be configured by software during initialization), and then an axi operation instruction is initiated to send the packet header to the adjacent Hub Die through the D2D interface again.

The present disclosure relates to methods, apparatuses, systems, electronic devices, computer-readable storage media and/or computer program products. The computer program product may include computer-readable program instructions for performing various aspects of the present disclosure.

The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be interpreted as a transitory signal per se, such as a radio wave or other freely propagating electromagnetic wave, an electromagnetic wave propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or an electrical signal transmitted through an electrical wire.

The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge computing devices. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.

The computer program instructions for carrying out operations of the present disclosure may be assembler instructions, Instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the present disclosure by utilizing the state information of the computer-readable program instructions to personalize the electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA).

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A multi-die interconnect based integrated circuit comprising:

a substrate comprising an interconnect structure;

a first die comprising a control module, a communication module, a memory module, and D2D interconnect components, and configured to control or schedule a second die; and

one or more second dies comprising a D2D interconnect component and a circuit-function-specific IP core, and configured to:

communicating with the first die interconnect via a D2D interconnect component and a D2D interconnect component of the first die;

receive data from the first die, perform operations of the IP core based on the particular circuit function and send results of the operations to the first die,

wherein the first die and the second die are packaged on the substrate via the interconnect structure.

2. The integrated circuit of claim 1, wherein the D2D interconnect components include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die receives incoming data from a device external to the integrated circuit, the D2D interconnect components are configured to:

analyzing the destination operation address of the data so as to determine a target die of the data to be transmitted;

in response to the data being transferred to the first die, translating the destination operating address to an on-die address of the first die, thereby saving the data on the first die; and

in response to the data being transmitted to the second die, translating the destination operating address to an on-chip address of the second die, thereby forwarding the data to the second die.

3. The integrated circuit of claim 1, wherein the D2D interconnect component includes an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die sends the stored data, the D2D interconnect component is configured to:

obtaining the data from a memory module of the first die based on a source address of the data;

resolving a destination operating address of the data, thereby determining a second die corresponding to an IP core to be transmitted to which the data is to be transmitted; and

forwarding the data to the second die based on the determined destination operating address of the second die.

4. The integrated circuit of claim 1, wherein the D2D interconnect components include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the second die reads data in the first die, the D2D interconnect components are configured to:

sending an on-die address of the data to the second die for reading the data in the first die by the second die.

5. The integrated circuit of any of claims 1-4, wherein the control module of the first die is configured to implement an IP core having die control functionality, the communication module of the first die is configured to implement an IP core having data interaction functionality with devices outside the integrated circuit, the communication module of the first die is configured to implement an IP core having data interaction functionality with other dies within the integrated circuit, and the storage module of the first die is configured to implement an IP core having data interaction functionality with memory or storage particles.

6. The integrated circuit according to any of claims 1-4, wherein the D2D interface is any one of a low speed parallel interface, a medium speed serial interface, and a high speed serial interface.

7. The integrated circuit of any of claims 1-4, wherein the first die and the second die being packaged on the substrate via the interconnect structure comprises:

the first die and one or more second dies are packaged on the substrate by substrate interconnect, rerouting RDL, or interposer interconnect.

8. The integrated circuit of any of claims 1-4, wherein the second die is developed based on different iterations of the IP core and the first die is multiplexed after the second die is iteratively developed.

9. A multi-die interconnect based integrated circuit comprising:

a substrate comprising an interconnect structure;

a first die comprising a control module, a communication module, a memory module, and D2D interconnect components and configured to control or schedule the second die, the third die;

one or more second dies comprising a D2D interconnect component and an IP core having a particular circuit function, and configured to:

communicating with the first die interconnect via a D2D interconnect component and a D2D interconnect component of the first die or communicating with a third die interconnect via a D2D interconnect component and a D2D interconnect component of the third die;

receiving data from the first die or a third die, performing an operation of the IP core based on the specific circuit function and sending a result of the operation to the first die or the third die; and

one or more third dies comprising a control module, a communication module, a storage module, and D2D interconnect components and configured to:

communicating with the first die, other third die, any of the first die and other third die interconnects via the D2D interconnect assembly;

receiving data from the first die or other third die, sending the received data to a second die in communication with the third die interconnect, receiving a result of an operation of the second die and sending the result of the operation to the first die or other third die,

wherein the first die, second die, and third die are packaged on the substrate via the interconnect structure.

10. The integrated circuit of claim 9, wherein the D2D interconnect components of the first die and the third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die receives data incoming from a device external to the integrated circuit, the D2D interconnect components are configured to:

in response to the data being transferred to a first die, translating the destination operating address to an on-chip address of the first die, thereby saving the data at a target first die;

determining a relay die interconnected with the second die in response to the data transfer to the second die;

translating the destination operating address to an on-chip address of the second die in response to the second die being interconnected with the first die, thereby forwarding the data to a target second die;

translating the destination operating address to an on-chip address of the third die in response to the second die being interconnected with the third die, thereby forwarding the data to a transit third die;

determining a relay die interconnected with the third die in response to the data transfer to the third die;

translating the destination operating address to an on-chip address of the third die in response to the third die being interconnected with the first die, thereby forwarding the data to a target third die;

in response to the third die interconnecting with other third dies, translating the destination operating address to an on-chip address of the other third dies of the transit die, thereby forwarding the data to the target third die.

11. The integrated circuit of claim 9, wherein the D2D interconnect components of the first die and third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the first die sends data stored in the first die, the D2D interconnect components are configured to:

analyzing a destination operation address of the data to be transmitted so as to determine a target die of the data to be transmitted;

determining a relay die interconnected with a second die in response to the data transfer to the second die;

translating the destination operating address to an on-chip address of the third die in response to the second die being interconnected with the third die, thereby forwarding the data to the third die;

12. The integrated circuit of claim 9, wherein the D2D interconnect components of the first die and third die include an outbound data path module, an on-chip data path module, an interconnect data path module, and a D2D interface, wherein when the second die or third die reads data in the first die or third die, the D2D interconnect components are configured to:

determining a die in which the data to be read is stored based on the source address of the data;

determining whether to interconnect the second die or the third die with the first die in response to the data being stored on the first die;

in response to the second die or a third die being interconnected with the first die, sending an on-die address of the data to the second die or the third die;

in response to the second die or a third die not being interconnected with the first die, obtaining a transit third die between the second die or the third die and the first die, sending an on-chip address of the data to the transit third die, thereby sending the on-chip address of the data to the second die or the third die via the transit third die;

determining whether to interconnect with the second die or a third die with a third die in response to the data being stored on the third die;

sending an on-die address of the data to the second die or a third die in response to the second die or the third die being interconnected with the third die; and

responding to the second die or the third die and the third die being not interconnected, acquiring a transfer third die between the second die or the third die and the third die, and sending the on-chip address of the data to the transfer third die, so as to send the on-chip address of the data to the second die or the third die via the transfer third die.

13. The integrated circuit of any of claims 9-12, wherein the third die is further configured to store data distributed with the first die and/or other third dies.

14. The integrated circuit of claim 13, wherein the control module of the third die is configured to implement an IP core having a die control function, the communication module of the third die is configured to implement an IP core having a data interaction function with devices outside the integrated circuit, the communication module of the third die is configured to implement an IP core having a data interaction function with other dies within the integrated circuit, and the storage module of the third die is configured to implement an IP core having a data interaction function with memory or storage particles.

15. The integrated circuit of claim 13, wherein the first die, the second die, and the third die being packaged on the substrate via the interconnect structure comprises:

16. The integrated circuit of claim 13, wherein the second die is iteratively developed based on different IP cores and the first die and the third die are multiplexed after the iterative development of the second die.