CN113254390B - Reconfigurable computing structure, computing method and hardware architecture - Google Patents

Reconfigurable computing structure, computing method and hardware architecture Download PDF

Info

Publication number
CN113254390B
CN113254390B CN202110640708.3A CN202110640708A CN113254390B CN 113254390 B CN113254390 B CN 113254390B CN 202110640708 A CN202110640708 A CN 202110640708A CN 113254390 B CN113254390 B CN 113254390B
Authority
CN
China
Prior art keywords
algorithm
semiconductor structure
reconfigurable
array
operator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110640708.3A
Other languages
Chinese (zh)
Other versions
CN113254390A (en
Inventor
尚会滨
杨施洋
陈巍
江博
耿云川
李冰倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qianxin Semiconductor Technology Beijing Co ltd
Original Assignee
Qianxin Semiconductor Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qianxin Semiconductor Technology Beijing Co ltd filed Critical Qianxin Semiconductor Technology Beijing Co ltd
Priority to CN202110640708.3A priority Critical patent/CN113254390B/en
Publication of CN113254390A publication Critical patent/CN113254390A/en
Application granted granted Critical
Publication of CN113254390B publication Critical patent/CN113254390B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • G06F15/7871Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Logic Circuits (AREA)

Abstract

The invention provides a reconfigurable computing structure, a computing method and a hardware architecture, wherein the computing structure comprises: a first semiconductor structure and a second semiconductor structure; the first semiconductor structure is connected with the second semiconductor structure, the first semiconductor structure comprises a memory and a logic circuit, the second semiconductor structure comprises a reconfigurable array, and the reconfigurable array comprises a plurality of reconfigurable operators; the logic circuit is used for acquiring an operation algorithm to be operated from the memory, configuring the reconfigurable operators in the reconfigurable array based on the operation algorithm, and determining the operation array corresponding to the operation algorithm so that the second semiconductor structure executes the operation algorithm based on the operation array. According to the invention, corresponding reconfigurable operators are flexibly configured according to operation algorithms under different scenes, and meanwhile, the data transfer distance between the memory and the logic circuit is reduced, so that the power consumption and the time delay are reduced, the plane process limitation is broken through, and the storage space of the computational logic resource is increased.

Description

Reconfigurable computing structure, computing method and hardware architecture
Technical Field
The invention relates to the technical field of computers, in particular to a reconfigurable computing structure, a computing method and a hardware architecture.
Background
Currently, many applications require complex operations, for example, a multimedia application may include sub-tasks such as data parallel processing, bit processing, irregular computation, high-precision word operation, operation with real-time requirement, etc., and the processing system is required to flexibly process the sub-tasks.
However, the conventional computing structure adopts fixed hardware resources for computing, and cannot flexibly configure corresponding hardware resources according to different computing tasks. In addition, in the conventional scheme, the memory and the logic circuit are separately arranged on two chips, so that a data carrying path from the memory to the logic circuit is longer, and power consumption and time delay are increased.
Disclosure of Invention
The invention provides a reconfigurable computing structure, a computing method and a hardware architecture, which are used for solving the defects that hardware resources cannot be flexibly configured according to different computing tasks, the data carrying distance from a memory to a logic circuit is long, and the power consumption and the time delay are increased in the prior art.
The invention provides a reconfigurable computing architecture comprising:
a first semiconductor structure and a second semiconductor structure;
the first semiconductor structure is connected to the second semiconductor structure, the first semiconductor structure including a memory and a logic circuit, the second semiconductor structure including a reconfigurable array, the reconfigurable array including a plurality of reconfigurable operators;
the logic circuit is used for acquiring an operation algorithm to be operated from the memory, configuring a reconfigurable operator in the reconfigurable array based on the operation algorithm, and determining an operation array corresponding to the operation algorithm, so that the second semiconductor structure executes the operation algorithm based on the operation array.
According to the reconfigurable computing structure provided by the invention, the second semiconductor structure further comprises a basic computing array, and the basic computing array is a pre-configured fixed array.
According to the reconfigurable computing structure provided by the invention, the computing algorithm comprises a first computing sub-algorithm and a second computing sub-algorithm, and the complexity of the first computing sub-algorithm is higher than that of the second computing sub-algorithm;
the reconfigurable array is used for executing the first operation sub-algorithm, and the basic calculation array is used for executing the second operation sub-algorithm.
According to the reconfigurable computing structure provided by the invention, the first semiconductor structure and the second semiconductor structure are connected after bonding processing is carried out on the first bonding layer and the second bonding layer; the first bonding layer is arranged on the first semiconductor structure, and the second bonding layer is arranged on the second semiconductor structure.
According to a reconfigurable computing architecture provided by the present invention, the first semiconductor architecture further comprises an interface circuit for acquiring the operation algorithm from an external data stream, so that the memory stores the operation algorithm, and for outputting an operation result of the operation algorithm.
According to a reconfigurable computing architecture provided by the present invention, the interface circuitry includes an interface for communication.
The invention also provides a computing method based on the reconfigurable computing structure, which comprises the following steps:
acquiring the operation algorithm and operation data corresponding to the operation algorithm;
and inputting the operation data into the first semiconductor structure, and executing the operation algorithm through the second semiconductor structure.
According to the computing method based on the reconfigurable computing structure, the computing algorithm comprises a first computing sub algorithm and a second computing sub algorithm, and the complexity of the first computing sub algorithm is higher than that of the second computing sub algorithm; the second semiconductor structure further comprises a basic computational array, the basic computational array being a pre-configured fixed array;
the inputting the operation data into the first semiconductor structure and the executing the operation algorithm through the second semiconductor structure comprise:
inputting the operational data into the first semiconductor structure, executing the first operator algorithm through the reconfigurable array, and executing the second operator algorithm through the basic compute array.
According to the computing method based on the reconfigurable computing structure provided by the invention, after the operation algorithm is executed through the second semiconductor structure, the method further comprises the following steps:
and inputting the operation result of the operation algorithm to the first semiconductor structure, and outputting the operation result through an interface circuit of the first semiconductor structure.
The invention also provides a hardware architecture based on the reconfigurable computing structure, which comprises the following steps: a reconfigurable computing architecture as described above.
According to the reconfigurable computing structure, the computing method and the hardware architecture provided by the invention, the logic circuit configures the reconfigurable operators in the reconfigurable array and determines the operation array corresponding to the operation algorithm, so that the corresponding reconfigurable operators can be flexibly configured according to the operation algorithms under different scenes, and the second semiconductor structure executes the operation algorithm based on the corresponding operation array. Meanwhile, the memory and the logic circuit are both arranged on the first semiconductor structure, so that the data carrying distance between the memory and the logic circuit can be effectively reduced, and the power consumption and the time delay are reduced. In addition, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the plane process limitation is broken through, and the storage space of the computational logic resource is increased.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a reconfigurable computing architecture provided by the present invention;
FIG. 2 is a schematic diagram of a reconfigurable computing architecture provided by the present invention;
FIG. 3 is a flow chart of a computing method based on a reconfigurable computing structure provided by the present invention;
FIG. 4 is a flow chart illustrating an algorithm for performing operations in a second semiconductor structure according to the present invention;
reference numerals:
110: a first semiconductor structure; 120: a second semiconductor structure; 111: a memory; 112: a logic circuit;
113: a first bonding layer; 114: an interface circuit; 121: a reconfigurable array; 122: a base computing array;
123: a second bonding layer.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In view of this, the present invention provides a reconfigurable computing architecture. Fig. 1 is a schematic structural diagram of a reconfigurable computing structure provided by the present invention, and as shown in fig. 1, the computing structure includes: a first semiconductor structure 110 and a second semiconductor structure 120.
The first semiconductor structure 110 is connected to a second semiconductor structure 120, the first semiconductor structure 110 comprising a memory 111 and a logic circuit 112, the second semiconductor structure 120 comprising a reconfigurable array 121 comprising a plurality of reconfigurable operators. The first semiconductor structure 110 and the second semiconductor structure 120 may be connected through a semiconductor bonding process, such as bonding, so that the operation data in the first semiconductor structure 110 may be transmitted to the second semiconductor structure 120 to complete the corresponding operation.
The memory 111 on the first semiconductor structure 110 may obtain the operation algorithm to be executed from an external data stream through the interface, and store the operation algorithm. The logic circuit 112 is configured to obtain the operation algorithm from the memory 111, configure the reconfigurable operators in the reconfigurable array 121 based on the operation algorithm, and determine the operation array corresponding to the operation algorithm, so that the second semiconductor structure 120 executes the operation algorithm based on the operation array.
It can be understood that, since the reconfigurable array 121 has a plurality of reconfigurable operators, the logic circuit 112 configures each reconfigurable operator to obtain different computing arrays, and each computing array can implement different operator functions, for example, when computing a function expression, the corresponding reconfigurable operator can be flexibly configured to complete a corresponding operation algorithm according to weights, coefficients and the like in different scenes.
Compared with an Application Specific Integrated Circuit (ASIC) which cannot process other tasks except for a specific calculation task, the reconfigurable calculation structure provided by the embodiment of the invention not only can flexibly execute corresponding calculation algorithms according to different scenes, but also can ensure higher performance. For example, a multimedia application may include sub-tasks such as data parallel processing, bit processing, irregular computation, high-precision word operation, and operation with real-time requirement, and the processing system is required to flexibly process the sub-tasks and achieve certain performance. Many other applications also have similar requirements, such as data encryption, artificial intelligence and the like, and the computing structure provided by the embodiment of the invention can flexibly configure corresponding reconfigurable operators according to computing tasks in different scenes to complete corresponding operation algorithms.
In addition, compared with the conventional method in which the memory and the logic circuit are separately arranged on two chips, in the embodiment of the present invention, the memory 111 and the logic circuit 112 are both arranged on the first semiconductor structure 110, so that the data transfer distance between the memory 111 and the logic circuit 112 can be effectively reduced, the power consumption and the time delay are reduced, and the efficiency of executing the operation algorithm is improved.
It should be noted that the first semiconductor structure 110 and the second semiconductor structure 120 may be stacked up and down, so that the data transfer distance from the memory 111 to the reconfigurable array 121 may be greatly shortened, the data transfer delay from the memory 111 to the reconfigurable array 121 may be reduced, and further, power consumption and delay may be reduced. In addition, the first semiconductor structure 110 and the second semiconductor structure 120 are stacked up and down, so that the area occupied by the first semiconductor structure 110 and the second semiconductor structure 120 is equivalent to the area of one semiconductor structure, that is, the occupied area of the calculation structure is reduced, the calculation structure can be integrated in a smaller chip, and the manufacturing cost of the chip is reduced. Meanwhile, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the storage space of the computational logic resource is increased.
Therefore, in the reconfigurable computing structure provided by the embodiment of the invention, the logic circuit configures the reconfigurable operators in the reconfigurable array and determines the operation array corresponding to the operation algorithm, so that the corresponding reconfigurable operators can be flexibly configured according to the operation algorithms under different scenes, and the second semiconductor structure executes the operation algorithm based on the corresponding operation array. Meanwhile, the memory and the logic circuit are both arranged on the first semiconductor structure, so that the data carrying distance between the memory and the logic circuit can be effectively reduced, and the power consumption and the time delay are reduced. In addition, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the plane process limitation is broken through, and the storage space of the computational logic resource is increased.
Based on the above embodiment, the second semiconductor structure 120 further includes the basic computation array 122, and the basic computation array 122 is a pre-configured fixed array.
Specifically, the basic calculation array 122 is a pre-configured fixed array, and may perform operations (e.g., basic addition, multiplication, etc.) on a fixed operation algorithm.
As shown in fig. 2, a basic computation array 122 is further disposed on the second semiconductor structure 120, so that when the computation algorithm does not need to adjust the weights or coefficients according to different scenarios, that is, the computation formula corresponding to the computation algorithm is a fixed basic algorithm, such as addition, multiplication, and the like, the computation algorithm can be executed through the basic computation array 122 without configuring the reconfigurable operator by the logic circuit 112, thereby saving the amount of logic computation of the logic circuit 112.
It should be noted that the first semiconductor structure 110 and the second semiconductor structure 120 may be stacked up and down, so that the data transfer distance from the memory 111 to the reconfigurable array 121 and the basic calculation array 122 may be greatly shortened, the data transfer delay from the memory 111 to the reconfigurable array 121 and the basic calculation array 122 may be reduced, and further, the power consumption and the delay may be reduced.
Based on any one of the embodiments, the operation algorithm comprises a first operator algorithm and a second operator algorithm, wherein the complexity of the first operator algorithm is higher than that of the second operator algorithm;
reconfigurable array 121 is used to execute a first operator algorithm and basic compute array 122 is used to execute a second operator algorithm.
Specifically, the arithmetic algorithm comprises a first arithmetic sub algorithm and a second arithmetic sub algorithm, and the complexity of the first arithmetic sub algorithm is higher than that of the second arithmetic sub algorithm. For example, the first operator algorithm may be a functional operation, the weight or coefficient in the corresponding calculation formula may change according to different scenes, and the second operator algorithm may be a basic addition and subtraction operation, the corresponding calculation formula is fixed.
On this basis, as a preferred embodiment, the reconfigurable array 121 is configured to execute a first operator algorithm, that is, the logic circuit 112 may configure the reconfigurable operators in the reconfigurable array 121 according to different scenarios, determine a computing array matched with the first operator algorithm, and implement flexible execution of the first operator algorithm according to different scenarios. Meanwhile, since the corresponding operation formulas of the second operator algorithm are fixed and unchangeable in different scenes, it can be understood that the corresponding hardware resources of the second operator algorithm are fixed and unchangeable in different scenes, and therefore the second operator algorithm can be executed by using the basic calculation array 122.
It should be noted that the reconfigurable array 121 and the basic computation array 122 may execute corresponding operator algorithms based on the order of the first operator algorithm and the second operator algorithm. For example, if the calculation order of the operator algorithm in the calculation algorithm is "first operator algorithm → second operator algorithm", the array order of the second semiconductor structure 120 for executing the corresponding operator algorithm is "reconfigurable array → basic calculation array"; if the calculation order of the operator algorithm in the calculation algorithm is "second operator algorithm → first operator algorithm", the array order of the second semiconductor structure 120 for executing the corresponding operator algorithm is "basic calculation array → reconfigurable array".
Therefore, according to the complexity of the operator algorithm in the operation algorithm, the reconfigurable array 121 and the basic calculation array 122 respectively execute the corresponding operator algorithm, so that the corresponding calculation array can be flexibly configured for the first operator algorithm with higher complexity, and the corresponding operator algorithm is executed for the second operator algorithm with lower complexity by adopting the basic calculation array, that is, the logic circuit 112 is not required for configuring the reconfigurable operator for the second operator algorithm, thereby saving the logic calculation amount of the logic circuit 112.
According to any of the above embodiments, the first semiconductor structure 110 and the second semiconductor structure 120 are connected by performing a bonding process on the first bonding layer 113 and the second bonding layer 123; the first bonding layer 113 is disposed on the first semiconductor structure 110, and the second bonding layer 123 is disposed on the second semiconductor structure 120.
Specifically, as shown in fig. 1, a first bonding layer 113 is disposed on the first semiconductor structure 110, a second bonding layer 123 is disposed on the second semiconductor structure 120, and the first bonding layer 113 and the second bonding layer 123 are connected by bonding, so that data interaction between the first semiconductor structure 110 and the second semiconductor structure 120 can be realized, for example, the first semiconductor structure 110 can transmit operation data corresponding to an operation algorithm to the second semiconductor structure 120, and the second semiconductor structure 120 can transmit an operation result corresponding to the operation algorithm to the first semiconductor structure 110.
Wherein, the bonding is that two homogeneous or heterogeneous semiconductor materials with clean surfaces and flat atomic levels are directly combined under certain conditions after surface cleaning and activating treatment, and wafers are bonded into a whole through Van der Waals force, molecular force and even atomic force. For example, the embodiment of the present invention may adopt a semiconductor direct bonding technology, so that the first semiconductor structure 110 and the second semiconductor structure 120 are bonded and connected, thereby not only realizing data interaction between the first semiconductor structure 110 and the second semiconductor structure 120, but also integrating the first semiconductor structure 110 and the second semiconductor structure 120 into a smaller chip, and saving the manufacturing cost of the chip.
According to any of the above embodiments, the first semiconductor structure 110 further includes an interface circuit 114, and the interface circuit 114 is configured to obtain the operation algorithm from the external data stream, to make the memory 111 store the operation algorithm, and to output the operation result of the operation algorithm.
Specifically, as shown in fig. 1, the first semiconductor structure 110 further includes an interface circuit 114, and the interface circuit 114 may obtain the operation algorithm from the external data stream and store the operation algorithm in the memory 111, so that the logic circuit 112 may obtain the operation algorithm from the memory 111.
In addition, after the second semiconductor structure 120 executes the completion operation algorithm, the corresponding operation result may be input to the interface circuit 114, so that the interface circuit 114 may output the corresponding operation result.
It can be seen that, in the embodiment of the present invention, the interface circuit 114 is disposed on the first semiconductor structure 110, so that data interaction between the reconfigurable computing structure and an external data stream can be realized.
The interface circuit 114 comprises an interface for communication according to any of the above embodiments.
In particular, the interface circuit 114 includes an interface for communication through which data interaction with an external data stream may be achieved. The interface may be an ethernet interface, or may also be an interface of other communication protocols, which is not specifically limited in this embodiment of the present invention.
For example, the interface circuit 114 may obtain the operation algorithm in the external data stream through the interface, the memory 111 may store the operation algorithm, and the logic circuit 112 may obtain the operation algorithm from the memory 111 and determine the algorithm array based on the operation algorithm, so that the second semiconductor structure 120 may execute the operation algorithm. In addition, the second semiconductor structure 120 may transmit an operation result corresponding to the operation algorithm to the interface circuit 114, so that the interface circuit 114 may output the operation result through the interface.
Based on any one of the above embodiments, the present invention provides a computing method based on the reconfigurable computing structure according to any one of the above embodiments, the method including:
acquiring an operation algorithm and operation data corresponding to the operation algorithm;
operational data is input into the first semiconductor structure and an operational algorithm is executed by the second semiconductor structure.
Specifically, after the arithmetic algorithm and the arithmetic data are obtained, the arithmetic data are input to the first semiconductor structure, and the memory in the first semiconductor structure is responsible for storing the arithmetic algorithm and the arithmetic data, so that the logic circuit in the first semiconductor structure can obtain the arithmetic algorithm from the memory. The memory can obtain the operation algorithm to be operated from the external data stream through the interface and store the operation algorithm.
After the logic circuit obtains the operation algorithm, the logic circuit configures the reconfigurable operators in the reconfigurable array based on the operation algorithm, and determines the operation array corresponding to the operation algorithm, so that the second semiconductor structure executes the operation algorithm based on the operation array. It is understood that the first semiconductor structure and the second semiconductor structure may be connected through a semiconductor bonding process, such as bonding, so that operation data in the first semiconductor structure can be transmitted to the second semiconductor structure to complete corresponding operations.
Due to the fact that the reconfigurable array is provided with the plurality of reconfigurable operators, the logic circuit configures each reconfigurable operator, different computing arrays can be obtained, each computing array can achieve different operator functions, and for example, when a function formula is computed, the corresponding reconfigurable operators can be flexibly configured to complete corresponding operation algorithms according to weights, coefficients and the like under different scenes.
In addition, compared with a method for calculating by using a calculation structure based on a traditional method, in the embodiment of the invention, as the memory and the logic circuit are both arranged on the first semiconductor structure, the data handling distance between the memory and the logic circuit can be effectively reduced, the power consumption and the time delay are reduced, and the efficiency of executing an operation algorithm is improved.
In addition, the first semiconductor structure and the second semiconductor structure can be arranged in an up-and-down overlapping mode, so that the data conveying distance from the memory to the reconfigurable array can be greatly shortened, the data conveying delay from the memory to the reconfigurable array is reduced, and further the power consumption and the time delay are reduced. In addition, the first semiconductor structure and the second semiconductor structure are arranged in an up-and-down overlapping mode, the occupied area of the first semiconductor structure and the occupied area of the second semiconductor structure are equivalent to the area of one semiconductor structure, the occupied area of the calculation structure is reduced, the calculation structure can be integrated in a smaller chip, and the manufacturing cost of the chip is reduced. Meanwhile, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the storage space of the computational logic resource is increased.
As shown in fig. 3, the operation data is input to the first semiconductor structure, the first semiconductor structure transmits the operation data to the second semiconductor structure, the second semiconductor structure performs data operation based on the operation data, the second semiconductor structure transmits the operation result to the first semiconductor structure after the second semiconductor structure completes the data operation, and the operation result is output from the first semiconductor structure if the next calculation is not needed. If the next calculation needs to be carried out on the basis, the first semiconductor structure inputs the data of the next calculation into the second semiconductor structure so that the second semiconductor structure carries out data processing according to the method.
Therefore, according to the computing method based on the reconfigurable computing structure provided by the embodiment of the invention, the reconfigurable operators in the reconfigurable array are configured through the logic circuit, and the computing array corresponding to the computing algorithm is determined, so that the corresponding reconfigurable operators can be flexibly configured according to the computing algorithms in different scenes, and the second semiconductor structure executes the computing algorithm based on the corresponding computing array. Meanwhile, the memory and the logic circuit are both arranged on the first semiconductor structure, so that the data carrying distance between the memory and the logic circuit can be effectively reduced, and the power consumption and the time delay are reduced. In addition, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the plane process limitation is broken through, and the storage space of the computational logic resource is increased.
Based on any one of the embodiments, the operation algorithm comprises a first operator algorithm and a second operator algorithm, wherein the complexity of the first operator algorithm is higher than that of the second operator algorithm; the second semiconductor structure further comprises a basic computation array, the basic computation array being a pre-configured fixed array;
inputting operation data into the first semiconductor structure, and executing an operation algorithm through the second semiconductor structure, wherein the operation algorithm comprises the following steps:
the operational data is input into the first semiconductor structure, a first operator algorithm is executed by the reconfigurable array, and a second operator algorithm is executed by the basic computational array.
Specifically, the arithmetic algorithm comprises a first arithmetic sub algorithm and a second arithmetic sub algorithm, and the complexity of the first arithmetic sub algorithm is higher than that of the second arithmetic sub algorithm. For example, the first operator algorithm may be a functional operation, the weight or coefficient in the corresponding calculation formula may change according to different scenes, and the second operator algorithm may be a basic addition and subtraction operation, the corresponding calculation formula is fixed.
On the basis, the reconfigurable array is used for executing a first operator algorithm, namely the logic circuit can configure reconfigurable operators in the reconfigurable array according to different scenes, determine a computing array matched with the first operator algorithm, and flexibly execute the first operator algorithm according to different scenes. Meanwhile, the corresponding operation formulas of the second operator algorithm in different scenes are fixed, so that the hardware resources corresponding to the second operator algorithm in different scenes are fixed, and the second operator algorithm can be executed by adopting the basic calculation array.
It should be noted that the reconfigurable array and the basic computation array may execute the corresponding operator algorithms based on the order of the first operator algorithm and the second operator algorithm.
As shown in fig. 4, after the operation algorithm is obtained, if the calculation order of the first operator algorithm in the operation algorithm is before the calculation order of the second operator algorithm, the array order for executing the corresponding operator algorithms in the second semiconductor structure is "reconfigurable array executes the first operator algorithm → basic calculation array executes the second operator algorithm"; if the calculation order of the first operator algorithm in the calculation algorithm is after the second operator algorithm, the array order for executing the corresponding operator algorithms in the second semiconductor structure is 'basic calculation array executing second operator algorithm → reconfigurable array executing first operator algorithm'.
Therefore, according to the complexity of the operator algorithm in the operation algorithm, the reconfigurable array and the basic calculation array respectively execute the corresponding operator algorithm, so that the corresponding calculation array can be flexibly configured for the first operator algorithm with higher complexity, and the basic calculation array is adopted to execute the corresponding operator algorithm for the second operator algorithm with lower complexity, namely, the second operator algorithm does not need a logic circuit to configure the reconfigurable operator, and the logic calculation amount of the logic circuit is saved.
Based on any of the above embodiments, after causing the second semiconductor structure to execute the operation algorithm, the method further includes:
and inputting the operation result of the operation algorithm into the first semiconductor structure so that the interface circuit of the first semiconductor structure outputs the operation result.
Specifically, the first semiconductor structure may acquire the operation algorithm from an external data stream through the interface circuit and store the operation algorithm in the memory, so that the logic circuit may acquire the operation algorithm from the memory.
In addition, after the second semiconductor structure executes the completion operation algorithm, the corresponding operation result may be input to the interface circuit, so that the interface circuit may output the corresponding operation result.
Based on any one of the above embodiments, a hardware architecture of a reconfigurable computing architecture includes: a reconfigurable computing architecture as described in any of the above embodiments.
Specifically, in the reconfigurable computing structure provided in the above embodiment, the reconfigurable operators in the reconfigurable array are configured through the logic circuit, and the operation array corresponding to the operation algorithm is determined, so that the corresponding reconfigurable operators can be flexibly configured according to the operation algorithms in different scenes, so that the second semiconductor structure executes the operation algorithm based on the corresponding operation array. Meanwhile, the memory and the logic circuit are both arranged on the first semiconductor structure, so that the data carrying distance between the memory and the logic circuit can be effectively reduced, and the power consumption and the time delay are reduced. In addition, the memory, the logic circuit and the reconfigurable array are respectively arranged in the two semiconductor structures, so that the plane process limitation is broken through, and the storage space of the computational logic resource is increased.
Therefore, the hardware architecture including the reconfigurable computing architecture of any of the above embodiments also has all the advantages of the reconfigurable computing architecture described above. The reconfigurable computing structure can be packaged in a wafer to obtain a hardware architecture of the reconfigurable computing structure, and data transmission and data transfer are at the wafer level, so that the transmission speed of data signals from the first semiconductor structure to the second semiconductor structure is higher, the time delay can be effectively reduced, and the power consumption can be reduced.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (9)

1. A reconfigurable computing architecture, comprising:
a first semiconductor structure and a second semiconductor structure;
the first semiconductor structure is connected to the second semiconductor structure, the first semiconductor structure including a memory and a logic circuit, the second semiconductor structure including a reconfigurable array, the reconfigurable array including a plurality of reconfigurable operators;
the logic circuit is used for acquiring an operation algorithm to be operated from the memory, configuring a reconfigurable operator in the reconfigurable array based on the operation algorithm, and determining an operation array corresponding to the operation algorithm so that the second semiconductor structure executes the operation algorithm based on the operation array;
the second semiconductor structure further includes a basic compute array, which is a pre-configured fixed array.
2. The reconfigurable computing architecture of claim 1, wherein the operational algorithms include a first operator algorithm and a second operator algorithm, the first operator algorithm having a higher complexity than the second operator algorithm;
the reconfigurable array is used for executing the first operation sub-algorithm, and the basic calculation array is used for executing the second operation sub-algorithm.
3. The reconfigurable computing structure of claim 1, wherein the first semiconductor structure and the second semiconductor structure are connected after a bonding process is performed on a first bonding layer and a second bonding layer; the first bonding layer is arranged on the first semiconductor structure, and the second bonding layer is arranged on the second semiconductor structure.
4. The reconfigurable computing architecture of claim 1, wherein the first semiconductor architecture further comprises an interface circuit for obtaining the operational algorithm from an external data stream, for causing the memory to store the operational algorithm, and for outputting an operational result of the operational algorithm.
5. The reconfigurable computing architecture of claim 4, wherein the interface circuitry includes an interface for communication.
6. A computing method based on a reconfigurable computing architecture according to any of claims 1 to 5, comprising:
acquiring the operation algorithm and operation data corresponding to the operation algorithm;
and inputting the operation data into the first semiconductor structure, and executing the operation algorithm through the second semiconductor structure.
7. A reconfigurable computing architecture-based computing method according to claim 6, wherein the computing algorithms include a first operator algorithm and a second operator algorithm, the first operator algorithm having a higher complexity than the second operator algorithm; the second semiconductor structure further comprises a basic computational array, the basic computational array being a pre-configured fixed array;
the inputting the operation data into the first semiconductor structure and the executing the operation algorithm through the second semiconductor structure comprise:
inputting the operational data into the first semiconductor structure, executing the first operator algorithm through the reconfigurable array, and executing the second operator algorithm through the basic compute array.
8. The reconfigurable computing architecture-based computing method of claim 6, further comprising, after the performing of the operational algorithm by the second semiconductor architecture:
and inputting the operation result of the operation algorithm to the first semiconductor structure, and outputting the operation result through an interface circuit of the first semiconductor structure.
9. A hardware architecture for a reconfigurable computing architecture, comprising: a reconfigurable computing structure according to any of claims 1 to 5.
CN202110640708.3A 2021-06-09 2021-06-09 Reconfigurable computing structure, computing method and hardware architecture Active CN113254390B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110640708.3A CN113254390B (en) 2021-06-09 2021-06-09 Reconfigurable computing structure, computing method and hardware architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110640708.3A CN113254390B (en) 2021-06-09 2021-06-09 Reconfigurable computing structure, computing method and hardware architecture

Publications (2)

Publication Number Publication Date
CN113254390A CN113254390A (en) 2021-08-13
CN113254390B true CN113254390B (en) 2021-10-29

Family

ID=77187165

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110640708.3A Active CN113254390B (en) 2021-06-09 2021-06-09 Reconfigurable computing structure, computing method and hardware architecture

Country Status (1)

Country Link
CN (1) CN113254390B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661656B (en) * 2022-05-25 2022-08-30 广州万协通信息技术有限公司 Reconfigurable array configuration method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043761A (en) * 2011-01-04 2011-05-04 东南大学 Fourier transform implementation method based on reconfigurable technology
CN102438149A (en) * 2011-10-10 2012-05-02 上海交通大学 Realization method of AVS (Audio Video Standard) inverse transformation based on reconfiguration technology
CN211719590U (en) * 2020-01-21 2020-10-20 深圳市汇顶科技股份有限公司 Communication interface and packaging structure
CN112463719A (en) * 2020-12-04 2021-03-09 上海交通大学 In-memory computing method realized based on coarse-grained reconfigurable array

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10031733B2 (en) * 2001-06-20 2018-07-24 Scientia Sol Mentis Ag Method for processing data
JP4438000B2 (en) * 2005-11-15 2010-03-24 株式会社半導体理工学研究センター Reconfigurable logic block, programmable logic circuit device having reconfigurable logic block, and method for configuring reconfigurable logic block
EP2894572B1 (en) * 2014-01-09 2018-08-29 Université de Rennes 1 Method and device for programming a FPGA
US9972536B2 (en) * 2014-10-08 2018-05-15 Taiyo Yuden Co., Ltd. Reconfigurable semiconductor device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043761A (en) * 2011-01-04 2011-05-04 东南大学 Fourier transform implementation method based on reconfigurable technology
CN102438149A (en) * 2011-10-10 2012-05-02 上海交通大学 Realization method of AVS (Audio Video Standard) inverse transformation based on reconfiguration technology
CN211719590U (en) * 2020-01-21 2020-10-20 深圳市汇顶科技股份有限公司 Communication interface and packaging structure
CN112463719A (en) * 2020-12-04 2021-03-09 上海交通大学 In-memory computing method realized based on coarse-grained reconfigurable array

Also Published As

Publication number Publication date
CN113254390A (en) 2021-08-13

Similar Documents

Publication Publication Date Title
KR102649482B1 (en) Neural processing accelerator
US20220107857A1 (en) System and method for offloading application functions to a device
KR102283469B1 (en) Apparatus and mechanism for processing neural network tasks using a single chip package with multiple identical dies.
CN105630735A (en) Coprocessor based on reconfigurable computational array
KR102650911B1 (en) High bandwidth memory silicon photonic through silicon via architecture for lookup computing artificial intellegence accelerator
CN113254390B (en) Reconfigurable computing structure, computing method and hardware architecture
US20210103820A1 (en) Pipelined backpropagation with minibatch emulation
US20180053084A1 (en) Multi-layer neural network
US11769040B2 (en) Scalable multi-die deep learning system
Dazzi et al. Efficient pipelined execution of CNNs based on in-memory computing and graph homomorphism verification
CN113407238A (en) Many-core architecture with heterogeneous processors and data processing method thereof
CN106506160A (en) A kind of ASIC and FPGA isomeries close coupling structure
Vink et al. Caffe barista: Brewing caffe with fpgas in the training loop
Guo et al. FCsN: A FPGA-Centric SmartNIC Framework for Neural Networks
Mayannavar et al. Performance comparison of serial and parallel multipliers in massively parallel environment
Wilton et al. Interconnect architectures for modulo-scheduled coarse-grained reconfigurable arrays
CN211125641U (en) Semiconductor structure for maximum pooling, chip and apparatus for maximum pooling
US20230035058A1 (en) Techniques For Booting A Compute Integrated Circuit Using A Boot Management Controller In A Processing Integrated Circuit
JP7506472B2 (en) System and method for offloading application functions to a device
Wijesundera et al. Wibheda: framework for data dependency-aware multi-constrained hardware-software partitioning in FPGA-based SoCs for IoT devices
Wang et al. A More Scalable Deep-learning Processing Unit For Depthwise Separable Convolution
Benhaoua et al. Multi-objective routing algorithm for dynamic communications mapping in NoC-based heterogeneous MPSoCs
JP2022552606A (en) Conveyor belt handling system, apparatus and method
Garai et al. RNS based reconfigurable processor for high speed signal processing
CN114358266A (en) Data flow driven convolution neural network accelerator

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant